Function definitions
Symbols and definitions
Used symbols and definitions of, e.g., auxiliary functions are given in the following. Vectors are typeset in bold and refer to column vectors.
\otimes : indicates element-wise multiplication of two D-dimensional vectors, \otimes:\mathcal{R}^D\times\mathcal{R}^D\to\mathcal{R}^D, (\mathbf{x},\mathbf{y})\mapsto\mathrm{{diag}}(\mathbf{x})\times\mathbf{y}=(x_i\times y_i)_{i=1,\dots,D}
\|.\| : denotes the Euclidean norm, \|\mathbf{x}\|^2=\sum_i x_i^2.
[.] : denotes the nearest integer value
\mathbf{0} : =(0,\dots,0)^{\mathrm{T}} all zero vector
\mathbf{1} : =(1,\dots,1)^{\mathrm{T}} all one vector
\Lambda^{\!\alpha} : is a diagonal matrix in D dimensions with the ith diagonal element as \lambda_{ii} = \alpha^{\frac{1}{2}\frac{i-1}{D-1}}
f^{{}}_\mathrm{pen} : :\mathcal{R}^D\to\mathcal{R}, \mathbf{x}\mapsto 100\sum_{i=1}^D\max(0,|x_i| - 5)^2
\mathbf{1}_-^+ : a D-dimensional vector with entries of -1 or 1 both drawn equal probability.
\mathbf{Q}, \mathbf{R} : orthogonal (rotation) matrices. For each function instance in each dimension a single realization for respectively \mathbf{Q} and \mathbf{R} is used. Rotation matrices are generated from standard normally distributed entries by Gram-Schmidt orthogonalization. Columns and rows of a rotation matrix form an orthogonal basis.
\mathbf{R} : see \mathbf{Q}
T^{{\beta}_\mathrm{asy}} : :\mathcal{R}^D\to\mathcal{R}^D, x_i\mapsto \begin{cases} x_i^{1+\beta \frac{i-1}{D-1}\sqrt{x_i}} & \text{~if~} x_i>0\\ x_i & \text{~otherwise~} \end{cases}, for i=1,\dots,D.
T_\mathrm{\hspace*{-0.01em}osz} : :\mathcal{R}^n\to\mathcal{R}^n, for any positive integer n, maps element-wise x\mapsto\mathrm{{sign}}(x)\exp\left(\hat{x} + 0.049\left(\sin(c_1 \hat{x}) + \sin(c_2 \hat{x})\right)\right) with \hat{x}= \begin{cases} \log(|x|) & \text{if~} x\not=0 \\ \in\mathcal{R}& \text{otherwise} \end{cases}, \mathrm{{sign}}(x) = \begin{cases} -1 & \text{if~} x < 0 \\ 0 & \text{if~} x = 0 \\ 1 & \text{otherwise} \end{cases}, c_1 = \begin{cases} 10 & \text{if~} x > 0\\ 5.5 & \text{otherwise} \end{cases} and c_2 = \begin{cases} 7.9 & \text{if~} x > 0\\ 3.1 & \text{otherwise} \end{cases}
\mathbf{x}^\mathrm{opt} : optimal solution vector, such that f(\mathbf{x^\mathrm{opt}}) is minimal.
General Setup
Search Space
All functions are defined and can be evaluated over \mathcal{R}^{D}, while the actual search domain is given as [-5,5]^{D}.
Location of the optimal \mathbf{x}^\mathrm{opt} and of f_\mathrm{opt}=f(\mathbf{x^\mathrm{opt}})
All functions have their global optimum in [-5,5]^{D}. The majority of functions has the global optimum in [-4,4]^{D}. The value for f_\mathrm{opt} is drawn from a cauchy distributed random variable, with roughly 50% of the values between -100 and 100. The value is rounded after two decimal places and the maximum and minimum are set to 1000 and -1000 respectively. In the function definitions a transformed variable vector \mathbf{z} is often used instead of the argument \mathbf{x}. The vector \mathbf{z} has its optimum in \mathbf{z^\mathrm{opt}}=\mathbf{0}, if not stated otherwise.
Boundary Handling
On all functions a penalty boundary handling is applied as given with f^{{}}_\mathrm{pen}.
Linear Transformations
Linear transformations of the search space are applied to derive non-separable functions from separable ones and to control the conditioning of the function.
Non-Linear Transformations and Symmetry Breaking
In order to make relatively simple, but well understood functions less regular, on some functions non-linear transformations are applied in x- or f-space. Both transformations T_\mathrm{\hspace*{-0.01emosz}}:\mathcal{R}^n\to\mathcal{R}^n, n\in\{1,D\}, and T^{{}_\mathrm{asy}}:\mathcal{R}^D\to\mathcal{R}^D are defined coordinate-wise. They are smooth and have, coordinate-wise, a strictly positive derivative. T_\mathrm{\hspace*{-0.01emosz}} is oscillating about the identity, where the oscillation is scale invariant w.r.t. the origin. T^{{}}_\mathrm{asy} is the identity for negative values. When T^{{}}_\mathrm{asy} is applied, a portion of 1/2^D of the search space remains untransformed.
Noise Models
In this benchmarking suite three different noise models are used. The first two, f_{\mathrm{GN}}\left({}\right) and f_{\mathrm{UN}}\left({}\right), are multiplicative noise models while the third model, f_{\mathrm{CN}}\left({}\right), is an additive noise model. All noise models are applied to a function value f under the assumption that f\ge0. All noise models reveal stochastic dominance between any two solutions and are therefore utility-free.
Gaussian Noise
The Gaussian noise model is scale invariant and defined as f_{\mathrm{GN}\left({f,\beta}\right)} = f\times\exp(\beta\,\mathcal{N}(0,1))\enspace. The noise strength is controlled with \beta. The distribution of the noise is log-normal, thus no negative noise values can be sampled. For the benchmark functions with moderate noise \beta = 0.01, otherwise \beta = 1. For small values of \beta this noise model resembles f\times(1+\beta\,\mathcal{N}(0,1)).
Uniform Noise
The uniform noise model is introduced as a more severe noise model than the Gaussian and is defined as f_{\mathrm{UN}\left({f,\alpha,\beta}\right)} = f\times\mathcal{U}(0,1)^{\beta}\max\left(1, \left(\dfrac{10^9}{f + \epsilon}\right)^{\alpha\,\mathcal{U}(0,1)}\right)\enspace. The noise model uses two random factors. The first factor is in the interval [0,1], uniformly distributed for \beta=1. The second factor, \max(\dots), is \ge1. The parameters \alpha and \beta control the noise strength. For moderate noise \alpha = 0.01\left(0.49 + {1}/{D}\right) and \beta = 0.01, otherwise \alpha = 0.49 + {1}/{D} and \beta = 1. Furthermore, \epsilon is set to 10^{-99} in order to prevent division by zero.
The uniform noise model is not scale invariant. Due to the last factor the noise strength increases with decreasing (positive) value of f. Therefore the noise strength becomes more severe when approaching the optimum.
Cauchy Noise
The Cauchy noise model represents a different type of noise with two important aspects. First, only a comparatively small percentage of function values is disturbed by noise. Second, the noise distribution is comparatively “weird”. Large outliers occur once in a while, and because they stem from a continuous distribution they cannot be easily diagnosed. The Cauchy noise model is defined as f_{\mathrm{CN}\left({f,\alpha,p}\right)} = f + \alpha\,\max\left(0, 1000 + \mathbb{I}_{\left\{\mathcal{U}(0,1) < p\right\}}\dfrac{\mathcal{N}(0,1)}{|\mathcal{N}(0,1)|+\epsilon}\right)\enspace, where \alpha defines the noise strength and p determines the frequency of the noise disturbance. In the moderate noise case \alpha = 0.01 and p = 0.05, otherwise \alpha = 1 and p = 0.2. The summand of 1000 was chosen to sample positive and negative “outliers” (as the function value is cut from below, see next paragraph) and \epsilon is set to 10^{-199}.
Final Function Value
In order to achieve a convenient testing for the target function value, in all noise models 1.01\times10^{-8} is added to the function value and, if the input argument f is smaller than 10^{-8}, the undisturbed f is returned.
f_\mathrm{XX}(f,\dots) \gets \begin{cases}f_\mathrm{XX}(f,\dots) + 1.01\times10^{-8}& \text{if~} f\ge10^{-8}\\ f & \text{otherwise} \end{cases}
Functions with moderate noise
Sphere
f_\mathrm{sphere}(\mathbf{x}) = \|\mathbf{z}\|^2
- \mathbf{z}= \mathbf{x}- \mathbf{x^\mathrm{opt}}
Properties:
Presumably the most easy continuous domain search problem, given the volume of the searched solution is small (i.e. where pure monte-carlo random search is too expensive).
unimodal
highly symmetric, in particular rotationally invariant
f101: Sphere with moderate gaussian noise
f_{101}(\mathbf{x}) = f_{\mathrm{GN}}({f_\mathrm{sphere}(\mathbf{x}),0.01}) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f102: Sphere with moderate uniform noise
f_{102}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{sphere}(\mathbf{x}),0.01\left(0.49 + \dfrac{1}{D}\right),0.01}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f103: Sphere with moderate seldom cauchy noise
f_{103}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{sphere}(\mathbf{x}),0.01,0.05}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Rosenbrock
f_\mathrm{rosenbrock}(\mathbf{x}) = \sum_{i = 1}^{D-1} 100\,\left(z_i^2 - z_{i+1}\right)^2 + (z_i-1)^2
\mathbf{z}= \max\!\left(1,\frac{\sqrt{D}}{8}\right)(\mathbf{x}- \mathbf{x^\mathrm{opt}}) + 1
\mathbf{z^\mathrm{opt}}=\mathbf{1}
Properties:
So-called banana function due to its 2-D contour lines as a bent ridge (or valley). In the beginning, the prominent first term of the function definition attracts to the point \mathbf{z}=\mathbf{0}. Then, a long bending valley needs to be followed to reach the global optimum. The ridge changes its orientation D-1 times.
- in larger dimensions the function has a local optimum with an attraction volume of about 25%
f104: Rosenbrock with moderate gaussian noise
f_{104}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.01}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f105: Rosenbrock with moderate uniform noise
f_{105}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.01\left(0.49 + \dfrac{1}{D}\right),0.01}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f106: Rosenbrock with moderate seldom cauchy noise
f_{106}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.01,0.05}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Functions with severe noise
Sphere
f_\mathrm{sphere}(\mathbf{x}) = \|\mathbf{z}\|^2
- \mathbf{z}= \mathbf{x}- \mathbf{x^\mathrm{opt}}
Properties:
Presumably the most easy continuous domain search problem, given the volume of the searched solution is small (i.e. where pure monte-carlo random search is too expensive).
unimodal
highly symmetric, in particular rotationally invariant
f107: Sphere with gaussian noise
f_{107}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{sphere}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f108: Sphere with uniform noise
f_{108}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{sphere}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f109: Sphere with seldom cauchy noise
f_{109}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{sphere}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Rosenbrock
f_\mathrm{rosenbrock}(\mathbf{x}) = \sum_{i = 1}^{D-1} 100\,\left(z_i^2 - z_{i+1}\right)^2 + (z_i-1)^2
\mathbf{z}= \max\!\left(1,\frac{\sqrt{D}}{8}\right)(\mathbf{x}- \mathbf{x^\mathrm{opt}}) + 1
\mathbf{z^\mathrm{opt}}=\mathbf{1}
Properties:
So-called banana function due to its 2-D contour lines as a bent ridge (or valley). In the beginning, the prominent first term of the function definition attracts to the point \mathbf{z}=\mathbf{0}. Then, a long bending valley needs to be followed to reach the global optimum. The ridge changes its orientation D-1 times.
- a local optimum with an attraction volume of about 25%
f110: Rosenbrock with gaussian noise
f_{110}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f111: Rosenbrock with uniform noise
f_{111}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f112: Rosenbrock with seldom cauchy noise
f_{112}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Step ellipsoid
f_\mathrm{step}(\mathbf{x}) = 0.1 \max\left(|\hat{z}_1|/10^4,\, \sum_{i = 1}^{D} 10^{2\frac{i-1}{D-1}} z_i^2\right)
\hat{\mathbf{z}} = \Lambda^{\!10}\mathbf{R}(\mathbf{x}- \mathbf{x^\mathrm{opt}})
\tilde{z}_i = \begin{cases} \lfloor0.5+\hat{z}_i\rfloor & \text{if~} \hat{z}_i > 0.5 \\ {\lfloor0.5+10\,\hat{z}_i\rfloor}/{10} & \text{otherwise} \end{cases} for i=1,\dots,D,
denotes the rounding procedure in order to produce the plateaus.\mathbf{z}= \mathbf{Q}\tilde{\mathbf{z}}
Properties:
The function consists of many plateaus of different sizes. Apart from a small area close to the global optimum, the gradient is zero almost everywhere.
- condition number is about 100
f113: Step ellipsoid with gaussian noise
f_{113}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{step}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f114: Step ellipsoid with uniform noise
f_{114}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{step}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f115: Step ellipsoid with seldom cauchy noise
f_{115}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{step}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Ellipsoid
f_\mathrm{ellipsoid}(\mathbf{x}) = \sum_{i = 1}^{D} 10^{4\frac{i-1}{D-1}}z_i^2
- \mathbf{z}= T_\mathrm{\hspace*{-0.01emosz}}(\mathbf{R}(\mathbf{x}- \mathbf{x^\mathrm{opt}}))
Properties:
Globally quadratic ill-conditioned function with smooth local irregularities.
- condition number is 10^4
f116: Ellipsoid with gaussian noise
f_{116}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{ellipsoid}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f117: Ellipsoid with uniform noise
f_{117}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{ellipsoid}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f118: Ellipsoid with seldom cauchy noise
f_{118}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{ellipsoid}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Different Powers
f_\mathrm{diffpowers}(\mathbf{x}) = \sqrt{\sum_{i = 1}^{D}|z_i|^{2+4\frac{i - 1}{D- 1}}}
- \mathbf{z}= \mathbf{R}(\mathbf{x}- \mathbf{x^\mathrm{opt}})
Properties:
Due to the different exponents the sensitivies of the z_i-variables become more and more different when approaching the optimum.
f119: Different Powers with gaussian noise
f_{119}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{diffpowers}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f120: Different Powers with uniform noise
f_{120}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{diffpowers}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f121: Different Powers with seldom cauchy noise
f_{121}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{diffpowers}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Highly multi-modal functions with severe noise
Schaffer’s F7
f_\mathrm{schaffer}(\mathbf{x}) = \left(\frac{1}{D- 1}\sum_{i = 1}^{D- 1} \sqrt{s_i} + \sqrt{s_i} \sin^2\!\left(50\,s_i^{1/5}\right)\right)^2
\mathbf{z}= \Lambda^{\!10}\mathbf{Q}\,T^{{0.5}}_\mathrm{asy}(\mathbf{R}(\mathbf{x}-\mathbf{x^\mathrm{opt}}))
s_i = \sqrt{z_i^2 + z_{i+1}^2} for i=1,\dots,D
Properties:
A highly multimodal function where frequency and amplitude of the modulation vary.
- conditioning is low
f122: Schaffer’s F7 with gaussian noise
f_{122}(\mathbf{x}) = f_{\mathrm{GN}\left({f_\mathrm{schaffer}(\mathbf{x}),1}\right)} + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f123: Schaffer’s F7 with uniform noise
f_{123}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{schaffer}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f124: Schaffer’s F7 with seldom cauchy noise
f_{124}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{schaffer}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Composite Griewank-Rosenbrock
f_\mathrm{f8f2}(\mathbf{x}) = \frac{1}{D-1} \sum_{i=1}^{D-1} \left(\frac{s_i}{4000} - \cos(s_i)\right) + 1
\mathbf{z}= \max\!\left(1,\frac{\sqrt{D}}{8}\right)\mathbf{R}\mathbf{x}+ 0.5
s_i = 100\,(z_i^2 - z_{i+1})^2 + (z_i-1)^2 for i=1,\dots,D
\mathbf{z^\mathrm{opt}}=\mathbf{1}
Properties:
Resembling the Rosenbrock function in a highly multimodal way.
f125: Composite Griewank-Rosenbrock with gaussian noise
f_{125}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{f8f2}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f126: Composite Griewank-Rosenbrock with uniform noise
f_{126}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{f8f2}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f127: Composite Griewank-Rosenbrock with seldom cauchy noise
f_{127}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{f8f2}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
Gallagher’s Gaussian Peaks, globally rotated
f_\mathrm{gallagher}(\mathbf{x}) = T_\mathrm{\hspace*{-0.01emosz}}\left(10 - \max_{i=1}^{101} w_i \exp\left(-\frac{1}{2D}\, (\mathbf{x}-\mathbf{y}_i)^{\mathrm{T}}\mathbf{R}^{\mathrm{T}} \mathbf{C}_i \mathbf{R}(\mathbf{x}-\mathbf{y}_i) \right)\right)^2
w_i = \begin{cases} 1.1 + 8 \times\dfrac{i-2}{99} & \text{for~} i=2,\dots,101 \\ 10 & \text{for~} i = 1 \end{cases}, three optima have a value larger than 9
\mathbf{C}_i=\Lambda^{\!\alpha_i}/\alpha_i^{1/4} where \Lambda^{\!\alpha_i} is defined as usual, but with randomly permuted diagonal elements. For i=2,\dots,101, \alpha_i is drawn uniformly randomly from the set \left\{1000^{2\frac{j}{99}} ~|~ j = 0,\dots,99\right\} without replacement, and \alpha_i=1000 for i=1.
the local optima \mathbf{y}_i are uniformly drawn from the domain [-4.9,4.9]^D for i=2,\dots,101 and \mathbf{y}_{1}\in[-4,4]^D. The global optimum is at \mathbf{x^\mathrm{opt}}=\mathbf{y}_1.
Properties:
The function consists of 101 optima with position and height being unrelated and randomly chosen.
condition number around the global optimum is about 30
same overall rotation matrix
f128: Gallagher’s Gaussian Peaks 101-me with gaussian noise
f_{128}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{gallagher}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f129: Gallagher’s Gaussian Peaks 101-me with uniform noise
f_{129}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{gallagher}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}
f130: Gallagher’s Gaussian Peaks 101-me with seldom cauchy noise
f_{130}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{gallagher}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}