Function definitions

Symbols and definitions

Used symbols and definitions of, e.g., auxiliary functions are given in the following. Vectors are typeset in bold and refer to column vectors.

\otimes : indicates element-wise multiplication of two D-dimensional vectors, \otimes:\mathcal{R}^D\times\mathcal{R}^D\to\mathcal{R}^D, (\mathbf{x},\mathbf{y})\mapsto\mathrm{{diag}}(\mathbf{x})\times\mathbf{y}=(x_i\times y_i)_{i=1,\dots,D}

\|.\| : denotes the Euclidean norm, \|\mathbf{x}\|^2=\sum_i x_i^2.

[.] : denotes the nearest integer value

\mathbf{0} : =(0,\dots,0)^{\mathrm{T}} all zero vector

\mathbf{1} : =(1,\dots,1)^{\mathrm{T}} all one vector

\Lambda^{\!\alpha} : is a diagonal matrix in D dimensions with the ith diagonal element as \lambda_{ii} = \alpha^{\frac{1}{2}\frac{i-1}{D-1}}

f^{{}}_\mathrm{pen} : :\mathcal{R}^D\to\mathcal{R}, \mathbf{x}\mapsto 100\sum_{i=1}^D\max(0,|x_i| - 5)^2

\mathbf{1}_-^+ : a D-dimensional vector with entries of -1 or 1 both drawn equal probability.

\mathbf{Q}, \mathbf{R} : orthogonal (rotation) matrices. For each function instance in each dimension a single realization for respectively \mathbf{Q} and \mathbf{R} is used. Rotation matrices are generated from standard normally distributed entries by Gram-Schmidt orthogonalization. Columns and rows of a rotation matrix form an orthogonal basis.

\mathbf{R} : see \mathbf{Q}

T^{{\beta}_\mathrm{asy}} : :\mathcal{R}^D\to\mathcal{R}^D, x_i\mapsto \begin{cases} x_i^{1+\beta \frac{i-1}{D-1}\sqrt{x_i}} & \text{~if~} x_i>0\\ x_i & \text{~otherwise~} \end{cases}, for i=1,\dots,D.

T_\mathrm{\hspace*{-0.01em}osz} : :\mathcal{R}^n\to\mathcal{R}^n, for any positive integer n, maps element-wise x\mapsto\mathrm{{sign}}(x)\exp\left(\hat{x} + 0.049\left(\sin(c_1 \hat{x}) + \sin(c_2 \hat{x})\right)\right) with \hat{x}= \begin{cases} \log(|x|) & \text{if~} x\not=0 \\ \in\mathcal{R}& \text{otherwise} \end{cases}, \mathrm{{sign}}(x) = \begin{cases} -1 & \text{if~} x < 0 \\ 0 & \text{if~} x = 0 \\ 1 & \text{otherwise} \end{cases}, c_1 = \begin{cases} 10 & \text{if~} x > 0\\ 5.5 & \text{otherwise} \end{cases} and c_2 = \begin{cases} 7.9 & \text{if~} x > 0\\ 3.1 & \text{otherwise} \end{cases}

\mathbf{x}^\mathrm{opt} : optimal solution vector, such that f(\mathbf{x^\mathrm{opt}}) is minimal.

General Setup

Search Space

All functions are defined and can be evaluated over \mathcal{R}^{D}, while the actual search domain is given as [-5,5]^{D}.

Location of the optimal \mathbf{x}^\mathrm{opt} and of f_\mathrm{opt}=f(\mathbf{x^\mathrm{opt}})

All functions have their global optimum in [-5,5]^{D}. The majority of functions has the global optimum in [-4,4]^{D}. The value for f_\mathrm{opt} is drawn from a cauchy distributed random variable, with roughly 50% of the values between -100 and 100. The value is rounded after two decimal places and the maximum and minimum are set to 1000 and -1000 respectively. In the function definitions a transformed variable vector \mathbf{z} is often used instead of the argument \mathbf{x}. The vector \mathbf{z} has its optimum in \mathbf{z^\mathrm{opt}}=\mathbf{0}, if not stated otherwise.

Boundary Handling

On all functions a penalty boundary handling is applied as given with f^{{}}_\mathrm{pen}.

Linear Transformations

Linear transformations of the search space are applied to derive non-separable functions from separable ones and to control the conditioning of the function.

Non-Linear Transformations and Symmetry Breaking

In order to make relatively simple, but well understood functions less regular, on some functions non-linear transformations are applied in x- or f-space. Both transformations T_\mathrm{\hspace*{-0.01emosz}}:\mathcal{R}^n\to\mathcal{R}^n, n\in\{1,D\}, and T^{{}_\mathrm{asy}}:\mathcal{R}^D\to\mathcal{R}^D are defined coordinate-wise. They are smooth and have, coordinate-wise, a strictly positive derivative. T_\mathrm{\hspace*{-0.01emosz}} is oscillating about the identity, where the oscillation is scale invariant w.r.t. the origin. T^{{}}_\mathrm{asy} is the identity for negative values. When T^{{}}_\mathrm{asy} is applied, a portion of 1/2^D of the search space remains untransformed.

Noise Models

In this benchmarking suite three different noise models are used. The first two, f_{\mathrm{GN}}\left({}\right) and f_{\mathrm{UN}}\left({}\right), are multiplicative noise models while the third model, f_{\mathrm{CN}}\left({}\right), is an additive noise model. All noise models are applied to a function value f under the assumption that f\ge0. All noise models reveal stochastic dominance between any two solutions and are therefore utility-free.

Gaussian Noise

The Gaussian noise model is scale invariant and defined as f_{\mathrm{GN}\left({f,\beta}\right)} = f\times\exp(\beta\,\mathcal{N}(0,1))\enspace. The noise strength is controlled with \beta. The distribution of the noise is log-normal, thus no negative noise values can be sampled. For the benchmark functions with moderate noise \beta = 0.01, otherwise \beta = 1. For small values of \beta this noise model resembles f\times(1+\beta\,\mathcal{N}(0,1)).

Uniform Noise

The uniform noise model is introduced as a more severe noise model than the Gaussian and is defined as f_{\mathrm{UN}\left({f,\alpha,\beta}\right)} = f\times\mathcal{U}(0,1)^{\beta}\max\left(1, \left(\dfrac{10^9}{f + \epsilon}\right)^{\alpha\,\mathcal{U}(0,1)}\right)\enspace. The noise model uses two random factors. The first factor is in the interval [0,1], uniformly distributed for \beta=1. The second factor, \max(\dots), is \ge1. The parameters \alpha and \beta control the noise strength. For moderate noise \alpha = 0.01\left(0.49 + {1}/{D}\right) and \beta = 0.01, otherwise \alpha = 0.49 + {1}/{D} and \beta = 1. Furthermore, \epsilon is set to 10^{-99} in order to prevent division by zero.

The uniform noise model is not scale invariant. Due to the last factor the noise strength increases with decreasing (positive) value of f. Therefore the noise strength becomes more severe when approaching the optimum.

Cauchy Noise

The Cauchy noise model represents a different type of noise with two important aspects. First, only a comparatively small percentage of function values is disturbed by noise. Second, the noise distribution is comparatively “weird”. Large outliers occur once in a while, and because they stem from a continuous distribution they cannot be easily diagnosed. The Cauchy noise model is defined as f_{\mathrm{CN}\left({f,\alpha,p}\right)} = f + \alpha\,\max\left(0, 1000 + \mathbb{I}_{\left\{\mathcal{U}(0,1) < p\right\}}\dfrac{\mathcal{N}(0,1)}{|\mathcal{N}(0,1)|+\epsilon}\right)\enspace, where \alpha defines the noise strength and p determines the frequency of the noise disturbance. In the moderate noise case \alpha = 0.01 and p = 0.05, otherwise \alpha = 1 and p = 0.2. The summand of 1000 was chosen to sample positive and negative “outliers” (as the function value is cut from below, see next paragraph) and \epsilon is set to 10^{-199}.

Final Function Value

In order to achieve a convenient testing for the target function value, in all noise models 1.01\times10^{-8} is added to the function value and, if the input argument f is smaller than 10^{-8}, the undisturbed f is returned.

f_\mathrm{XX}(f,\dots) \gets \begin{cases}f_\mathrm{XX}(f,\dots) + 1.01\times10^{-8}& \text{if~} f\ge10^{-8}\\ f & \text{otherwise} \end{cases}

Functions with moderate noise

Sphere

f_\mathrm{sphere}(\mathbf{x}) = \|\mathbf{z}\|^2

  • \mathbf{z}= \mathbf{x}- \mathbf{x^\mathrm{opt}}

Properties:

Presumably the most easy continuous domain search problem, given the volume of the searched solution is small (i.e. where pure monte-carlo random search is too expensive).

  • unimodal

  • highly symmetric, in particular rotationally invariant

f101: Sphere with moderate gaussian noise

f_{101}(\mathbf{x}) = f_{\mathrm{GN}}({f_\mathrm{sphere}(\mathbf{x}),0.01}) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f102: Sphere with moderate uniform noise

f_{102}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{sphere}(\mathbf{x}),0.01\left(0.49 + \dfrac{1}{D}\right),0.01}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f103: Sphere with moderate seldom cauchy noise

f_{103}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{sphere}(\mathbf{x}),0.01,0.05}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Rosenbrock

f_\mathrm{rosenbrock}(\mathbf{x}) = \sum_{i = 1}^{D-1} 100\,\left(z_i^2 - z_{i+1}\right)^2 + (z_i-1)^2

  • \mathbf{z}= \max\!\left(1,\frac{\sqrt{D}}{8}\right)(\mathbf{x}- \mathbf{x^\mathrm{opt}}) + 1

  • \mathbf{z^\mathrm{opt}}=\mathbf{1}

Properties:

So-called banana function due to its 2-D contour lines as a bent ridge (or valley). In the beginning, the prominent first term of the function definition attracts to the point \mathbf{z}=\mathbf{0}. Then, a long bending valley needs to be followed to reach the global optimum. The ridge changes its orientation D-1 times.

  • in larger dimensions the function has a local optimum with an attraction volume of about 25%

f104: Rosenbrock with moderate gaussian noise

f_{104}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.01}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f105: Rosenbrock with moderate uniform noise

f_{105}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.01\left(0.49 + \dfrac{1}{D}\right),0.01}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f106: Rosenbrock with moderate seldom cauchy noise

f_{106}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.01,0.05}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Functions with severe noise

Sphere

f_\mathrm{sphere}(\mathbf{x}) = \|\mathbf{z}\|^2

  • \mathbf{z}= \mathbf{x}- \mathbf{x^\mathrm{opt}}

Properties:

Presumably the most easy continuous domain search problem, given the volume of the searched solution is small (i.e. where pure monte-carlo random search is too expensive).

  • unimodal

  • highly symmetric, in particular rotationally invariant

f107: Sphere with gaussian noise

f_{107}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{sphere}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f108: Sphere with uniform noise

f_{108}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{sphere}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f109: Sphere with seldom cauchy noise

f_{109}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{sphere}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Rosenbrock

f_\mathrm{rosenbrock}(\mathbf{x}) = \sum_{i = 1}^{D-1} 100\,\left(z_i^2 - z_{i+1}\right)^2 + (z_i-1)^2

  • \mathbf{z}= \max\!\left(1,\frac{\sqrt{D}}{8}\right)(\mathbf{x}- \mathbf{x^\mathrm{opt}}) + 1

  • \mathbf{z^\mathrm{opt}}=\mathbf{1}

Properties:

So-called banana function due to its 2-D contour lines as a bent ridge (or valley). In the beginning, the prominent first term of the function definition attracts to the point \mathbf{z}=\mathbf{0}. Then, a long bending valley needs to be followed to reach the global optimum. The ridge changes its orientation D-1 times.

  • a local optimum with an attraction volume of about 25%

f110: Rosenbrock with gaussian noise

f_{110}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f111: Rosenbrock with uniform noise

f_{111}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f112: Rosenbrock with seldom cauchy noise

f_{112}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{rosenbrock}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Step ellipsoid

f_\mathrm{step}(\mathbf{x}) = 0.1 \max\left(|\hat{z}_1|/10^4,\, \sum_{i = 1}^{D} 10^{2\frac{i-1}{D-1}} z_i^2\right)

  • \hat{\mathbf{z}} = \Lambda^{\!10}\mathbf{R}(\mathbf{x}- \mathbf{x^\mathrm{opt}})

  • \tilde{z}_i = \begin{cases} \lfloor0.5+\hat{z}_i\rfloor & \text{if~} \hat{z}_i > 0.5 \\ {\lfloor0.5+10\,\hat{z}_i\rfloor}/{10} & \text{otherwise} \end{cases} for i=1,\dots,D,
    denotes the rounding procedure in order to produce the plateaus.

  • \mathbf{z}= \mathbf{Q}\tilde{\mathbf{z}}

Properties:

The function consists of many plateaus of different sizes. Apart from a small area close to the global optimum, the gradient is zero almost everywhere.

  • condition number is about 100

f113: Step ellipsoid with gaussian noise

f_{113}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{step}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f114: Step ellipsoid with uniform noise

f_{114}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{step}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f115: Step ellipsoid with seldom cauchy noise

f_{115}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{step}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Ellipsoid

f_\mathrm{ellipsoid}(\mathbf{x}) = \sum_{i = 1}^{D} 10^{4\frac{i-1}{D-1}}z_i^2

  • \mathbf{z}= T_\mathrm{\hspace*{-0.01emosz}}(\mathbf{R}(\mathbf{x}- \mathbf{x^\mathrm{opt}}))

Properties:

Globally quadratic ill-conditioned function with smooth local irregularities.

  • condition number is 10^4

f116: Ellipsoid with gaussian noise

f_{116}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{ellipsoid}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f117: Ellipsoid with uniform noise

f_{117}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{ellipsoid}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f118: Ellipsoid with seldom cauchy noise

f_{118}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{ellipsoid}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Different Powers

f_\mathrm{diffpowers}(\mathbf{x}) = \sqrt{\sum_{i = 1}^{D}|z_i|^{2+4\frac{i - 1}{D- 1}}}

  • \mathbf{z}= \mathbf{R}(\mathbf{x}- \mathbf{x^\mathrm{opt}})

Properties:

Due to the different exponents the sensitivies of the z_i-variables become more and more different when approaching the optimum.

f119: Different Powers with gaussian noise

f_{119}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{diffpowers}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f120: Different Powers with uniform noise

f_{120}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{diffpowers}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f121: Different Powers with seldom cauchy noise

f_{121}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{diffpowers}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Highly multi-modal functions with severe noise

Schaffer’s F7

f_\mathrm{schaffer}(\mathbf{x}) = \left(\frac{1}{D- 1}\sum_{i = 1}^{D- 1} \sqrt{s_i} + \sqrt{s_i} \sin^2\!\left(50\,s_i^{1/5}\right)\right)^2

  • \mathbf{z}= \Lambda^{\!10}\mathbf{Q}\,T^{{0.5}}_\mathrm{asy}(\mathbf{R}(\mathbf{x}-\mathbf{x^\mathrm{opt}}))

  • s_i = \sqrt{z_i^2 + z_{i+1}^2} for i=1,\dots,D

Properties:

A highly multimodal function where frequency and amplitude of the modulation vary.

  • conditioning is low

f122: Schaffer’s F7 with gaussian noise

f_{122}(\mathbf{x}) = f_{\mathrm{GN}\left({f_\mathrm{schaffer}(\mathbf{x}),1}\right)} + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f123: Schaffer’s F7 with uniform noise

f_{123}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{schaffer}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f124: Schaffer’s F7 with seldom cauchy noise

f_{124}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{schaffer}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Composite Griewank-Rosenbrock

f_\mathrm{f8f2}(\mathbf{x}) = \frac{1}{D-1} \sum_{i=1}^{D-1} \left(\frac{s_i}{4000} - \cos(s_i)\right) + 1

  • \mathbf{z}= \max\!\left(1,\frac{\sqrt{D}}{8}\right)\mathbf{R}\mathbf{x}+ 0.5

  • s_i = 100\,(z_i^2 - z_{i+1})^2 + (z_i-1)^2 for i=1,\dots,D

  • \mathbf{z^\mathrm{opt}}=\mathbf{1}

Properties:

Resembling the Rosenbrock function in a highly multimodal way.

f125: Composite Griewank-Rosenbrock with gaussian noise

f_{125}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{f8f2}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f126: Composite Griewank-Rosenbrock with uniform noise

f_{126}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{f8f2}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f127: Composite Griewank-Rosenbrock with seldom cauchy noise

f_{127}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{f8f2}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

Gallagher’s Gaussian Peaks, globally rotated

f_\mathrm{gallagher}(\mathbf{x}) = T_\mathrm{\hspace*{-0.01emosz}}\left(10 - \max_{i=1}^{101} w_i \exp\left(-\frac{1}{2D}\, (\mathbf{x}-\mathbf{y}_i)^{\mathrm{T}}\mathbf{R}^{\mathrm{T}} \mathbf{C}_i \mathbf{R}(\mathbf{x}-\mathbf{y}_i) \right)\right)^2

  • w_i = \begin{cases} 1.1 + 8 \times\dfrac{i-2}{99} & \text{for~} i=2,\dots,101 \\ 10 & \text{for~} i = 1 \end{cases}, three optima have a value larger than 9

  • \mathbf{C}_i=\Lambda^{\!\alpha_i}/\alpha_i^{1/4} where \Lambda^{\!\alpha_i} is defined as usual, but with randomly permuted diagonal elements. For i=2,\dots,101, \alpha_i is drawn uniformly randomly from the set \left\{1000^{2\frac{j}{99}} ~|~ j = 0,\dots,99\right\} without replacement, and \alpha_i=1000 for i=1.

  • the local optima \mathbf{y}_i are uniformly drawn from the domain [-4.9,4.9]^D for i=2,\dots,101 and \mathbf{y}_{1}\in[-4,4]^D. The global optimum is at \mathbf{x^\mathrm{opt}}=\mathbf{y}_1.

Properties:

The function consists of 101 optima with position and height being unrelated and randomly chosen.

  • condition number around the global optimum is about 30

  • same overall rotation matrix

f128: Gallagher’s Gaussian Peaks 101-me with gaussian noise

f_{128}(\mathbf{x}) = f_{\mathrm{GN}}\left({f_\mathrm{gallagher}(\mathbf{x}),1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f129: Gallagher’s Gaussian Peaks 101-me with uniform noise

f_{129}(\mathbf{x}) = f_{\mathrm{UN}}\left({f_\mathrm{gallagher}(\mathbf{x}),0.49 + \dfrac{1}{D},1}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}

f130: Gallagher’s Gaussian Peaks 101-me with seldom cauchy noise

f_{130}(\mathbf{x}) = f_{\mathrm{CN}}\left({f_\mathrm{gallagher}(\mathbf{x}),1,0.2}\right) + f_{\mathrm{pen}}(\mathbf{x}) + f_\mathrm{opt}