Method	__init__	Instantiate self from a list of folder- or filenames or `DataSet` instances.
Method	processIndexFile	Reads in an index (.info?) file information on the different runs.
Method	append	Redefines the append method to check for unicity.
Method	extend	Extend with elements.
Method	pickle	Loop over self to pickle each element.
Method	dictByAlg	Returns a dictionary of instances of this class by algorithm.
Method	dictByAlgName	Returns a dictionary of instances of this class by algorithm.
Method	dictByDim	Returns a dictionary of instances of this class by dimensions.
Method	dictByFunc	Returns a dictionary of instances of this class by functions.
Method	dictByDimFunc	Returns a dictionary of instances of this class by dimensions and for each dimension by function.
Method	dictByNoise	Returns a dictionary splitting noisy and non-noisy entries.
Method	isBiobjective	Undocumented
Method	dictByFuncGroupBiobjective	Returns a dictionary of instances of this class by function groups for bi-objective case.
Method	dictByFuncGroupSingleObjective	Returns a dictionary of instances of this class by function groups for single objective case.
Method	dictByFuncGroup	Returns a dictionary of instances of this class by function groups.
Method	getFuncGroups	Returns a dictionary of function groups.
Method	dictByParam	Returns a dictionary of DataSetList by parameter values.
Method	info	Display some information onscreen.
Method	sort	Undocumented
Method	run_length_distributions	No summary
Method	get_all_data_lines	return a list of all data lines in `self` for each algorithm and a list of the respective computed aRTs.
Method	det_best_data	return a list of the `number` smallest evaluations over all data sets in `self` for each `target in target_values`.
Method	det_best_data_lines	return a list of the respective best data lines over all data sets in `self` for each `target in target_values` and an array of the computed scores (ERT `if scoring_function == 'ERT'`).
Method	get_sorted_algorithms	return list of the algorithms from `self`, sorted by minimum loss factor in the ECDF.
Method	get_reference_values_hash	Undocumented

def __init__(self, args=[], check_data_type=True):

Instantiate self from a list of folder- or filenames or DataSet instances.

Exceptions: Warning -- Unexpected user input. pickle.UnpicklingError

Parameters list args strings being either info file names, folder containing info files or pickled data files, or a list of DataSets.

def processIndexFile(self, indexFile):

Reads in an index (.info?) file information on the different runs.

def append(self, o, check_data_type=False):

Redefines the append method to check for unicity.

def extend(self, o):

Extend with elements.

This method is implemented to prevent problems since append was superseded. This method could be the origin of efficiency issue.

def pickle(self, *args, **kwargs):

Loop over self to pickle each element.

def dictByAlg(self):

Returns a dictionary of instances of this class by algorithm.

The resulting dict uses algId and comment as keys and the corresponding slices as values.

def dictByAlgName(self):

Returns a dictionary of instances of this class by algorithm.

Compared to dictByAlg, this method uses only the data folder as key and the corresponding slices as values.

def dictByDim(self):

Returns a dictionary of instances of this class by dimensions.

Returns a dictionary with dimension as keys and the corresponding slices as values.

def dictByFunc(self):

Returns a dictionary of instances of this class by functions.

Returns a dictionary with the function id as keys and the corresponding slices as values.

def dictByDimFunc(self):

Returns a dictionary of instances of this class by dimensions and for each dimension by function.

Returns a dictionary with dimension as keys and the corresponding slices as values.

ds = dsl.dictByDimFunc[40][2] # DataSet dimension 40 on F2

def dictByNoise(self):

Returns a dictionary splitting noisy and non-noisy entries.

def isBiobjective(self):

Undocumented

def dictByFuncGroupBiobjective(self):

Returns a dictionary of instances of this class by function groups for bi-objective case.

The output dictionary has function group names as keys and the corresponding slices as values.

def dictByFuncGroupSingleObjective(self):

Returns a dictionary of instances of this class by function groups for single objective case.

The output dictionary has function group names as keys and the corresponding slices as values. Current groups are based on the GECCO-BBOB 2009-2013 function testbeds.

def dictByFuncGroup(self):

Returns a dictionary of instances of this class by function groups.

The output dictionary has function group names as keys and the corresponding slices as values.

def getFuncGroups(self):

Returns a dictionary of function groups.

The output dictionary has functions group names as keys and function group descriptions as values.

def dictByParam(self, param):

Returns a dictionary of DataSetList by parameter values.

Returns a dictionary with values of parameter param as keys and the corresponding slices of DataSetList as values.

def info(self, opt=None):

Display some information onscreen.

Parameters string opt changes size of output, can be 'all' (default), 'short'

def sort(self, key1='dim', key2='funcId'):

Undocumented

def run_length_distributions(self, dimension, target_values, fun_list=None, reference_data_set_list=None, reference_scoring_function=lambda x: toolsstats.prctile(x, [5])[0], data_per_target=15, flatten_output_dict=True, simulated_restarts=False, bootstrap=False):

return a dictionary with an entry for each algorithm, or for only one algorithm the dictionary value if flatten_output_dict is True, and the left envelope rld-array.

For each algorithm the entry contains a sorted rld-array of evaluations to reach the targets on all functions in func_list or all functions in self, the list of solved functions, the list of processed functions. If the sorted rld-array is normalized by the reference score (after sorting), the last entry is the original rld.

Example:

%pylab
dsl = cocopp.load(...)  # a single algorithm
rld = dsl.run_length_distributions(10, [1e-1, 1e-3, 1e-5])
step(rld[0][0], np.linspace(0, 1, len(rld[0][0]),
     endpoint=True)

TODO: change interface to return always rld_original and optional the scores to compare with such that we need to compute rld[0][0] / rld[0][-1] to get the current output?

If reference_data_set_list is not None evaluations are normalized by the reference data, however the data remain to be sorted without normalization.

Parameters	simulated_restarts	use simulated trials instead of "raw" evaluations from calling `DataSet.detEvals`. `simulated_restarts` may be a `bool`, or a kwargs `dict` passed like `*simulated_restarts` to the method `DataSet.evals_with_simulated_restarts`, or it may indicate the number of simulated trials. By default, the first trial is chosen without replacement. That means, if the number of simulated trials equals to `nbRuns()`, the result is the same as from `DataSet.detEvals`, bar the ordering of the data. If `bootstrap` is active, the number is set to `nbRuns()` and the first trial is chosen with* replacement.
	bootstrap	`if bootstrap`, the number of evaluations is bootstrapped within the instances/trials or via simulated restarts.

def get_all_data_lines(self, target_value, fct, dim):

return a list of all data lines in self for each algorithm and a list of the respective computed aRTs.

Example

Get all run lengths of all trials on f1 in 20-D to reach target 1e-7:

data = dsl.get_all_data_lines(1e-7, 1, 20)[0]
flat_data = np.hstack(data)
plot(np.arange(1, 1+len(flat_data)) / len(flat_data),
     sort(flat_data))  # sorted fails on nan

def det_best_data(self, target_values, fct, dim, number=15):

return a list of the number smallest evaluations over all data sets in self for each target in target_values.

Detail: currently, the minimal observed evaluation is computed instance-wise and the number "easiest" instances are returned. That is, if number is the number of instances, the best eval for each instance is returned. Also the smallest number evaluations regardless of instance are computed, but not returned.

def det_best_data_lines(self, target_values, fct, dim, scoring_function=None):

return a list of the respective best data lines over all data sets in self for each target in target_values and an array of the computed scores (ERT if scoring_function == 'ERT').

A data line is the set of evaluations from all (usually 15) runs for a given target value. The score determines which data line is "best".

If scoring_function is None, the best is determined with method detERT. Using scoring_function=lambda x: toolsstat.prctile(x, [5], ignore_nan=False) is another useful alternative.

def get_sorted_algorithms(self, dimension, target_values, fun_list=None, reference_dataset_list=None, smallest_evaluation_to_use=3):

return list of the algorithms from self, sorted by minimum loss factor in the ECDF.

Best means to be within loss of the best algorithm at at least one point of the ECDF from the functions fun_list, i.e. minimal distance to the left envelope in the semilogx plot.

target_values gives for each function-dimension pair a list of target values.

TODO: data generation via run_length_distributions and sorting should probably be separated.

def get_reference_values_hash(self):

Undocumented

cocopp.pproc.DataSetList(list) class documentation

Example

`cocopp.pproc.DataSetList(list)` class documentation