List of instances of DataSet
.
This class implements some useful slicing functions.
Also it will merge data of DataSet instances that are identical (according to function __eq__ of DataSet).
Method | __init__ |
Instantiate self from a list of folder- or filenames or DataSet instances. |
Method | append |
Redefines the append method to check for unicity. |
Method | by |
Returns a dictionary of DataSetList instances by attr_name . |
Method | det |
return a list of the number smallest evaluations over all data sets in self for each target in target_values. |
Method | det |
return a list of the respective best data lines over all data sets in self for each target in target_values and an array of the computed scores (ERT if scoring_function == 'ERT'). |
Method | dict |
Returns a dictionary of instances of this class by algorithm. |
Method | dict |
Returns a dictionary of instances of this class by algorithm. |
Method | dict |
Returns a dictionary of instances of this class by dimensions. |
Method | dict |
Returns a dictionary of instances of this class by dimensions and for each dimension by function. |
Method | dict |
Returns a dictionary of instances of this class by functions. |
Method | dict |
Returns a dictionary of instances of this class by objective functions (grouping over constraints). |
Method | dict |
Returns a dictionary of instances of this class by function groups. |
Method | dict |
Returns a dictionary of instances of this class by function groups for bi-objective case. |
Method | dict |
Returns a dictionary of instances of this class by function groups for single objective case. |
Method | dict |
Returns a dictionary splitting noisy and non-noisy entries. |
Method | dict |
Returns a dictionary of DataSetList by parameter values. |
Method | extend |
Extend with elements. |
Method | get |
return a list of all data lines in self for each algorithm and a list of the respective computed ERTs. |
Method | get |
Undocumented |
Method | get |
return list of the algorithms from self, sorted by minimum loss factor in the ECDF. |
Method | get |
Returns a dictionary of function groups. |
Method | info |
Display some information onscreen. |
Method | is |
Undocumented |
Method | pickle |
Loop over self to pickle each element. |
Method | process |
Reads in an index (.info?) file information on the different runs. |
Method | run |
return a dictionary with an entry for each algorithm, or for only one algorithm the dictionary value if flatten_output_dict is True, and the left envelope rld-array. |
Method | sort |
Undocumented |
Instance Variable | current |
Undocumented |
Instantiate self from a list of folder- or filenames or DataSet instances.
Exceptions: Warning -- Unexpected user input. pickle.UnpicklingError
Parameters | |
args | Undocumented |
check | Undocumented |
list args | strings being either info file names, folder containing info files or pickled data files, or a list of DataSets. |
Returns a dictionary of DataSetList
instances by attr_name
.
attr_name
values are the dictionary keys and the corresponding
slices (partial lists) are the values.
May in future replace some of the specific methods, for example, dsl.dictByDim() == dsl.by('dim').
return a list of the number smallest evaluations over all data sets in self for each target in target_values.
Detail: currently, the minimal observed evaluation is computed
instance-wise and the number "easiest" instances are returned.
That is, if number
is the number of instances, the best eval
for each instance is returned.
Also the smallest number evaluations regardless of instance
are computed, but not returned.
return a list of the respective best data lines over all data sets in self for each target in target_values and an array of the computed scores (ERT if scoring_function == 'ERT').
A data line is the set of evaluations from all (usually 15) runs for a given target value. The score determines which data line is "best".
If scoring_function is None, the best is determined with method detERT. Using scoring_function=lambda x: toolsstat.prctile(x, [5], ignore_nan=False) is another useful alternative.
TODO: do we want to append equal-instance lines for detEvals?
Returns a dictionary of instances of this class by algorithm.
The resulting dict uses algId and comment as keys and the corresponding slices as values.
Returns a dictionary of instances of this class by algorithm.
Compared to dictByAlg, this method uses only the data folder as key and the corresponding slices as values.
Returns a dictionary of instances of this class by dimensions.
Returns a dictionary with dimension as keys and the corresponding slices as values.
Returns a dictionary of instances of this class by dimensions and for each dimension by function.
Returns a dictionary with dimension as keys and the corresponding slices as values.
ds = dsl.dictByDimFunc[40][2] # DataSet dimension 40 on F2
Returns a dictionary of instances of this class by functions.
Returns a dictionary with the function id as keys and the corresponding slices as values.
Returns a dictionary of instances of this class by objective functions (grouping over constraints).
Should be used only with the constrained test bed.
Returns a dictionary with the function string identifiers as keys and the corresponding slices as values.
Returns a dictionary of instances of this class by function groups.
The output dictionary has function group names as keys and the corresponding slices as values.
Returns a dictionary of instances of this class by function groups for bi-objective case.
The output dictionary has function group names as keys and the corresponding slices as values.
Returns a dictionary of instances of this class by function groups for single objective case.
The output dictionary has function group names as keys and the corresponding slices as values. Current groups are based on the GECCO-BBOB 2009-2013 function testbeds.
Returns a dictionary of DataSetList by parameter values.
Returns | |
a dictionary with values of parameter param as keys and the corresponding slices of DataSetList as values. |
Extend with elements.
This method is implemented to prevent problems since append was superseded. This method could be the origin of efficiency issue.
return a list of all data lines in self for each algorithm and a list of the respective computed ERTs.
Example
Get all run lengths of all trials on f1 in 20-D to reach target 1e-7:
data = dsl.get_all_data_lines(1e-7, 1, 20)[0] flat_data = np.hstack(data) plot(np.arange(1, 1+len(flat_data)) / len(flat_data), sort(flat_data)) # sorted fails on nan
return list of the algorithms from self, sorted by minimum loss factor in the ECDF.
Best means to be within loss
of the best algorithm at
at least one point of the ECDF from the functions fun_list
,
i.e. minimal distance to the left envelope in the semilogx plot.
target_values gives for each function-dimension pair a list of target values.
TODO: data generation via run_length_distributions and sorting should probably be separated.
Returns a dictionary of function groups.
The output dictionary has functions group names as keys and function group descriptions as values.
Display some information onscreen.
Parameters | |
opt | Undocumented |
string opt | changes size of output, can be 'all' (default), 'short' |
return a dictionary with an entry for each algorithm, or for only one algorithm the dictionary value if flatten_output_dict is True, and the left envelope rld-array.
For each algorithm the entry contains a sorted rld-array of
evaluations to reach the targets on all functions in
func_list or all functions in self
, the list of solved
functions, the list of processed functions. If the sorted
rld-array is normalized by the reference score (after sorting),
the last entry is the original rld.
Example:
%pylab dsl = cocopp.load(...) # a single algorithm rld = dsl.run_length_distributions(10, [1e-1, 1e-3, 1e-5]) step(rld[0][0], np.linspace(0, 1, len(rld[0][0]), endpoint=True)
TODO: change interface to return always rld_original and optional the scores to compare with such that we need to compute rld[0][0] / rld[0][-1] to get the current output?
If reference_data_set_list is not None evaluations are normalized by the reference data, however the data remain to be sorted without normalization.
Parameters | |
dimension | Undocumented |
target | Undocumented |
fun | Undocumented |
reference | Undocumented |
reference | Undocumented |
data | Undocumented |
flatten | Undocumented |
simulated | use simulated trials instead of
"raw" evaluations from calling DataSet.detEvals .
simulated_restarts may be a bool , or a kwargs dict
passed like **simulated_restarts to the method
DataSet.evals_with_simulated_restarts , or it may indicate
the number of simulated trials. By default, the first trial
is chosen without replacement. That means, if the number of
simulated trials equals to nbRuns(), the result is the
same as from DataSet.detEvals , bar the ordering of the
data. If bootstrap is active, the number is set to
nbRuns() and the first trial is chosen with replacement. |
bootstrap | if bootstrap, the number of evaluations is bootstrapped within the instances/trials or via simulated restarts. |