Known subclasses: cocopp.archiving.COCOBBOBBiobjDataArchive, cocopp.archiving.COCOBBOBDataArchive, cocopp.archiving.COCOBBOBNoisyDataArchive

Data archive based on an archive definition file.

This class is not meant to be instantiated directly. Instead, use cocopp.archiving.get to get a class instance. The class needs an archive definition file to begin with, as created with cocopp.archiving.create.

See cocopp.archives or cocopp.archiving.official_archives for the "official" archives.

This class "is" a list (StrList) of names which are relative file names separated with slashes "/". Each name represents the zipped data from a full archived experiment, benchmarking one algorithm on an entire benchmark suite.

The function create serves to create a new user-defined archive from experiment data which can be loaded with get. Other derived classes define other specific (sub)archives.

Using the class

Calling the class instance (alias to find) helps to extract entries matching one or several substrings, e.g. a year or a method. find_indices returns the respective indices instead of the names. print displays both. For example:

>>> import cocopp
>>> cocopp.archives.bbob.find('bfgs')  # doctest:+SKIP
['2009/BFGS_ros_noiseless.tgz',
 '2012/DE-BFGS_voglis_noiseless.tgz',
 '2012/PSO-BFGS_voglis_noiseless.tgz',
 '2014-others/BFGS-scipy-Baudis.tgz',
 '2014-others/L-BFGS-B-scipy-Baudis.tgz'...

To post-process these data call:

>>> cocopp.main(cocopp.archives.bbob.get_all('bfgs'))  # doctest:+SKIP

Method get downloads a single "matching" data set if necessary and returns the absolute data path which can be used with cocopp.main.

Method index is inherited from list and finds the index of the respective name entry in the archive (exact match only).

cocopp.archives.all contains all experimental data for all test suites.

>>> import cocopp
>>> bbob = cocopp.archives.bbob  # the bbob testbed archive
>>> len(bbob) > 150
True
>>> bbob[:3]  # doctest:+ELLIPSIS,+SKIP,
['2009/...
>>> bbob('2009/bi')[0]  # doctest:+ELLIPSIS,+SKIP,
'...

Get a list of already downloaded data full pathnames or None:

>>> [bbob.get(i, remote=False) for i in range(len(bbob))] # doctest:+ELLIPSIS
[...

Find something more specific:

>>> bbob('auger')[0]  # == bbob.find('auger')[0]  # doctest:+SKIP,
'2009/CMA-ESPLUSSEL_auger_noiseless.tgz'

corresponds to cocopp.main('auger!').

>>> bbob.index('2009/CMA-ESPLUSSEL_auger_noiseless.tgz')  # just list.index
5
>>> data_path = bbob.get(bbob(['au', '2009'])[0], remote=False)
>>> assert data_path is None or str(data_path) == data_path

These commands may download data, to avoid this the option remote=False is given:

>>> ' '.join(bbob.get(i, remote=False) or '' for i in [2, 13, 33])  # can serve as argument to cocopp.main  # doctest:+ELLIPSIS,+SKIP,
'...
>>> bbob.get_all([2, 13, 33], remote=False).as_string  # is the same  # doctest:+ELLIPSIS,+SKIP,
' ...
>>> ' '.join(bbob.get(name, remote=False) for name in [bbob[2], bbob[13], bbob[33]])  # is the same  # doctest:+ELLIPSIS,+SKIP,
'...
>>> ' '.join(bbob.get(name, remote=False) for name in [
...         '2009/BAYEDA_gallagher_noiseless.tgz',
...         '2009/GA_nicolau_noiseless.tgz',
...         '2010/1komma2mirser_brockhoff_noiseless.tar.gz'])  # is the same  # doctest:+ELLIPSIS,+SKIP,
'...

DONE: join with COCODataArchive, to get there: - DONE upload definition files to official archives - DONE? use uploaded definition files (see official_archive_locations in _get_remote) - DONE? replace usages of derived data classes by get - DONE remove definition list in code of the root class - DONE review and join classes without default for local path

Method __init__ Argument is a local path to the archive.
Method get_found get full entries of the last find
Method get_all Return a list (StrList) of absolute pathnames,
Method get_first get the first archived data matching all of substrs.
Method get return the full data pathname of substr in the archived data.
Method get_one deprecated, for backwards compatibility only, use get instead
Method get_extended return a list of valid paths.
Method contains return True if (the exact) name or path is in the archive
Method downloaded return list of data set names of locally available data.
Method full_path return full local path of name or any path, idempotent
Method consistency_check_data basic quick consistency check of downloaded data.
Method check_hash raise Exception when hashes disagree or file is missing.
Method update update definition file, either from remote location or from local data.
Method consistency_check_read check/compare against definition file on disk
Method read_definition_file return definition triple list
Static Method is_archive return True if folder contains a COCO archive definition file
Method _download create full local path and download single dataset
Method _name return supposed name of full_path or name without any checks
Method _name_with_check return name of full_path, idempotent.
Method _hash compute hash of name or path
Method _known_hash return known hash or None
Method _url_ return value of _url_ entry in definition_list file or None.
def __init__(self, local_path):

Argument is a local path to the archive.

This class is not anymore meant to be used directly, rather use cocopp.archiving.get.

local_path is an archive folder containing a definition file, possibly downloaded with get calling _get_remote from a given url. ~ may refer to the user home folder.

Set _all and self from _all without _url_` entry. This init does not deal with remote logic, it only reads in _url_ from the definition file into the remote_data_path attribute.

Details: Set _all_dict which is a (never used) dictionary generated from _all and self and consists of the keys except for '_url_'.

def get_found(self, remote=True):
get full entries of the last find
def get_all(self, indices=None, remote=True):

Return a list (StrList) of absolute pathnames,

by repeatedly calling get. Elements of the indices list can be an index or a substring that matches one and only one name in the archive. If indices is None, the results from the last call to find are used. Download the data if necessary.

See also get.

def get_first(self, substrs, remote=True):

get the first archived data matching all of substrs.

substrs is a list of substrings.

get_first(substrs, remote) is a shortcut for:

self.find(*substrs)
if self.found:
    return self.get(self.found[0], remote=remote)
return None
def get(self, substr=None, remote=True):

return the full data pathname of substr in the archived data.

Retrieves the data from remote if necessary.

substr can be a substring that matches one and only one name in the data archive or an integer between 0 and len(self).

Raises a ValueError if substr matches several archive entries on none.

If substr is None (default), the first match of the last call to find* or get* is used like self.found[0]`.

If remote is True (default), the respective data are downloaded from the remote location if necessary. Otherwise return None for a match.

def _download(self, name):
create full local path and download single dataset
def get_one(self, *args, **kwargs):
deprecated, for backwards compatibility only, use get instead
def get_extended(self, args, remote=True):

return a list of valid paths.

Elements in args may be a valid path name or a known name from the data archive, or a uniquely matching substring of such a name, or a matching substring with added "!" in which case the first match is taken only (calling self.get_first), or a matching substring with added "*" in which case all matches are taken (calling self.get_all), or a regular expression containing a * and not ending with ! or *, in which case, for example, "bbob/2017.*cma" matches "bbob/2017/DTS-CMA-ES-Pitra.tgz" among others (in a regular expression "." matches any single character and ".*" matches any number >= 0 of characters).

def _name(self, full_path):
return supposed name of full_path or name without any checks
def contains(self, name):
return True if (the exact) name or path is in the archive
@property
def downloaded(self):

return list of data set names of locally available data.

This is only meaningful for a remote archive.

def full_path(self, name):
return full local path of name or any path, idempotent
def _name_with_check(self, full_path):

return name of full_path, idempotent.

If full_path is not from the data archive a warning is issued and path seperators are replaced with /.

Check that all names are only once in the data archive:

>>> import cocopp
>>> bbob = cocopp.archives.bbob
>>> for name in bbob:
...     assert bbob.count(name) == 1, "%s counted %d times in data archive" % (name, bbob.count(name))
...     assert len(bbob.find(name)) == 1, "%s found %d times" % (name, bbob.find(name))
def consistency_check_data(self):

basic quick consistency check of downloaded data.

return (number_of_checked_data, number_of_all_data)

def check_hash(self, name):

raise Exception when hashes disagree or file is missing.

raise RunTimeError if hash is unknown raise ValueError if hashes disagree

def _hash(self, name, hash_function=hashlib.sha256):
compute hash of name or path
def _known_hash(self, name):
return known hash or None
def update(self):

update definition file, either from remote location or from local data.

As remote archives may grow or change, a common usecase may be

>>> import cocopp.archiving as ac
>>> url = 'http://lq-cma.gforge.inria.fr/data-archives/lq-gecco2019'
>>> arch = ac.get(url).update()  # doctest:+SKIP

For updating a local archive use:

create(self.local_data_path)

Details: for updating the local definition file from the local data rather use create. This will however remove a remote URL from its definition and the remote and the local archive can be different now. create makes a backup of the existing definition file.

def consistency_check_read(self):
check/compare against definition file on disk
def _url_(self, definition_list=None):
return value of _url_ entry in definition_list file or None.
def read_definition_file(self):
return definition triple list
@staticmethod
def is_archive(url_or_folder):
return True if folder contains a COCO archive definition file
API Documentation for cocopp, generated by pydoctor at 2020-01-21 16:27:37.