cwepr.dataset module

Datasets: units containing data and metadata.

The dataset is one key concept of the ASpecD framework and hence the cwepr package derived from it, consisting of the data as well as the corresponding metadata. Storing metadata in a structured way is a prerequisite for a semantic understanding within the routines. Furthermore, a history of every processing, analysis and annotation step is recorded as well, aiming at a maximum of reproducibility. This is part of how the ASpecD framework and therefore the cwepr package tries to support good scientific practice.

Therefore, each processing and analysis step of data should always be performed using the respective methods of a dataset, at least as long as it can be performed on a single dataset.

Datasets

Generally, there are two types of datasets: Those containing experimental data and those containing calculated data. Therefore, two corresponding subclasses exist:

Dataset factory

Particularly in case of recipe-driven data analysis (c.f. aspecd.tasks), there is a need to automatically retrieve datasets using nothing more than a source string that can be, e.g., a path or LOI. This is where the DatasetFactory comes in. This is a factory in the sense of the factory pattern described by the “Gang of Four” in their seminal work, “Design Patterns” (Gamma et al., 1995):

Module documentation

class cwepr.dataset.ExperimentalDataset

Bases: aspecd.dataset.ExperimentalDataset

Set of data uniting all relevant information.

The unity of numerical and metadata is indispensable for the reproducibility of data and is possible by saving all information available for one set of measurement data in a single instance of this class.

class cwepr.dataset.CalculatedDataset

Bases: aspecd.dataset.CalculatedDataset

Entity consisting of calculated data and metadata.

As the class is fully inherited from ASpecD for simple usage, see the ASpecD documentation of the aspecd.dataset.CalculatedDataset class for details.

class cwepr.dataset.DatasetFactory

Bases: aspecd.dataset.DatasetFactory

Factory for creating dataset objects based on the source provided.

Particularly in case of recipe-driven data analysis (c.f. aspecd.tasks), there is a need to automatically retrieve datasets using nothing more than a source string that can be, e.g., a path or LOI.

The DatasetFactory operates in conjunction with a cwepr.io.factory.DatasetImporterFactory to import the actual dataset. See the respective class documentation for more details.

importer_factory

ImporterFactory instance used for importing datasets

Type

cwepr.io.factory.DatasetImporterFactory