Usage¶

This library contains the DSFF class that can:

Behave as a context manager
Have items got (data and features return the related worksheets) or set (for setting, in order of precedence, standard XSLX properties and metadata contained in the description property)
Write data, features and metadata
Convert to the ARFF (for use with the Weka framework) or CSV formats or to a FilelessDataset structure (from the Packing Box)

Modes¶

The DSFF class can be instantiated using a mode of file operation. It works similarly to the native file.open function but with a more reduced set of modes. The following table indicates

Modes	r	r+	w	w+
Read	*	*		*
Write		*	*	*
Create			*	*
Truncate			*	*

Bound methods for conversions

When Read is available, the to_* (e.g. to_arff) methods are bound to the DSFF class. On the contrary, when Write is available, the from_* (e.g. from_arff) methods are bound to the DSFF class. As a consequence, the modes with "+" have both to_* and from_* methods attached.

The following pictures illustrate the available alternative formats and their applicable modes:

Converting from other formats to DSFF	Converting from DSFF to other formats

Lossy conversions

The following conversions only preserve the data (not the dictionary of features or metadata):

DSFF to ARFF
DSFF to CSV

Usage¶

Creating a DSFF from a FilelessDataset

>>> import dsff
>>> with dsff.DSFF() as f:
    f.write("/path/to/my-dataset")  # folder of a FilelessDataset (containing data.csv, features.json and metadata.json)
# while leaving the context, ./my-dataset.dsff is created

Creating an ARFF file from a DSFF

>>> import dsff
>>> with dsff.DSFF("my-dataset.dsff") as f:
    f.to_arff()  # creates ./my-dataset.arff

Creating a CSV file from a DSFF

>>> import dsff
>>> with dsff.DSFF("my-dataset.dsff") as f:
    f.to_csv()  # creates ./my-dataset.csv

Creating a FilelessDataset from a DSFF

>>> import dsff
>>> with dsff.DSFF("/path/to/my-dataset.dsff") as f:
    f.to_dataset()  # creates ./[dsff-title] with data.csv, features.json and metadata.json