Usage¶
This library contains the DSFF
class that can:
- Behave as a context manager
- Have items got (
data
andfeatures
return the related worksheets) or set (for setting, in order of precedence, standard XSLX properties and metadata contained in thedescription
property) - Write data, features and metadata
- Convert to the ARFF (for use with the Weka framework) or CSV formats or to a FilelessDataset structure (from the Packing Box)
Modes¶
The DSFF
class can be instantiated using a mode of file operation. It works similarly to the native file.open
function but with a more reduced set of modes. The following table indicates
Modes | r | r+ | w | w+ |
---|---|---|---|---|
Read | * | * | * | |
Write | * | * | * | |
Create | * | * | ||
Truncate | * | * |
Bound methods for conversions
When Read is available, the to_*
(e.g. to_arff
) methods are bound to the DSFF class. On the contrary, when Write is available, the from_*
(e.g. from_arff
) methods are bound to the DSFF class. As a consequence, the modes with "+
" have both to_*
and from_*
methods attached.
The following pictures illustrate the available alternative formats and their applicable modes:
Converting from other formats to DSFF | Converting from DSFF to other formats |
---|---|
![]() |
![]() |
Lossy conversions
The following conversions only preserve the data (not the dictionary of features or metadata):
Usage¶
Creating a DSFF from a FilelessDataset
>>> import dsff
>>> with dsff.DSFF() as f:
f.write("/path/to/my-dataset") # folder of a FilelessDataset (containing data.csv, features.json and metadata.json)
# while leaving the context, ./my-dataset.dsff is created
Creating an ARFF file from a DSFF
>>> import dsff
>>> with dsff.DSFF("my-dataset.dsff") as f:
f.to_arff() # creates ./my-dataset.arff
Creating a CSV file from a DSFF
>>> import dsff
>>> with dsff.DSFF("my-dataset.dsff") as f:
f.to_csv() # creates ./my-dataset.csv
Creating a FilelessDataset from a DSFF
>>> import dsff
>>> with dsff.DSFF("/path/to/my-dataset.dsff") as f:
f.to_dataset() # creates ./[dsff-title] with data.csv, features.json and metadata.json