Datasets

A dataset is a small, portable configuration file describing which data belongs together, how it should be interpreted, and how it can be plotted. Datasets are human-readable JSON files with extensions such as:

.ds
.dataset
.json

These files are the backbone of the framework’s reproducible analysis system.

Why datasets matter

Typical ad-hoc analysis involves scattering paths, parameters, units, and processing logic across multiple scripts. Over time, this leads to:

inconsistent metadata
accidental path mistakes
hard-to-reproduce results
difficulty sharing analyses across a team

A dataset solves this by acting as a single source of truth. It encapsulates:

experiment metadata
device type
datafile grouping
labels and colours
the experiment timestamp
notes and documentation

The GUI and framework then build the correct:

data readers
processors
plotters

…automatically and consistently.

Typical lifecycle of a dataset

1. Creation

Datasets are primarily created using the Data Creation Window in the GUI.
This ensures correctness and prevents invalid or incomplete JSON.

In this window the user:

names the dataset
selects the device
sets the experiment date & time
adds one or more datafiles and assigns labels
optionally assigns colours
writes notes or comments
saves the dataset JSON to disk

All required metadata is enforced by the window.

2. Storage

A dataset is stored as a compact JSON configuration. Key fields include:

dataset name
creation timestamp
experiment timestamp
device identifier
list of files and their labels
optional colours
notes

A custom encoder and decoder handle the saving and loading of DataSet instances from JSON files on disk.

Example (illustrative only):

{
    "location": null,
    "name": "Example Set",
    "creation_date": "2025.12.02_14.19.54",
    "experiment_date_time": "2023.03.30_13.19.54",
    "device": "Generic",
    "notes": "",
    "console": {},
    "structure_type": "flat",
    "filepaths": {
        "Example IV Curve": "path/to/IV_2023_03_30_13_17_28.txt"
    },
    "colours": {}
}

(This example demonstrates the structure only. Do not use it directly.)

3. Execution

Once a dataset is loaded in the main GUI:

Its files are automatically listed
The user selects a subset of files to plot
The device determines which plots are valid
Plot options appear immediately
The run happens in a background worker thread
The reader, processor, and plotter cooperate automatically
The result is displayed to the user

TL;DR

Datasets provide a clean, structured, reproducible way to define:

what the data is
where it lives
how it should be interpreted
which device and plots apply
how files relate to one another
metadata needed for processing

Together with the Data Creation Window, datasets transform analysis workflows from scattered scripts into a declarative, maintainable, and shareable system.