qualia_core.dataset.MNIST module
- class qualia_core.dataset.MNIST.IDXType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
IntEnum
List of possible data types of an IDX file.
- UINT8 = 8
- INT8 = 9
- INT16 = 11
- INT32 = 12
- FLOAT32 = 13
- FLOAT64 = 14
- class qualia_core.dataset.MNIST.IDXMagicNumber[source]
Bases:
BigEndianStructure
Magic number of IDX file format.
Header of 4 bytes. - First 2 bytes are always 0 - 3rd byte is the data type, one of
IDXType
- 4th byte is the number of dimensions that follow the magic number- dtype
Structure/Union member
- n_dims
Structure/Union member
- null
Structure/Union member
- class qualia_core.dataset.MNIST.MNISTBase(path: str = '', dtype: str = 'float32')[source]
Bases:
RawDataset
Base class for MNIST-style datasets (MNIST and Fashion-MNIST).
This class provides common functionality for loading and processing datasets that use the IDX file format. Both MNIST and Fashion-MNIST share the same: - File format (IDX) - Image dimensions (28x28 pixels) - Number of classes (10) - Dataset sizes (60,000 training, 10,000 test)
The IDX file format is a simple format for vectors and multidimensional matrices of various numerical types. The files are organized as: - magic number (4 bytes) identifying data type and dimensions - dimension sizes (4 bytes each) - data in row-major order
Initialize an MNIST-style dataset.
- Parameters:
path – Directory containing the IDX files
dtype – Data type to convert images to
- class qualia_core.dataset.MNIST.MNIST(path: str = '', dtype: str = 'float32')[source]
Bases:
MNISTBase
Original MNIST handwritten digits dataset.
The MNIST database contains 70,000 grayscale images of handwritten digits (0-9). Each image is 28x28 pixels, centered to reduce preprocessing and get better results.
Dataset split: - 60,000 training images - 10,000 test images
Labels: - 0-9: Corresponding digits
Initialize an MNIST-style dataset.
- Parameters:
path – Directory containing the IDX files
dtype – Data type to convert images to
- class qualia_core.dataset.MNIST.FashionMNIST(path: str = '', dtype: str = 'float32')[source]
Bases:
MNISTBase
Fashion MNIST clothing dataset.
A drop-in replacement for MNIST, containing 70,000 grayscale images of clothing items. Each image is 28x28 pixels, following the same format as original MNIST.
Dataset split: - 60,000 training images - 10,000 test images
Labels: 0: T-shirt/top 5: Sandal 1: Trouser 6: Shirt 2: Pullover 7: Sneaker 3: Dress 8: Bag 4: Coat 9: Ankle boot
Initialize an MNIST-style dataset.
- Parameters:
path – Directory containing the IDX files
dtype – Data type to convert images to