# Getting Started with Qualia ## Introduction This guide will help you get up and running with Qualia, a modular system for machine learning experimentation and deployment. We will show you a very simple installation and exemple. If you want more detail about the different part we advice you to seek the answers in the documentation : - [Installation](Installation) - [Usage](../UserGuide/Usage) - [ConfigurationFile](../UserGuide/ConfigurationFile) - [Components](../UserGuide/Components) ## Prerequisites Before you begin, ensure you have: - Python (version 3.9-3.12, recommended: 3.12) - Git (for cloning repositories) - Basic familiarity with command line operations ## Installation Options Please use one of the installation procedure listed in {doc}`Installation` or {doc}`UsingDocker`. ## Tutorial: Your First Qualia Project Let's create a practical example using the UCI Human Activity Recognition (HAR) dataset. ### 1. Project Setup First, create a new project directory: ```bash mkdir qualia-project cd qualia-project ``` ### 2. Dataset Preparation 1. Download the UCI HAR dataset from [UCI HAR](https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones) 2. Extract into `data/` directory ``` qualia-project/ ├── data/ │ └── UCI HAR Dataset/ # Place extracted dataset here │ ├── train/ │ └── test/ └── config.toml # We'll create this next ``` ### 3. Configuration Create `config.toml` with the following content: ```toml [bench] name = "UCI-HAR_CNN_example" seed = 2 first_run = 1 last_run = 1 [learningframework] kind = "PyTorch" [dataset] kind = "UCI_HAR" params.variant = "raw" params.path = "data/UCI HAR Dataset/" # Data preprocessing steps [[preprocessing]] kind = "DatamodelConverter" [[preprocessing]] kind = "Class2BinMatrix" # Model configuration [model_template] kind = "CNN" epochs = 10 batch_size = 32 params.batch_norm = true [model_template.optimizer] kind = "Adam" params.lr = 0.001 # CNN architecture [[model]] name = "uci-har_cnn_simple" params.filters = [5, 5] # Two conv layers with 5 filters each params.kernel_sizes = [2, 2] # 2x2 kernels params.pool_sizes = [2, 0] # Pooling after first conv layer # Optional: Deployment configuration [deploy] target = 'Linux' converter.kind = 'QualiaCodeGen' quantize = ['float32'] optimize = [''] limit = 50 ``` ### 4. Running the Experiment Execute these commands in order: ```bash # 1. Preprocess the dataset qualia config.toml preprocess_data # 2. Train the model qualia config.toml train # Optional: Deploy and evaluate qualia config.toml prepare_deploy qualia config.toml deploy_and_evaluate ``` ### Understanding the Output - Processed dataset: `out/data//` - Model weights: `out/learningmodel/` - Training logs: `log//learningmodel/` - Deployment files: `out//` - Evaluation results: `log//evaluate/` ### Qualia's commands #### Data Preprocessing ```bash qualia config.toml preprocess_data ``` Loads UCI_HAR dataset specified in `[dataset]` section: ```toml [dataset] kind = "UCI_HAR" params.variant = "raw" params.path = "data/UCI HAR Dataset/" ``` Applies preprocessing steps from `[[preprocessing]]`: 1. DatamodelConverter: Converts activities and subjects into data/label arrays 2. Class2BinMatrix: Converts class numbers to one-hot encoding ```toml [[preprocessing]] kind = "DatamodelConverter" [[preprocessing]] kind = "Class2BinMatrix" ``` Exports preprocessed NumPy arrays to `out/data/UCI_HAR_raw/`: - `train_data.npy`, `train_labels.npy` - `test_data.npy`, `test_labels.npy` #### Model Training ```bash qualia config.toml train ``` Uses PyTorch framework defined in `[learningframework]`: ```toml [learningframework] kind = "PyTorch" ``` Trains model from `[[model]]` using shared CNN configuration: ```toml [model_template] kind = "CNN" epochs = 10 batch_size = 32 [[model]] name = "uci-har_cnn_simple" params.filters = [5, 5] # Two conv layers params.kernel_sizes = [2, 2] # Two conv layers with kernel of length 2 params.pool_sizes = [2, 0] # First layer pooling params.dims = 1 # 1D convolutions ``` Saves trained model in `out/learningmodel/`. #### Deployment Preparation ```bash qualia config.toml prepare_deploy ``` Converts trained model to embedded format using `[deploy]` settings: ```toml [deploy] target = 'Linux' # Host CPU converter.kind = 'QualiaCodeGen' # C code generator quantize = ['float32'] # Precision optimize = [''] # Optimizations ``` Generates C code in `out/qualia_codegen` and compiles for Linux host in `out/deploy/Linux`.