Csmart-Digit Training
User Guide for Csmart Training Software
Introduction
Csmart Digit AI Training is a command-line software for training AI models for coffee classification tasks. It supports multiple architectures and provides integration with Mlflow for tracking experiments. This guide will walk you through installation, training, testing, explainability, exporting, and inference.
1. Installation
1.1. Install Miniconda
Miniconda is a lightweight version of Anaconda that helps manage Python environments efficiently. Download and install it from:
1.2. Create a Virtual Environment
Before installing dependencies, create an isolated virtual environment using Conda:
Activate the virtual environment:
1.3. Install Requirements
Once inside the environment, install the necessary dependencies:
1.4. Install the Project
Install the project in editable mode to allow local modifications:
1.5. Install PyTorch with GPU Support
Install PyTorch with CUDA support by following the instructions at:
2. Running the Mlflow Server (Optional)
Mlflow is used to track experiments and save model artifacts. If you do not want to use Mlflow, set uri: ./mlruns
in config.yaml
.
To start the Mlflow server, run:
Access the Mlflow UI by opening the following URL in your browser:
Why Mlflow?
The Mlflow server is useful for:
Logging model metrics, parameters, and hyperparameters.
Storing checkpoints for model recovery.
Providing a UI to analyze training behavior.
If you do not want to use Mlflow, you can disable it by setting uri: ./mlruns
in config.yaml
.
3. Training a Model
3.1. Training with a Real Dataset
To train with an actual dataset, run:
3.2. Changing the Base Model
To specify a different model architecture, pass base_model_name
as a command-line argument:
Alternatively, you can modify the config.yaml
file directly.
3.3. Available Models
Supported architectures include:
resnet101
resnet18
resnet34
resnet50
resnext50_32x4d
wide_resnet101_2
convnext_base
convnext_large
vit_base_patch16_224
swin_base_patch4_window7_224
resnext101_64x4d
fused_network
efficientnetb0
segformer-b0
segformer-b1
segformer-b2
segformer-b3
segformer-b4
segformer-b5
Models with higher accuracy were achieved using convnext_large
and segformer-b5
3.4. Where Are the Trained Files Stored?
At the end of training, all relevant files are stored in Mlflow artifacts, under training_data
. The training directory follows the structure:
Stored artifacts include:
Metrics
Checkpoints
TensorBoard logs
Training visualizations
4. Testing a Model
After training, run the test script to compute additional evaluation metrics:
4.1. Where Are Test Results Stored?
Test results are saved in Mlflow artifacts under test_data
. Each test run generates a new folder:
Multiple test runs can be performed, each saved separately.
5. Explainable AI (XAI) Analysis
To understand how the model makes decisions, use xai_analysis.py
. This applies Explainable AI (XAI) techniques like GradCAM and Gradient SHAP.
Run with:
5.1. Where Are XAI Results Stored?
XAI visualizations are stored in Mlflow artifacts under xai_data
, organized by timestamps:
6. Exporting the Model
Once the model achieves the desired performance, export it to ONNX format for inference:
7. Running Predictions with an Exported Model
After exporting, run inference with the ONNX model:
Summary of Commands
Create virtual environment
conda create -n csmart-training python=3.10
Activate virtual environment
conda activate csmart-training
Install dependencies
pip install -r requirements.txt
Install project
pip install -e .
Install PyTorch with GPU support (Optional)
Start Mlflow server
mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./mlflow-artifacts --host 127.0.0.1 --port 5000
Train a model with real dataset
python src/train.py dataset_name=coffee_multiclass
Train a model with a specific architecture
python src/train.py dataset_name=coffee_multiclass base_model_name=segformer-b5
Modify base model inside config file
Edit config.yaml
and change base_model_name
Test the trained model
python src/test.py checkpoint={path_to_ckpt_file}
Run XAI analysis
python src/xai_analysis.py checkpoint={path_to_ckpt_file}
Export model to ONNX
python src/export.py checkpoint={path_to_ckpt_file}
Run inference using ONNX model
python src/predict.py onnx_weights={path_to_onnx_file}
🚀 Now you’re ready to train and deploy your AI models efficiently!
Last updated