Csmart-Digit Training
User Guide for Csmart Training Software
Last updated
User Guide for Csmart Training Software
Last updated
Csmart Digit AI Training is a command-line software for training AI models for coffee classification tasks. It supports multiple architectures and provides integration with Mlflow for tracking experiments. This guide will walk you through installation, training, testing, explainability, exporting, and inference.
Miniconda is a lightweight version of Anaconda that helps manage Python environments efficiently. Download and install it from:
Before installing dependencies, create an isolated virtual environment using Conda:
Activate the virtual environment:
Once inside the environment, install the necessary dependencies:
Install the project in editable mode to allow local modifications:
Install PyTorch with CUDA support by following the instructions at:
Mlflow is used to track experiments and save model artifacts. If you do not want to use Mlflow, set uri: ./mlruns
in config.yaml
.
To start the Mlflow server, run:
Access the Mlflow UI by opening the following URL in your browser:
The Mlflow server is useful for:
Logging model metrics, parameters, and hyperparameters.
Storing checkpoints for model recovery.
Providing a UI to analyze training behavior.
If you do not want to use Mlflow, you can disable it by setting uri: ./mlruns
in config.yaml
.
To train with an actual dataset, run:
To specify a different model architecture, pass base_model_name
as a command-line argument:
Alternatively, you can modify the config.yaml
file directly.
Supported architectures include:
resnet101
resnet18
resnet34
resnet50
resnext50_32x4d
wide_resnet101_2
convnext_base
convnext_large
vit_base_patch16_224
swin_base_patch4_window7_224
resnext101_64x4d
fused_network
efficientnetb0
segformer-b0
segformer-b1
segformer-b2
segformer-b3
segformer-b4
segformer-b5
Models with higher accuracy were achieved using convnext_large
and segformer-b5
At the end of training, all relevant files are stored in Mlflow artifacts, under training_data
. The training directory follows the structure:
Stored artifacts include:
Metrics
Checkpoints
TensorBoard logs
Training visualizations
After training, run the test script to compute additional evaluation metrics:
Test results are saved in Mlflow artifacts under test_data
. Each test run generates a new folder:
Multiple test runs can be performed, each saved separately.
To understand how the model makes decisions, use xai_analysis.py
. This applies Explainable AI (XAI) techniques like GradCAM and Gradient SHAP.
Run with:
XAI visualizations are stored in Mlflow artifacts under xai_data
, organized by timestamps:
Once the model achieves the desired performance, export it to ONNX format for inference:
After exporting, run inference with the ONNX model:
Create virtual environment
conda create -n csmart-training python=3.10
Activate virtual environment
conda activate csmart-training
Install dependencies
pip install -r requirements.txt
Install project
pip install -e .
Install PyTorch with GPU support (Optional)
Start Mlflow server
mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./mlflow-artifacts --host 127.0.0.1 --port 5000
Train a model with real dataset
python src/train.py dataset_name=coffee_multiclass
Train a model with a specific architecture
python src/train.py dataset_name=coffee_multiclass base_model_name=segformer-b5
Modify base model inside config file
Edit config.yaml
and change base_model_name
Test the trained model
python src/test.py checkpoint={path_to_ckpt_file}
Run XAI analysis
python src/xai_analysis.py checkpoint={path_to_ckpt_file}
Export model to ONNX
python src/export.py checkpoint={path_to_ckpt_file}
Run inference using ONNX model
python src/predict.py onnx_weights={path_to_onnx_file}
🚀 Now you’re ready to train and deploy your AI models efficiently!
Follow