Outlier Detection
Identifying and removing anomalous images from your dataset using Isolation Forest
The Outlier Detection screen uses the Isolation Forest algorithm to identify anomalous images within a class folder. Outlier images — those that deviate significantly from the rest of the class — can degrade model performance if included in training. This tool helps you find and separate them before training begins.
The algorithm works by extracting feature embeddings from a trained checkpoint and then applying statistical anomaly detection. Images flagged as outliers are moved to a separate subfolder for your review.
Prerequisites
An open project.
A class folder containing images of a single class to analyze.
A trained checkpoint (
.ckptfile) from a previous training session.
Outlier detection requires a .ckpt checkpoint file. ONNX models (.onnx) are not supported for this operation because the algorithm needs access to the model's internal feature layers.
Step-by-Step Walkthrough
1. Select the Class Folder
Click Select next to the first field and choose a folder containing images of a single class. The detected image count is displayed after selection.
Run outlier detection on one class at a time. For example, select Dataset/Split/V00/train/OK/ to check for outliers in the OK class.
2. Select the Model Checkpoint
Click Select next to the model field and choose a .ckpt file from a previous training session. The application validates that the file has a supported extension.
If you select an .onnx file by mistake, an orange warning box appears explaining that only .ckpt files are supported.
3. Adjust the Contamination Factor
The contamination slider controls the expected proportion of outliers in the dataset. It ranges from 0.01 (1%) to 0.50 (50%), with a step size of 0.01.
A low value (0.01–0.05) assumes very few outliers and only flags the most extreme anomalies.
A high value (0.10–0.30) assumes more outliers and flags images more aggressively.
After adjusting the slider, click Save to persist the value. If you navigate away with unsaved changes, a confirmation dialog asks whether to save or discard.
Start with a low contamination factor (e.g., 0.05) and review the results. If too many valid images are flagged, lower the value. If obvious outliers are missed, increase it.
4. Run Outlier Detection
Click Run Outlier Detection to start. The application automatically saves any pending contamination changes, then launches the Python outlier detection script.
Progress is streamed to the log monitor on the right panel. When complete, the results folder opens automatically in your file explorer, showing the separated outlier images.
Stopping the Process
Click Stop to cancel. The sidebar is locked during execution to prevent navigation.
Reviewing Results
After detection completes, review the flagged images:
The outlier images are moved to a separate subfolder within the class folder.
Open the folder and visually inspect each image.
Restore false positives — move images that were incorrectly flagged back to the main class folder.
Confirm true outliers — leave genuinely anomalous images in the outlier folder (they will be excluded from training).
Configuration Persistence
The contamination value and model path are stored in the project's config.yaml file. The contamination is saved under outlier_detection.contamination and the model path under preclassification.model_path (shared with the pre-classification screen).
Troubleshooting
Warning about ONNX model
Selected an .onnx file instead of .ckpt
Choose a .ckpt checkpoint file from your Checkpoints folder
Too many images flagged
Contamination factor is too high
Lower the contamination slider and run again
No outliers found
Contamination factor is too low or dataset is very clean
Increase the contamination factor slightly, or the dataset may genuinely have no outliers
Process fails immediately
Python runtime is not installed
Install the runtime from Hardware Settings
Last updated