circle-xmarkOutlier Detection

Identifying and removing anomalous images from your dataset using Isolation Forest

The Outlier Detection screen uses the Isolation Forest algorithm to identify anomalous images within a class folder. Outlier images — those that deviate significantly from the rest of the class — can degrade model performance if included in training. This tool helps you find and separate them before training begins.

The algorithm works by extracting feature embeddings from a trained checkpoint and then applying statistical anomaly detection. Images flagged as outliers are moved to a separate subfolder for your review.

Prerequisites

  • An open project.

  • A class folder containing images of a single class to analyze.

  • A trained checkpoint (.ckpt file) from a previous training session.

circle-exclamation

Step-by-Step Walkthrough

1. Select the Class Folder

Click Select next to the first field and choose a folder containing images of a single class. The detected image count is displayed after selection.

circle-info

Run outlier detection on one class at a time. For example, select Dataset/Split/V00/train/OK/ to check for outliers in the OK class.

2. Select the Model Checkpoint

Click Select next to the model field and choose a .ckpt file from a previous training session. The application validates that the file has a supported extension.

If you select an .onnx file by mistake, an orange warning box appears explaining that only .ckpt files are supported.

3. Adjust the Contamination Factor

The contamination slider controls the expected proportion of outliers in the dataset. It ranges from 0.01 (1%) to 0.50 (50%), with a step size of 0.01.

  • A low value (0.01–0.05) assumes very few outliers and only flags the most extreme anomalies.

  • A high value (0.10–0.30) assumes more outliers and flags images more aggressively.

After adjusting the slider, click Save to persist the value. If you navigate away with unsaved changes, a confirmation dialog asks whether to save or discard.

circle-info

Start with a low contamination factor (e.g., 0.05) and review the results. If too many valid images are flagged, lower the value. If obvious outliers are missed, increase it.

4. Run Outlier Detection

Click Run Outlier Detection to start. The application automatically saves any pending contamination changes, then launches the Python outlier detection script.

Progress is streamed to the log monitor on the right panel. When complete, the results folder opens automatically in your file explorer, showing the separated outlier images.

Stopping the Process

Click Stop to cancel. The sidebar is locked during execution to prevent navigation.

Reviewing Results

After detection completes, review the flagged images:

  1. The outlier images are moved to a separate subfolder within the class folder.

  2. Open the folder and visually inspect each image.

  3. Restore false positives — move images that were incorrectly flagged back to the main class folder.

  4. Confirm true outliers — leave genuinely anomalous images in the outlier folder (they will be excluded from training).

Configuration Persistence

The contamination value and model path are stored in the project's config.yaml file. The contamination is saved under outlier_detection.contamination and the model path under preclassification.model_path (shared with the pre-classification screen).

Troubleshooting

Issue
Possible Cause
Solution

Warning about ONNX model

Selected an .onnx file instead of .ckpt

Choose a .ckpt checkpoint file from your Checkpoints folder

Too many images flagged

Contamination factor is too high

Lower the contamination slider and run again

No outliers found

Contamination factor is too low or dataset is very clean

Increase the contamination factor slightly, or the dataset may genuinely have no outliers

Process fails immediately

Python runtime is not installed

Install the runtime from Hardware Settings

Last updated