magnifying-glass-chartImage Search

Finding visually similar images across datasets using color histograms or neural embeddings

The Image Search screen helps you find visually similar images within a dataset. This is useful for locating duplicates, verifying class consistency, or finding images that closely resemble a reference set. You provide a folder of reference images, a folder to search through, and the application copies matching images to a destination folder.

Two comparison methods are available: HSV Cosine Similarity (fast, color-based) and Checkpoint Embeddings (slower, semantically aware using a trained model).

Prerequisites

  • An open project.

  • A folder of reference images representing what you want to find.

  • A folder of images to search through (can be the same or different from the reference).

  • For the checkpoint method: a trained .ckpt model file.

Step-by-Step Walkthrough

1. Select the Reference Images Folder

Click Select next to the first field and choose the folder containing your reference images. These are the images the application will try to find matches for. After selection, the detected image count is displayed.

2. Select the Search Folder

Click Select next to the second field and choose the folder to search through. This folder is scanned for images similar to your references. The image count is displayed after selection.

3. Select the Destination Folder

Click Select next to the third field to choose where matched images will be copied. If the destination already contains files, the application prompts you with a confirmation dialog before proceeding.

4. Choose a Comparison Method

Select the comparison approach from the dropdown:

Method
Speed
How It Works
Best For

HSV Cosine Similarity

Fast

Compares color histograms in HSV color space

Finding visually similar images by color and tone

Checkpoint Embeddings (.ckpt)

Slower

Extracts neural network feature vectors and compares them

Finding semantically similar images using a trained model

If you select Checkpoint Embeddings, an additional field appears where you must select a .ckpt model file.

5. Configure Matching Parameters

Distance Threshold — Controls how strict the matching is. Values range from 0 to 1, with lower values requiring closer matches. The step size is 0.05.

  • HSV method: recommended threshold around 0.10–0.30

  • Checkpoint method: recommended threshold around 0.50–0.80

Top K — Limits the maximum number of matches per reference image. Set to 0 for unlimited matches. Use a specific value if you only want the closest N matches.

Click Find Similar to begin. The application validates all paths, launches the Python similarity search script, and streams progress to the log monitor.

When the search completes successfully, a green results box displays the total number of matches found and images copied. The destination folder opens automatically in your file explorer.

Click Stop at any time to cancel. The sidebar is locked during execution to prevent accidental navigation.

Troubleshooting

Issue
Possible Cause
Solution

No matches found

Threshold is too strict

Increase the distance threshold value

Too many false matches

Threshold is too loose

Decrease the threshold or switch to the checkpoint method for more semantic comparison

Checkpoint method fails

Invalid or incompatible model file

Ensure the .ckpt file is from a completed training session

Last updated