Simple Image Compare Tool

Simple image comparison tool that detects color and face similarities using CLIP embeddings (default) and color matching (separate optional mode). The tool now supports multiple embedding models:

CLIP (default): 512D embeddings, high zero-shot performance
SigLIP: 768D or 1024D embeddings, excellent retrieval performance
ALIGN: 640D embeddings, high accuracy for retrieval
FLAVA: 768D embeddings, good for complex reasoning
X-VLM: 256D embeddings, efficient for region-text tasks - requires local copy of X-VLM
LAION: 1024D embeddings, high-quality visual-language understanding - based on CLIP ViT-H/14 architecture

Each model offers different tradeoffs between accuracy, speed, and resource usage. The default CLIP model provides a good balance for most use cases.

Image and Video Browser

The UI can be used as a media file browser. The following features are available that your OS default photo application may not have:

Expand Features

Auto-resize images to fill the screen
Auto-refresh directory files
Slideshow (customizable)
Optionally play and compare video files and other media - typically will use the first image found for the comparison.
Quicker and smoother transitions between images
Faster load time for directories with many images in some cases
Faster load times when switching between sort types
Go to file by string search or by index (1-based)
Mark groups of files to enable quick transitions and comparisons
Mark favorite media and access them quickly via the Favorites window
Move, copy, and delete marked file groups without overwriting system clipboard
Revert and modify historical file action changes
Quickly find directories via recent directory picker window
Stores session info about seen directories (useful for directories with many images)
Can be set up to run on user-defined list of files in place of a directory
Extension with sd-runner for image generation
Extension with refacdir for file operations
Find related images and prompts from embedded Stable Diffusion workflows
Sort files by related images and prompts
View raw image metadata
Content filtering of images and videos based on their text encoding similarity (automatically hide, move to dir, delete etc)
Create PDFs from marked files with customizable quality and compression options
Password protection system for sensitive operations with configurable session timeouts

For image files, zoom and drag functionality is available in both browsing mode as well as when viewing grouped media after a comparison has been run.

Note that depending on your configuration videos, GIFs, PDFs, SVGs and HTMLs may not be included, you may need to open the filetype configuration window with Ctrl+J and turn them on.

Favorites Window

You can mark any media file (image, video, etc.) as a favorite and access all favorites quickly using the Favorites window (Ctrl+F). This is especially useful when working with directories containing many files, as it allows you to keep persistent preferred items easily accessible for future searches and actions.

Directory Notes

The Directory Notes feature allows you to maintain persistent notes and marked files for individual directories. You can add notes to specific files, mark files for later reference, and export or import your notes and marked files as text or JSON files. This is separate from the runtime marked files used for moving files, making it useful for long-term organization and documentation of your media collections.

Prevalidation Rules

The tool includes a flexible prevalidation system that can automatically process media before they're shown to the user. This is useful for:

Automatically skipping, hiding, or deleting unwanted media
Moving or copying media to specific directories based on content
Filtering media using CLIP embeddings, H5 image classifiers, PyTorch image classifiers, prompt string detection
Setting up rules that apply to specific directories

Prevalidation rules can be configured with:

Multiple validation types enabled simultaneously (OR logic - any type can trigger the action)
Positive and negative text prompts shared across embedding and prompt validation
Custom thresholds for embedding-based matching
Different actions (skip, hide, notify, move, copy, delete, add mark)
Directory-specific rules
H5 model-based classification rules
PyTorch model-based classification rules (supports .pth, .pt, .safetensors, and .bin formats)

This feature is particularly useful for maintaining clean media collections and automating local content filtering, but it can be disabled at any time if desired. You can find an example H5 classifier that is known to work here.

Usage

Clone this repository and ensure Python 3 and the required packages are installed from requirements.txt.

Run app.py to start the UI, or provide the location of the directory containing images for comparison to compare_embeddings.py or compare.py at runtime.

Expand Details

Useful for detecting duplicates or finding associations between large unstructured sets of image files. File management controls are available after the image analysis has completed.

Individual images can be passed to search against the full image data set by passing flag --search with the path of the search file, or setting a search file in the UI before running comparison.

The color matching compare mode is faster than embedding comparison but less robust. In the group comparison case, since every image must be compared to every other image the time complexity is $\mathcal{O}(n^2)$. To remedy this issue for large image sets, set the store_checkpoints config setting to enable process caching to close and pick up where you left off previously, but ensure no files are added or removed from the comparison directory before restarting a compare.

When using embedding compare modes, you can search your images by text - both positive and negative. Commas will break the texts to search into multiple parts, to be combined in a final set of results. If there is a good embedding signal for the search texts it will likely return the images you are looking for. It will take a while to load the first time as embeddings need to be generated. If a list of preset text searches is defined in your config JSON, you can cycle between them with the dedicated shortcut found below.

If a search image is set simultaneously with search text, its embedding will be factored into the search at a weight equal to a single search text part.

Configuration

Expand Details

locale supports any of the following locales:

en (English)
de (Deutsh)
fr (Français)
es (Español)
it (Italiano)
pt (Português)
ru (Русский)
ja (日本語)
ko (한국어)
zh (中文)

clip_model defines the CLIP model to use for generating CLIP embeddings.

image_types defines the allowed file extensions for gathering image files, while video_types defines the allowed file extensions for gathering video files - there are only valid if the enable_videos setting is enabled.

file_check_interval_seconds defines the interval between auto-updates to identify recent file changes.

slideshow_interval_seconds defines the interval between slideshow transitions.

sort_by defines the default image browsing sort setting upon starting the application.

trash_folder defines the target folder for image deletion. If not set, deletion will send the image to your system's default trash folder.

enable_prevalidations enables the prevalidation system. When enabled, prevalidation rules will be applied to media before they are shown.

image_classifier_h5_models defines a list of image classifier models (H5 or PyTorch) that can be used for prevalidation rules. Each model should specify:

model_name: A unique name for the model
model_location: Path to the model file (.h5 for TensorFlow/Keras, or .pth/.pt/.safetensors/.bin for PyTorch)
model_categories: List of categories the model can classify
backend: "auto" (detected from file extension), "hdf5"/"tensorflow" for H5 models, or "pytorch" for PyTorch models
use_hub_keras_layers: Whether to use Keras hub layers (H5 models only)
Additional PyTorch-specific parameters: model_architecture, weights_only, device, input_shape, etc.

If the sd_prompt_reader_loc config setting is pointing to your local copy of stable-diffusion-prompt-reader then opening image details for an image with a stable diffusion prompt will give prompt information found in the image.

tag_suggestions_file should point to a JSON list that provides suggested tags for images for easy access in adding tags, if desired.

file_path_json_path should be set to the path for the file path JSON, if setting use_file_path_json is set to true.

text_embedding_search_presets_exclusive enables the search results returned by preset search texts to be exclusive of eachother to more accurately categorize. Note that since some text embeddings have a much stronger signal than others clustering on those searches can occur.

store_checkpoints will cache a group comparison process at certain checkpoints for later restart.

Key and Mouse Bindings

While the UI elements support normal usage in most cases, there are many bindings that enable extended functionality, mostly to minimize UI content unrelated to image viewers.

Press Shift+H to open up a help window with all key bindings. A directory with images must be set before most of the bindings will have any effect. The group bindings are only functional in GROUP mode after a comparison has been run.

Move Marks Window

This window helps with efficient filing of file marks.

Expand Details

When the move marks window is open -- with or without GUI -- marks can be moved to a target directory by pressing the Enter key, or with the GUI elements if visible. After pressing the Enter key, a number of things can occur:

If no target directories have been set, a folder picker window will open to set a new directory.

If a marks action has been run previously, simply pressing Enter without a filter set will use the directory last used for the move or copy action.

If target directories have been set and a filter is set, the move or copy operation will use the first target directory in the filtered list.

If shift key is pressed along with Enter, the files will be copied instead of moved.

If control key is pressed, any previously marked directories will be ignored and a folder picker window will open to set a new target directory.

If alt key is pressed, the penultimate mark target dir will be used as target directory. This is useful when you want to successively copy files to one directory and then move them to another, without having to re-filter each time.

Simply typing letters while the mark window is open will filter the list of mark target directories, even if the GUI is not present. The backspace key will delete letters from the filter. You can scroll through the list of saved target directories using arrow keys.

To bypass the move marks window, use the Ctrl+R or Ctrl+E shortcuts to immediately run the previous and penultimate actions respectively on the current selection. You can also use number keys or Ctrl+T as hotkeys for persistent marks actions. To see the full list of file action hotkeys and their current settings open the hotkey actions window by pressing Ctrl+H on the marks window.

Ctrl+Z will undo the previous file marks move or copy action. If an earlier action needs to be reversed or modified, open the file actions window to verify the action in the history list and reverse it via the UI.

File Actions Window

The file actions window displays a certain number of completed actions, as defined in the config JSON. Similar to the move marks window, typing will add to a text filter that filters the actions by the target directory basenames.

On this window the previous file action media can be viewed and reversed or the action can be modified if desired.

Limitations

NOTE - It is not currently possible to undo or modify a delete action, however unless the delete folder is explicitly set to null in the config it is likely the deleted items will be saved in a trash folder before being fully removed.

This is a simple app primarily meant for personal use but could be adapted for more intensive use cases.

The face similarity measure in particular is very crude and only compares the number of faces in each image, so it is off by default. At a future time more complex face comparison logic may be added, but for now the embedding comparison is helpful in matching faces.

Name		Name	Last commit message	Last commit date
Latest commit History 295 Commits
assets		assets
auth		auth
compare		compare
configs		configs
extensions		extensions
files		files
image		image
lib		lib
locale		locale
scripts		scripts
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
app.py		app.py
feature_brainstorm_embedding_prototype.md		feature_brainstorm_embedding_prototype.md
requirements-optional.txt		requirements-optional.txt
requirements.txt		requirements.txt
run_test.py		run_test.py
start_windows.bat		start_windows.bat
test_texture_rotation.py		test_texture_rotation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Simple Image Compare Tool

Image and Video Browser

Favorites Window

Directory Notes

Prevalidation Rules

Usage

Configuration

Key and Mouse Bindings

Move Marks Window

File Actions Window

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Languages

tomhallmain/simple_image_compare

Folders and files

Latest commit

History

Repository files navigation

Simple Image Compare Tool

Image and Video Browser

Favorites Window

Directory Notes

Prevalidation Rules

Usage

Configuration

Key and Mouse Bindings

Move Marks Window

File Actions Window

Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages