Skip to content

lapalap/invert

Repository files navigation

Labeling Neural Representations with Inverse Recognition

Accepted at 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
Paper link

Open In Colab





Inverse Recognition (INVERT) is a method designed to enhance our understanding of the representations learned by Deep Neural Networks (DNNs). It aims to bridge the gap between these complex, hierarchical data representations and human-understandable concepts. Unlike existing global explainability methods, INVERT is more scalable and less reliant on resources such as segmentation masks. It also offers an interpretable metric that measures the alignment between the representation and its explanation, providing a degree of statistical significance.


You can it via pip as shown below:

! git clone https://github.com/lapalap/invert.git --quiet
! pip install git+file:///content/invert --quiet


You can get started with the following colab notebook.



Version 1.2

This release brings a major speed-up path for INVERT via HDF5-based, batched explanations—ideal for large layers and datasets.

Highlights

  • New appendable HDF5 dataset: invertH5Dataset
    • Stores a single dataset Y of shape [N, D] (float32), where each row is a flattened activation vector.
    • Simple streaming API: .update(batch) appends batched activations (PyTorch or NumPy); activations are flattened automatically.
    • Designed for high-throughput writes with chunking; great for long inference runs.
  • New fast explainer: Invert.explain_from_h5(...)
    • Reads activations directly from an HDF5 file (dataset Y) and performs fully vectorized explanation search.
    • Computes AUROC for many concepts and features at once using rank-based formulas with proper tie handling.
    • Efficient feature batching (feature_batch_size) and no per-candidate loops: conjunctions/disjunctions use inclusion–exclusion algebra for sums and counts.
    • Supports the same knobs as the classic API: L, B, limit_search, min_fraction, max_fraction, mode, memorize_states.

What’s improved

  • Speed & scalability: Large, layered activations can be labeled significantly faster by avoiding repeated per-neuron passes and by computing metrics in bulk.
  • Memory efficiency: Processes features in column-batches; no need to load the entire [N, D] activation matrix into GPU memory at once.
  • Backwards compatible: The existing explain_representation(...) API is unchanged and still works with a single activation vector per unit.

Minimal usage

Collect activations once (streaming) and save to HDF5:

from invert_h5_dataset import invertH5Dataset

# Create (or append to) an H5 file with a single dataset "Y" of shape [N, D]
ds = invertH5Dataset("runs/acts.h5", d_dim=D, mode="a")

for imgs in loader:
    with torch.no_grad():
        acts = model(imgs.to(device))   # [B, ..., D]
        ds.update(acts)                 # appends B rows; flattens automatically

ds.close()

Run the fast explainer:

from invert.explainer import Invert

inv = Invert(device="cuda")
inv.load_concept_labels(labels_path="labels.pt",
                        description_path="labels.json")

results = inv.explain_from_h5(
    h5_path="runs/acts.h5",
    L=3, B=5,
    limit_search=None,
    min_fraction=0.0,
    max_fraction=0.5,
    mode="positive",
    memorize_states=False,
    feature_batch_size=512,   # tune for your memory
)

# results: Dict[int, List[Explanation]] keyed by feature (unit) index

Version 1.1

The latest update brings several enhancements to our software:

  • Improved Interface: We’ve introduced a fresh, user-friendly interface that enhances your experience.
  • Speed Boost: Explanation generation is now faster, allowing you to get insights more swiftly.
  • Memory Optimization: We’ve fine-tuned memory usage for better efficiency.

Specifically:

  • The Phi class now supports sympytorch, streamlining the explanation generation process.
  • Our new interface enables the explain_representation method to work seamlessly with individual activation tensors from a single unit.


@article{bykov2024labeling,
  title={Labeling neural representations with inverse recognition},
  author={Bykov, Kirill and Kopf, Laura and Nakajima, Shinichi and Kloft, Marius and H{\"o}hne, Marina},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

About

Official GitHub for the paper "Labeling Neural Representations with Inverse Recognition"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published