Skip to content
/ dora Public

GitHub repository for DORA: Data-agnOstic Representation Analysis paper. DORA allows to find outlier representations in Deep Neural Networks.

License

Notifications You must be signed in to change notification settings

lapalap/dora

Repository files navigation

DORA: Data-agnOstic Representation Analysis

A toolkit to explore the Representation Spaces of Deep Neural Networks
PyTorch version
Paper link

Open In Colab tests codecov Dependencies Status Code style: black License


Data-agnOstic Representation analysis – DORA – is the automatic framework for inspecting the representation space of Deep Neural Networks for infected neurons (i.e. neurons that are representing spurious or artifactual concepts). Independent of data, for any given DL model DORA, allows to automatically detect anomalous representations, that bear a high risk of learning unintended spurious concepts deviating from the desired decision-making policy. Infected representations, found by DORA, can also be used as artifact detectors when applied to any given dataset – allowing furthermore an automatic detection and successive cleaning of infected data points.

With DORA user can investigate networks for presence of artifactual representations. As an example, DORA was able to found cluster of unintended (spurious) Chinese-character detector in representations from standard ImageNet trained networks.


You can it via pip as shown below:

pip install git+https://github.com/lapalap/dora.git

You can get started either with the colab notebook or locally as shown below:

Let's start by analysing some neurons from the pre-trained resnet18:

import torch
import torchvision.models as models
import torchvision.transforms as transforms

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
neuron_indices = [i for i in range(100, 200)]

model = models.resnet18(pretrained=True).eval().to(device)
my_transforms = transforms.Compose(
    [
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ]
)

And then use dora to generate synthetic activation maximization signals and collect their encodings on the same layer ✨

from dora import Dora
from dora.objectives import ChannelObjective

d = Dora(model=model, image_transforms=my_transforms, device=device)

d.generate_signals(
    neuron_idx=neuron_indices,
    layer=model.avgpool,
    objective_fn=ChannelObjective(),
    lr=18e-3,
    width=224,
    height=224,
    iters=90,
    experiment_name="model.avgpool",
    overwrite_experiment=True,  ## will still use what already exists if generation params are same
)

@article{bykov2022dora,
  title={DORA: Exploring outlier representations in Deep Neural Networks},
  author={Bykov, Kirill and Deb, Mayukh and Grinwald, Dennis and M{\"u}ller, Klaus-Robert and H{\"o}hne, Marina M-C},
  journal={arXiv preprint arXiv:2206.04530},
  year={2022}
}

This project is licensed under the terms of the GNU GPL v3.0 license. See LICENSE for more details.

About

GitHub repository for DORA: Data-agnOstic Representation Analysis paper. DORA allows to find outlier representations in Deep Neural Networks.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •