Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics Abdelkareem, A.O. et al.(2025)
spOT-NMF is a Python package for unsupervised deconvolution and discovery of gene programs in spatial transcriptomics. It integrates Optimal Transport (OT) into a non-negative matrix factorization (NMF) framework, enabling robust topic modeling, high-resolution spatial deconvolution, and rich biological annotation.
This package supports the analyses in: spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics — bioRxiv (2025). DOI: 10.1101/2025.08.02.668292
- OT-NMF Deconvolution: Reference-free topic modeling with OT-regularized NMF.
- HVG Selection: Flexible, batch-aware highly variable gene selection.
- Biological Annotation: Automated enrichment and gene-set overlap of inferred programs.
- Spatial Visualization: Publication-quality spatial plots for topic/program usage.
- Scalable & Modular: Built for large datasets and multi-sample workflows.
- CLI & Python API: Run from the command line or import in notebooks.
- Install PyTorch (CPU or CUDA) for your platform (see pytorch.org). Examples:
# CPU-only
pip install torch --index-url https://download.pytorch.org/whl/cpu
# CUDA 11.8 (Linux/Windows with NVIDIA GPUs)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118- Install spOT-NMF:
pip install spot-nmf- Verify the CLI:
spotnmf --helpConda users:
conda create -n spotnmf python=3.12 conda activate spotnmf # install torch as above, then: pip install spot-nmf
Full pipeline (deconvolution → annotation → spatial plots):
spotnmf spotnmf \
--sample_name SAMPLE1 \
--adata_path ./data/sample1.h5ad \
--results_dir ./results \
--k 5Other commands:
spotnmf deconvolve --sample_name SAMPLE1 --adata_path ./data/sample1.h5ad --results_dir ./results --k 5
spotnmf plot --sample_name SAMPLE1 --adata_path ./data/sample1.h5ad --results_dir ./results
spotnmf annotate --sample_name SAMPLE1 --results_dir ./results --genome GRCh38
spotnmf network --sample_name SAMPLE1 --results_dir ./results --usage_threshold 0 --n_bins 1000 --edge_threshold 0.199import spotnmf as spot
# === Configuration === #
DATA_PATH = Path("data/test_data/dataset10_adata_spatial.h5ad")
RESULTS_DIR = Path(r"/data/test_results/")
SAMPLE_NAME = "TestSample"
GENOME = "mm10"
# === Read Data === #
adata = spot.io.read_adata(
data_path=DATA_PATH,
data_mode="h5ad"
)
# === Model Parameters === #
model_params = {
"lr": 0.001, # Learning rate
"h": 0.01, # H regularization
"w": 0.01, # W regularization
"eps": 0.05, # Epsilon
"normalize_rows": True,
}
# === Run Factorization === #
results = spot.cli.run_experiment(
adata_spatial=adata,
k=5, # Number of ranks
sample_name=SAMPLE_NAME,
results_dir=str(RESULTS_DIR),
genome=GENOME,
annotate=False,
plot=False,
network=False,
is_visium=True,
model_params=model_params,
)
# === Annotate Programs === #
spot.cli.annotate_programs(
results_dir=str(RESULTS_DIR),
sample_name=SAMPLE_NAME,
genome=GENOME,
)| Command | Description |
|---|---|
spotnmf |
Full pipeline: deconvolution → annotation → spatial plotting |
deconvolve |
Run OT-NMF and save results |
plot |
Visualize spatial topic/program usage |
annotate |
Enrich and annotate gene programs |
network |
Visualize niche networks based on topic interactions |
Run spotnmf <command> --help for per-command options.
topics_per_spot_{sample}.csv— topic/program usage per spotgenescores_per_topic_{sample}.csv— gene scores per topicranked_genescores_{sample}.csv— ranked marker genes per topic- Pathway enrichment and gene-set overlap tables
- Spatial plots & QC visualizations
- Network plots of topic–topic interactions
The main branch provides the reusable software package.
The original Jupyter notebooks used to reproduce manuscript figures are maintained in the manuscript branch:
git fetch origin
git checkout manuscriptNotebooks are in:
scripts/manuscript_notebooks/
Use manuscript to regenerate paper figures; use main for running the package on your data.
Please cite:
Abdelkareem, A.O., Gill, G.S., Manoharan, V.T., Verhey, T.B., & Morrissy, A.S. spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics. bioRxiv (2025). https://doi.org/10.1101/2025.08.02.668292
@article{abdelkareem2025spotnmf,
title = {spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics},
author = {Abdelkareem, Aly O. and Gill, Gurveer S. and Manoharan, Varsha Thoppey and Verhey, Theodore B. and Morrissy, A. Sorana},
journal = {bioRxiv},
year = {2025},
doi = {10.1101/2025.08.02.668292},
url = {https://www.biorxiv.org/content/10.1101/2025.08.02.668292v1},
note = {Preprint}
}We welcome ideas, bug reports, and feature requests—please open a GitHub Issue: https://github.com/MorrissyLab/spOT-NMF/issues
GPL-3.0. See LICENSE for details.
Questions or need help? Open an Issue: https://github.com/MorrissyLab/spOT-NMF/issues