CodonMoE is a Python package that implements Adaptive Mixture of Codon Reformative Experts (CodonMoE and CodonMoE-pro) for mRNA analyses.
We include four public mRNA datasets, all bundled as CSVs in datasets/. Each file shares the same schema:
Sequence: RNA sequence (A,U,C,G)Value: real-valued targetDataset: dataset identifierSplit:train/valid/test
| Dataset | File |
|---|---|
| MLOS | datasets/MLOS.csv |
| Tc-Riboswitches | datasets/Tc-Riboswitches.csv |
| mRFP Expression | datasets/mRFP_Expression.csv |
| CoV Vaccine Degradation | datasets/CoV_Vaccine_Degradation.csv |
conda create -n codonmoe python=3.9 -y
conda activate codonmoe
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia -y
git clone https://github.com/Kingsford-Group/codonmoe.git
cd codonmoe
pip install -r requirements.txt
pip install -e .CodonMoE(input_dim, num_experts=4, dropout_rate=0.1)Parameters:
input_dim: Dimension of the input featuresnum_experts: Number of expert networks in the Mixture of Expertsdropout_rate: Dropout rate for regularization
CodonMoEPro(
input_dim,
num_experts=4,
kernel_num=100,
kernel_sizes=(3, 4, 5),
dropout_rate=0.1,
)Parameters:
input_dim: Dimension of the input featuresnum_experts: Number of expert networks in the Mixture of Expertskernel_num: Number of convolutional kernels per kernel sizekernel_sizes: Tuple of kernel sizes for multi-scale Codon Convolutiondropout_rate: Dropout rate for regularization
CodonMoEPro is an enhanced version of CodonMoE.
mRNAModel(base_model, codon_moe)Parameters:
base_model: The base model to be integrated with CodonMoEcodon_moe: The CodonMixture of Experts model
python -m unittest tests/test_codon_moe.pyIf you find this repository useful, please consider citing our paper: CodonMoE: DNA Language Models for mRNA Analyses.
@article{du2025codonmoe,
title={CodonMoE: DNA Language Models for mRNA Analyses},
author={Du, Shiyi and Liang, Litian and Li, Jiayi and Kingsford, Carl},
journal={arXiv preprint arXiv:2508.04739},
year={2025}
}