-
Notifications
You must be signed in to change notification settings - Fork 85
Description
Describe the bug
I am trying to build a snpEff database for my custom genome (S. japonicus, fission yeast, assembly SJ5). I have all genome.fa, cds.fa and protein.fa files ready and checked that their ID match the transcript ID, and also added the genome to the snpEff.config file. When I build the database (java command below) a lot of warnings about transcripts not being found are displayed (see below). Also, the sanity checks for CDS and protein sequences fail with >35-37% of error.
I tried to reproduce the example of database building for the human genome (https://pcingola.github.io/SnpEff/snpeff/build_db/#example-building-the-human-genome-database) and obtained a similar result (trancripts not found).
To Reproduce
- SnpEff version: 5.3a (build 2025-09-02 10:24)
- Genome version: SJ5
- SnpEff full command line:
sudo java -jar snpEff.jar build -gtf22 -v SJ5 2>&1 | tee SJ5.build - Output / Error message:
SJ5.build.txt
Expected behavior
I should be able to build a database with <3% of errors.
Data
Annotation files (cds.fa, genes.gtf, protein.fa):
annotation.zip
Genome:
SJ5.zip
Additional context