Skip to content

crashes after "checkpoint extract_barcodes" #92

@IlinaMargo

Description

@IlinaMargo

Good afternoon! I previously ran the tests and everything worked fine.

Now, I'm trying to run BLR on my own data — a flax genome (Linum usitatissimum, ASM22429v2) with stLFR reads. However, I'm running into issues, and I'm not sure how to debug them since there’s no clear error message.

I install and run it as follows:
git clone https://github.com/AfshinLab/BLR.git
conda install -c conda-forge conda-lock
conda activate blr
pip install .
conda create --name blr --file environment.linux-64.lock
blr init --reads1=/media/eternus1/projects/milina/genotrophs/raw_data/20DNA/20DNA.clean_R1.fastq.gz -l stlfr /media/eternus1/projects/milina/genotrophs/stLFR/blr_output/
blr config --set genome_reference /media/eternus1/projects/milina/reference/GCA_000224295.2_ASM22429v2_genomic.fa

I also updated blr.yaml to include the chromosome names from the reference:
phasing_contigs: CP027619.1,CP027626.1,CP027627.1,CP027628.1,CP027629.1,CP027630.1,CP027631.1,CP027632.1,CP027633.1,CP027620.1,CP027621.1,CP027622.1,CP027623.1,CP027624.1,CP027625.1

The pipeline runs up to about 40% and then just seems to stall without an actual error. Here’s the end of the Snakemake log:(/media/eternus1/projects/milina/genotrophs/stLFR/blr_output/.snakemake/log/2025-04-16T162136.645977.snakemake.log)

samtools index final.barcodes.bam final.barcodes.bam.bai
[Thu Apr 17 10:36:26 2025]
Finished job 615.
621 of 628 steps (99%) done
Select jobs to execute...

[Thu Apr 17 10:36:26 2025]
checkpoint extract_barcodes:
input: mtglink.gfa, final.barcodes.bam, final.barcodes.bam.bai
output: mtglink_tmp/read_subsampling_pre
log: mtglink_tmp/read_subsampling_pre.log
jobid: 617
Downstream jobs will be updated after completion.

python -c "import sys; print('.'.join(map(str, sys.version_info[:2])))"
Activating conda environment: /home/milina/common_conda_envs/39858c45c61b8084a22c2330556deb97
python /media/eternus1/projects/milina/genotrophs/stLFR/blr_output/.snakemake/scripts/tmpenjhiz6x.extract_barcodes.py
Activating conda environment: /home/milina/common_conda_envs/39858c45c61b8084a22c2330556deb97
Updating job aggregate_extracts.
Updating job mtglink.
[Thu Apr 17 10:37:23 2025]
Finished job 617.
622 of 1557 steps (40%) done
Complete log: /media/eternus1/projects/milina/genotrophs/stLFR/blr_output/.snakemake/log/2025-04-16T162136.645977.snakemake.log

There's no error message in the logs, which makes it hard for me to understand what’s going wrong or whether it’s just slow.
Any ideas on what might be happening, or how I could better debug this?
Also in my output directory I have some log files too, but there is no information about mistake, only three of this logs have zero information: ./final.samtools_stats.txt.log ./initialmapping_nobc.bam.sorting.log and ./manta.log

Some system Information:
OS: Ubuntu 22.04.4 LTS
Kernel: 5.15.0-78-generic
CPU: Intel(R) Xeon(R) Platinum 8268 @ 2.90GHz — 192 threads (4 sockets × 24 cores × 2 threads)
RAM: 2.9 TiB total, ~2.6 TiB available
Python version: 3.6.13
Snakemake version: 6.4.1
Conda version: 24.11.3
pip version: 21.2.4

Thanks so much for your help!

Best,
Margo Ilina

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions