Skip to content

Conversation

Copy link

Copilot AI commented Aug 21, 2025

This PR implements a comprehensive analysis pipeline that automates the complete workflow for CRISPR screening data analysis, addressing the need for an integrated solution that orchestrates all three main analysis steps.

Problem Statement

Previously, users had to manually execute each step of the CRISPR screening analysis workflow:

  1. FASTQ processing (using screenpro guidecounter)
  2. Phenotype calculation (programmatic use of phenoscore module)
  3. Data visualization (manual plotting with plotting module)

This fragmented approach made it difficult for users to run complete analyses and required deep knowledge of the individual modules.

Solution

New AnalysisPipeline Class

Introduced a comprehensive AnalysisPipeline class in screenpro/pipeline.py that:

  • Orchestrates the complete workflow: Seamlessly integrates FASTQ processing, phenotype calculation, and visualization
  • Supports both library types: Works with single-guide and dual-guide designs
  • Provides flexible execution modes: Run individual steps or the complete pipeline
  • Handles preprocessing: Includes normalization, filtering, and quality control steps
  • Manages output structure: Creates organized directories for counts, phenotypes, and plots

CLI Integration

Added a new screenpro pipeline command that supports:

Configuration file mode:

screenpro pipeline -c config.json -o results/

Command-line parameter mode:

screenpro pipeline --single-guide-design \
  -l library.tsv -f fastq/ -s "ctrl_rep1,ctrl_rep2,treat_rep1,treat_rep2" \
  --comparisons "ctrl:treat" -o results/

Python API

Users can now run complete analyses programmatically:

import screenpro as scp

# Create and run pipeline
pipeline = scp.AnalysisPipeline(
    cas_type='cas9', 
    library_type='single_guide_design',
    output_dir='results/'
)

config = {
    'fastq_processing': {
        'library_path': 'library.tsv',
        'fastq_dir': 'fastq/',
        'samples': ['sample1', 'sample2']
    },
    'phenotype_calculation': {
        'comparisons': [{'cond_ref': 'control', 'cond_test': 'treated'}]
    }
}

pipeline.run_complete_pipeline(config)

Key Features

  • Configuration flexibility: Support for JSON/YAML config files and command-line parameters
  • Metadata integration: Automatic sample metadata handling
  • Preprocessing pipeline: Built-in normalization, filtering, and quality control
  • Error handling: Comprehensive validation with informative error messages
  • Progress reporting: Colored output showing analysis progress
  • Structured output: Organized results in counts/, phenotypes/, and plots/ directories

Files Added/Modified

  • New: screenpro/pipeline.py - Complete pipeline implementation
  • Modified: screenpro/__init__.py - Added pipeline imports
  • Modified: screenpro/main.py - Added CLI command integration
  • New: tests/test_pipeline.py - Pipeline functionality tests
  • New: example_config.json - Example configuration file
  • New: PIPELINE.md - Comprehensive usage documentation

Benefits

  1. Simplified workflow: Single command execution for complete analysis
  2. Improved reproducibility: Configuration files ensure consistent analyses
  3. Better usability: Accessible to both computational and experimental researchers
  4. Maintained flexibility: Preserves all existing functionality while adding automation
  5. Future extensibility: Clean architecture for adding new analysis methods

This implementation significantly enhances ScreenPro2's usability by providing an automated, end-to-end analysis solution while maintaining the framework's flexibility and power.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@abearab abearab closed this Aug 21, 2025
@abearab abearab deleted the copilot/fix-86d4fa5c-5024-47f4-bdd1-8235857ace4c branch August 21, 2025 00:19
Copilot AI restored the copilot/fix-86d4fa5c-5024-47f4-bdd1-8235857ace4c branch August 21, 2025 00:20
Copilot AI changed the title [WIP] Write a analysis pipeline Implement complete analysis pipeline for CRISPR screening data Aug 21, 2025
Copilot AI requested a review from abearab August 21, 2025 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants