Anatalyst: The Analysis Catalyst - A Modular Pipeline built for Single-cell RNA-seq Analysis¶
Anatalyst is a flexible, modular Python pipeline designed to facilitate boilerplate analytical workflows that leverage existing Python and R programs. The current scope of module support is focused on Single Cell RNA sequencing, built on top of Scanpy, providing a customizable workflow for common single-cell analysis tasks.
Key Features¶
- Modular Architecture: Each analysis step is encapsulated in a module that can be included or excluded as needed
- Configurable Pipeline: Simple YAML configuration to customize pipeline parameters
- Checkpoint System: Save and resume pipeline execution from checkpoints
- R Integration: Seamless integration of R tools (like SoupX) for specialized analyses
- Reproducible Analysis: Detailed reports with visualizations and parameter settings
- Framework Flexibility: New modules can be custom-built and inserted into the pipeline to allow for any type of sequential analysis
Workflow Overview¶
Anatalyst provides a comprehensive workflow for single-cell analysis:
- Data Loading: Import aligned 10X single-cell data using an .h5 file
- Quality Control: Calculate QC metrics and visualize distributions
- Ambient RNA Removal: Remove background RNA contamination using SoupX
- Doublet Detection: Identify and flag potential cell doublets
- Cell Filtering: Filter out low-quality cells and outliers
- Pearson Normalization: Normalize data using Pearson residuals
- Dimensionality Reduction: PCA, UMAP, and t-SNE for visualization and analysis
- Report Generation: Create comprehensive HTML reports with key figures
Quick Start¶
TBD - expecting pull Docker container with pipeline already installed Mount local directory for input/output and extra module insertion?
# Install the container
docker pull something-or-other
# Run a pipeline with a configuration file
python -m sc_pipeline.scripts.run_pipeline --config my_config.yaml
# This command will likely change and may just be the way the container is launched?
# Perhaps these args just get passed to docker compose via ENV variables and we make a new entrypoint.sh file?
Check out the Getting Started guide for more detailed instructions.