API reference

Top-level

capellini.config

Configuration dataclass for the CAPELLINI pipeline.

capellini.pipeline

Top-level pipeline orchestrator.

Stages

capellini.stages.preflight

Pre-flight stage: folder initialization and optional fresh-start cleanup.

capellini.stages.dada2

DADA2 stage: run DADA2_Pipe.R and move the generated FASTA.

capellini.stages.ncbi_mapping

NCBI mapping stage: download taxonomy names and assign real NCBI taxids.

capellini.stages.mmseqs2

MMSeqs2 stage: 16S reference, easy-search, and 3-layer NCBI/GCA assignment.

capellini.stages.spacepharer

SpacePHARER stage: spacer extraction, DB creation, prediction, and statistics.

capellini.stages.procs

ProCs stage: bacterial/viral protein extraction, clustering, and PA matrix.

capellini.stages.network

Network stage: common abundance, shrinkage, raw/smoothed CRISPR, residual X*.

Utilities

capellini.utils.io

I/O helpers: file reading, writing, downloading, subprocess execution.

capellini.utils.taxonomy

Taxonomy helpers: NCBI name lookup, index sanitization, bacteria taxonomy cleaning.

capellini.utils.transforms

Numerical transformations: CLR, row normalization, shrinkage correlation.

capellini.utils.network_utils

Network-level utilities: residual message passing, CRISPR smoothing, taxonomy kernels, abundance helpers.