4 From FASTQ to TME — runall
4.1 How runall passes options
runall defines a small set of top-level options (e.g., --mode/--outdir/--fastq/--threads/--batch_size). Any unrecognized options are forwarded to the corresponding sub-steps. This keeps runall flexible as sub-commands evolve.
Below are two fully wired workflows handled by iobrpy runall.
4.2 Salmon mode
iobrpy runall \
--mode salmon \
--outdir "/path/to/outdir" \
--fastq "/path/to/fastq" \
--threads 8 \
--batch_size 1 \
--index "/path/to/salmon/index" \
--project MyProj4.3 STAR mode
iobrpy runall \
--mode star \
--outdir "/path/to/outdir" \
--fastq "/path/to/fastq" \
--threads 8 \
--batch_size 1 \
--index "/path/to/star/index" \
--project MyProj4.4 Option legend for the runall examples
4.4.1 Common options
--mode {salmon|star}— Select backend (Salmon quant vs. STAR align+count)--outdir <DIR>— Root output directory (creates the standardized layout)--fastq <DIR>— Raw FASTQ dir, forwarded tofastq_qc --path1_fastq--threads <INT>/--batch_size <INT>— Global concurrency / batching--resume— Skip steps whose outputs already exist--dry_run— Print planned commands without executing
4.4.2 Salmon-only
--index <DIR>— Salmon index forbatch_salmon--project <STR>— Prefix for merged outputs inmerge_salmon--return_feature {symbol|ENSG|ENST}— Output gene ID type inprepare_salmon--remove_version— Strip version suffix inprepare_salmon
4.4.3 STAR-only
--index <DIR>— STAR genomeDir forbatch_star_count--project <STR>— Prefix for merged counts inmerge_star_count--idtype {ensembl|entrez|symbol|mgi}— Gene ID type forcount2tpm--org {hsa|mmus}— Organism forcount2tpm--remove_version— Strip version suffix beforecount2tpm
4.4.4 Signature scoring
--method {integration|pca|zscore|ssgsea}— Scoring method forcalculate_sig_score--signature <set>— Which signature set to use (all, etc.)--mini_gene_count <INT>— Minimum genes per signature--adjust_eset— Extra filtering after log transform
4.4.5 Deconvolution
--perm <INT>/--QN {true|false}— CIBERSORT permutations / quantile normalization--platform <STR>— ESTIMATE platform--features HUGO_symbols— MCPcounter feature type--arrays--tumor--scale_mrna— quanTIseq options--reference {TRef|BRef|both}— EPIC reference profile
4.4.6 Ligand–receptor
--data_type {tpm|count}— Input matrix type forLR_cal--id_type {symbol|ensembl|...}— Gene ID type forLR_cal--verbose— Verbose logging
4.5 Expected layout
# Salmon mode:
/path/to/outdir
|-- 01-qc
| |-- <sample>_1.fastq.gz
| |-- <sample>_2.fastq.gz
| |-- <sample>_fastp.html
| |-- <sample>_fastp.json
| |-- <sample>.task.complete
| `-- multiqc_report
| `-- multiqc_fastp_report.html
|-- 02-salmon
| |-- <sample>
| | `-- quant.sf
| |-- MyProj_salmon_count.tsv.gz
| `-- MyProj_salmon_tpm.tsv.gz
|-- 03-tpm
| |-- prepare_salmon.csv
| `-- tpm_matrix.csv
|-- 04-signatures
| `-- calculate_sig_score.csv
|-- 05-tme
| |-- cibersort_results.csv
| |-- epic_results.csv
| |-- quantiseq_results.csv
| |-- IPS_results.csv
| |-- estimate_results.csv
| |-- mcpcounter_results.csv
| `-- deconvo_merged.csv
`-- 06-LR_cal
`-- lr_cal.csv
# STAR mode:
/path/to/outdir
|-- 01-qc
| |-- <sample>_1.fastq.gz
| |-- <sample>_2.fastq.gz
| |-- <sample>_fastp.html
| |-- <sample>_fastp.json
| |-- <sample>.task.complete
| `-- multiqc_report
| `-- multiqc_fastp_report.html
|-- 02-star
| |-- <sample>/
| |-- <sample>__STARgenome/
| |-- <sample>__STARpass1/
| |-- <sample>_STARtmp/
| |-- <sample>_Aligned.sortedByCoord.out.bam
| |-- <sample>_Log.final.out
| |-- <sample>_Log.out
| |-- <sample>_Log.progress.out
| |-- <sample>_ReadsPerGene.out.tab
| |-- <sample>_SJ.out.tab
| |-- <sample>.task.complete
| |-- .batch_star_count.done
| |-- .merge_star_count.done
| `-- MyProj.STAR.count.tsv.gz
|-- 03-tpm
| |-- count2tpm.csv
| `-- tpm_matrix.csv
|-- 04-signatures
| `-- calculate_sig_score.csv
|-- 05-tme
| |-- cibersort_results.csv
| |-- epic_results.csv
| |-- quantiseq_results.csv
| |-- IPS_results.csv
| |-- estimate_results.csv
| |-- mcpcounter_results.csv
| `-- deconvo_merged.csv
`-- 06-LR_cal
`-- lr_cal.csv
4.6 Output Reference
4.6.1 Standard layout (produced by iobrpy runall)
01-qc/— fastp outputs; a resume flag.fastq_qc.doneis written when the step completes.02-salmon/or02-star/— quantification/alignment + merged matrices; resume flags like.batch_salmon.done,.merge_salmon.done, or.merge_star_count.done.03-tpm/— unified TPM matrixtpm_matrix.csv. For Salmon mode it comes fromprepare_salmon; for STAR mode it comes fromcount2tpm.04-signatures/— signature scoring results (file:calculate_sig_score.csv).05-tme/— deconvolution outputs from multiple methods +deconvo_merged.csv.06-LR_cal/— ligand–receptor resultslr_cal.csv.
4.6.2 Salmon mode (02-salmon/)
- Per-sample Salmon folders containing
quant.sf(frombatch_salmon). A.batch_salmon.doneflag is written after completion. - Merged matrices (from
merge_salmon):<PROJECT>_salmon_tpm.tsv[.gz]<PROJECT>_salmon_count.tsv[.gz]
A.merge_salmon.doneflag is written after completion.
03-tpm/prepare_salmon.csv— cleaned genes × samples TPM matrix produced byprepare_salmon(default--return_feature symbolunless overridden).03-tpm/tpm_matrix.csv— log2(x+1) matrix produced bylog2_esetfromprepare_salmon.csv.
4.6.3 STAR mode (02-star/)
- Per-sample STAR outputs (BAM, logs,
*_ReadsPerGene.out.tab, etc.). - Merged counts (from
merge_star_count):<PROJECT>.STAR.count.tsv.gz. A.merge_star_count.doneflag is written after completion.
03-tpm/count2tpm.csv— TPM matrix produced bycount2tpmfrom the merged STAR ReadPerGene/count matrix.03-tpm/tpm_matrix.csv— log2(x+1) matrix produced bylog2_esetfromcount2tpm.csv.
4.6.4 Signatures (04-signatures/)
calculate_sig_score.csv— per-sample pathway/signature scores. Columns correspond to the selected signature set and method (integration,pca,zscore, orssgsea).
4.6.5 Deconvolution (05-tme/)
Each method writes a single table named <method>_results.csv:
cibersort_results.csv— columns suffixed with_CIBERSORT. Note whether--permand--QNwere used.quantiseq_results.csv— quanTIseq fractions. Document the chosen--method {lsei|hampel|huber|bisquare}and flags like--arrays,--tumor,--scale_mrna,--signame.epic_results.csv— EPIC fractions; record the reference profile used (--reference {TRef|BRef|both}).estimate_results.csv— ESTIMATE immune/stromal/purity scores; columns suffixed_estimate.mcpcounter_results.csv— MCPcounter scores; columns suffixed_MCPcounter.IPS_results.csv— IPS sub-scores and total score.
Merged table - deconvo_merged.csv — produced by runall after all deconvolution methods finish; normalizes the sample index to a column named ID and outer-joins by sample ID across methods.
4.6.6 Ligand–receptor (06-LR_cal/)
lr_cal.csv— ligand–receptor scoring table fromLR_cal. Record the--data_type {count|tpm}and the--id_typeyou used.