7 Examples
7.1 From downloading data to TME
7.1.1 Data download
7.1.1.1 Prepare the SRR list
- Retrieve the SRR accessions for PRJNA1161405 from NCBI SRA: https://www.ncbi.nlm.nih.gov/sra/?term=PRJNA1161405.
- Save the accessions (one per line) into
PRJNA1161405.txtand upload it to:path/to/PRJNA1161405/.
7.1.1.2 High-speed download with prefetch
Requires SRA Toolkit installed and on your
PATH.
# (Optional) load/activate your environment
# module load sra-tools # or: conda activate sra-tools
cd path/to/PRJNA1161405
# Download all SRR runs listed in PRJNA1161405.txt into the current directory
prefetch -O ./ --option-file PRJNA1161405.txt7.1.1.3 Convert .sra to FASTQ with fasterq-dump
This loop finds each run directory produced by prefetch and converts the .sra file to paired FASTQ files.
folder="path/to/PRJNA1161405/"
cd path/to/PRJNA1161405
for dir in "${folder}"SRR*; do
if [[ -d "${dir}" ]]; then
dir_name="$(basename "${dir}")"
input_file="${dir}/${dir_name}.sra"
# -3: skip technical reads, -p: show progress, -e 64: threads, -O . : output to current dir
fasterq-dump -3 "${input_file}" -p -e 64 -O .
fi
done7.1.1.4 Multi-thread compression with pigz
Compress all .fastq files in the folder using 8 threads.
cd path/to/PRJNA1161405
for file in SRR*.fastq; do
if [ -f "$file" ]; then
pigz "$file" -p 8
fi
done7.1.1.5 (Optional) Direct downloads from ENA FTP with curl
If you prefer pulling FASTQ files directly from ENA:
#!/usr/bin/env bash
set -euo pipefail
# Normal samples
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/063/SRR35344563/SRR35344563_1.fastq.gz -o SRR35344563_GSM8516765_Normal4_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/063/SRR35344563/SRR35344563_2.fastq.gz -o SRR35344563_GSM8516765_Normal4_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/061/SRR35344561/SRR35344561_1.fastq.gz -o SRR35344561_GSM8516763_Normal2_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/061/SRR35344561/SRR35344561_2.fastq.gz -o SRR35344561_GSM8516763_Normal2_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/060/SRR35344560/SRR35344560_1.fastq.gz -o SRR35344560_GSM8516762_Normal1_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/060/SRR35344560/SRR35344560_2.fastq.gz -o SRR35344560_GSM8516762_Normal1_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/062/SRR35344562/SRR35344562_1.fastq.gz -o SRR35344562_GSM8516764_Normal3_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/062/SRR35344562/SRR35344562_2.fastq.gz -o SRR35344562_GSM8516764_Normal3_Homo_sapiens_RNA-Seq_2.fastq.gz
# HCC samples
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/068/SRR35344568/SRR35344568_1.fastq.gz -o SRR35344568_GSM8516770_HCC3_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/068/SRR35344568/SRR35344568_2.fastq.gz -o SRR35344568_GSM8516770_HCC3_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/069/SRR35344569/SRR35344569_1.fastq.gz -o SRR35344569_GSM8516771_HCC4_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/069/SRR35344569/SRR35344569_2.fastq.gz -o SRR35344569_GSM8516771_HCC4_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/070/SRR35344570/SRR35344570_1.fastq.gz -o SRR35344570_GSM8516772_HCC5_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/070/SRR35344570/SRR35344570_2.fastq.gz -o SRR35344570_GSM8516772_HCC5_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/071/SRR35344571/SRR35344571_1.fastq.gz -o SRR35344571_GSM8516773_HCC6_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/071/SRR35344571/SRR35344571_2.fastq.gz -o SRR35344571_GSM8516773_HCC6_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/072/SRR35344572/SRR35344572_1.fastq.gz -o SRR35344572_GSM8516774_HCC7_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/072/SRR35344572/SRR35344572_2.fastq.gz -o SRR35344572_GSM8516774_HCC7_Homo_sapiens_RNA-Seq_2.fastq.gz
# CLD samples
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/073/SRR35344573/SRR35344573_1.fastq.gz -o SRR35344573_GSM8516775_CLD1_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/073/SRR35344573/SRR35344573_2.fastq.gz -o SRR35344573_GSM8516775_CLD1_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/074/SRR35344574/SRR35344574_1.fastq.gz -o SRR35344574_GSM8516776_CLD2_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/074/SRR35344574/SRR35344574_2.fastq.gz -o SRR35344574_GSM8516776_CLD2_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/075/SRR35344575/SRR35344575_1.fastq.gz -o SRR35344575_GSM8516777_CLD3_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/075/SRR35344575/SRR35344575_2.fastq.gz -o SRR35344575_GSM8516777_CLD3_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/076/SRR35344576/SRR35344576_1.fastq.gz -o SRR35344576_GSM8516778_CLD4_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/076/SRR35344576/SRR35344576_2.fastq.gz -o SRR35344576_GSM8516778_CLD4_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/077/SRR35344577/SRR35344577_1.fastq.gz -o SRR35344577_GSM8516779_CLD5_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/077/SRR35344577/SRR35344577_2.fastq.gz -o SRR35344577_GSM8516779_CLD5_Homo_sapiens_RNA-Seq_2.fastq.gz7.1.2 From FASTQ to TME - runall
7.1.2.1 Salmon mode
iobrpy runall \
--mode salmon \
--outdir "/path/to/outdir" \
--fastq "/path/to/fastq" \
--threads 8 \
--batch_size 1 \
--index "/path/to/salmon/index" \
--project SRR7.1.2.2 STAR mode
iobrpy runall \
--mode star \
--outdir "/path/to/outdir" \
--fastq "/path/to/fastq" \
--threads 8 \
--batch_size 1 \
--index "/path/to/star/index" \
--project SRR7.2 TPM conversion
This page shows four common entry points to a TPM matrix and the final log2(x+1) transform you should apply after each path.
Quick rule of thumb
- Raw counts → TPM: usecount2tpm.
- Salmon quant → TPM: useprepare_salmon.
- Gene-expression tables (e.g., arrays) → gene-level matrix: useanno_esetto map/aggregate to symbols.
- Mouse → Human: usemouse2human_esetto map symbols. - After any of the above, runlog2_eset.
7.2.1 From count matrix to TPM
# 1) counts → TPM
iobrpy count2tpm \
-i MyProj.STAR.count.tsv.gz \
-o TPM_matrix.csv \
--idtype ensembl \
--org hsa \
--remove_version
# (Optional) Add effective transcript lengths if available:
# --effLength_csv efflen.csv --id id --length eff_length --gene_symbol symbol# 2) TPM → log2(x+1)
iobrpy log2_eset \
-i TPM_matrix.csv \
-o TPM_matrix.log2.csv7.2.2 From Salmon matrix to TPM
# 1) Salmon TPM (gene/transcript) → cleaned gene-level TPM
iobrpy prepare_salmon \
-i MyProj_salmon_tpm.tsv.gz \
-o TPM_matrix.csv \
--return_feature symbol \
--remove_version# 2) TPM → log2(x+1)
iobrpy log2_eset \
-i TPM_matrix.csv \
-o TPM_matrix.log2.csv7.2.3 From gene-expression matrix to gene-level matrix with annotation (anno_eset)
Use when your input is an expression table that needs ID mapping / de-duplication (e.g., microarray probes → symbols, or TPM tables with mixed identifiers).
# Map/aggregate to symbols using a built-in annotation set
iobrpy anno_eset \
-i expression_matrix.csv \
-o expression_anno.csv \
--annotation anno_grch38 \
--symbol symbol \
--probe id \
--method mean \
--remove_version
# Alternative platform example:
# iobrpy anno_eset -i expression_matrix.csv -o expression_anno.csv \
# --annotation anno_hug133plus2 --symbol symbol --probe id --method mean# if your input was already TPM-like, finish with log2(x+1)
iobrpy log2_eset \
-i expression_anno.csv \
-o expression_anno.log2.csv7.2.4 Mouse → Human gene conversion (mouse2human_eset)
Two common modes:
# Matrix mode: rows = mouse gene symbols, columns = samples
iobrpy mouse2human_eset \
-i mouse_matrix.tsv \
-o human_matrix.tsv \
--is_matrix \
--verbose# Table mode: has a symbol column (e.g., SYMBOL); will de-duplicate then map
iobrpy mouse2human_eset \
-i mouse_table.csv \
-o human_matrix.csv \
--column_of_symbol SYMBOL \
--verbose# log2(x+1) after mapping
iobrpy log2_eset \
-i human_matrix.tsv \
-o human_matrix.log2.tsv7.3 From TPM to TME
This page takes a TPM matrix and runs downstream TME analyses: signature scoring, immune deconvolution (multiple methods), clustering, and ligand–receptor scoring.
7.3.1 Inputs
- TPM matrix:
TPM_matrix.csv - (Optional) log2 transform: if desired, apply:
iobrpy log2_eset \
-i TPM_matrix.csv \
-o TPM_matrix.log2.csv7.3.2 All-in-one TME profiling - tme_profile
tme_profilewraps the following functions into one command:- Signature scoring →
calculate_sig_score - Immune deconvolution (six methods) →
cibersort,IPS,estimate,mcpcounter,quantiseq,epic - Ligand–receptor scoring →
LR_cal - It also merges the deconvolution outputs into a single table
- Signature scoring →
Not included:
desideand any clustering (tme_cluster,nmf).
Tip: You can either run each function step-by-step (see the sections below for individual commands and options), or usetme_profileto execute the full chain in one go.
7.3.2.1 Minimal usage
iobrpy tme_profile \
-i TPM_matrix.csv \
-o out/tme \
--threads 17.3.3 Immune deconvolution
Choose one or several methods below; each writes one result file.
7.3.3.1 CIBERSORT
iobrpy cibersort \
-i TPM_matrix.csv \
-o cibersort.csv \
--perm 100 \
--QN True \
--absolute False \
--abs_method sig.score \
--threads 17.3.3.2 quanTIseq
iobrpy quantiseq \
-i TPM_matrix.csv \
-o quantiseq.csv \
--signame TIL10 \
--method lsei \
--tumor \
--arrays \
--scale_mrna7.3.3.3 EPIC
iobrpy epic \
-i TPM_matrix.csv \
-o epic.csv \
--reference TRef7.3.3.4 ESTIMATE
iobrpy estimate \
-i TPM_matrix.csv \
-o estimate.csv \
--platform affymetrix7.3.3.5 MCPcounter
iobrpy mcpcounter \
-i TPM_matrix.csv \
-o mcpcounter.csv \
--features HUGO_symbols7.3.3.6 IPS
iobrpy IPS \
-i TPM_matrix.csv \
-o IPS.csv7.3.3.7 DeSide
iobrpy deside \
--model_dir path/to/your/DeSide_model \
-i TPM_matrix.csv \
-o deside.csv \
--result_dir path/to/your/plot/folder \
--exp_type TPM \
--scaling_by_constant \
--transpose \
--print_info7.3.4 TME clustering
You can cluster samples by cell fractions or signature scores.
7.3.4.1 k-means with KL index auto-k (recommended)
iobrpy tme_cluster \
-i cibersort.csv \
-o tme_cluster.csv \
--features 1:22 \
--id "ID" \
--min_nc 2 \
--max_nc 5 \
--print_result \
--scale7.3.4.2 NMF clustering (auto-k, excluding k=2)
iobrpy nmf \
-i cibersort.csv \
-o path/to/your/result/folder \
--kmax 10 \
--features 1:22 \
--skip_k_27.3.5 Ligand–receptor scoring
Compute bulk ligand–receptor interaction scores from TPM:
iobrpy LR_cal \
-i TPM_matrix.csv \
-o LR_score.csv \
--data_type "tpm" \
--id_type "symbol" \
--cancer_type pancan \
--verbose