7  Examples

7.1 From downloading data to TME

7.1.1 Data download

7.1.1.1 Prepare the SRR list

7.1.1.2 High-speed download with prefetch

Requires SRA Toolkit installed and on your PATH.

# (Optional) load/activate your environment
# module load sra-tools            # or: conda activate sra-tools

cd path/to/PRJNA1161405
# Download all SRR runs listed in PRJNA1161405.txt into the current directory
prefetch -O ./ --option-file PRJNA1161405.txt

7.1.1.3 Convert .sra to FASTQ with fasterq-dump

This loop finds each run directory produced by prefetch and converts the .sra file to paired FASTQ files.

folder="path/to/PRJNA1161405/"
cd path/to/PRJNA1161405

for dir in "${folder}"SRR*; do
  if [[ -d "${dir}" ]]; then
    dir_name="$(basename "${dir}")"
    input_file="${dir}/${dir_name}.sra"
    # -3: skip technical reads, -p: show progress, -e 64: threads, -O . : output to current dir
    fasterq-dump -3 "${input_file}" -p -e 64 -O .
  fi
done

7.1.1.4 Multi-thread compression with pigz

Compress all .fastq files in the folder using 8 threads.

cd path/to/PRJNA1161405
for file in SRR*.fastq; do
  if [ -f "$file" ]; then
    pigz "$file" -p 8
  fi
done

7.1.1.5 (Optional) Direct downloads from ENA FTP with curl

If you prefer pulling FASTQ files directly from ENA:

#!/usr/bin/env bash
set -euo pipefail

# Normal samples
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/063/SRR35344563/SRR35344563_1.fastq.gz -o SRR35344563_GSM8516765_Normal4_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/063/SRR35344563/SRR35344563_2.fastq.gz -o SRR35344563_GSM8516765_Normal4_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/061/SRR35344561/SRR35344561_1.fastq.gz -o SRR35344561_GSM8516763_Normal2_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/061/SRR35344561/SRR35344561_2.fastq.gz -o SRR35344561_GSM8516763_Normal2_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/060/SRR35344560/SRR35344560_1.fastq.gz -o SRR35344560_GSM8516762_Normal1_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/060/SRR35344560/SRR35344560_2.fastq.gz -o SRR35344560_GSM8516762_Normal1_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/062/SRR35344562/SRR35344562_1.fastq.gz -o SRR35344562_GSM8516764_Normal3_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/062/SRR35344562/SRR35344562_2.fastq.gz -o SRR35344562_GSM8516764_Normal3_Homo_sapiens_RNA-Seq_2.fastq.gz

# HCC samples
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/068/SRR35344568/SRR35344568_1.fastq.gz -o SRR35344568_GSM8516770_HCC3_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/068/SRR35344568/SRR35344568_2.fastq.gz -o SRR35344568_GSM8516770_HCC3_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/069/SRR35344569/SRR35344569_1.fastq.gz -o SRR35344569_GSM8516771_HCC4_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/069/SRR35344569/SRR35344569_2.fastq.gz -o SRR35344569_GSM8516771_HCC4_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/070/SRR35344570/SRR35344570_1.fastq.gz -o SRR35344570_GSM8516772_HCC5_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/070/SRR35344570/SRR35344570_2.fastq.gz -o SRR35344570_GSM8516772_HCC5_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/071/SRR35344571/SRR35344571_1.fastq.gz -o SRR35344571_GSM8516773_HCC6_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/071/SRR35344571/SRR35344571_2.fastq.gz -o SRR35344571_GSM8516773_HCC6_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/072/SRR35344572/SRR35344572_1.fastq.gz -o SRR35344572_GSM8516774_HCC7_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/072/SRR35344572/SRR35344572_2.fastq.gz -o SRR35344572_GSM8516774_HCC7_Homo_sapiens_RNA-Seq_2.fastq.gz

# CLD samples
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/073/SRR35344573/SRR35344573_1.fastq.gz -o SRR35344573_GSM8516775_CLD1_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/073/SRR35344573/SRR35344573_2.fastq.gz -o SRR35344573_GSM8516775_CLD1_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/074/SRR35344574/SRR35344574_1.fastq.gz -o SRR35344574_GSM8516776_CLD2_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/074/SRR35344574/SRR35344574_2.fastq.gz -o SRR35344574_GSM8516776_CLD2_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/075/SRR35344575/SRR35344575_1.fastq.gz -o SRR35344575_GSM8516777_CLD3_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/075/SRR35344575/SRR35344575_2.fastq.gz -o SRR35344575_GSM8516777_CLD3_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/076/SRR35344576/SRR35344576_1.fastq.gz -o SRR35344576_GSM8516778_CLD4_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/076/SRR35344576/SRR35344576_2.fastq.gz -o SRR35344576_GSM8516778_CLD4_Homo_sapiens_RNA-Seq_2.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/077/SRR35344577/SRR35344577_1.fastq.gz -o SRR35344577_GSM8516779_CLD5_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR353/077/SRR35344577/SRR35344577_2.fastq.gz -o SRR35344577_GSM8516779_CLD5_Homo_sapiens_RNA-Seq_2.fastq.gz

7.1.2 From FASTQ to TME - runall

7.1.2.1 Salmon mode

iobrpy runall \
  --mode salmon \
  --outdir "/path/to/outdir" \
  --fastq "/path/to/fastq" \
  --threads 8 \
  --batch_size 1 \
  --index "/path/to/salmon/index" \
  --project SRR

7.1.2.2 STAR mode

iobrpy runall \
  --mode star \
  --outdir "/path/to/outdir" \
  --fastq "/path/to/fastq" \
  --threads 8 \
  --batch_size 1 \
  --index "/path/to/star/index" \
  --project SRR

7.2 TPM conversion

This page shows four common entry points to a TPM matrix and the final log2(x+1) transform you should apply after each path.

Quick rule of thumb
- Raw counts → TPM: use count2tpm.
- Salmon quant → TPM: use prepare_salmon.
- Gene-expression tables (e.g., arrays) → gene-level matrix: use anno_eset to map/aggregate to symbols.
- Mouse → Human: use mouse2human_eset to map symbols. - After any of the above, run log2_eset.


7.2.1 From count matrix to TPM

# 1) counts → TPM
iobrpy count2tpm \
  -i MyProj.STAR.count.tsv.gz \
  -o TPM_matrix.csv \
  --idtype ensembl \
  --org hsa \
  --remove_version
# (Optional) Add effective transcript lengths if available:
#   --effLength_csv efflen.csv --id id --length eff_length --gene_symbol symbol
# 2) TPM → log2(x+1)
iobrpy log2_eset \
  -i TPM_matrix.csv \
  -o TPM_matrix.log2.csv

7.2.2 From Salmon matrix to TPM

# 1) Salmon TPM (gene/transcript) → cleaned gene-level TPM
iobrpy prepare_salmon \
  -i MyProj_salmon_tpm.tsv.gz \
  -o TPM_matrix.csv \
  --return_feature symbol \
  --remove_version
# 2) TPM → log2(x+1)
iobrpy log2_eset \
  -i TPM_matrix.csv \
  -o TPM_matrix.log2.csv

7.2.3 From gene-expression matrix to gene-level matrix with annotation (anno_eset)

Use when your input is an expression table that needs ID mapping / de-duplication (e.g., microarray probes → symbols, or TPM tables with mixed identifiers).

# Map/aggregate to symbols using a built-in annotation set
iobrpy anno_eset \
  -i expression_matrix.csv \
  -o expression_anno.csv \
  --annotation anno_grch38 \
  --symbol symbol \
  --probe id \
  --method mean \
  --remove_version
# Alternative platform example:
# iobrpy anno_eset -i expression_matrix.csv -o expression_anno.csv \
#   --annotation anno_hug133plus2 --symbol symbol --probe id --method mean
# if your input was already TPM-like, finish with log2(x+1)
iobrpy log2_eset \
  -i expression_anno.csv \
  -o expression_anno.log2.csv

7.2.4 Mouse → Human gene conversion (mouse2human_eset)

Two common modes:

# Matrix mode: rows = mouse gene symbols, columns = samples
iobrpy mouse2human_eset \
  -i mouse_matrix.tsv \
  -o human_matrix.tsv \
  --is_matrix \
  --verbose
# Table mode: has a symbol column (e.g., SYMBOL); will de-duplicate then map
iobrpy mouse2human_eset \
  -i mouse_table.csv \
  -o human_matrix.csv \
  --column_of_symbol SYMBOL \
  --verbose
# log2(x+1) after mapping
iobrpy log2_eset \
  -i human_matrix.tsv \
  -o human_matrix.log2.tsv

7.3 From TPM to TME

This page takes a TPM matrix and runs downstream TME analyses: signature scoring, immune deconvolution (multiple methods), clustering, and ligand–receptor scoring.


7.3.1 Inputs

  • TPM matrix: TPM_matrix.csv
  • (Optional) log2 transform: if desired, apply:
iobrpy log2_eset \
  -i TPM_matrix.csv \
  -o TPM_matrix.log2.csv

7.3.2 All-in-one TME profiling - tme_profile

  • tme_profile wraps the following functions into one command:
    • Signature scoringcalculate_sig_score
    • Immune deconvolution (six methods)cibersort, IPS, estimate, mcpcounter, quantiseq, epic
    • Ligand–receptor scoringLR_cal
    • It also merges the deconvolution outputs into a single table

Not included: deside and any clustering (tme_cluster, nmf).
Tip: You can either run each function step-by-step (see the sections below for individual commands and options), or use tme_profile to execute the full chain in one go.

7.3.2.1 Minimal usage

iobrpy tme_profile \
  -i TPM_matrix.csv \
  -o out/tme \
  --threads 1

7.3.3 Immune deconvolution

Choose one or several methods below; each writes one result file.

7.3.3.1 CIBERSORT

iobrpy cibersort \
  -i TPM_matrix.csv \
  -o cibersort.csv \
  --perm 100 \
  --QN True \
  --absolute False \
  --abs_method sig.score \
  --threads 1

7.3.3.2 quanTIseq

iobrpy quantiseq \
  -i TPM_matrix.csv \
  -o quantiseq.csv \
  --signame TIL10 \
  --method lsei \
  --tumor \
  --arrays \
  --scale_mrna

7.3.3.3 EPIC

iobrpy epic \
  -i TPM_matrix.csv \
  -o epic.csv \
  --reference TRef

7.3.3.4 ESTIMATE

iobrpy estimate \
  -i TPM_matrix.csv \
  -o estimate.csv \
  --platform affymetrix

7.3.3.5 MCPcounter

iobrpy mcpcounter \
  -i TPM_matrix.csv \
  -o mcpcounter.csv \
  --features HUGO_symbols

7.3.3.6 IPS

iobrpy IPS \
  -i TPM_matrix.csv \
  -o IPS.csv

7.3.3.7 DeSide

iobrpy deside \
  --model_dir path/to/your/DeSide_model \
  -i TPM_matrix.csv \
  -o deside.csv \
  --result_dir path/to/your/plot/folder \
  --exp_type TPM \
  --scaling_by_constant \
  --transpose \
  --print_info

7.3.4 TME clustering

You can cluster samples by cell fractions or signature scores.

7.3.4.2 NMF clustering (auto-k, excluding k=2)

iobrpy nmf \
  -i cibersort.csv \
  -o path/to/your/result/folder \
  --kmax 10 \
  --features 1:22 \
  --skip_k_2

7.3.5 Ligand–receptor scoring

Compute bulk ligand–receptor interaction scores from TPM:

iobrpy LR_cal \
  -i TPM_matrix.csv \
  -o LR_score.csv \
  --data_type "tpm" \
  --id_type "symbol" \
  --cancer_type pancan \
  --verbose