Chapter 9 TME and genomic interaction

9.1 Loading packages

library(IOBR)

9.2 Genomic data prepare

In this section, we are going to use the MAF data of TCGA-STAD cohort as an example dataset. This dataset could be found in multiple places, here we show two ways to get it.

9.3 Identifying Mutations Associated with TME

The microenvironmental data from the TCGA-STAD expression matrix was merged. The Cuzick or Wilcoxon test was used to identify genetic variants associated with microenvironmental factors. CD_8_T_effector was used as the target variable in this example.

data("tcga_stad_sig", package = "IOBR")
res<-find_mutations(mutation_matrix     = mut, 
                    signature_matrix    = tcga_stad_sig, 
                    id_signature_matrix = "ID", 
                    signature           = "CD_8_T_effector",
                    palette             = "nrc",
                    min_mut_freq        = 0.01, 
                    plot                = TRUE, 
                    jitter              = TRUE, 
                    point.alpha         = 0.25)
## [1] ">>>> Result of Cuzick Test"
##             p.value  names statistic adjust_pvalue
## PIK3CA 6.279087e-10 PIK3CA  6.183261  3.139543e-07
## PLXNA4 8.196539e-05 PLXNA4  3.938580  2.049135e-02
## DMD    3.514677e-04    DMD  3.574075  3.882570e-02
## SPEG   3.665757e-04   SPEG  3.563047  3.882570e-02
## AHNAK2 3.882570e-04 AHNAK2  3.547940  3.882570e-02
## TCHH   6.571311e-04   TCHH  3.406867  5.476093e-02
## ABCC9  1.043516e-03  ABCC9  3.278524  7.453684e-02
## ANK3   1.470480e-03   ANK3  3.180447  7.713409e-02
## WDFY3  1.690248e-03  WDFY3  3.139867  7.713409e-02
## LRP1   1.762002e-03   LRP1  3.127666  7.713409e-02
## [1] ">>> Result of Wilcoxon test (top 10)"
##             p.value  names statistic adjust_pvalue
## PIK3CA 3.045532e-11 PIK3CA      4006  1.522766e-08
## TCHH   2.585594e-05   TCHH      2929  6.463985e-03
## PLXNA4 5.257043e-05 PLXNA4      2904  7.754311e-03
## LRP1   6.203449e-05   LRP1      2330  7.754311e-03
## RNF213 1.066555e-04 RNF213      3827  1.066555e-02
## SPEG   1.809850e-04   SPEG      1854  1.439761e-02
## WDFY3  2.420595e-04  WDFY3      2626  1.439761e-02
## DMD    2.430706e-04    DMD      4995  1.439761e-02
## ANK3   2.591569e-04   ANK3      3845  1.439761e-02
## AHNAK2 3.456706e-04 AHNAK2      4828  1.728353e-02
## All mutation types: mut.
## Warning: You defined `cell_fun` for a heatmap with more than 100 rows or
## columns, which might be very slow to draw. Consider to use the
## vectorized version `layer_fun`.
## All mutation types: mut.
## Warning: You defined `cell_fun` for a heatmap with more than 100 rows or
## columns, which might be very slow to draw. Consider to use the
## vectorized version `layer_fun`.

9.4 OncoPrint of result

OncoPrint

Figure 9.1: OncoPrint

9.5 Boxplot of top 10 mutated genes

Top 10 mutated genes

Figure 9.2: Top 10 mutated genes

9.6 Other methods to obtaind genomic data

9.6.1 Using TCGAmutations

As its name, the R package TCGAmutations provides pre-compiled, curated somatic mutations from 33 TCGA cohorts along with relevant clinical information for all sequenced samples. You can install it similar to the TCGAbiolinks.

if (!requireNamespace("TCGAmutations", quietly = TRUE))
  BiocManager::install("PoisonAlien/TCGAmutations")

It’s quite simple to use:

maf = TCGAmutations::tcga_load(study = "STAD")
# Change `source` argument to Firehose for MAF files from Broad Firehose
# maf = TCGAmutations::tcga_load(study = "STAD", source = "Firehose")

9.6.2 Using maftools

If you are a user of R package Maftools, you can access and load the data in a similar way (because the author of TCGAmutations and Maftools is the same person).

# The following github can be changed to gitee
# it maybe fast in China mainland
maftools::tcgaAvailable(repo = "github")
maftools::tcgaLoad("STAD", repo = "github")

MAF data transformation To prepare the data for the downstream analysis. We need to extract the SNV data in it and transform it into a non-negative matrix.

mut_list <- make_mut_matrix(maf = maf, isTCGA = TRUE, category = "multi")

9.7 References

Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627

Gu, Z. (2022) Complex Heatmap Visualization. iMeta.

Anand Mayakonda et al., (2018) Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Research