Skip to contents

Performs comprehensive correlation analysis between phenotype data and feature data, supporting both continuous and categorical phenotypes. Filters features based on statistical significance and generates publication-ready visualizations including box plots, heatmaps, and correlation plots.

Usage

iobr_cor_plot(
  pdata_group,
  id1 = "ID",
  feature_data,
  id2 = "ID",
  target = NULL,
  group = "group3",
  is_target_continuous = TRUE,
  padj_cutoff = 1,
  index = 1,
  category = "signature",
  signature_group = NULL,
  ProjectID = "TCGA",
  palette_box = "nrc",
  cols_box = NULL,
  palette_corplot = "pheatmap",
  palette_heatmap = 2,
  feature_limit = 26,
  character_limit = 60,
  show_heatmap_col_name = FALSE,
  show_col = FALSE,
  show_plot = FALSE,
  path = NULL,
  discrete_x = 20,
  discrete_width = 20,
  show_palettes = FALSE,
  fig.type = "pdf"
)

Arguments

pdata_group

Data frame containing phenotype data with an identifier column.

id1

Character string specifying the column name in `pdata_group` serving as the sample identifier. Default is `"ID"`.

feature_data

Data frame containing feature data with corresponding identifiers.

id2

Character string specifying the column name in `feature_data` serving as the sample identifier. Default is `"ID"`.

target

Character string specifying the target variable column name for continuous analysis. Default is `NULL`.

group

Character string specifying the grouping variable name for categorical analysis. Default is `"group3"`.

is_target_continuous

Logical indicating whether the target variable is continuous, which affects grouping strategy. Default is `TRUE`.

padj_cutoff

Numeric value specifying the adjusted p-value cutoff for filtering features. Default is `1`.

index

Numeric index used for ordering output file names. Default is `1`.

category

Character string specifying the data category: `"signature"` or `"gene"`.

signature_group

List specifying the grouping variable for signatures. Options include `"sig_group"` for signature grouping or `"signature_collection"`/`"signature_tme"` for gene grouping.

ProjectID

Character string specifying the project identifier for file naming.

palette_box

Character string or integer specifying the color palette for box plots. Default is `"nrc"`.

cols_box

Character vector of specific colors for box plots. Default is `NULL`.

palette_corplot

Character string or integer specifying the color palette for correlation plots. Default is `"pheatmap"`.

palette_heatmap

Integer specifying the color palette index for heatmaps. Default is `2`.

feature_limit

Integer specifying the maximum number of features to display. Default is `26`.

character_limit

Integer specifying the maximum number of characters for variable labels. Default is `60`.

show_heatmap_col_name

Logical indicating whether to display column names on heatmaps. Default is `FALSE`.

show_col

Logical indicating whether to display color codes for palettes. Default is `FALSE`.

show_plot

Logical indicating whether to display plots. Default is `FALSE`.

path

Character string specifying the directory path for saving output files. Default is `NULL`.

discrete_x

Numeric threshold for character length beyond which labels are discretized. Default is `20`.

discrete_width

Numeric value specifying the width for label wrapping in plots. Default is `20`.

show_palettes

Logical indicating whether to display color palettes. Default is `FALSE`.

fig.type

Character string specifying the format for saving figures (`"pdf"`, `"png"`, etc.). Default is `"pdf"`.

Value

Depending on configuration, returns ggplot2 objects (box plots, heatmaps, correlation plots) and/or a data frame containing statistical analysis results.

Author

Dongqiang Zeng

Examples

set.seed(123)

pdata_group <- data.frame(
  ID = 1:100,
  phenotype_score = rnorm(100)
)

feature_data <- data.frame(
  ID = 1:100,
  Feature1 = rnorm(100),
  Feature2 = rnorm(100),
  Feature3 = rnorm(100)
)

sig_group_example <- list(
  signature = c("Feature1", "Feature2", "Feature3")
)

results <- iobr_cor_plot(
  pdata_group = pdata_group,
  feature_data = feature_data,
  id1 = "ID",
  id2 = "ID",
  target = "phenotype_score",
  is_target_continuous = TRUE,
  category = "signature",
  signature_group = sig_group_example,
  show_plot = FALSE,
  path = tempdir()
)
#>  Processing signature: "signature"
#> tidyHeatmap says: (once per session) from release 1.7.0 the scaling is set to "none" by default. Please use scale = "row", "column" or "both" to apply scaling
#> tidyHeatmap says: If you use tidyHeatmap for scientific research, please cite: Mangiola, S. and Papenfuss, A.T., 2020. 'tidyHeatmap: an R package for modular heatmap production based on tidy principles.' Journal of Open Source Software. doi:10.21105/joss.02472.
#> This message is displayed once per session.
#> Warning: `when()` was deprecated in purrr 1.0.0.
#>  Please use `if` instead.
#>  The deprecated feature was likely used in the tidyHeatmap package.
#>   Please report the issue at
#>   <https://github.com/stemangiola/tidyHeatmap/issues>.
#>  Heatmap3 palettes: pheatmap, peach, blues, virids, reds, normal

#>  Computing spearman correlation for 3 features
#>  Correlation analysis complete

print(results)
#> # A tibble: 3 × 6
#>   sig_names p.value statistic p.adj log10pvalue stars
#>   <chr>       <dbl>     <dbl> <dbl>       <dbl> <fct>
#> 1 Feature2    0.225  -0.122   0.493      0.647  "+"  
#> 2 Feature3    0.328  -0.0987  0.493      0.484  "+"  
#> 3 Feature1    0.969   0.00390 0.969      0.0136 ""