Perform over-representation analysis — do

do_ora() performs over-representation analysis (ORA) using the clusterProfiler package.

Usage

do_ora(
  protein_list,
  database = c("GO", "Reactome"),
  background = NULL,
  pval_lim = 0.05
)

Arguments

protein_list: A character vector containing the protein names.
database: The database to perform the ORA. It can be either "GO" or "Reactome".
background: A character vector containing the background genes.
pval_lim: The p-value threshold to consider a term as significant.

Value

A list containing the results of the ORA.

Details

The ontology option used when database = "GO" is "BP" (Biological Process). When Olink data is used, it is recommended to provide a protein list as background.

Examples

# Perform Differential Expression Analysis
control = c("BRC", "CLL", "CRC", "CVX", "ENDC", "GLIOM", "LUNGC", "LYMPH", "MYEL", "OVC", "PRC")
de_res <- do_limma(example_data,
                   example_metadata,
                   case = "AML",
                   control = control,
                   wide = FALSE)
#> Comparing AML with BRC, CLL, CRC, CVX, ENDC, GLIOM, LUNGC, LYMPH, MYEL, OVC, PRC.

# Extract the up-regulated proteins for AML
sig_up_proteins_aml <- de_res$de_results |>
  dplyr::filter(sig == "significant up") |>
  dplyr::pull(Assay)

# Perform ORA with GO database
do_ora(sig_up_proteins_aml, database = "GO")
#> No background provided. When working with Olink data it is recommended to use background.
#> 'select()' returned 1:1 mapping between keys and columns
#> #
#> # over-representation test
#> #
#> #...@organism 	 Homo sapiens 
#> #...@ontology 	 BP 
#> #...@keytype 	 ENTREZID 
#> #...@gene 	 chr [1:21] "566" "100" "328" "54518" "9289" "9048" "285" "181" "51129" ...
#> #...pvalues adjusted by 'BH' with cutoff <0.05 
#> #...78 enriched terms found
#> 'data.frame':	78 obs. of  12 variables:
#>  $ ID            : chr  "GO:0050900" "GO:0050926" "GO:0045785" "GO:0001666" ...
#>  $ Description   : chr  "leukocyte migration" "regulation of positive chemotaxis" "positive regulation of cell adhesion" "response to hypoxia" ...
#>  $ GeneRatio     : chr  "7/21" "3/21" "6/21" "5/21" ...
#>  $ BgRatio       : chr  "396/18888" "26/18888" "485/18888" "313/18888" ...
#>  $ RichFactor    : num  0.0177 0.1154 0.0124 0.016 0.0153 ...
#>  $ FoldEnrichment: num  15.9 103.8 11.1 14.4 13.8 ...
#>  $ zScore        : num  10 17.5 7.54 7.96 7.76 ...
#>  $ pvalue        : num  1.52e-07 3.03e-06 1.09e-05 1.98e-05 2.44e-05 ...
#>  $ p.adjust      : num  0.000163 0.001616 0.003863 0.005215 0.005215 ...
#>  $ qvalue        : num  9.74e-05 9.68e-04 2.31e-03 3.12e-03 3.12e-03 ...
#>  $ geneID        : chr  "566/100/9048/25/2683/199/30817" "566/9048/285" "566/100/54518/9289/25/199" "100/285/51129/405/1386" ...
#>  $ Count         : int  7 3 6 5 5 5 5 5 3 5 ...
#> #...Citation
#>  T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
#>  clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
#>  The Innovation. 2021, 2(3):100141 
#>