hd_correlate()
calculates the correlation matrix of the input dataset.
Arguments
- x
A numeric vector, matrix or tibble.
- y
A numeric vector, matrix or tibble with compatible dimensions with
x
. Default is NULL.- use
A character string. The method to use for computing correlations. Default is "pairwise.complete.obs". Other options are "everything", "all.obs", "complete.obs", or "na.or.complete".
- method
A character string. The correlation method to use. Default is "pearson". Other options are "kendall" or "spearman".
Details
You can read more about the method for computing covariances in the presence of missing values
and the coefficient that is calculated in the documentation of the cor()
function in the stats
package.
Examples
# Correlate features in a dataset (column wise)
dat <- example_data |>
dplyr::select(DAid, Assay, NPX) |>
tidyr::pivot_wider(names_from = "Assay", values_from = "NPX") |>
dplyr::select(-DAid)
hd_correlate(dat)[1:5, 1:5] # Subset of the correlation matrix
#> AARSD1 ABL1 ACAA1 ACAN ACE2
#> AARSD1 1.00 0.47 0.19 -0.06 0.04
#> ABL1 0.47 1.00 0.46 -0.01 0.13
#> ACAA1 0.19 0.46 1.00 0.03 0.32
#> ACAN -0.06 -0.01 0.03 1.00 0.07
#> ACE2 0.04 0.13 0.32 0.07 1.00
# Correlate 2 vectors
vec1 <- c(1, 2, 3, 4, 5)
vec2 <- c(5, 4, 3, 2, 1)
hd_correlate(vec1, vec2)
#> [1] -1