Over-representation of correlation analysis (ORCA): a method for identifying associations between variable sets.

TitleOver-representation of correlation analysis (ORCA): a method for identifying associations between variable sets.
Publication TypeJournal Article
Year of Publication2015
AuthorsPomyen Y, Segura M, Ebbels TMD, Keun HC
Date Published01/2015
KeywordsAlkylating Agents, Computational Biology, Data Interpretation, Statistical, Databases, Factual, Datasets as Topic, Enzyme Inhibitors, Gene Expression Profiling, Genomics, Humans, MicroRNAs, Molecular Sequence Annotation, Neoplasms, Tumor Cells, Cultured, Tumor Markers, Biological

MOTIVATION: Often during the analysis of biological data, it is of importance to interpret the correlation structure that exists between variables. Such correlations may reveal patterns of co-regulation that are indicative of biochemical pathways or common mechanisms of response to a related set of treatments. However, analyses of correlations are usually conducted by either subjective interpretation of the univariate covariance matrix or by applying multivariate modeling techniques, which do not take prior biological knowledge into account. Over-representation analysis (ORA) is a simple method for objectively deciding whether a set of variables of known or suspected biological relevance, such as a gene set or pathway, is more prevalent in a set of variables of interest than we expect by chance. However, ORA is usually applied to a set of variables differentiating a single experimental variable and does not take into account correlations.

RESULTS: Over-representation of correlation analysis (ORCA) is a novel combination of ORA and correlation analysis that provides a means to test whether more associations exist between two specific groups of variables than expected by chance. The method is exemplified by application to drug sensitivity and microRNA expression data from a panel of cancer cell lines (NCI60). ORCA highlighted a previously reported correlation between sensitivity to alkylating anticancer agents and topoisomerase inhibitors. We also used this approach to validate microRNA clusters predicted by mRNA correlations. These observations suggest that ORCA has the potential to reveal novel insights from these data, which are not readily apparent using classical ORA.

AVAILABILITY AND IMPLEMENTATION: The R code of the method is available at https://github.com/ORCABioinfo/ORCAcode.

Alternate JournalBioinformatics
PubMed ID25183485