In the LINCS data science research webinar which took place on December 20, 2016, Yongqun "Oliver" He DVM, PhD, University of Michigan Medical School (DCIC External Data Science Research Project) presented "Cell Line Ontology-based Standardization, Integration and Analysis of LINCS Cell Lines".
A biomedical ontology is a human- and computer-interpretable set of terms and relations that represent entities in a specific biomedical domain and how they relate to each other. Ontologies have played a critical role in biomedical data, metadata, and knowledge standardization, exchange, integration, as well as inferring new knowledge.
Cell lines have been widely used in biomedical research. As a member of the Open Biological/Biomedical Ontologies (OBO) Foundry library ontologies, the community-based Cell Line Ontology (CLO) covers the domain of cell lines. Co-developed by many ontology development groups and societies, CLO has established consensus definitions of cell line-specific terms such as ‘cell line’, ‘cell line cell’, and ‘cell line culturing’. A community-agreed CLO cell line design pattern has also been established and used for CLO to represent nearly 40,000 cell lines from various resources such as ATCC, HyperCLDB, Coriell Cell lines, and Japan Riken cell lines. CLO has also been used in different applications, for example, in the modeling of cell line cell-vaccine/pathogen interactions.
Cell lines are crucial to study molecular signatures and pathways, and are widely used in the NIH Library of Integrated Network-based Cellular Signatures (LINCS) project. To better serve the LINCS research community, we generated a CLO subset/view (LINCS-CLOview) of LINCS cell lines. The LINCS-CLOview includes 1,097 LINCS cell lines. For each cell line, the CLO subset includes its cell type, original tissue/organ/organism information, and associated disease, and how these entities are related. In total, 121 diseases, including three benign neoplasms (e.g., breast fibrocystic disease associated with MCF10A and MCF 10F cell lines) and 118 various types of cancers, were found and laid out using the hierarchical structure of the Disease Ontology (DOID). These LINCS cell lines are also associated with various human or mouse tissue and organ types, represented by 131 UBERON ontology terms. Forty-three cell types, represented by the Cell Type Ontology (CL), are associated with LINCS cell lines. The information in the CLO LINCS subset can be easily queries using SPARQL scripts, and it can also be used to enhance the performance of website queries of cell line related information.
Next stage of CLO-related research will also be discussed. For example, we are currently developing a CLO branch to represent various stem cell lines. CLO can also be further developed to model and analyze the data and knowledge of cell line gene transfection, cell signature markers, drug-induced molecular interaction pathways, and other cell line phenotypes. Overall, CLO can be used as an ontological basis and extendable platform to support integrative and systematic cell line cell-based research.
BD2K-LINCS DCIC: http://lincs-dcic.org
LINCS Project: http://www.lincsproject.org/