COSMIC cancer database
Encyclopedia
COSMIC is an online database of somatic
Somatic
The term somatic means 'of the body',, relating to the body. In medicine, somatic illness is bodily, not mental, illness. The term is often used in biology to refer to the cells of the body in contrast to the germ line cells which usually give rise to the gametes...

ally acquired mutations found in human cancer
Cancer
Cancer , known medically as a malignant neoplasm, is a large group of different diseases, all involving unregulated cell growth. In cancer, cells divide and grow uncontrollably, forming malignant tumors, and invade nearby parts of the body. The cancer may also spread to more distant parts of the...

. Somatic mutations are those that occur in non-germline
Germline
In biology and genetics, the germline of a mature or developing individual is the line of germ cells that have genetic material that may be passed to a child.For example, gametes such as the sperm or the egg, are part of the germline...

 cells that are not inherited by children. COSMIC, an acronym of Catalogue Of Somatic Mutations In Cancer, curates data from papers in the scientific literature
Scientific literature
Scientific literature comprises scientific publications that report original empirical and theoretical work in the natural and social sciences, and within a scientific field is often abbreviated as the literature. Academic publishing is the process of placing the results of one's research into the...

 and large scale experimental screens from the Cancer Genome Project
Cancer Genome Project
The Cancer Genome Project, based at the Wellcome Trust Sanger Institute, aims to identify sequence variants/mutations critical in the development of human cancers...

 at the Sanger Institute. The database is freely available without restriction via its website.

Creation and history

The COSMIC database was designed to collect and display information on somatic mutations in cancer. It was launched in 2004, with data from just four gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

s, HRAS
HRAS
GTPase HRas also known as transforming protein p21 is an enzyme that in humans is encoded by the HRAS gene. The HRAS gene is located on the short arm of chromosome 11 at position 15.5, from base pair 522,241 to base pair 525,549.- Function :...

, KRAS2
KRAS
GTPase KRas also known as V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog and KRAS, is a protein that in humans is encoded by the KRAS gene. Like other members of the Ras family, the KRAS protein is a GTPase and is an early player in many signal transduction pathways...

, NRAS
Neuroblastoma RAS viral oncogene homolog
NRAS is an enzyme that in humans is encoded by the NRAS gene. It was discovered by researchers at the Institute of Cancer Research, funded by the Imperial Cancer Research Fund , and named NRAS, for its initial identification in human neuroblastoma cells.-External links:*...

 and BRAF
BRAF (gene)
Serine/threonine-protein kinase B-Raf or simply B-Raf, also known as proto-oncogene B-Raf or v-Raf murine sarcoma viral oncogene homolog B1, is a protein that in humans is encoded by the BRAF gene...

. These four genes are known to be somatically mutated in cancer. Since its creation, the database has expanded rapidly. By 2005 COSMIC contained 529 genes screened from 115,327 tumours, describing 20,981 mutations. By August 2009 it contained information from 1.5 million experiments performed, encompassing 13,423 genes in almost 370,000 tumours and describing over 90,000 mutations. COSMIC version 48, released in July 2010, incorporates mutation data from p53
P53
p53 , is a tumor suppressor protein that in humans is encoded by the TP53 gene. p53 is crucial in multicellular organisms, where it regulates the cell cycle and, thus, functions as a tumor suppressor that is involved in preventing cancer...

 in collaboration with the International Agency for Research on Cancer
International Agency for Research on Cancer
The International Agency for Research on Cancer is an intergovernmental agency forming part of the World Health Organisation of the United Nations....

. In addition, it provided updated gene co-ordinates for the most recent human reference genome
Reference genome
A reference genome is a digital nucleic acid sequence database, assembled by scientists as a representative example of a species' genetic code. As they are often assembled from the sequencing of DNA from a number of donors, reference genomes do not accurately represent the genetic code of any...

 builds. This release includes data from over 2.76 million experiments on over half a million tumours. The number of mutations documented in this release totals 141,212.

The website is focused on presenting complex phenotype-specific mutation data in a graphical manner. Data is taken from selected genes, initially in the Cancer Gene Census, as well as literature search from PubMed
PubMed
PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine at the National Institutes of Health maintains the database as part of the Entrez information retrieval system...

.

Process

Data can be accessed via selection of a gene or cancer tissue type (phenotype
Phenotype
A phenotype is an organism's observable characteristics or traits: such as its morphology, development, biochemical or physiological properties, behavior, and products of behavior...

), either using browse by features or the search box. Results show summary information with mutation counts and frequencies. The gene summary page provides a mutation spectrum map and external resources; the phenotype (tissue) summary page provides lists of mutated genes.

Examples

The figure shows the CDKN2A
CDKN2A
CDKN2A can refer to:* P16 * p14arf...

 gene, which is a tumor suppressor that leads to cancer when it is inactivated.

Contents

The COSMIC database contains thousands of somatic mutations that are implicated in the development of cancer. The database collects information from two major sources. Firstly, mutations in known cancer genes are collected from the literature. The list of genes that undergo manual curation are identified by their presence in the Cancer Gene Census. Secondly, data for inclusion in the database is collected from whole genome resequencing studies of cancer samples undertaken by the Cancer Genome Project. For example, Campbell and colleagues used next generation sequencing to examine samples from two individuals with lung cancer
Lung cancer
Lung cancer is a disease characterized by uncontrolled cell growth in tissues of the lung. If left untreated, this growth can spread beyond the lung in a process called metastasis into nearby tissue and, eventually, into other parts of the body. Most cancers that start in lung, known as primary...

which lead to the identification of 103 somatic DNA rearrangements.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK