Scientists at the National Cancer Institute (NCI) have created a dataset of cancer-specific genetic coding variants that includes six billion data points linking drugs with the changes in the genome.
This source of big data, reported to be the most extensive cancer pharmacology database in the world, provides an opportunity for data mining in cancer drug discovery, and could help researchers understand why patients’ responses to cancer drugs vary and how tumours become resistant. It also has potential to help move forward the quest for personalised medicine.
The team carried out whole-exome sequencing of the 60 cell lines in the NCI-60 human cancer cell line panel and catalogued the genetic coding variants, including type I variants (found in the normal population) and type II variants (cancer-specific). The NCI-60 cell lines have been studied extensively, and include cells from nine tissues of origin, including breast, ovary, prostate, colon, lung, kidney, brain, blood, and skin. They used algorithms to predict the sensitivity of the cells with the type II variants and 103 approved anticancer drugs and 207 drugs in development, to see if this could be used to predict response.
According to James Doroshow, the director of the NCI’s Division of Cancer Treatment and Diagnosis, in an interview in Shots, thousands of drugs have been screened using the NCI-60 panel, to see their impact on the cells. The researchers have also analysed 5000 different combinations of approved drugs to see if they can find drugs that work well together.
In an interview with the American Association of Cancer Research, Yves Pommier, chief of the Laboratory of Molecular Pharmacology at the NCI in Bethesda, explained that the team is making the data set public.
“Opening this extensive data set to researchers will expand our knowledge and understanding of tumorigenesis, as more and more cancer-related gene aberrations are discovered,” Pommier said in the interview. “This comes at a great time, because genomic medicine is becoming a reality, and I am very hopeful this valuable information will change the way we use drugs for precision medicine.”
While not involved in this study, GenoKey focuses on data mining and analytics, and has used a combination of genetic and clinical data, along with combinatorial analysis, to find links between genetic changes and bipolar disorder. The NCI’s dataset of cancer-specific genetic coding variants, and GenoKey’s work, show the power of using big data and healthcare analytics in medical and biopharma research, particularly when combining genetic and clinical data.