General Biology Seminar
There is a convergence of scale between technologies for biological data generation such as genome sequencing and imaging, and technologies for data analysis such as cloud and supercomputing. This convergence is creating a new paradigm for discovery in biology where computing over vast amounts of data can lead to novel, testable hypotheses. Like previous phase shifts in biology, such as the development of genetics in the early 20th century and the advent of molecular biology in the late 20th century, "Big Data Biology" in the 21st century will only be successful if it leads to deeper insights into biological mechanisms and if it helps generate useful knowledge for biomedicine and agriculture. I will show several examples of specific predictions and insights we have made into the genetics of cancers and other complex diseases from our work mining data from thousands of genomes, tens of thousands of images, and more than 100 million patient records. I will show how this approach has allowed us to pinpoint molecules for further genetic and biochemical study, as well as potential therapeutic interventions.