Computing and Mathematical Sciences Colloquium
Annenberg 105
Coding Techniques for Emerging DNA-Based Storage Systems
Professor Olgica Milenkovic,
ECE Department,
University of Illinois, Urbana-Champaign,
Despite the many advances in traditional data recording techniques, the surge of Big Data platforms and energy conservation issues have imposed new challenges to the storage community in terms of identifying extremely high volume, non-volatile and durable recording media. The potential for using macromolecules for ultra-dense storage was recognized as early as in the 1960s, when the celebrated physicists Richard Feynman outlined his vision for nanotechnology in the talk "There is plenty of room at the bottom." Among known macromolecules, DNA is unique in so far that it lends itself to implementations of non-volatile recoding media of outstanding integrity (one can still recover the DNA of species extinct for more than 70,000 years) and extremely high storage capacity (a human cell, with a mass of roughly 3 pg, hosts DNA with encoding 6.4 GB of information). Building upon the rapid growth of biotechnology systems for DNA synthesis and sequencing, we developed and implemented a new DNA-based rewritable and random access memory. Our system is based on DNA editing and constrained and error-control coding techniques that ensure data reliability, specificity and sensitivity of access, and at the same time, provide exceptionally high data storage capacity. The coding methods used range from traditional prefix-synchronized codes to newly introduced profile codes. As a proof of concept, we encoded in DNA parts of the Wikipedia pages of six universities in the USA, selected specific content blocks and edited portions of the text within various positions in the blocks. A Joint Work with Han Mao Kia, Jian Ma, Hussein Tabatabaei Yazdi, Yongbo Yuan, and Huimin Zhao.
For more information, please contact Carmen Nemer-Sirois by phone at (626) 395-4561 or by email at carmens@caltech.edu.