I am an Assistant Professor in the Department of Bio and Brain Engineering and Graduate School of Engineering Biology at KAIST. Previously, I had the opportunity to work very closely with molecular biologists in V. Narry Kim’s lab at Seoul National University. In 2016, I received my Ph.D. in Computer Science at Princeton University working with Olga G. Troyanskaya. As an undergraduate, I majored in both Computer Science and Mathematics at the University of Texas at Austin where I was first introduced to computational biology by Tandy Warnow.
Please read the Featured Spotlight for more about my journey as a computational biologist, my advice to undergraduates and graduate students, and why I stayed in academia.
If you are interested in working with me, please feel free to contact me with a brief overview of your background and research interests.
PhD in Computer Science, 2016
Princeton University
BSc in Computer Science, 2010
The University of Texas at Austin
BSc in Mathematics, 2010
The University of Texas at Austin
I never teach my pupils; I only attempt to provide the conditions in which they can learn.
– Albert Einstein
The advent of massive open online courses has changed the way we look at education and challenges traditional views on the role of instructors. Simple transfer of knowledge is no longer the rate limiting step for educating the next generation. Instead, knowledge is now accessible to anyone with a computer, tablet or mobile phone with a connection to the internet. I’ve also benefited tremendously from these initiatives, but at the same time forced me to reevaluate my pedagogical values. This led me to my three foundations of instruction and mentorship: construction, selection, and interaction. All of which are the basis of the following courses shared below.
The science of today is the technology of tomorrow.
– Barbara McClintock
Biology is not random, just largely unknown. There are almost an infinite amount of possible interactions, yet only a sparse handful constitutes a complex living system. To narrow down this vast search space, massive amounts of biological data are being generated to capture snapshots or snippets of the functional genome, multicellular heterogeneity, and complex human diseases. In this effort, bioinformatics algorithms play a key role in interpreting these large data collections and elucidating the underlying principles, both at the molecular and system levels.
The Young Laboratory at KAIST draws upon ideas from data science, applied statistics, and machine learning to tackle fundamental questions in quantitative biology. We incorporate problem-specific knowledge into the behavior of our algorithms to address the challenge of underspecification in modern machine learning methods. One of our primary objectives is to complete the human gene regulatory network. Specifically, we aim to map the missing regulatory axes of functional RNAs in terms of RNA modification, RNA structure, and protein-RNA interaction.
Only 2% of the human genome consists of protein-coding genes. The remaining 98% is non-coding and thought to encode the regulatory information for gene expression. Interpreting this non-coding region is thus key to understanding the functional genome and its implications for complex diseases. To tackle this, we take advantage of biological data generated from breakthroughs in chemical biology and bioengineering such as short- and long-read sequencing, oligosynthesis, chemical probing, and click chemistry.
In particular, we focus on elements of the genome that are transcribed into functional RNAs. Advances in biochemical and high-throughput techniques provide strong evidence that 74.7% of the human genome undergoes transcription, thus highlighting the importance of RNA research in functional genomics. The technology-specific computational tools built in our lab offer the means towards integrative genomics and functional interpretation. Our goal is to achieve this at single-nucleotide resolution across transcription, processing, modification, translation, decay, and other stages of the RNA life cycle.
It’s an exciting time to work in modern biology and bioengineering. Innovations in high-throughput techniques such as single-cell sequencing and spatial transcriptomics provide the means to extract deeper molecular insights in organismal development, immunology, and cancer biology. Existing computational models are being challenged and improved, placing us closer to unraveling the complexity of biological systems.
The algorithmic task here is to address inherent computational and statistical challenges in handling data generated from each high-throughput technology. This could be anything from applying the right data normalization to correcting batch effects for data integration. The key is to incorporate biology-specific knowledge into the design of computational tools, statistical models, and neural architectures. In light of this, our lab is committed to the development of these tailored methods, which we then use to extract quantitative principles underlying the biological data.
RNA therapeutics, genome editing, and artificial organoids represent just a few examples in biological engineering that are changing the way we solve human biology. However, these endeavors are often combinatorial optimization problems with near-infinite potential but intractable with brute-force algorithms. In RNA engineering, for example, there are more than 1060 possible 100-nucleotide sequences with varying degrees of functionality. To put this into perspective, the estimated number of atoms on Earth is approximately 1050 atoms only. This simple mathematical exercise indicates the limit of solely relying on high-throughput screening for sequence design and optimization.
Our approach involves building powerful search algorithms and intelligent systems to navigate this vast combinatorial space. Inspired by works in translational medicine, we leverage meaningful insights and principles from molecular biology and functional genomics for the computational optimization of bioengineering. We are interested in accelerating a wide range of applications including next-generation molecular devices for diagnosis and automatic systems for synthetic biology.
We may have all come on different ships, but we’re in the same boat now.
– Martin Luther King, Jr.
*equal contributions #corresponding author
The essence of strategy is choosing what not to do.
– Michael Porter
*equal contributions #corresponding author
I confess I do not know why, but looking at the stars always makes me dream.
– Vincent Van Gogh
Here is the list of places I’d like to get my specific cup of coffee.