Decoding the human genome with interpretable deep learning

The function for much of the 3 billion letters in the human genome remain to be understood. Advances in DNA sequencing technology have generated enormous amount of data, yet we don’t have the tool to extract rules of how the genome works. Deep learning holds great potential in decoding the genome, in particular due to the digital nature of DNA sequences and the ability to handle large data sets. However, like many other applications, the interpretability of deep learning models hampers its ability to help understand the genome. We are developing deep learning architectures embedded with the principles of gene regulation and we will be leveraging millions of existing whole genome measurements of gene activity to learn a mechanistic model of gene regulation in human cells.

One selected candidate will receive a stipend via the DSI Scholars program. Amount is subject to available funding.

Faculty Advisor

Professor: Xuebing Wu
Department/School: Department of Medicine and Department of Systems Biology
Location: 3960 Broadway, Lasker building Room 430, New York, NY 10032
We study the mechanisms of mammalian mRNA regulation by using integrated experimental and computational approaches. We are also exploring the therapeutic potential of CRISPR-based mRNA-targeting for treating human diseases.

Project Timeline

Earliest starting date: 10/15/2019
End date: 8/31/2020
Number of hours per week of research expected during Fall 2019: ~5

Candidate requirements

Skill sets: Python; basic understanding of machine learning
Student eligibility: freshman, sophomore, junior, senior, master’s
International students on F1 or J1 visa: eligible

Decoding the human genome with interpretable deep learning

Faculty Advisor

Project Timeline

Candidate requirements

Columbia Data Science Institute (DSI) Scholars Program