Predicting gene expression from sequencing in Alzheimers Disease

The goal of this project is to evaluate algorithms that predict gene expression directly from sequencing data. With the availability of large scale sequencing data in ADSP and progress made in machine learning methods, it is possible to model long range interactions in the DNA sequence to infer intermediate phenotypes such as gene expression. First, we will test a deep learning based method called Enformer that is able to integrate long-range interactions (such as promoter-enhancer interactions) in the genome and predict gene expression from sequence. Using available RNA-sequencing on a small number of samples (e.g. ROSMAP cohort), we will optimize the algorithm to improve accuracy of prediction. Secondly, inferred expression in the ADSP cohorts will be used to test association with Alzheimer’s Disease and related endophenotypes. Finally, we will incorporate datasets that will become available in future such as cell-type specific ATAC-seq and disease-specific gene expression to re-train learning models to improve gene-expression prediction directly from sequencing data.

Selected candidate(s) can receive a stipend directly from the faculty advisor. This is not a guarantee of payment, and the total amount is subject to available funding.

Faculty Advisor

Professor: Badri Vardarajan
Center/Lab:
Location: 630W, 168th Street, 19th Floor, New York NY-10032
Bioinformatics and computational multi-omics analysis in Alzheimer’s Disease

Project Timeline

Earliest starting date: 9/15/21
End date: 5/31/22
Number of hours per week of research expected during Fall 2021: ~20

Candidate requirements

Skill sets: Machine Learning, Bioinformatics, Data Science
Student eligibility: ~~freshman~~, ~~sophomore~~, ~~junior~~, senior, master’s
International students on F1 or J1 visa: eligible
Academic Credit Possible: Yes
Additional comments: Knowledge of bioinformatics and deep learning is a plus

Predicting gene expression from sequencing in Alzheimers Disease

Faculty Advisor

Project Timeline

Candidate requirements

Columbia Data Science Institute (DSI) Scholars Program