DNA sequence reads from a community of microbial genomes are currently processed without considering sequence variants. The project involves building a processing pipeline of such billions of short reads, identifying closest strains they might belong to, assembling them into specific clones, calling their variants, and analyzing the dynamic nature of these bacterial strains along sampling points.

This project is sponsored by DSI Center for Health Analytics.

Faculty Advisor

  • Professor Itsik Pe’er
  • Department/School: Computer Science/SEAS
  • Location: Computer Science Buiding 506

Project timeline

  • Start date: 05/15/2018
  • End date: 08/15/2018
  • Number of hours per week of research expected: 40

Candidate requirements

  • Skill sets: Python; unix; probability & statistics.
  • Student eligibility (as of Spring 2018): freshman, sophomore, junior, senior, master’s
  • International students on F1 or J1 visa: eligible