In this project we’ll be expanding on an existing family of supervised topic models. These models extend LDA to document collections where for each document we observe additional labels or values of interest. More specifically, one of the goals of this project is to use additional document level data, such as regulatory discretion, to develop better data modelling tools.

This project is eligible for a stipend, with matching funds from the faculty advisor and the Data Science Institute. This is not a guarantee of payment, and the total amount is subject to available funding.

Faculty Advisor

  • Professor: Sharyn O’Halloran
  • Center/Lab: Poltical Science / Data Anlaytics and Quantative Analysis (DAQA) at SIPA
  • Location: 727 IAB
  • The research team is currently working on applying NLP and ML models to explain financial market regulation, trade policy, and environmental policy.

Project Timeline

  • Earliest starting date: 6/1/2023
  • End date: 8/15/2023
  • Number of hours per week of research expected during Spring-Summer 2023: ~20

Candidate requirements

  • Skill sets: Programming experiences with Python. We are seeking student/s that have prior exposure to machine learning and natural language processing either through a course and/or research project. Ideal candidate would have prior working experience with topic models. Experience working with large datasets is a plus.
  • Student eligibility: freshman, sophomore, junior, senior, master’s
  • International students on F1 or J1 visa: eligible
  • Academic Credit Possible: Yes
  • Additional comments: Expected student commit to the project and meet expected deliverables. Students must also attend regular weekly meetings, providing updates on work..