The development of computational data science techniques in natural language processing (NLP) and machine learning (ML) algorithms to analyze large and complex textual information opens new avenues to study intricate processes, such as government regulation of financial markets, at a scale unimaginable even a few years ago. This project develops scalable NLP and ML algorithms (classification, clustering and ranking methods) that automatically classify laws into various codes/labels, rank feature sets based on use case, and induce best structured representation of sentences for various types of computational analysis.

The object of the research is to provide standardized coding labels of policies to assist regulators to better understand how key policy features impact financial markets.

This project is currently NOT accepting applications.

Faculty Advisors

  • Professor Sharyn O’Halloran
  • Department/School: SIPA/Political Science/SPS
  • Location: Data Science Institute Center for Financial & Business Analytics

Project timeline

  • Start date: 06/01/2018
  • End date: 08/15/2018
  • Number of hours per week of research expected: 25

Candidate requirements

  • Skill sets: Language parser, R, Python, database management and statistical analysis.
  • Student eligibility (as of Spring 2018): freshman, sophomore, junior, senior, master’s
  • International students on F1 or J1 visa: eligible