In this project we’ll be expanding on an existing family of supervised topic models. These models extend LDA to document collections where for each document we observe additional labels or values of interest. More specifically, one of the goals of this project is to use additional document level data, such as regulatory discretion, to develop better data modelling tools.

Continue reading

I’m currently working, on loan, for NTIA ( on the BEAD (Broadband Equity, Access and Deployment), a roughly $40 billion project to deploy high-speed internet to all or most locations that currently lack access. We have a public and semi-public data set that lists every home and business in the United States, as well as broadband deployments and government grants.The project will answer questions such as: What will it cost to deploy fiber? Where are community anchor institutions located? What locations are already being subsidized? Which locations without service are in high-poverty areas?

Continue reading

Prediction Markets have been used to forecast outcomes of research interest using market mechanisms (See A decentralized prediction market, Auger, has been created on blockchain for betting purposes (See, An alternative approach to prediction market has been proposed in Dalal et al ( This project develops a new hybrid model for centralized and decentralized collaborative prediction market that can be used to elicit opinions of university researchers on socially important issues. Specifically, the project uses Django and Ethereum based platform to develop a smart contract and an ERC-20 compliant token for researchers to participate in the new market. The smart contract is being developed in Solidity and Javascript. The corresponding frontend and backend uses Django and python on AWS cloud. The project will require developing and experimenting with new innovative Automated Market Makers used in DeFi.

Continue reading

Author's picture

Columbia Data Science Institute (DSI) Scholars Program

The DSI Scholars Program is to engage and support undergraduate and master students in participating data science related research with Columbia faculty. The program’s unique enrichment activities will foster a learning and collaborative community in data science at Columbia.

Columbia University DSI

New York, NY