This project is the first comprehensive examination of African North Americans who crossed one of the U.S.-Canada borders, going either direction, after the Underground Railroad, in the generation alive roughly 1865-1930. It analyzes census and other records to match individuals and families across the decades, despite changes or ambiguities in their names, ages, “color,” birthplace, or other details.

Continue reading

Genome wide CRISPR lethality screens show broad variability in cellular fitness phenotypes across cancer. We postulate that genes with overlapping functions should deliver similar responses enabling functional annotation of uncharacterized genes. Here we will build a network connecting genes based on the similarity of their knockout phenotypes, benchmark this network using protein interaction databases and functional transcriptomics, and leverage network analyses to identify mutational and transcriptional modulators of functional complexes.

Continue reading

A wealth of evidence for the automaticity of perceptual organization processes points toward the existence of a global-to-local processing bias in early perceptual stages. Global features are encoded and spontaneously reported during early conscious vision, resulting in the perception of coherent objects prior to identifying detailed information. Yet, results from experiments that presented illusory figure presentation below the perceptual threshold to study the reliance of perceptual organization on visual awareness have shown conflicting findings, leaving open the question of how global features interact during figure perception. The present study will examine the interaction between symmetry and perceptual completion under conditions of restricted awareness.

Continue reading

This project has a two-fold aim. First, we seek to determine what makes an idea seem novel versus ordinary and if there is an ideal mix of the two. Second, building on these findings, we build a generative model that suggests tweaks to an idea that enhance its perceived creativity and appeal. We will pursue these two aims using 69K recipes and reviews from allrecipes.com. We will use NLP approach to extract important features from the recipe such as ingredients, preparation instruction and review content.

Continue reading

Prediction Markets have been used to forecasts outcomes of research interest using market mechanism (See https://www.nature.com/news/the-power-of-prediction-markets-1.20820). A decentralized prediction market, Augur, has been created on blockchain for betting purposes (See, https://www.augur.net/). An alternative approach to prediction market has been proposed in Dalal et al (https://www.sciencedirect.com/science/article/abs/pii/S0040162511000734). This project proposes to develop a new model for decentralized prediction market which can be used to elicit opinions of university researchers on socially important issues. Specifically, the project will use Ethereum based platform to develop a smart contract and an ERC-20 compliant token for researchers to participate in the new market.

Continue reading

The federal government spends billions of dollars a year supporting rural broadband (internet access), subsidizing build-out in low-density areas that do not have broadband (unserved areas). However, it is not clear whether the rural areas most in need are receiving a fair share of the funding. Using a very large dataset of broadband availability, census data and recent auction results, the project will analyze whether unserved areas with high racial diversity or lower median income are receiving a fair share of funding. Depending on team size, we will also attempt to create a shareable master data set building on OpenStreetMap and other sources that provides key data points for census units.

Continue reading

Author's picture

Columbia Data Science Institute (DSI) Scholars Program

The DSI Scholars Program is to engage and support undergraduate and master students in participating data science related research with Columbia faculty. The program’s unique enrichment activities will foster a learning and collaborative community in data science at Columbia.

Columbia University DSI

New York, NY