New York Presbyterian/Columbia University Irving Medical Center (NYP/CUIMC) serves a high number of racial/ethnic minority and low-income patients. In this project, we will create a data repository of all patients who have completed a universal screen in a clinical encounter for social determinants of health, including food insecurity. The scholar will handle large datasets extracted from the medical record for database creation and data visualizations. The dataset will include patient demographics, food security, and clinical outcomes. This data resource will allow the scholar to partner with researchers to examine predictors of food insecurity, clinical courses, and health outcomes among a large population of patients, including a time period prior to the COVID-19 surge in New York City. The project will be co-mentored through the members of the University-wide Food Systems Network, a novel collaboration of researchers at the Medical Center, Earth Institute, SIPA, and Teacher’s College.

Continue reading

Getting a better approximation of the age of a NYC’s building can improve assigning the building to a structural type that includes type of construction and relevant building code in effect. Mapping the age and type of building would help NYC DOB and the City on a number of fronts, which include enabling NYC DOB to be more effective in enforcing building and construction safety and evaluating risk when adjacent or nearby subsurface construction is proposed. Furthermore, the more precise characterization of NYC buildings will improve efforts by the City to craft policies aimed at energy efficiency (TWG) as it drives to 80% GHG reductions by 2050 (80X50) and determining natural disaster vulnerability of its building stock (HAZUS).

Continue reading

The State regulates Construction and Demolition Waste (CDW) — its generation, recycling and reuse — and collects all data on CDW. There is no city source of data for CDW. For the city to innovate policy with respect to CDW by leveraging its capital program as one way to close material loops, which would generate environmental sustainability and financial sustainability benefits, understanding where CDW goes from the demolition process through the recycling process is the most important single step.

Continue reading

In collaboration with DDC, Microsoft AI team has developed a predictive machine learning model that forecasts monthly distribution of cash flow for DDC’s active projects. DDC intends to operationalize this model and possibly integrate into our dashboards. Assistance is needed of a data scientist to collaborate with DDC in operationalizing the model whereby DDC can prepare the visuals and data scientist can assist with operationalizing the machine learning components.

Continue reading

Rights CoLab is working with the Sustainability Accounting Standards Board (SASB) to develop and define a strengthened set of disclosure standards that investors can use to persuade companies to improve labor rights for both direct employees and workers in their supply chains. The project has two components: a data science project and an Independent Advisory Group. Our coalition of labor experts, data scientists, and SASB partners is focused on improving social disclosure standards that drive real gains in human rights.

Continue reading

DEP uses near real-time water quality data to guide its operations (i.e., the selection and routing of water) to achieve optimum quality for consumers. Historical data is used to evaluate the effectiveness of watershed protection programs, and model predictions of future water quality are used to understand potential impacts to the water supply under different infrastructure and climate scenarios.

Continue reading

Memory is a basic function of our brain that enables us to use the past experiences to service the present and future on a daily base, and memory function is often disrupted in neurological and psychiatric diseases, such as Alzheimer’s disease and posttraumatic stress disorder. To understand the molecular mechanism of memory storage, we will focus on DNA methylation, a chemical modification of our genome, that is hypothesized to play a critical role for memory. We have identified thousands of DNA methylation changes at numerous genomic loci occurred during the formation of fear and reward memory in the mouse brain. We will develop new computational tools to analyze these changes of DNA methylation and search for the common sequence features of these genomic loci. The result of this project will lead to a systematic understanding of the principle on the function and regulation of DNA methylation in memory, and will pave the way to develop new therapeutic strategies for diseases involved memory defects.

Continue reading

Author's picture

Columbia Data Science Institute (DSI) Scholars Program

The DSI Scholars Program is to engage and support undergraduate and master students in participating data science related research with Columbia faculty. The program’s unique enrichment activities will foster a learning and collaborative community in data science at Columbia.

Columbia University DSI

New York, NY