NYC DDC has initiated a machine learning project to develop predictive model for estimating cost of project and work items. Using the latest technique in Machine Learning and Advanced Statistics, NYC DDC to develop a model that predicts the cost of future and active projects and construction work items in different phases of the lifecycle of the project based on historical data. DDC has partnered with Microsoft who is providing the proof of concept guidance and making tools available for the proof of concept development. DDC is seeking assistance of a data scientist from the Town and Gown program to develop the model.

Continue reading

The spread of COVID-19 has led to unprecedented and ongoing changes to daily life, including shelter-in-place orders, widespread closing of businesses and schools, and work-from-home and school-from-home at previously unknown levels. These changes in behavior are placing extraordinary demands on the Internet. This project will measure the Internet’s ability to meet these demands, including comparing its performance before, during, and after the peak of COVID-19; whether the amount of change varies between areas heavily impacted by COVID-19 and those less impacted; and whether and how large networks adapt. To provide this rich understanding, this project will combine multiple Internet-scale datasets that provide complementary views to investigate how responses to COVID-19 have impacted the Internet and how networks have reacted. Measuring the network impact of COVID-19 will illuminate the Internet’s strengths and weak points and is a crucial step towards improving the Internet’s future resilience in the face of pandemics, natural disasters, large scale conflict, and terrorist attacks.

Continue reading

A translational medical informatic project is available to identify risk factors associated with head and neck cancer and lung cancer in electronic medical records. Projects include data extraction, data curation, and establishing and maintaining a database of biospecimens and patients' characteristics. Statistical analysis and modeling will be done to identify clinical characteristics and risk factors which are associated with aggressive form of tumors. Training and mentorship will be provided. Prospective candidates should have great communication skills, willingness to work in a highly collaborative environment, and have excellent time management and organizational skills.

Continue reading

Single cell sequencing has generated unprecedented insight into the cellular complexity of normal and diseased organ. We are interested in using this technique to understand the mechanisms of eye development, disease and regeneration. We also would like to compare the transcriptomic signatures between mouse models and human tissues. This project involves analysis of large amount of data from single cell sequencing. It requires understanding of statistical analysis and proficient programming skills.

Continue reading

The proposed project would focus on analyzing quantitative data from a 4 year NIMH-funded study entitled “Integrating evidence-based depression treatment in primary care: Tuberculosis (TB) in Brazil as a model” (PI: Sweetland, K01MH104514). The aim of the study was to assess whether social network analysis could be used to leverage the receptivity and connectivity of TB providers in a Brazilian public health system in a way that could accelerate the adoption (implementation) and diffusion (dissemination) of an evidence-based treatment for depression treatment in a primary care network. Baseline receptivity was operationalized via six brief quantitative scales to measure mental health literacy, work self-efficacy, organizational climate, attitudes towards evidence-based practices, organizational readiness to change and individual innovation thresholds. Connectivity was assessed by asking TB providers with whom they discuss difficult cases, give advice to, or receive advice regarding difficult TB cases. Baseline receptivity and connectivity data was used to identify 3 pilot sites in which to train primary care providers to deliver evidence-based depression treatment for one year.

Continue reading

The goal of this project is to mitigate the risks of commuting for Columbia employees when they will have to return to work after the state-on-pause is over from May 15. As Columbia has been preparing for ramping up labs with an emphasis on social distancing within campus, higher risks could arise when the employees have to commute between home and campus. It is estimated approximately one in five residents in NYC might have been infected by COVID-19 . With the fear of exposing to coronavirus, commuters have shifted from transit to individual cars or bikes, leading to a significant drop in subway ridership , more speeding tickets , surging bike traffic , and more crashes with cyclist injuries . On the other hand, low-income people of color, who have been hit the hardest by the coronavirus , could be in a more disadvantaged position after the state “un-pause”, because: (1) they lack accessibility to other travel modes than public transit; (2) they usually live far from their workplace for an affordable accommodation and have to commute a long way; and (3) many of them take night shift but most transportation options are shut down at night after the pandemic. This project aims to address travel safety and equity concerns of essential workers and provide a responsible and safe transportation solution for the Columbia community.

Continue reading

Author's picture

Columbia Data Science Institute (DSI) Scholars Program

The DSI Scholars Program is to engage and support undergraduate and master students in participating data science related research with Columbia faculty. The program’s unique enrichment activities will foster a learning and collaborative community in data science at Columbia.

Columbia University DSI

New York, NY