Background: The Human Health Exposure Analysis Resource (HHEAR) provides investigators access to laboratory and statistical analyses aimed at incorporating and expanding environmental exposures within their research. To benefit the broader scientific community, the HHEAR Data Repository houses deidentified epidemiologic and biomarker data from all studies accepted into the HHEAR program. To date, 46 studies including data from over 36,000 individuals are part of the HHEAR program. Of those studies, 26 have deposited their data in the HHEAR Data Repository. More information on HHEAR can be found at hheardatacenter.mssm.edu.

Continue reading

Background: The central dogma of biology stipulates that DNA is transcribed into RNA, which are translated into proteins, which carry out functions around the cell. However, as time passes, we are discovering more and more exceptions to this dogma. One of which are small non-coding RNAs (sRNA): these short fragments of RNA don’t get translated into proteins; instead, they fold into small structures and carry out many key and catalytic functions in the bacterial cell. sRNAs are uniquely versatile, as they are capable of interacting with both protein and nucleic acid targets, are responsible for bacterial responses to environmental stimuli, and can serve as virulence mechanisms.

Continue reading

Scholars and practitioners alike have stressed that gender serves as a ‘symbolic glue’ for the mobilization of illiberal causes. Recent attacks on the “gender academy,” perpetrated in contexts of rising illiberalism, have taken a variety of forms, from the de-legitimation of gender programs to their outright closure, from the marginalization of scholars and researchers to their physical and psychological endangerment. As a crisis mitigation strategy, the Women and Gender in Global Affairs network at the Institute for the Study of Human Rights, working with other partners including CoreWoman, is exploring the development of an Early-Warning System to provide information, resources and support to gender scholars who may face illiberal attacks. After an initial preparatory phase, the project will involve a test pilot in 2 to 3 Latin America countries. Possible candidates include México, Brazil, Colombia, Peru, Nicaragua, Bolivia, and Venezuela. DSI student involvement will include a range of activities such as:

Continue reading

CRISPR/Cas13 is a programmable RNA-targeting system with a significant therapeutic potential. However, there is a lack of method for designing highly specific CRISPR/Cas13 systems. We have generated a large data set with high-throughput genomic assays and a previous DSI scholar has developed a transformer-based model that is capable of predicting targeting specificity from RNA sequences. We are looking for a motivated student with a strong deep learning background and a basic understanding of molecular biology to improve the model and publish the result.

Continue reading

Students will design and contribute new features to the AI Model Share MLOps Platform. Projects will allow students first hand experience working with and developing MLOps tools including model deployment, continuous model improvement, and ML analytics. Individualized projects will allow students to 1) integrate advanced deep learning models into our system, 2) work on ML model replication tools, 3) integrate new ML dashboards into our toolkit, and more.

Continue reading

Research in Lyme Disease shows that it is very hard to identify clinically meaningful improvement for chronic patients whose symptoms tend to wax and wane. Our team developed a diagnostic tool - General Symptom Questionnaire (Fallon et all., 2019, PMID: 31867334) and gathered a lot of data on patients who attended our center for research and/or treatment. The purpose of the proposed study is to analyze the existing data to find clinically meaningful cut-offs on the scale that can inform clinicians on whether the patient improved or not. If you are interested in psychometrics and want to contribute to understanding of how the chronic disease evolves, this is the project for you.

Continue reading

The aim of this project is to use artificial intelligence (AI) to extract valuable information from unstructured eye movements of highly-skilled domain experts, in particular those of expert clinicians as they perform complex diagnostic decision-making tasks. Such eye-movement data is rich in patterns that can be deciphered using the power of unsupervised machine learning algorithms (such as k-nearest neighbor/hierarchical clustering and principal components analysis) or unsupervised deep learning algorithms (such as deep generative models, autoencoders, and long short-term memory autoencoders for sequence data). Furthermore, as novices transform into experts, patterns embedded in their eye movements (time spent on regions of interest vs. time spent on surgical equipment) may offer a valuable tool for extracting features that pinpoint the critical mechanisms (’eureka moments’) behind expert decision-making. The primary objectives of this project are (1) to collect eye movements of novice and expert ophthalmologists as they view medical images during eye-disease diagnoses using benchtop-based, head-mounted, or Virtual Reality embedded eye trackers (Eyelink 1000, Pupil Labs Core, or HTC Vive Pro, respectively) and (2) to apply unsupervised machine learning/deep learning approaches to extract meaningful information from this data. Features to be extracted from this data include but are not limited to: fixation duration and fixation count in regions of interest, fixation order, saccade velocity, and pupil diameter. This data collection and data analytics project will enable extraction of the most relevant features for task-oriented training of future AI-based disease diagnosis systems. Capturing eye movements, and thereby the underlying visual decision-making mechanisms behind an expert’s knowledge that are not otherwise quantifiable, will allow us to mimic these mechanisms in AI systems, potentially improving their diagnostic accuracy and interpretability for future clinical applications.

Continue reading

Author's picture

Columbia Data Science Institute (DSI) Scholars Program

The DSI Scholars Program is to engage and support undergraduate and master students in participating data science related research with Columbia faculty. The program’s unique enrichment activities will foster a learning and collaborative community in data science at Columbia.

Columbia University DSI

New York, NY