Background: The central dogma of biology stipulates that DNA is transcribed into RNA, which are translated into proteins, which carry out functions around the cell. However, as time passes, we are discovering more and more exceptions to this dogma. One of which are small non-coding RNAs (sRNA): these short fragments of RNA don’t get translated into proteins; instead, they fold into small structures and carry out many key and catalytic functions in the bacterial cell. sRNAs are uniquely versatile, as they are capable of interacting with both protein and nucleic acid targets, are responsible for bacterial responses to environmental stimuli, and can serve as virulence mechanisms.

Continue reading

Students will design and contribute new features to the AI Model Share MLOps Platform. Projects will allow students first hand experience working with and developing MLOps tools including model deployment, continuous model improvement, and ML analytics. Individualized projects will allow students to 1) integrate advanced deep learning models into our system, 2) work on ML model replication tools, 3) integrate new ML dashboards into our toolkit, and more.

Continue reading

Research in Lyme Disease shows that it is very hard to identify clinically meaningful improvement for chronic patients whose symptoms tend to wax and wane. Our team developed a diagnostic tool - General Symptom Questionnaire (Fallon et all., 2019, PMID: 31867334) and gathered a lot of data on patients who attended our center for research and/or treatment. The purpose of the proposed study is to analyze the existing data to find clinically meaningful cut-offs on the scale that can inform clinicians on whether the patient improved or not. If you are interested in psychometrics and want to contribute to understanding of how the chronic disease evolves, this is the project for you.

Continue reading

The aim of this project is to use artificial intelligence (AI) to extract valuable information from unstructured eye movements of highly-skilled domain experts, in particular those of expert clinicians as they perform complex diagnostic decision-making tasks. Such eye-movement data is rich in patterns that can be deciphered using the power of unsupervised machine learning algorithms (such as k-nearest neighbor/hierarchical clustering and principal components analysis) or unsupervised deep learning algorithms (such as deep generative models, autoencoders, and long short-term memory autoencoders for sequence data). Furthermore, as novices transform into experts, patterns embedded in their eye movements (time spent on regions of interest vs. time spent on surgical equipment) may offer a valuable tool for extracting features that pinpoint the critical mechanisms (’eureka moments’) behind expert decision-making. The primary objectives of this project are (1) to collect eye movements of novice and expert ophthalmologists as they view medical images during eye-disease diagnoses using benchtop-based, head-mounted, or Virtual Reality embedded eye trackers (Eyelink 1000, Pupil Labs Core, or HTC Vive Pro, respectively) and (2) to apply unsupervised machine learning/deep learning approaches to extract meaningful information from this data. Features to be extracted from this data include but are not limited to: fixation duration and fixation count in regions of interest, fixation order, saccade velocity, and pupil diameter. This data collection and data analytics project will enable extraction of the most relevant features for task-oriented training of future AI-based disease diagnosis systems. Capturing eye movements, and thereby the underlying visual decision-making mechanisms behind an expert’s knowledge that are not otherwise quantifiable, will allow us to mimic these mechanisms in AI systems, potentially improving their diagnostic accuracy and interpretability for future clinical applications.

Continue reading

Clinicians place orders for patients in the electronic health record (EHR). There is currently no internal mechanism to detect medical errors in EHR systems. Identifying and monitoring medical errors has relied on voluntary reporting and chart review, methods that are subject to substantial self-reporting bias. To quantify the magnitude of wrong-patient errors, I developed and validated the Wrong-Patient Retract-and-Reorder (RAR) Measure. The Wrong-Patient RAR measure overcomes limitations of voluntary reporting by using an electronic query to objectively detect wrong-patient orders in EHR data. Whereas previous data indicated an average of 9 wrong-patient medication errors per hospital per year based on voluntary reporting,2 the RAR measure identified 5,246 wrong-patient orders in a large healthcare system in 1 year. The vastly greater volume of errors detected provides insights into the epidemiology of wrong-patient orders, informs targeted intervention strategies, and yields sufficient numbers of events to power health IT safety intervention studies.

Continue reading

Author's picture

Columbia Data Science Institute (DSI) Scholars Program

The DSI Scholars Program is to engage and support undergraduate and master students in participating data science related research with Columbia faculty. The program’s unique enrichment activities will foster a learning and collaborative community in data science at Columbia.

Columbia University DSI

New York, NY