Natural Language Processing within the CONCERN Project
The CONCERN project aims to develop models and tools to quantify clinician concern about patient deterioration in the inpatient setting that can be used in early warning scores. We have discovered and validated several measurable ways within the Electronic Health Record (EHR) to measure clinician concern and have demonstrated that our approach identified patients at risk of deterioration earlier than other methods, which focus only on physiological data. One of our approaches is leveraging documentation of certain concepts within narrative text in nursing notes that are consistent with concern about a patient. However, this narrative free text is not easily accessible - it is often mixed together with structured or templated text and varies over note types. The steps to be performed are
- Learn EHR data models, particularly related to the structure and types of narrative notes.
- Develop an algorithm to detect and extract narrative (unstructured) text from within various note types that include both structured and unstructured clinical data.
- Employ natural language methods to extract entities identified by subject matter experts and develop a classification model to determine instances of concepts within the narrative text. This project utilizes SQL and Python programing languages.
Selected candidate(s) may receive a stipend directly from the faculty advisor. This is not a guarantee of payment, and the total amount is subject to available funding.
Faculty Advisor
- Professor: Sarah Rossetti / Kenrick Cato
- Department/School: Department of Biomedical Informatics / School of Nursing
- Location: PH-20
- Dr. Rossetti’s research is focused on identifying and intervening on system-level weaknesses – particularly those related to poor communication and care coordination – that increase patient risk for harm within our healthcare system by applying computation tools to mine and extract value from electronic health record (EHR) data and leveraging user-centered design of patient-centered and collaborative decision support tools. The mission of the Optimizing with Applied Clinical Informatics Models and Methods (OPTACIMM) Center is to use informatics to advance healthcare delivery and outcomes for individuals and communities.
Project Timeline
- Earliest starting date: 3/1/2021
- End date: 12/31/2021
- Number of hours per week of research expected during Spring 2021: ~6
- Number of hours per week of research expected during Summer 2021: ~20-40
Candidate requirements
- Skill sets: Familiarity with Python and Jupyter Notebooks
- Student eligibility:
freshman, sophomore, junior, senior, master’s - International students on F1 or J1 visa: eligible
- Academic Credit Possible: No