Skip to content

GenAI Genesis project. Won the Google & HBSU Best Healthcare AI Hack award: https://devpost.com/software/imagehr

License

Notifications You must be signed in to change notification settings

Toyakki/ImagEHR

Repository files navigation

imagEHR

This is an AI application that automates the extraction and mapping of multi-modal clinical data to industry standards to streamline research, trial, and regulatory use. By the power of AI we empower healthcare professionals by reducing repetitive, time-consuming tasks and letting them focus on what matters most "Patients". The problem is the manual mapping of unstructured clinical data wastes valuable time and delays patient care and drug development. Our solution lies in the Semi-automation of this process which can reduce delays, boost accuracy, and free clinical teams to focus on saving lives.

Test it out live at https://imagehr-k7kg.onrender.com (note: some inference features are unavailable due to free hosting limits)

Check it out on DevPost: https://devpost.com/software/imagehr

Table of Contents:

How to Use imagEHR

Install PyTorch https://pytorch.org/get-started/locally/

SETUP:

pip install -r requirements.txt

PULL SUBMODULES:

git submodule update --init --remote --recursive

ENVIRONMENT: Copy .env.example into .env and fill in your API keys, etc.

Run the Flask Server:

python main.py

Code references

Application Inner Workings

Automates data extraction of EHR (text) and X-ray (image) data, and then infers how it can be mapped to industry standards for downstream research, trial, and regulatory use.

Note: There are several types of clinical data that can be used to build a patient profile (eg. pharmacokinetic, laboratory, etc.) For simplicity, this application was developed using only EHR (text) and X-ray (image) data as raw inputs for a patient’s profile.

• Cohere to extract meaningful insights from EHR • Pre-trained model to extract meaningful data from X-ray images • Cohere to predict what SDTM domains the data can be mapped to • Cohere to evaluate how raw, extracted data from EHR + X-ray images map to CDISC industry standards under each SDTM domain • Output .csv file of mapped data extractions in industry-ready format

Note: Physicians and clinical programmers can also manually add or remove any domain or variable using their judgement.

Challenges we ran into

The SDTM domains and CDISC terminology (controlled vs. non-controlled terminology) was extremely complex and hard to navigate. We had to teach ourselves how this mapping process worked. Access to real-world clinical data was limited. For example, the EHR sandbox data we used was low quality and there were not many good alternatives.

Our accomplishments

• Low latency (< 5 minutes) throughout the automation process. • Incorporating multi-modal data analyses to improve workflows.

Lessons learned

• How to work together as a team with our different backgrounds in biological science and computer sciences. • Walks help to refresh the mind, body, and spirit after working for consecutive hours.

Next steps

• Incorporation of additional data modalities (eg. pharmacokinetic, oncology, genetics, laboratory, etc.) • Using a more accurate vision or multi-modal model & incorporating physician feedback • Integration with FHIR-based hospital systems to enable real-time data ingestion and seamless interoperability with EHR platforms

References

About

GenAI Genesis project. Won the Google & HBSU Best Healthcare AI Hack award: https://devpost.com/software/imagehr

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •