CLIP-CRD

CLIP-CRD is a tool designed to automate report diagnosis for chest radiographs. Here the CRD stands for "Chest Radiography Diagnosis". This project was developed during my internship at Systech Datasoft Limited with the goal of building a multi-modal model that would leverage CLIP model's semantic segmentation capability to generate detailed descriptions of X-ray images. By combining OpenAI's CLIP with GPT-2, and following the system designed by CLIP_prefix_caption we were able to train a model with significantly improved prediction accuracy, achieving 54.98% test accuracy, on Chest Radiography images outperforming the baseline CLIP model.

Features

Multi-Modal Understanding – Integrates both image and text data to generate meaningful and accurate radiology reports.
Deep Learning-Powered – Leverages state-of-the-art models like CLIP and GPT-2 for improved diagnostic insights.
Performance Boost – Delivers 54.98% test accuracy, surpassing the vanilla CLIP model in prediction tasks.

Getting Started

Follow these steps to set up and run the project on your local machine.

1. Clone the Repository

Run the following command to clone the repository:

git clone git@github.com:Yash-Haque/CLIP-CRD.git
cd CLIP-CRD

2. Set Up Your Environment

Make sure you have Python installed. If you're using Conda, activate your environment:

conda activate CLIP-CRD

If you're using Python’s built-in virtual environment:

python -m venv venv
source venv/bin/activate  # On Mac/Linux
venv\Scripts\activate  # On Windows

3. Install Dependencies

Install the required packages:

pip install -r requirements.txt

4. Prepare the Data

Create a folder for storing data:

mkdir data

Download the OpenI dataset from this link and unzip it into the data/ folder.

5. Run the Preprocessing Script

Navigate to the scripts/ folder:

cd scripts

Run the preprocessing script:

./parse_openi.sh

6. Run the Main Script

Once preprocessing is complete, execute the main script:

./run.sh

This will process the dataset, generate CLIP embeddings, and store the outputs in the outputs/ directory.

Notes

If you run into issues with missing dependencies, make sure you have PyTorch and OpenAI's CLIP installed.
If config.py isn't found, ensure that the sys.path is correctly set to .. i.e. the project directory.
The scripts are designed to automatically create missing folders, so no need to manually create outputs/.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
CLIP_prefix_caption @ 1ad805a		CLIP_prefix_caption @ 1ad805a
CRD-notebooks		CRD-notebooks
scripts		scripts
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLIP-CRD

Features

Getting Started

1. Clone the Repository

2. Set Up Your Environment

3. Install Dependencies

4. Prepare the Data

5. Run the Preprocessing Script

6. Run the Main Script

Notes

About

Releases

Packages

Languages

Yash-Haque/CLIP-CRD

Folders and files

Latest commit

History

Repository files navigation

CLIP-CRD

Features

Getting Started

1. Clone the Repository

2. Set Up Your Environment

3. Install Dependencies

4. Prepare the Data

5. Run the Preprocessing Script

6. Run the Main Script

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages