Animals Classifier 🐾

This project combines computer vision and natural language processing to identify animals from images, then enriches the identification with detailed information from Wikipedia. It's a practical exploration of modern AI capabilities wrapped in a simple, user-friendly interface.

Architecture Overview

The system operates through three core components:

Vision Recognition: Uses OpenAI's multimodal capabilities to identify animals in uploaded images
Knowledge Retrieval: Leverages LlamaIndex to fetch and process relevant Wikipedia articles
Interactive Interface: Presents findings through a clean Gradio UI

Getting Started

Prerequisites

You'll need the following on your development machine:

Python 3.12+ (earlier versions might work but weren't tested)
pip (for package management)
virtualenv (for isolated environments)

Installation & Setup

# Clone this repository
git clone 

# Navigate to project directory
cd animals-classifier

# Set up virtual environment
python -m venv venv

# Activate environment
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set your OpenAI API key
export OPENAI_API_KEY="sk-..."  # On Windows: set OPENAI_API_KEY=sk-...

# Launch the application
python main.py

Once running, open your browser and navigate to http://127.0.0.1:7860 to interact with the classifier.

Technical Implementation

The project is structured around three main Python modules:

agent.py: Implements the Wikipedia interaction layer using LlamaIndex
classifier.py: Handles image processing and OpenAI API integration
main.py: Provides the Gradio interface and application entry point

Testing Resources

The images directory contains sample animal images from Unsplash (free license) for testing the classifier's capabilities.

Tech Stack

The project leverages several powerful technologies:

OpenAI: For multimodal vision and language capabilities
LlamaIndex: For structured retrieval from Wikipedia
Gradio: For rapid interface development
Pillow: For image processing

Future Improvements

Potential enhancements to consider:

Caching previously processed animals to reduce API costs
Expanding beyond Wikipedia to other knowledge sources
Adding capability to compare multiple animals
Implementing offline recognition for common species

Documentation References

This project represents a practical intersection of multiple AI capabilities - image recognition, knowledge retrieval, and natural language processing. While relatively simple in scope, it demonstrates how powerful AI interfaces can be created with surprisingly little code.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
classifier.py		classifier.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Animals Classifier 🐾

Architecture Overview

Getting Started

Prerequisites

Installation & Setup

Technical Implementation

Testing Resources

Tech Stack

Future Improvements

Documentation References

About

Languages

License

poacosta/animals-classifier

Folders and files

Latest commit

History

Repository files navigation

Animals Classifier 🐾

Architecture Overview

Getting Started

Prerequisites

Installation & Setup

Technical Implementation

Testing Resources

Tech Stack

Future Improvements

Documentation References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages