This repository contains both a curated dialogue dataset and the tools used to generate it. The project aims to improve AI-human interactions through structured, empathetic conversation patterns.
- Dataset: A collection of dialogues demonstrating the React-Respond-Reflect framework
- Generation Tools: Python scripts for creating and processing dialogue data
- Training Pipeline: Scripts for fine-tuning models on the RRR framework
- Deployment Tools: Docker and API integration for serving trained models
- Streaming API: Real-time text generation with the RRR format
The dataset contains dialogues between users and a virtual human, where each response follows a three-part structure:
- React: Physical/emotional reactions expressed through actions and body language
- Respond: The actual verbal response to the user
- Reflect: Internal thoughts and analysis of the conversation
{
"conversation_id": "unique_id",
"messages": [
{
"role": "user",
"content": "user message"
},
{
"role": "assistant",
"content": "virtual human response with react/respond/reflect tags"
}
],
"num_turns": "number of back-and-forth exchanges"
}
- Work-related stress and challenges
- Personal development and growth
- Technical learning and coding
- Time management and productivity
- Interpersonal relationships
- Mental health and wellbeing
-
seed_dialogues_generate_dataset.py
- Generates dialogues using GPT-4-mini
- Batch processing with progress tracking
- Temperature-based randomization
- Automatic validation
-
seed_dialogues_convert_to_hf.py
- Converts to HuggingFace format
- Generates dataset statistics
- Handles dataset publishing
-
seed_dialogues_validate_tags.py
- Validates XML-style tags
- Fixes formatting issues
- Provides detailed reporting
-
seed_dialogues_save_curated.py
- Handles manual curation workflow
- Creates automatic backups
- Preserves dialogue structure
The rrr_train.py
script provides a complete pipeline for fine-tuning models on the RRR framework:
- Uses Unsloth for efficient training
- Supports LoRA fine-tuning
- Implements ChatML format
- Validates RRR format in outputs
- Optimized for consumer GPUs
- Install dependencies:
pip install -r requirements.txt
- Configure environment:
cp .env.example .env
# Add your API keys:
# - OPENAI_API_KEY: For dialogue generation
# - HF_TOKEN: For HuggingFace upload
- Run tools:
# Generate dialogues
python seed_dialogues_generate_dataset.py
# Convert to HuggingFace format
python seed_dialogues_convert_to_hf.py
# Validate tags
python seed_dialogues_validate_tags.py
# Save curated dialogues
python seed_dialogues_save_curated.py
# Train the model
python rrr_train.py
We've fine-tuned a Mistral-7B model to follow the React-Respond-Reflect framework, creating a conversational AI that provides structured, empathetic responses.
- Format Adherence: Consistently follows the three-part structure
- Reasoning Quality: Demonstrates thoughtful internal processing
- Response Quality: Provides helpful, contextually appropriate answers
- Reflection Depth: Shows self-awareness and conversation analysis
- Fast Response: Generates complete responses in ~3 seconds
from transformers import AutoTokenizer, AutoModelForCausalLM
from unsloth import FastLanguageModel
# Load the model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="rrr_model", # Local path or HuggingFace repo
max_seq_length=2048,
load_in_4bit=True,
)
# Apply chat template
from unsloth.chat_templates import get_chat_template
tokenizer = get_chat_template(
tokenizer,
chat_template="chatml",
map_eos_token=True,
)
# Prepare for inference
FastLanguageModel.for_inference(model)
# Generate a response
messages = [{"role": "user", "content": "I'm feeling anxious about my job interview tomorrow. Any advice?"}]
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to("cuda")
outputs = model.generate(
input_ids=inputs,
max_new_tokens=2048,
temperature=0.7,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
print(response)
from datasets import load_dataset
dataset = load_dataset("leonvanbokhorst/react-respond-reflect-dialogues-v2")
- Training conversational AI models
- Studying empathetic response patterns
- Analyzing structured dialogue frameworks
- Developing emotional intelligence in chatbots
We've conducted comprehensive benchmarking of our fine-tuned React-Respond-Reflect model against a baseline model (Mistral-7B). The results demonstrate significant improvements in format compliance, response quality, and processing speed.
Metric | Fine-tuned Model | Baseline Model |
---|---|---|
Format Compliance | 100.0% | 0.0% |
Reasoning Quality | 7.1/10 | 0.0/10 |
Response Quality | 6.7/10 | 0.0/10 |
Reflection Depth | 7.1/10 | 0.0/10 |
Response Time | 2.7s | 45.7s |
Metric | Score |
---|---|
BLEU | 0.044 |
ROUGE-1 | 0.236 |
ROUGE-2 | 0.053 |
ROUGE-L | 0.179 |
METEOR | 0.168 |
BERTScore Precision | 0.234 |
BERTScore Recall | 0.152 |
BERTScore F1 | 0.194 |
Semantic Similarity | 0.378 |
The fine-tuned model shows varying performance across different prompt categories:
Category | Format Compliance | Reasoning | Response | Reflection | BLEU | ROUGE-1 | METEOR | Semantic Similarity |
---|---|---|---|---|---|---|---|---|
General | 100.0% | 8.0/10 | 8.0/10 | 7.6/10 | 0.054 | 0.279 | 0.205 | 0.435 |
Emotional Support | 100.0% | 8.0/10 | 8.0/10 | 7.2/10 | 0.063 | 0.278 | 0.205 | 0.401 |
Practical Advice | 100.0% | 7.3/10 | 6.4/10 | 7.4/10 | 0.038 | 0.210 | 0.156 | 0.420 |
Philosophical | 100.0% | 7.2/10 | 6.9/10 | 7.6/10 | 0.043 | 0.240 | 0.161 | 0.358 |
Adversarial | 100.0% | 4.8/10 | 3.8/10 | 5.6/10 | 0.020 | 0.173 | 0.109 | 0.262 |
The model performs best on general queries and emotional support, while maintaining good performance on philosophical and practical advice prompts. As expected, adversarial prompts designed to challenge the model show lower performance metrics, though the model still maintains format compliance.
The benchmark evaluates model performance across several dimensions:
- Format Compliance: Percentage of responses that correctly follow the React-Respond-Reflect format.
- Quality Metrics: Human-evaluated scores (0-10) for reasoning quality, response quality, and reflection depth.
- Response Time: Average time to generate a complete response.
- NLP Metrics: BLEU, ROUGE, METEOR, BERTScore, and semantic similarity compared to reference responses.
- Category Performance: Analysis across different prompt types (general, emotional support, practical advice, philosophical, adversarial).
The benchmark uses 50 diverse prompts across categories, comparing the fine-tuned model against a baseline Mistral-7B model without fine-tuning.
- Follow PEP 8 style guide
- Use type hints (PEP 484)
- Add Google-style docstrings
- Run validation before committing
@dataset{react_respond_reflect_dialogues,
author = {van Bokhorst, Leon},
title = {React-Respond-Reflect Dialogues Dataset},
year = {2025},
publisher = {HuggingFace},
version = {2.0},
url = {https://huggingface.co/datasets/leonvanbokhorst/react-respond-reflect-dialogues-v2}
}
This repository includes a comprehensive deployment solution for LoRA fine-tuned models using FastAPI. This provides an efficient, scalable API for serving your model in production applications.
- Merged Model Serving: Efficiently serve merged LoRA models
- Streaming Support: Real-time token-by-token generation
- Performance Monitoring: Built-in metrics for tracking inference performance
- Docker Ready: Containerized deployment with Docker and docker-compose
- API Documentation: Auto-generated FastAPI documentation
-
Clone the repository:
git clone https://github.com/yourusername/react-respond-reflect.git cd react-respond-reflect
-
Install dependencies:
pip install -r requirements.txt
-
Merge your LoRA model with the base model:
./merge_and_build.sh --base-model mistralai/Mistral-7B-Instruct-v0.3 --adapter ./rrr_model --output-path ./rrr_model_merged
-
Build and run with Docker Compose:
./rebuild_and_restart.sh
-
Check the API status:
curl http://localhost:7000/health
GET /health
Response:
{
"status": "ok",
"model_loaded": true
}
POST /generate
Request body:
{
"prompt": "I'm feeling stressed about my upcoming presentation. Can you help?",
"max_new_tokens": 512,
"temperature": 0.7,
"top_p": 0.9,
"top_k": 40,
"repetition_penalty": 1.1,
"stream": false
}
Response:
{
"generated_text": "<react>*sits with a calm, supportive posture*</react>\n<respond>It's completely normal to feel stressed about presentations. Try these steps: 1) Practice your presentation multiple times, 2) Visualize success, 3) Take deep breaths before starting, 4) Remember that the audience wants you to succeed. You've got this!</respond>\n<reflect>Their stress about the presentation suggests they might benefit from both practical advice and emotional reassurance. Offering concrete steps helps make the task manageable.</reflect>",
"metadata": {
"input_tokens": 58,
"output_tokens": 86,
"inference_time_seconds": 2.78,
"tokens_per_second": 30.91
},
"error": null
}
For real-time token-by-token generation, set stream: true
in your request:
{
"prompt": "Tell me about the benefits of meditation",
"temperature": 0.5,
"stream": true
}
The API will return a stream of JSON objects, each containing a token and a finished flag:
{"token":"<", "finished":false}
{"token":"react", "finished":false}
{"token":">", "finished":false}
// ... more tokens ...
{"token":"", "finished":true, "metadata":{"input_tokens":48, "output_tokens":82, "inference_time_seconds":3.47, "tokens_per_second":23.64}}
Use the provided test scripts to test the API:
# Test basic generation
python test_api.py --prompt "I need help with my presentation delivery. I get very nervous."
# Test streaming generation
python test_streaming.py --prompt "Tell me about the benefits of meditation" --temperature 0.5
The API includes built-in validation to ensure responses follow the React-Respond-Reflect format:
- Responses must include
<react>
,<respond>
, and<reflect>
tags - Tags must appear in the correct order
- Each tag must have a corresponding closing tag
- The content between tags must be non-empty
This validation ensures consistent, high-quality responses that follow the RRR framework.
This project is licensed under the MIT License - see the LICENSE file for details.