Voice-to-Action App

Author: Yasharth Bajpai

Date: February 21, 2025

Project Overview

The Voice-to-Action App is a React Native application that allows users to record audio, transcribe it into text, and extract actionable insights such as meeting details, key points, decisions, and follow-up tasks. The app integrates with Google Speech-to-Text API for transcription and Perplexity AI for advanced analysis of the transcript.

This project is designed to streamline meeting management by providing users with:

A transcript of the recording.
Structured meeting summaries.
Actionable to-do lists.
Email integration for sharing meeting details.

Features

Audio Recording

Record audio directly within the app using the device's microphone.
Supports both iOS and Android platforms.

Speech-to-Text Transcription

Converts recorded audio into text using the Google Speech-to-Text API.

AI-Powered Insights

Extracts structured data from transcripts using Perplexity AI, including:
- Meeting details (date, time, participants).
- Key discussion points.
- Decisions made.
- Action items with assignees and deadlines.

Email Sharing

Compose and send meeting summaries via email directly from the app.

Editable Meeting Details

Modify meeting details and tasks through an intuitive edit modal.

Dark-Themed UI

A visually appealing dark mode design for better user experience.

Demo Video

Here is my main link where I have helped people understand how my project works by giving a hands-on demo:
Watch Demo

Code Walkthrough

Here is my second link where I have explained how the code works in the backend and frontend, line by line:
Watch Code Walkthrough

Technologies Used

Frontend

React Native
Expo AV (for audio recording)
Axios (for backend communication)
React Native Modal (for modals)
React Native Linking (for email integration)

Backend

Node.js with Express.js
Google Speech-to-Text API
Perplexity AI API

Installation Guide

Prerequisites

Install Node.js and npm.
Install Expo CLI for running the React Native app.
Set up a Google Cloud account and enable the Speech-to-Text API.
Obtain an API key from Perplexity AI.

Steps

Backend Setup

Clone the repository and navigate to the backend folder:
```
git clone 
cd backend
```
Install dependencies:
```
npm install
```
Add your Google Cloud credentials JSON file (myjson.json) to the backend folder.
Replace perplexityApikey in transcriptionService.js with your Perplexity AI API key.
Start the server:
```
node index.js
```

Frontend Setup

Navigate to the frontend folder:
```
cd frontend
```
Install dependencies:
```
npm install
```
Update the myip variable in App.js with your local machine's IP address (e.g., 192.168.x.x).
Start the Expo development server:
```
expo start
```
Scan the QR code from Expo on your mobile device or run it on an emulator.

Usage Instructions

Launch the app on your mobile device or emulator.
Tap "🎤 Start Recording" to begin recording audio.
Tap "⏹️ Stop Recording" to stop recording and process the audio.
View the generated transcript and structured insights (meeting details, tasks, etc.).
Edit any details by tapping the edit icon ✏️.
Share meeting summaries via email by tapping the email icon 📧.

Folder Structure

project/
├── backend/
│   ├── index.js                # Main server file
│   ├── transcriptionService.js # Handles transcription and AI analysis
│   └── myjson.json             # Google Cloud credentials file (not included in repo)
├── frontend/
│   ├── App.js                  # Main React Native app file
│   ├── styles.js               # Styling for UI components
│   └── assets/                 # App assets (if any)
└── README.md                   # Documentation file

Screenshots

Home Screen	Transcript & Meeting Details

Future Enhancements

Add support for multiple languages in transcription.
Integrate calendar APIs (e.g., Google Calendar) for automatic event creation.
Enable offline transcription using local models.

Contact

For any questions or feedback, feel free to reach out:
Yasharth Bajpai
Email: yasharthbajpai0103@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Backend		Backend
Frontend		Frontend
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice-to-Action App

Author: Yasharth Bajpai

Project Overview

Features

Audio Recording

Speech-to-Text Transcription

AI-Powered Insights

Email Sharing

Editable Meeting Details

Dark-Themed UI

Demo Video

Code Walkthrough

Technologies Used

Frontend

Backend

Installation Guide

Prerequisites

Steps

Backend Setup

Frontend Setup

Usage Instructions

Folder Structure

Screenshots

Future Enhancements

Contact

About

Releases

Packages

Languages

License

yasharthbajpai/speech-to-text

Folders and files

Latest commit

History

Repository files navigation

Voice-to-Action App

Author: Yasharth Bajpai

Project Overview

Features

Audio Recording

Speech-to-Text Transcription

AI-Powered Insights

Email Sharing

Editable Meeting Details

Dark-Themed UI

Demo Video

Code Walkthrough

Technologies Used

Frontend

Backend

Installation Guide

Prerequisites

Steps

Backend Setup

Frontend Setup

Usage Instructions

Folder Structure

Screenshots

Future Enhancements

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages