YT Bot using RAG

Apr 15, 2025 ยท 2 min read
Brain Tumor Classification

๐ŸŽฅ YouTube RAG Assistant

A Streamlit-based app that lets you ask questions or get a summary of any YouTube video using its transcript, powered by RAG (Retrieval-Augmented Generation).


๐Ÿš€ Features

  • ๐Ÿ”— Paste a YouTube URL and automatically extract the transcript
  • ๐ŸŒ Supports multi-language transcripts (auto-translates to English for Q&A)
  • ๐Ÿค– Choose between:
    • OpenAI (GPT)
    • Gemini (Google)
    • Ollama (local LLMs like llama3)
  • ๐Ÿ’ฌ Ask custom questions or auto-summarize the entire video

๐Ÿ”ง Setup Instructions

1. Clone the repo & install requirements

git clone https://github.com/Chaganti-Reddy/YT_RAG.git
cd YT_RAG
pip install -r requirements.txt

2. Run the app

streamlit run app.py

Model Support

Currently, only the following models are supported:

ModelNeeds API KeyNotes
OpenAIโœ… YesUse GPT-3.5 or GPT-4
Geminiโœ… YesUses raw API (faster than LangChain wrapper)
OllamaโŒ NoRequires ollama installed locally & model pulled (ollama run llama3)

Folder Structure

YT_RAG/
โ”œโ”€โ”€ app.py                โ† Streamlit UI
โ”œโ”€โ”€ query_engine.py       โ† QA chain logic
โ”œโ”€โ”€ model_selector.py     โ† Model switcher
โ”œโ”€โ”€ retriever_utils.py    โ† Vector store builder
โ”œโ”€โ”€ transcript_loader.py  โ† Transcript & translator
โ”œโ”€โ”€ gemini_direct.py      โ† Direct Gemini API call
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ README.md

Tips

  1. Your questions must be in English (auto translation of transcript is handled).
  2. Gemini API quotas are low on free tier โ€” fallback to Ollama if needed.
  3. Works great with videos that have auto-generated subtitles.