AI-Tutor is a modular educational assistant that leverages advanced LLMs and agentic AI workflows to help students learn science and technology. It integrates LangChain for LLM orchestration, LangGraph for agent execution, LangSmith for monitoring and analytics, FAISS for vector-based retrieval, and Gradio for a user-friendly web interface. Student
# π§βπ« AI Tutor (Nemo) [](https://www.python.org/) [](https://www.langchain.com/) [](https://www.langchain.com/langgraph) [](https://www.langchain.com/langsmith) [](https://www.gradio.app/) [](https://faiss.ai/) [](LICENSE) --- ## π Project Overview **AI Tutor (Nemo)** is an intelligent, interactive tutor built for **science and technology students**. It combines **Large Language Models (LLMs)** with **agentic workflows** to provide two core capabilities: - **π¬ PDF Tutoring (RAG)**: Answers questions from uploaded textbooks/notes (retrieval mode). - **π₯ Video Summarization**: Transcribes and summarizes uploaded video files or YouTube links. This system is designed to **simulate a personalized AI tutor**: patient, accurate, and adapted to a studentβs multi-modal learning materials. --- ## π₯ Demo Video Summarizer(New feature) https://github.com/user-attachments/assets/4e5c3307-2cc4-4eee-8eae-d26be04b7322 ## π₯ Demo PDF Tutor(old feature) https://github.com/user-attachments/assets/87b4eacf-5509-4cf5-afe1-36eec3fa2b2b --- ## β‘ Key Features ### PDF Tutoring (RAG) - π€ **LLM-powered tutoring** using Google Generative AI (Gemini). - π§ **Context-aware agents** with **LangChain** and **LangGraph** to manage reasoning steps. - π **PDF knowledge ingestion**: Extracts content, splits it, and stores embeddings in **FAISS** for retrieval. - π **Math rendering with LaTeX**: All formulas are displayed cleanly in Mark
Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.