FastAPI Backend for a Conversational Agent using Cohere, (Azure) OpenAI, Langchain & Langgraph and Qdrant as VectorDB

[](https://github.com/astral-sh/ruff)
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
[](https://github.com/j178/prek)
# Conversational RAG Agent
This is a Rest-Backend for a Conversational Agent, that allows you to embed Documents, search for them using Semantic Search, to QA based on Documents and do document processing with Large Language Models.

## Table of Contents
- [Conversational RAG Agent](#conversational-rag-agent)
- [Table of Contents](#table-of-contents)
- [LLMs and Backend Providers](#llms-and-backend-providers)
- [Quickstart](#quickstart)
- [Project Description](#project-description)
- [What is RAG?](#what-is-rag)
- [Tracing](#tracing)
- [Semantic Search](#semantic-search)
- [Hybrid Search](#hybrid-search)
- [Architecture](#architecture)
- [Installation \& Development Backend](#installation--development-backend)
- [Load Demo Data](#load-demo-data)
- [Development Frontend](#development-frontend)
- [Qdrant API Key](#qdrant-api-key)
- [Testing the API](#testing-the-api)
- [Star History](#star-history)
## LLMs and Backend Providers
I have decided to stop creating different services for different provider and switching to LiteLLM which allows to use basically every provider you want.
Some providers i would recommend are:
- [Cohere](https://cohere.com/) Awesome models and great free tier.
- [Ollama](https://ollama.com/) If you want to keep your data your data.
- [Google AI Studio](https:/Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.