agenticAI_pipeline

Name: agenticAI_pipeline
Author: bujo-eayn

bujo-eayn June 10, 2025

3 copies 0 downloads

A modular multi-agent AI system that performs deep scientific research using a supervisor-worker architecture. It combines foundational and specialized language models to reason, plan, and execute tasks for document and chart analysis in scientific domains.

Agentic Document Intelligence with GPT-4 & SmolDocling

This project is an agentic AI pipeline that uses GPT-4 as the primary agent and integrates SmolDocling as a tool to deeply analyze uploaded documents. It allows users to upload various document formats and extract structured content automatically, with an intelligent evaluation and feedback loop.

🚀 Features

Upload documents in multiple formats (PDF, Word, etc.)
Automatic conversion to PDF if needed
Dual extraction using GPT-4 and SmolDocling
Evaluation of extracted content using BLEU, overlap, and Jaccard similarity
Iterative feedback to SmolDocling to improve accuracy
Final structured output in Word and PDF formats
User prompt execution on final extracted document
Streamlit UI for ease of use
Dockerized for simple deployment

🧠 Pipeline Overview

Upload Document: User uploads a file and provides a prompt.
Preprocessing: The document is converted to PDF (if not already).
GPT-4 Extraction: Extracts text, tables, images, and structural elements.
SmolDocling Extraction: Sends static prompt to SmolDocling backend for extraction.
Evaluation: Compares GPT-4 vs SmolDocling outputs using:
- Textual overlap ratio
- BLEU score
- Jaccard similarity
Consistency Check:
- ✅ If consistent: build final doc and apply user prompt.
- ❌ If inconsistent: identify differences and retry SmolDocling with feedback.
Final Output: Assemble and export the cleaned, structured document.

🗂️ Folder Structure

.
├── app.py                  # Streamlit UI
├── Dockerfile              # Docker configuration
├── requirements.txt
│
├── graph/                  # LangGraph logic
│   ├── graph_builder.py
│   └── nodes/              # Nodes in the pipeline
│       ├── user_input.py
│       ├── preprocess_doc.py
│       ├── gpt_extract.py
│       ├── smoldocling_call.py
│       ├── evaluate.py
│       ├── retry_node.py
│       ├── fina

Comments

More Agents

View all

agentic-ai

Agentsmith

Universal, model-agnostic operating harness for AI agents (Claude, Codex, Gemini, …) — a lean core + work-type profiles assembled by one setup script.

PromptPartner

308

agent-skills

Awesome Gamedev Agent Skills

Game-development Agent Skills for AI coding agents: install once and a master router loads the right skill for your engine and task. 66 original, version-pinned skills (plus a master router) in the portable SKILL.md format that runs across Claude Code, Cursor, Codex, Copilot, Gemini CLI and more, for Godot, Unity, Unreal, web and beyond.

gamedev-skills

303

ai-agents

Agentpet

A desktop pet for macOS & Windows that monitors your AI coding agents (Claude Code, Codex, Cursor, Gemini...) in real time, and grows as you code, feed it tokens, level it up, climb the leaderboard.

ntd4996

279

ai-agent

UltraGameStudio

UltraGameStudio - AI coding agent for game development: engine workflows, gameplay code, and asset generation.

wellingfeng

260

Zero

The coding agent that answers to you, your model, your machine, your rules.

Gitlawb

1,099

agent-bridge

Lucarne

Stop babysitting local AI agents. Just notifications, approve, and resume your Codex,Pi,Grok, or Claude code sessions anywhere. 0-Intrusion mobile control bridge via Telegram/微信/飞书. No hooks, no skills, no MCP.

tuchg

314