Puppeteer & Crawl4AI microservice for web automation, scraping, and AI processing with Bull queues
# PuppetMaster 🤖
A powerful microservice for web automation, scraping, and data processing, integrating Puppeteer for browser control and Crawl4AI for advanced crawling and AI-powered extraction.
[<img src="https://devin.ai/assets/askdeepwiki.png" alt="Ask https://DeepWiki.com" height="20"/>](https://deepwiki.com/mzazakeith/PuppetMaster)
## Features
- **Puppeteer Core:**
- 🌐 Headless browser automation with Puppeteer and Chromium
- 🖱️ Standard browser interactions: navigate, click, type, scroll, select
- 🖼️ Screenshot generation (full page or element)
- 📄 PDF generation
- ⚙️ Custom JavaScript evaluation
- **Crawl4AI Integration:**
- 🕷️ Advanced crawling strategies (schema-based, LLM-driven)
- 🧩 Flexible data extraction (CSS, XPath, LLM)
- 🧠 Dynamic schema generation using LLMs
- ✅ Content verification
- 🔗 Deep link crawling
- ⏳ Element waiting and filtering
- 📄 PDF text extraction
- 📝 Webpage to Markdown conversion
- 🌐 Webpage to PDF conversion (via Crawl4AI)
- **System:**
- 🔄 Bull queue system for robust job management (separate queues for Puppeteer & Crawl4AI)
- 📊 MongoDB for job persistence, status tracking, and results storage
- 💾 Local file storage for generated assets (screenshots, PDFs, Markdown files)
- 📈 API endpoints for job management and queue monitoring
## Key Technologies
* **Backend:** Node.js, Express.js
* **Web Automation:** Puppeteer
* **Crawling & AI:** Python, FastAPI, Crawl4AI
* **Job Queue:** BullMQ, Redis
* **Database:** MongoDB (with Mongoose)
* **Language:** JavaScript, Python
## Installation
### Prerequisites
- Node.js (v18 or later recommended)
- npm or yarn
- Python (v3.8 or later recommended)
- pip
- MongoDB (local instance or Atlas)
- Redis (local instance or cloud provider)
### Setup
1. **Clone the repository:**
```bash
git clone <repository-url>
cd PuppetMaster
```
2. **Install Node.js dependencies:**
```bash
npm install
# or
Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.