Project that scape all github repositories, read them, analyse with an AI and return you on a discord webhook project that could interest you.
# GitHub Scraping AI
AI-powered GitHub project discovery. Automatically finds interesting repositories matching your interests and sends daily notifications to Discord.
## Features
- **Smart Filtering**: Uses LLM to evaluate repositories against your natural language interests
- **Keyword Search**: Filter GitHub search with custom keywords (OR logic)
- **Fork Exclusion**: Automatically filters out forked repositories
- **Multi-Provider Support**: Works with OpenAI, Anthropic (Gemini), or Google (Gemini)
- **Async & Concurrent**: Fetches READMEs concurrently for faster processing
- **Daily Discovery**: Fetches top starred repos from the last 24 hours
- **Discord Notifications**: Rich embeds with repo details and AI reasoning
- **Rejected Repos Log**: Logs rejected repos with reasons for prompt fine-tuning
- **Deduplication**: Tracks seen repos to avoid duplicates
- **Dry Run Mode**: Test without sending to Discord
## How It Works
```
GitHub API → Fetch top 1000 new repos → LLM evaluates each against your prompt → Discord notification
```
1. Queries GitHub for repositories created in the last 24h, sorted by stars
2. Fetches README excerpts for context
3. Sends each repo + your interests to an LLM for evaluation
4. Posts matching repos to Discord with the AI's reasoning
5. Caches seen repos to prevent duplicates
## Quick Start
### Prerequisites
- Python 3.13+
- [uv](https://docs.astral.sh/uv/) package manager
- API key for one of: OpenAI, Anthropic, or Google
- GitHub personal access token
- Discord webhook URL
### Installation
```bash
# Clone the repository
git clone https://github.com/yourusername/github-scraping-agent-ai.git
cd github-scraping-agent-ai
# Install dependencies
uv sync
```
### Configuration
1. **Copy example files:**
```bash
cp config.example.json config.json
cp prompt.example.md prompt.md
```
2. **Edit `config.json`** with your credentials:
```json
{
"github": {
"token": "ghp_your_github_token",
"keyGoogle's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.