A fully autonomous AI Agent/Python pipeline that utilizes Large Language Models (LLMs) like Gemini to generate content, produce videos, and automatically upload educational videos to YouTube.
# Gemini YouTube Automation The project includes a GitHub Actions workflow that runs daily at 7:00 AM UTC. It: - Generates lesson scripts using Gemini. - Produces long-form and short YouTube videos. - Uploads them automatically with appropriate thumbnails and metadata. ## Project Structure ```text gemini-youtube-automation/ ├── .github/ │ └── workflows/ │ └── main.yml # GitHub Actions workflow configuration ├── src/ # Source directory for Python modules │ ├── init.py # Initializes the 'src' package │ ├── generator.py # Code for generating content and video │ └── uploader.py # Code for uploading to YouTube ├── .gitignore # Files and directories to ignore in version control ├── content_plan.json # Contains topics for moving forward. ├── main.py # Main entry point to run the application └── requirements.txt # List of Python packages needed ``` ## Setup Instructions 1. **Clone the repository:** git clone https://github.com/ChaituRajSagar/gemini-youtube-automation.git cd gemini-youtube-automation 2. **Install dependencies:** Make sure you have Python installed, then run: pip install -r requirements.txt 3. **Configure YouTube API:** Follow the [YouTube API documentation](https://developers.google.com/youtube/v3) to set up your API credentials and update the necessary configurations in `uploader.py`. ## Usage To run the application, execute the following command: python main.py This will initiate the content generation and upload process. ## Contributing Contributions are welcome! Please open an issue or submit a pull request for any improvements or features. ## 📊 Daily Production Infographic Here's a visual summary of the bot's daily performance and workflow:  ## License This project is licensed under the MIT License. See the LICENSE file for details.
Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.