A step by step implementation of building an AI agent that plays 3d shooting game
<!-- omit in toc --> # AI Agent that plays 3D shooting game [](https://www.python.org/downloads/release/python-3100/) [](https://www.together.ai/) [](https://gemini.com) [](https://mistral.ai) [](https://openai.com/) [](https://medium.com/@fareedkhandev/creating-an-ai-agent-that-play-first-person-shooting-game-c4996168ed65) AI agents are now also becoming part of games. The way they interact with the environment, take actions, and make decisions all depends on proper thinking and proper usage of LLMs/multimodals. So, to learn how AI agents work in a gaming environment: 1. First, we are going to **create a first-person 3D action shooting game** using the Python Ursina engine. 2. Then we will create a **multi-AI agent** based on sub-agents using LLM multimodals, like **Gemini V3** for text and **Mistral 3.1** for vision capability, combined together to form a powerful multi-agent system that will play the game for us and win it. Take a look at how our AI agent (Gemini + Mistral) is playing our own 3D action game, failing at the first attempt but winning after the second attempt.   We will code everything from scratch, all the way from building the 3D game engine to creating the AI agent using an LLM and a vision-language model. <!-- omit in t
Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.