browser-control

Name: browser-control
Author: dhamaniasad

dhamaniasad April 12, 2025

10 copies 0 downloads

An Agentic AI Chrome extension to automate interactions with the browser with natural language instructions. Uses Gemini, coded by AI

Browser Control Agent

A Chrome extension that enables AI-powered browser automation through natural language commands.

Overview

Browser Control Agent is a Chrome extension that allows you to automate web-based tasks using natural language instructions. Simply tell the agent what you want to accomplish (e.g., "find 10 properties to stay in Chiang Mai in September"), and it will intelligently navigate websites and perform actions on your behalf.

Key Features

Natural Language Control: Interact with your browser using everyday language
Multimodal Understanding: The agent processes both visual and textual content to understand web pages
Adaptive Automation: Intelligently handles dynamic web content through an iterative decision-making process
Privacy-Focused: All processing happens client-side within your browser

How It Works

You provide a high-level goal through the side panel chat interface
The AI agent analyzes the current webpage using text and screenshots
It decides on the appropriate action (clicking, typing, scrolling, etc.)
After executing the action, it re-evaluates the page state and determines the next step
This loop continues until your goal is achieved

Technology

Browser Control Agent leverages Google's Gemini 1.5 multimodal language model to understand web content and determine the most appropriate actions to take.

Installation

Clone this repository
Install dependencies: npm install
Build the extension: npm run build
Load the extension in Chrome:
- Go to chrome://extensions/
- Enable "Developer mode"
- Click "Load unpacked" and select the dist folder

Usage

Click the Browser Control Agent icon in your Chrome toolbar to open the side panel
Enter your API key in the options page (accessible via the side panel)
Navigate to a website where you want to automate tasks
Enter your goal in the chat interface and watch as the agent works for you

License

Comments

More Agents

View all

agentic-ai

Agentsmith

Universal, model-agnostic operating harness for AI agents (Claude, Codex, Gemini, …) — a lean core + work-type profiles assembled by one setup script.

PromptPartner

308

agent-skills

Awesome Gamedev Agent Skills

Game-development Agent Skills for AI coding agents: install once and a master router loads the right skill for your engine and task. 66 original, version-pinned skills (plus a master router) in the portable SKILL.md format that runs across Claude Code, Cursor, Codex, Copilot, Gemini CLI and more, for Godot, Unity, Unreal, web and beyond.

gamedev-skills

303

ai-agents

Agentpet

A desktop pet for macOS & Windows that monitors your AI coding agents (Claude Code, Codex, Cursor, Gemini...) in real time, and grows as you code, feed it tokens, level it up, climb the leaderboard.

ntd4996

279

ai-agent

UltraGameStudio

UltraGameStudio - AI coding agent for game development: engine workflows, gameplay code, and asset generation.

wellingfeng

260

Zero

The coding agent that answers to you, your model, your machine, your rules.

Gitlawb

1,099

agent-bridge

Lucarne

Stop babysitting local AI agents. Just notifications, approve, and resume your Codex,Pi,Grok, or Claude code sessions anywhere. 0-Intrusion mobile control bridge via Telegram/微信/飞书. No hooks, no skills, no MCP.

tuchg

314