Agentic Browser

Documentation

Back to Home
Home Projects Agentic Browser INDEX

INDEX

Welcome to the Agentic Browser comprehensive documentation. This document serves as the unified index for all documentation, consolidating content from both the new topical organization and the detailed numbered guides.

Quick Navigation#

Core Documentation#

System Components#

Backend Architecture#

  • API Server - FastAPI router-service-tool architecture

    • API endpoints and route definitions

    • Service layer business logic

    • Tool layer for external integrations

Agent System#

  • AI Agent System - LangGraph-based agent orchestration

    • React agent for conversational AI

    • Browser Use agent for script generation

    • Tool system and orchestration

    • Context management and state handling

Browser Extension#

  • Browser Extension - WebExtensions-based UI and automation

    • Extension architecture and messaging

    • Background and content scripts

    • Side panel UI components

    • Agent execution engine

    • WebSocket communication

    • Authentication system

MCP Server#

  • MCP Server - Model Context Protocol implementation

    • Tool definitions and standardization

    • LLM integration through MCP

    • Integration with external clients

Features & Integrations#

Service Integrations#

  • Service Integrations - External service connectivity

    • Gmail integration for email operations

    • Google Calendar integration for scheduling

    • GitHub integration for repository analysis

    • YouTube integration for video processing

    • Website analysis and content extraction

    • Academic portal integration

Data & Models#

  • Data Models and Schemas - API contracts and schemas

    • Request and response models

    • Service integration data structures

    • Agent communication formats

    • Validation and error handling

Configuration & Operations#

Reference#


Documentation Structure#

This consolidated documentation is organized into two parallel structures:

Topical Organization (Primary)#

The documentation is primarily organized by topic/component, making it easy to find information about specific features:

  • Overview and getting started materials at the top level

  • Component-specific sections in dedicated folders

  • Detailed guides and reference material within each component

Historical Reference (Numbered Sections)#

The original numbered documentation structure is preserved for reference:

  • Section 1: Overview and system architecture

  • Section 2: Installation and getting started

  • Section 3: Python backend API

  • Section 4: Agent intelligence system

  • Section 5: Browser extension

  • Section 6: Data models and API contracts

Both structures provide the same information—choose whichever navigation style works best for you.


Getting Started Paths#

For First-Time Users#

  1. Start with Project Overview to understand the system

  2. Follow Getting Started for installation and setup

  3. Explore System Architecture to understand how components interact

For Developers#

  1. Review Development Guidelines for contribution process

  2. Study relevant component documentation:

  3. Reference Data Models and Schemas for API contracts

For Operators#

  1. Read Configuration Management to set up your environment

  2. Review Deployment and Operations for production setup

  3. Use Troubleshooting and FAQ for common issues

For Integration#

  1. Review MCP Server for protocol-level integration

  2. Study Service Integrations for available features

  3. Check Data Models and Schemas for API contracts


Key Concepts#

Model-Agnostic Design#

The system supports multiple LLM providers (Google Gemini, OpenAI, Anthropic, Ollama, DeepSeek, OpenRouter) through a unified abstraction layer. Users supply their own API keys via environment variables (BYOK - Bring Your Own Keys).

Layered Architecture#

  • Frontend: Browser extension with background and content scripts

  • Backend: FastAPI server with service-oriented architecture

  • Agent Runtime: LangGraph-based agent orchestration

  • LLM Layer: Model-agnostic provider adapters

  • Safety Layer: Guardrails, logging, and user consent

Tool System#

The system provides 11+ specialized tools for web automation, content processing, and external service integration. Tools are dynamically constructed based on context and user authentication.

Declarative Action System#

Browser automation is achieved through JSON-based action plans generated by the LLM, ensuring safety and transparency in automated actions.


Technology Stack#

  • Language: Python 3.12+ (backend), TypeScript/React (frontend)

  • Agent Framework: LangChain, LangGraph

  • Web Framework: FastAPI, Uvicorn

  • Extension Framework: WXT (Web eXtension Tooling)

  • Browser APIs: WebExtensions API

  • Communication: MCP (Model Context Protocol), WebSocket, HTTP/REST

  • Content Processing: BeautifulSoup, html2text, yt-dlp

  • Security: python-dotenv, pycryptodome


Common Tasks#

Setting Up the Development Environment#

See: Getting Started and Development Guidelines

Configuring API Keys and Environment#

See: Configuration Management

Adding a New Service Integration#

See: Service Integrations and Development Guidelines

Understanding Agent Behavior#

See: AI Agent System and Prompts and Prompt Engineering

Debugging and Troubleshooting#

See: Troubleshooting and FAQ and System Architecture

Deploying to Production#

See: Deployment and Operations

Integrating with External Tools#

See: MCP Server and Data Models and Schemas


Contributing#

Before contributing, please review:

  1. Development Guidelines - Contribution process and standards

  2. Testing Strategy - Testing requirements

  3. Security Considerations - Security best practices


Version History#

This consolidated documentation combines:

  • New Documentation: Topical organization with component-based structure

  • Previous Documentation: Detailed numbered guides with implementation specifics

Both sources have been merged to provide comprehensive coverage of all aspects of the Agentic Browser system.


Additional Resources#


Last Updated: 2026 Documentation Status: Consolidated and Unified