INDEX
Welcome to the Agentic Browser comprehensive documentation. This document serves as the unified index for all documentation, consolidating content from both the new topical organization and the detailed numbered guides.
Core Documentation#
Project Overview - System architecture, core components, and capabilities
Getting Started - Installation, setup, and quick start guide
System Components#
Backend Architecture#
API Server - FastAPI router-service-tool architecture
API endpoints and route definitions
Service layer business logic
Tool layer for external integrations
Agent System#
AI Agent System - LangGraph-based agent orchestration
React agent for conversational AI
Browser Use agent for script generation
Tool system and orchestration
Context management and state handling
Browser Extension#
Browser Extension - WebExtensions-based UI and automation
Extension architecture and messaging
Background and content scripts
Side panel UI components
Agent execution engine
WebSocket communication
Authentication system
MCP Server#
MCP Server - Model Context Protocol implementation
Tool definitions and standardization
LLM integration through MCP
Integration with external clients
Features & Integrations#
Service Integrations#
Service Integrations - External service connectivity
Gmail integration for email operations
Google Calendar integration for scheduling
GitHub integration for repository analysis
YouTube integration for video processing
Website analysis and content extraction
Academic portal integration
Data & Models#
Data Models and Schemas - API contracts and schemas
Request and response models
Service integration data structures
Agent communication formats
Validation and error handling
Configuration & Operations#
Configuration Management - Environment setup and configuration
Deployment and Operations - Production deployment
Development Guidelines - Contribution and development practices
Testing Strategy - Testing approach and utilities
Reference#
System Architecture - Detailed system design
Tool System - Tool definitions and framework
Prompts and Prompt Engineering - LLM prompt strategies
Security Considerations - Security best practices
Troubleshooting and FAQ - Common issues and solutions
This consolidated documentation is organized into two parallel structures:
Topical Organization (Primary)#
The documentation is primarily organized by topic/component, making it easy to find information about specific features:
Overview and getting started materials at the top level
Component-specific sections in dedicated folders
Detailed guides and reference material within each component
Historical Reference (Numbered Sections)#
The original numbered documentation structure is preserved for reference:
Section 1: Overview and system architecture
Section 2: Installation and getting started
Section 3: Python backend API
Section 4: Agent intelligence system
Section 5: Browser extension
Section 6: Data models and API contracts
Both structures provide the same information—choose whichever navigation style works best for you.
For First-Time Users#
Start with Project Overview to understand the system
Follow Getting Started for installation and setup
Explore System Architecture to understand how components interact
For Developers#
Review Development Guidelines for contribution process
Study relevant component documentation:
Backend: API Server and AI Agent System
Frontend: Browser Extension
Reference Data Models and Schemas for API contracts
For Operators#
Read Configuration Management to set up your environment
Review Deployment and Operations for production setup
Use Troubleshooting and FAQ for common issues
For Integration#
Review MCP Server for protocol-level integration
Study Service Integrations for available features
Check Data Models and Schemas for API contracts
Model-Agnostic Design#
The system supports multiple LLM providers (Google Gemini, OpenAI, Anthropic, Ollama, DeepSeek, OpenRouter) through a unified abstraction layer. Users supply their own API keys via environment variables (BYOK - Bring Your Own Keys).
Layered Architecture#
Frontend: Browser extension with background and content scripts
Backend: FastAPI server with service-oriented architecture
Agent Runtime: LangGraph-based agent orchestration
LLM Layer: Model-agnostic provider adapters
Safety Layer: Guardrails, logging, and user consent
Tool System#
The system provides 11+ specialized tools for web automation, content processing, and external service integration. Tools are dynamically constructed based on context and user authentication.
Declarative Action System#
Browser automation is achieved through JSON-based action plans generated by the LLM, ensuring safety and transparency in automated actions.
Language: Python 3.12+ (backend), TypeScript/React (frontend)
Agent Framework: LangChain, LangGraph
Web Framework: FastAPI, Uvicorn
Extension Framework: WXT (Web eXtension Tooling)
Browser APIs: WebExtensions API
Communication: MCP (Model Context Protocol), WebSocket, HTTP/REST
Content Processing: BeautifulSoup, html2text, yt-dlp
Security: python-dotenv, pycryptodome
Setting Up the Development Environment#
See: Getting Started and Development Guidelines
Configuring API Keys and Environment#
Adding a New Service Integration#
See: Service Integrations and Development Guidelines
Understanding Agent Behavior#
See: AI Agent System and Prompts and Prompt Engineering
Debugging and Troubleshooting#
See: Troubleshooting and FAQ and System Architecture
Deploying to Production#
See: Deployment and Operations
Integrating with External Tools#
See: MCP Server and Data Models and Schemas
Before contributing, please review:
Development Guidelines - Contribution process and standards
Testing Strategy - Testing requirements
Security Considerations - Security best practices
This consolidated documentation combines:
New Documentation: Topical organization with component-based structure
Previous Documentation: Detailed numbered guides with implementation specifics
Both sources have been merged to provide comprehensive coverage of all aspects of the Agentic Browser system.
GitHub Repository: https://github.com/tashifkhan/agentic-browser
Issue Tracker: https://github.com/tashifkhan/agentic-browser/issues
Discussions: https://github.com/tashifkhan/agentic-browser/discussions
Last Updated: 2026 Documentation Status: Consolidated and Unified