User Guide ========== This comprehensive guide covers all of PyBA's features and configuration options. .. contents:: :local: :depth: 3 Engine Configuration -------------------- The ``Engine`` class is the main entry point for PyBA. Here are all available options: .. code-block:: python from pyba import Engine, Database engine = Engine( # LLM Provider (choose one) openai_api_key="sk-...", # OpenAI API key gemini_api_key="...", # Google Gemini API key vertexai_project_id="...", # VertexAI project ID vertexai_server_location="...", # VertexAI region (required with project_id) # Model Override model_name=None, # Override the default model (default: None, uses provider default) # Browser Settings headless=True, # Run without visible browser (default: True) handle_dependencies=False, # Auto-install Playwright deps (default: False) # Logging & Tracing use_logger=False, # Print actions to console (default: False) enable_tracing=False, # Generate trace.zip files (default: False) trace_save_directory=None, # Where to save traces (default: /tmp/pyba/) # Execution Limits max_depth=20, # Max actions per run (default from config) # Stealth use_random=False, # Random mouse/scroll jitters (default: False) # Database database=None, # Database instance for logging # Resource Optimization low_memory=False, # Enable low memory mode (default: False) # Secrets secrets=None, # Add a password manager instance for automated login ) Step-by-Step Configuration -------------------------- The ``Step`` class provides interactive, step-by-step control over a persistent browser: .. code-block:: python from pyba import Step, Database step = Step( # LLM Provider (choose one) openai_api_key="sk-...", # OpenAI API key gemini_api_key="...", # Google Gemini API key vertexai_project_id="...", # VertexAI project ID vertexai_server_location="...", # VertexAI region # Model Override model_name=None, # Override the default model (default: None, uses provider default) # Browser Settings headless=False, # Always visible by default handle_dependencies=False, # Auto-install Playwright deps # Logging & Tracing use_logger=False, # Print actions to console enable_tracing=False, # Generate trace.zip files trace_save_directory=None, # Where to save traces # Step-specific max_actions_per_step=5, # Max actions per instruction (default: 5) get_output=False, # Ask the model for a summary when a step completes (default: False) # Stealth use_random=False, # Random mouse/scroll jitters # Database database=None, # Database instance for logging # Resource Optimization low_memory=False, # Enable low memory mode (default: False) # Secrets secrets=None, # Add a password manager instance for automated login ) **The Step lifecycle:** .. code-block:: python # 1. Launch the browser await step.start(automated_login_sites=["instagram"]) # Optional login sites # 2. Feed instructions one at a time await step.step("Go to my Instagram profile") await step.step("Click on the first post") output = await step.step("Extract the caption and like count") # 3. Shut down await step.stop() Running Tasks ------------- The ``run()`` Method ^^^^^^^^^^^^^^^^^^^^ The main method for executing browser automation tasks: .. code-block:: python result = engine.sync_run( prompt="Your natural language instruction", automated_login_sites=["instagram", "gmail"], # Optional extraction_format=MyPydanticModel # Optional ) **Parameters:** - ``prompt`` (required): Natural language description of what you want to do - ``automated_login_sites``: List of sites to auto-login (credentials from env vars) - ``extraction_format``: Pydantic model defining the output structure Sync vs Async ^^^^^^^^^^^^^ .. code-block:: python # Synchronous (blocking) result = engine.sync_run(prompt="...") # Asynchronous import asyncio async def main(): result = await engine.run(prompt="...") asyncio.run(main()) Automated Logins ---------------- PyBA includes pre-built login handlers for common platforms. Credentials are stored securely in environment variables. Setting Up Credentials ^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash # Instagram export instagram_username=your_username export instagram_password=your_password # Facebook export facebook_username=your_email export facebook_password=your_password # Gmail export gmail_username=your_email export gmail_password=your_password Using Auto-Login ^^^^^^^^^^^^^^^^ .. code-block:: python result = engine.sync_run( prompt="Go to my Instagram profile and count my followers", automated_login_sites=["instagram"] ) .. note:: Credentials are **never** sent to the LLM. The login scripts use hardcoded selectors and execute locally. Secrets and Password Manager ---------------------------- For more complex setups, you can plug a password manager or secret store into PyBA so that all required credentials are loaded into the environment **before** any login happens. At a high level: - **You implement** a small class that knows how to fetch your secrets. - **PyBA calls** a ``resolve() -> dict[str, str]`` method on that class. - **The result** is applied to ``os.environ`` and then consumed by the login scripts (``instagram``, ``facebook``, ``gmail``, etc.). PasswordManager Protocol ^^^^^^^^^^^^^^^^^^^^^^^^ PyBA defines a minimal protocol that your password manager must follow: .. code-block:: python from pyba.utils.structure import PasswordManager class MyPasswordManager(PasswordManager): def resolve(self) -> dict[str, str]: # Pull values from your real secret store here # (1Password, Vault, AWS Secrets Manager, local file, etc.) return { "instagram_username": "my-username", "instagram_password": "super-secret-password", } The contract is: - ``resolve()`` must be callable with **no arguments** - ``resolve()`` must return a ``dict[str, str]`` - Keys are environment variable names (for example ``instagram_username``) - Values are the corresponding secrets If ``resolve()`` returns anything other than a ``dict[str, str]``, PyBA will raise a :class:`TypeError`. Passing the password manager to engines ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ All engines accept a ``secrets`` argument which should be an **instance** of your password manager: .. code-block:: python from pyba import Engine password_manager = MyPasswordManager() engine = Engine( openai_api_key="sk-...", secrets=password_manager, ) result = engine.sync_run( prompt="Go to my Instagram profile and get the first 5 post captions", automated_login_sites=["instagram"], ) Behind the scenes: - ``Engine``, ``DFS``, ``BFS`` and ``Step`` all forward ``secrets`` into a shared base class. - The base engine calls :func:`pyba.utils.common.extract_secrets` which: - Validates that the object has a ``resolve`` method - Calls ``resolve()`` and checks that the return type is a ``dict[str, str]`` - Sets each key/value pair into ``os.environ`` Custom password manager examples ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ You can map any backing store into environment variables. For example, reading from a JSON file: .. code-block:: python import json from pathlib import Path from pyba.utils.structure import PasswordManager class JsonPasswordManager(PasswordManager): def __init__(self, path: str): self._path = Path(path) def resolve(self) -> dict[str, str]: data = json.loads(self._path.read_text()) # Expecting a flat key/value mapping in the JSON file return {str(k): str(v) for k, v in data.items()} # Usage secrets = JsonPasswordManager("secrets.json") engine = Engine(openai_api_key="sk-...", secrets=secrets) Error cases and CannotResolveError ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you accidentally pass the **class** instead of an **instance** (for example ``secrets=MyPasswordManager`` instead of ``secrets=MyPasswordManager()``), or if your password manager requires positional arguments that you did not provide, PyBA will raise :class:`pyba.utils.exceptions.CannotResolveError`. Common mistakes: - ``secrets=MyPasswordManager`` (missing ``()``) → ``CannotResolveError`` - ``resolve()`` defined with parameters (for example ``def resolve(self, profile: str)``) → ``CannotResolveError`` - ``resolve()`` returns a list or other type → ``TypeError: resolve() must return dict[str, str]`` When you hit this error, ensure that: - You are passing an **instance** of your password manager into the engine - ``resolve()`` takes no arguments and returns a plain dict of string keys and values 2FA Handling ^^^^^^^^^^^^ If a site requires 2FA, PyBA will pause and wait for you to complete it manually. The automation resumes once you're logged in. Tracing & Debugging ------------------- PyBA generates Playwright trace files for debugging failed automations. Enabling Tracing ^^^^^^^^^^^^^^^^ .. code-block:: python engine = Engine( openai_api_key="...", enable_tracing=True, trace_save_directory="/path/to/traces" # Optional, defaults to /tmp/pyba/ ) engine.sync_run(prompt="Your task") Traces are saved as ``_trace.zip``. Viewing Traces ^^^^^^^^^^^^^^ Open traces with Playwright's Trace Viewer: .. code-block:: bash # Online viewer # Upload your trace.zip to https://trace.playwright.dev/ # Local viewer npx playwright show-trace /path/to/trace.zip The trace shows: - Every action taken - Network requests - DOM snapshots - Console logs - Screenshots at each step Stealth Mode ------------ For sites with bot detection, enable stealth features: .. code-block:: python engine = Engine( openai_api_key="...", use_random=True, # Enable random movements headless=False # Some sites detect headless browsers ) What ``use_random=True`` Does ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - **Random mouse movements**: Simulates natural cursor movement during waits - **Scroll jitters**: Adds slight randomness to scrolling - **Human-like delays**: Varies timing between actions .. note:: Stealth mode slightly slows down automation but significantly improves success rate on protected sites. Error Handling -------------- PyBA has built-in retry logic with exponential backoff for API errors. Automatic Retries ^^^^^^^^^^^^^^^^^ When an LLM API call fails (rate limits, timeouts), PyBA automatically: 1. Waits with exponential backoff (1s, 2s, 4s, 8s... up to 60s max) 2. Adds random jitter to prevent thundering herd 3. Retries the request Action Failures ^^^^^^^^^^^^^^^ When a browser action fails (element not found, timeout), PyBA: 1. Captures the failure reason 2. Re-extracts the current DOM 3. Asks the LLM for an alternative action 4. Logs the failure to the database (if configured) Logging ------- Enable verbose logging to see what PyBA is doing: .. code-block:: python engine = Engine( openai_api_key="...", use_logger=True ) Log output includes: - ``[INFO]`` General information and progress - ``[ACTION]`` Browser actions being performed - ``[SUCCESS]`` Completed operations - ``[WARNING]`` Non-fatal issues - ``[ERROR]`` Failures and exceptions Configuration File ------------------ PyBA reads defaults from ``pyba/config.yaml``. You can override these with constructor arguments. Key configuration sections: .. code-block:: yaml main_engine_configs: headless_mode: true handle_dependencies: false use_logger: false enable_tracing: false max_iteration_steps: 20 max_depth: 10 # For DFS/BFS modes max_breadth: 5 # For BFS mode openai: provider: "openai" model: "gpt-4o" vertexai: provider: "vertexai" model: "gemini-2.0-flash" gemini: provider: "gemini" model: "gemini-2.5-pro" Best Practices -------------- Writing Good Prompts ^^^^^^^^^^^^^^^^^^^^ **Be specific:** .. code-block:: python # Good "Go to amazon.com, search for 'wireless headphones', sort by price low to high, and get the name and price of the first 3 results" # Too vague "Find cheap headphones" **Break complex tasks into steps:** .. code-block:: python # Good "Go to GitHub, navigate to the trending page, filter by Python language, and get the names of the top 5 repositories" # Unclear order "Get trending Python repos from GitHub" Low Memory Mode ^^^^^^^^^^^^^^^^ For resource-constrained environments (CI servers, containers, low-spec machines), enable low memory mode to reduce browser resource usage: .. code-block:: python engine = Engine( openai_api_key="...", low_memory=True ) **What low memory mode does:** *Python-side optimizations (~120MB saved at idle):* - Skips loading ``oxymouse`` (and its dependencies ``numpy``, ``scipy``), saving ~46MB of RAM per process - Lazy-loads LLM provider libraries — only the chosen provider (OpenAI or Gemini) is loaded, saving ~64-73MB - Reduces SQLAlchemy connection pool from 50 to 5 connections *Chromium-side flags (container stability, not RAM):* - ``--disable-dev-shm-usage`` — Uses /tmp instead of /dev/shm (prevents OOM in Docker containers) - ``--disable-gpu`` — Disables GPU compositing (not needed for headless) - ``--disable-background-networking`` — Stops background network requests (updates, safe browsing) - ``--disable-extensions`` — No browser extensions loaded - ``--disable-sync`` — Disables Chrome profile sync - ``--disable-features=Translate,BackForwardCache`` — Disables page translation and back/forward page caching - ``--mute-audio`` — Mutes all audio output - ``--disable-lcd-text`` — Disables subpixel text rendering - Sets device scale factor to 1 .. note:: The Chromium flags do not measurably reduce browser RSS. Chromium memory is dominated by DOM complexity, JavaScript execution, and network data — not by these flags. The flags improve stability in containerized environments (especially ``--disable-dev-shm-usage``). .. note:: ``low_memory=True`` and ``use_random=True`` cannot be used together. Random mouse/scroll movements require ``oxymouse``, which low memory mode excludes to save RAM. Low memory mode is available on all engine classes: ``Engine``, ``Step``, ``DFS``, and ``BFS``. .. code-block:: python from pyba import Step, DFS, BFS, Database # Step mode with low memory step = Step(openai_api_key="...", low_memory=True) # DFS/BFS with low memory db = Database(engine="sqlite", name="/tmp/pyba.db") dfs = DFS(openai_api_key="...", database=db, low_memory=True) bfs = BFS(openai_api_key="...", database=db, low_memory=True) .. seealso:: See :doc:`performance` for benchmarking scripts, measured results, and performance tips. Common Issues ^^^^^^^^^^^^^ **"Element not found"** - The page may still be loading—try increasing timeouts - The selector may have changed—enable tracing to debug **"Rate limit exceeded"** - PyBA handles this automatically with retries - Consider using a different LLM provider **"Login failed"** - Check your environment variables are set correctly - The site may have changed their login flow