Anti-Bot Guide

You’ve sent a run, it came back COMPLETED, but the result is empty or wrong. Or maybe it outright FAILED. Before you start rewriting your goal, check whether the site is blocking you — bot detection is the most common cause of silent failures, and the fix is usually two lines of code. This guide walks through the full process: confirm the problem, apply the right configuration, and tune your goal to behave more like a human.

Examples use the Python SDK. The same parameters work across all SDKs and the REST API — see API Reference for TypeScript and cURL equivalents.

Installation

pip install tinyfish

Set your API key as an environment variable so you don’t have to pass it explicitly:

export TINYFISH_API_KEY="your-api-key"

Step 1: Confirm Anti-Bot Is the Problem

Don’t assume. Sites can fail for lots of reasons — slow JavaScript, unexpected layout changes, ambiguous goals. Anti-bot has specific fingerprints. Look for them first.

Get the streaming URL and watch the browser

Every run produces a streaming_url — a live browser preview you can open while the run is happening. This is the fastest way to see exactly what the agent encountered. Use agent.stream() to capture it as soon as it’s available:

from tinyfish import TinyFish, CompleteEvent

client = TinyFish()

with client.agent.stream(
    goal="Extract the product name and price",
    url="https://example.com/products",
    on_streaming_url=lambda e: print(f"Watch live: {e.streaming_url}"),
    on_progress=lambda e: print(f"  > {e.purpose}"),
) as stream:
    for event in stream:
        if isinstance(event, CompleteEvent):
            print("Status:", event.status)
            print("Result:", event.result_json)

The on_progress callback shows each step the agent took — if it got stuck on a challenge page, you’ll see it stop there. If you already started a run with agent.queue(), retrieve the streaming URL from the run object:

run = client.runs.get("run_abc123")
print(run.streaming_url)   # open this in your browser

What to look for in the browser preview

Open streaming_url in your browser. What you see tells you what happened:

What you see	Likely cause
Cloudflare challenge / “Checking your browser”	Cloudflare bot detection
DataDome popup or redirect	DataDome protection
Blank page or infinite spinner	IP-based block or JS fingerprinting
CAPTCHA (reCAPTCHA, hCaptcha)	CAPTCHA gate — cannot be solved automatically
”Access Denied” or 403 page	IP or User-Agent block
Login page when you expected content	Session-based bot detection

Check the result — `COMPLETED` doesn’t mean it worked

The result field is a better indicator of agentic success.

from tinyfish import TinyFish, RunStatus, CompleteEvent

client = TinyFish()

with client.agent.stream(
    goal="Extract the product name and price",
    url="https://example.com/products",
) as stream:
    for event in stream:
        if isinstance(event, CompleteEvent):
            if event.status == RunStatus.COMPLETED and event.result_json:
                # Anti-bot shows up here as null fields or explicit failure flags
                result = event.result_json
                if result.get("status") == "failure" or not any(result.values()):
                    print("Blocked — result is empty despite COMPLETED status")
            elif event.status == RunStatus.FAILED:
                print("Run failed:", event.error.message if event.error else "unknown")

Anti-bot signatures in the result:

Fields are all null or empty arrays AND the streaming view shows the target content was never loaded
result.reason mentions “access denied”, “blocked”, or “could not find”

If the streaming view shows a challenge page and the result is empty or a failure — you’ve confirmed anti-bot. Move to Step 2.

Step 2: Enable Stealth Mode and Proxy

Apply both together. Stealth changes the browser fingerprint; the proxy changes the IP. Sites that use anti-bot services correlate both signals — changing only one often isn’t enough.

Switch to stealth browser

from tinyfish import TinyFish, BrowserProfile

client = TinyFish()

response = client.agent.run(
    goal="Extract the product name and price",
    url="https://protected-site.com/products",
    browser_profile=BrowserProfile.STEALTH,   # was BrowserProfile.LITE or omitted
)

BrowserProfile.STEALTH is a modified browser with anti-detection techniques. The default (BrowserProfile.LITE) is faster but doesn’t include these measures.

Add a proxy

from tinyfish import TinyFish, BrowserProfile, ProxyConfig, ProxyCountryCode

client = TinyFish()

response = client.agent.run(
    goal="Extract the product name and price",
    url="https://protected-site.com/products",
    browser_profile=BrowserProfile.STEALTH,
    proxy_config=ProxyConfig(
        enabled=True,
        country_code=ProxyCountryCode.US,   # match the site's expected audience
    ),
)

Choosing a country: Pick the country where the site’s primary users are. Available values:

Enum	Country
`ProxyCountryCode.US`	United States
`ProxyCountryCode.GB`	United Kingdom
`ProxyCountryCode.CA`	Canada
`ProxyCountryCode.DE`	Germany
`ProxyCountryCode.FR`	France
`ProxyCountryCode.JP`	Japan
`ProxyCountryCode.AU`	Australia

Verify what proxy was actually used

After a run, browser_config on the run object confirms what was applied:

run = client.runs.get("run_abc123")
print(run.browser_config.proxy_enabled)        # True/False
print(run.browser_config.proxy_country_code)   # "US" or None

Full example with both applied

from tinyfish import TinyFish, BrowserProfile, ProxyConfig, ProxyCountryCode, CompleteEvent, RunStatus

client = TinyFish()

with client.agent.stream(
    goal="Extract the product name and price",
    url="https://protected-site.com/products",
    browser_profile=BrowserProfile.STEALTH,
    proxy_config=ProxyConfig(enabled=True, country_code=ProxyCountryCode.US),
    on_streaming_url=lambda e: print(f"Watch: {e.streaming_url}"),
    on_progress=lambda e: print(f"  > {e.purpose}"),
) as stream:
    for event in stream:
        if isinstance(event, CompleteEvent):
            if event.status == RunStatus.COMPLETED:
                print("Result:", event.result_json)
            else:
                print("Failed:", event.error.message if event.error else "unknown")

Watch the streaming view again after this change. If the actual page loads instead of a challenge screen — you’re through. Move to Step 3 to make the run more reliable at scale.

TinyFish cannot solve CAPTCHAs (reCAPTCHA, hCaptcha, etc.). The configurations above — stealth mode, proxies, and human-like goal patterns — reduce the likelihood of CAPTCHAs being triggered, but if a site serves one, it’s a hard limit for now. We’re actively working on expanding our anti-detection capabilities.

Step 3: Guide the Agent to Behave More Like a Human

Stealth and proxy get you past the door. But some sites layer behavioral analysis on top of fingerprinting — they watch for robotic patterns like instant form submissions, missing cookie consent dismissals, or zero mouse dwell time. Your goal controls a lot of this behavior. Bot detection systems often look at whether a user interacted with a consent banner before the main content. Always dismiss it explicitly:

goal = """
Close any cookie consent or GDPR banner that appears before doing anything else.
Then extract the product name, current price, and availability status.
Return as JSON: { "name": string, "price": number, "available": boolean }
"""

Add deliberate pauses at suspicious checkpoints

Sites with aggressive behavioral detection (checkout pages, login flows) flag runs that move too fast:

goal = """
1. Wait for the page to fully load before interacting with anything.
2. Close any cookie banner.
3. Wait for the banner to disappear before proceeding.
4. Scroll down to view the pricing section.
5. Wait for the pricing section to fully render, then extract all plan names and monthly prices.

Return as JSON array: [{ "plan": string, "price_monthly": number }]
"""

Describe elements visually, not by selector

Automation-aware selectors are sometimes deliberately changed to trip scrapers. Visual descriptions are more resilient:

# Fragile — may be intentionally changed by the site
goal = "Click the button with id='add-to-cart-btn'"

# Resilient — describes what a human would see
goal = "Click the blue 'Add to Cart' button directly below the product price"

Use numbered steps for multi-step flows

For login flows or multi-page workflows, numbered steps give the agent explicit decision points rather than leaving it to guess:

goal = """
1. Wait for the page to fully load (spinner should disappear).
2. If a cookie consent banner is visible, click 'Accept' or 'Accept All'.
3. Locate the search bar at the top of the page and type "running shoes".
4. Wait for autocomplete suggestions to appear, then press Enter.
5. Wait for results to load.
6. Extract the first 10 results: product name, price, and product URL.

Stop after 10 results. Do not paginate.
Return as JSON array.
"""

Add explicit fallback instructions

Protected sites sometimes show intermediate pages (challenge passed, now redirecting). Tell the agent how to handle them:

goal = """
Extract the product price from this page.

If a loading screen or redirect page appears, wait for it to complete before extracting.
If an 'Access Denied' page appears, return { "error": "access_denied" }.
If the price shows 'Contact Us', return { "price": null, "contact_required": true }.

Return: { "price": number or null, "currency": string }
"""

Putting It All Together

A complete hardened run for a protected site:

from tinyfish import (
    TinyFish,
    BrowserProfile,
    ProxyConfig,
    ProxyCountryCode,
    CompleteEvent,
    RunStatus,
)

client = TinyFish()

with client.agent.stream(
    url="https://protected-site.com/pricing",
    browser_profile=BrowserProfile.STEALTH,
    proxy_config=ProxyConfig(enabled=True, country_code=ProxyCountryCode.US),
    goal="""
        1. Wait for the page to fully load.
        2. Close any cookie consent or GDPR banner that appears.
        3. Wait 1 second before proceeding.
        4. Locate the pricing section — it typically shows plan names in a grid or table.
        5. For each plan, extract: plan name, monthly price, and annual price if shown.

        If a Cloudflare or security check page appears, wait for it to complete automatically.
        If you see an 'Access Denied' or CAPTCHA page, return { "error": "blocked" }.
        Do not click any purchase or checkout buttons.

        Return as JSON array:
        [{ "plan": "Pro", "monthly_price": 49, "annual_price": 39 }]
    """,
    on_streaming_url=lambda e: print(f"Watch run: {e.streaming_url}"),
    on_progress=lambda e: print(f"  > {e.purpose}"),
) as stream:
    for event in stream:
        if isinstance(event, CompleteEvent):
            if event.status == RunStatus.COMPLETED:
                print("Result:", event.result_json)
            else:
                print("Failed:", event.error.message if event.error else "unknown")

Decision Tree

Run returned empty or wrong result?
│
├── Open streaming_url (from on_streaming_url callback or runs.get())
│   ├── Challenge / "Checking your browser" page → Anti-bot confirmed
│   ├── Access Denied / 403 → Anti-bot confirmed
│   ├── Blank page → Likely anti-bot (fingerprint-based)
│   └── Page loaded but result wrong → Goal issue, not anti-bot
│
└── Anti-bot confirmed?
    ├── Add browser_profile=BrowserProfile.STEALTH
    ├── Add proxy_config=ProxyConfig(enabled=True, country_code=ProxyCountryCode.US)
    ├── Re-run and watch stream again
    │   ├── Page loads → Add goal hardening (Step 3) for reliability at scale
    │   └── Still blocked → Site likely requires CAPTCHA (hard limit)
    └── Done

Creative Solutions and Iteration

When a site is actively defended, the most effective approach is often to rethink the workflow rather than force the original one. Watch yourself do it first. Before writing your goal, navigate to the target site yourself and think through what a human would actually do. Use that as your script. The streaming view is also useful here — watch a run or two to understand exactly what the agent encounters before committing to a final goal. Start at the front door. Linking directly to a filtered search results page or a deep URL can look robotic. Starting at target.com and navigating to your destination — searching for a product, clicking through a category — often succeeds where a direct deep link fails. Go to the source. If the formatted data you need lives behind anti-bot and paywalls, ask whether the underlying raw data is available elsewhere. Aggregator sites are often heavily protected; their primary sources may not be. Synthesizing from multiple simpler sources is frequently more reliable than fighting for one complex one. Check for a public API or feed first. Some sites that actively block scraping also publish APIs, RSS feeds, or sitemaps. Five minutes checking saves a lot of iteration. Keep dwell time low. The longer an agent stays on a site, the higher the likelihood of detection. Balancing human-like navigation with speed matters — break large workflows into focused, smaller tasks that can be handled by multiple agents running in parallel. You get both the human-like pacing and the throughput. Scale is one of TinyFish’s superpowers. Time your runs intentionally. Anti-bot systems are sensitive to traffic volume. Running during off-peak hours for your target site (for US-based sites, late morning to early afternoon PST often works well) can reduce the likelihood of triggering rate-based challenges. If you’re testing a new workflow, start with a single run during a quiet period before scaling up. Vary your entry points at scale. If you’re running hundreds of batch jobs against the same site, uniform traffic patterns can themselves become a fingerprint — even across different IPs. Mixing up how runs navigate to their destination (some via homepage search, some via category pages, some direct) makes the aggregate traffic look more organic. Runs have a 10-minute timeout, so this also naturally encourages breaking complex workflows into smaller, parallelizable pieces. Design goals to complete their core task well within that limit. Manage concurrency intentionally. TinyFish does not throttle runs by domain — if you enqueue a large batch against the same site, they will fire in parallel up to your account’s concurrency limit. For sensitive sites, consider staggering your jobs in your own queuing logic rather than enqueuing everything at once.

What’s Coming

TinyFish continuously improves browser behavior and anti-detection performance across the web. If a site blocked you on a previous project, it’s worth trying again — the same run that failed before may work without any changes on your end. Authenticated sessions: Connect your password manager and use use_vault: true to log into sites as part of a run. Beyond unlocking gated content, authenticated sessions naturally bypass many anti-bot measures — logged-in users are treated very differently by most protection systems. See Vault Credentials for details.

Need Help?

If you’re stuck on a specific site, share the URL and your current configuration with us — we can often diagnose the issue quickly.

Email: support@tinyfish.io
Discord: discord.gg/tinyfish

Start Here

Agent API

Search API

Fetch API

Browser API

Capabilities

Tools & Integrations

Guides

Use Cases

Reference

Installation

Step 1: Confirm Anti-Bot Is the Problem

Get the streaming URL and watch the browser

What to look for in the browser preview

Check the result — `COMPLETED` doesn’t mean it worked

Step 2: Enable Stealth Mode and Proxy

Switch to stealth browser

Add a proxy

Verify what proxy was actually used

Full example with both applied

Step 3: Guide the Agent to Behave More Like a Human

Add deliberate pauses at suspicious checkpoints

Describe elements visually, not by selector

Use numbered steps for multi-step flows

Add explicit fallback instructions

Putting It All Together

Decision Tree

Creative Solutions and Iteration

What’s Coming

Need Help?

Start Here

Agent API

Search API

Fetch API

Browser API

Capabilities

Tools & Integrations

Guides

Use Cases

Reference

Documentation Index

​Installation

​Step 1: Confirm Anti-Bot Is the Problem

​Get the streaming URL and watch the browser

​What to look for in the browser preview

​Check the result — COMPLETED doesn’t mean it worked

​Step 2: Enable Stealth Mode and Proxy

​Switch to stealth browser

​Add a proxy

​Verify what proxy was actually used

​Full example with both applied

​Step 3: Guide the Agent to Behave More Like a Human

​Handle cookie and consent banners

​Add deliberate pauses at suspicious checkpoints

​Describe elements visually, not by selector

​Use numbered steps for multi-step flows

​Add explicit fallback instructions

​Putting It All Together

​Decision Tree

​Creative Solutions and Iteration

​What’s Coming

​Need Help?

Installation

Step 1: Confirm Anti-Bot Is the Problem

Get the streaming URL and watch the browser

What to look for in the browser preview

Check the result — `COMPLETED` doesn’t mean it worked

Step 2: Enable Stealth Mode and Proxy

Switch to stealth browser

Add a proxy

Verify what proxy was actually used

Full example with both applied

Step 3: Guide the Agent to Behave More Like a Human

Handle cookie and consent banners

Add deliberate pauses at suspicious checkpoints

Describe elements visually, not by selector

Use numbered steps for multi-step flows

Add explicit fallback instructions

Putting It All Together

Decision Tree

Creative Solutions and Iteration

What’s Coming

Need Help?