You’ve sent a run, it came back COMPLETED, but the result is empty or wrong. Or maybe it outright FAILED. Before you start rewriting your goal, check whether the site is blocking you — bot detection is the most common cause of silent failures, and the fix is usually two lines of code.
This guide walks through the full process: confirm the problem, apply the right configuration, and tune your goal to behave more like a human.
Examples use the Python SDK. The same parameters work across all SDKs and the REST API — see API Reference for TypeScript and cURL equivalents.
Installation
Set your API key as an environment variable so you don’t have to pass it explicitly:
export TINYFISH_API_KEY="your-api-key"
Step 1: Confirm Anti-Bot Is the Problem
Don’t assume. Sites can fail for lots of reasons — slow JavaScript, unexpected layout changes, ambiguous goals. Anti-bot has specific fingerprints. Look for them first.
Get the streaming URL and watch the browser
Every run produces a streaming_url — a live browser preview you can open while the run is happening, or replay afterward. This is the fastest way to see exactly what the agent encountered.
Use agent.stream() to capture it as soon as it’s available:
from tinyfish import TinyFish, CompleteEvent
client = TinyFish()
with client.agent.stream(
goal="Extract the product name and price",
url="https://example.com/products",
on_streaming_url=lambda e: print(f"Watch live: {e.streaming_url}"),
on_progress=lambda e: print(f" > {e.purpose}"),
) as stream:
for event in stream:
if isinstance(event, CompleteEvent):
print("Status:", event.status)
print("Result:", event.result_json)
The on_progress callback shows each step the agent took — if it got stuck on a challenge page, you’ll see it stop there.
If you already started a run with agent.queue(), retrieve the streaming URL from the run object:
run = client.runs.get("run_abc123")
print(run.streaming_url) # open this in your browser
What to look for in the browser preview
Open streaming_url in your browser. What you see tells you what happened:
| What you see | Likely cause |
|---|
| Cloudflare challenge / “Checking your browser” | Cloudflare bot detection |
| DataDome popup or redirect | DataDome protection |
| Blank page or infinite spinner | IP-based block or JS fingerprinting |
| CAPTCHA (reCAPTCHA, hCaptcha) | CAPTCHA gate — cannot be solved automatically |
| ”Access Denied” or 403 page | IP or User-Agent block |
| Login page when you expected content | Session-based bot detection |
Check the result — COMPLETED doesn’t mean it worked
The result field is a better indicator of agentic success.
from tinyfish import TinyFish, RunStatus, CompleteEvent
client = TinyFish()
with client.agent.stream(
goal="Extract the product name and price",
url="https://example.com/products",
) as stream:
for event in stream:
if isinstance(event, CompleteEvent):
if event.status == RunStatus.COMPLETED and event.result_json:
# Anti-bot shows up here as null fields or explicit failure flags
result = event.result_json
if result.get("status") == "failure" or not any(result.values()):
print("Blocked — result is empty despite COMPLETED status")
elif event.status == RunStatus.FAILED:
print("Run failed:", event.error.message if event.error else "unknown")
Anti-bot signatures in the result:
- Fields are all
null or empty arrays AND the streaming view shows the target content was never loaded
result.reason mentions “access denied”, “blocked”, or “could not find”
If the streaming view shows a challenge page and the result is empty or a failure — you’ve confirmed anti-bot. Move to Step 2.
Step 2: Enable Stealth Mode and Proxy
Apply both together. Stealth changes the browser fingerprint; the proxy changes the IP. Sites that use anti-bot services correlate both signals — changing only one often isn’t enough.
Switch to stealth browser
from tinyfish import TinyFish, BrowserProfile
client = TinyFish()
response = client.agent.run(
goal="Extract the product name and price",
url="https://protected-site.com/products",
browser_profile=BrowserProfile.STEALTH, # was BrowserProfile.LITE or omitted
)
BrowserProfile.STEALTH is a modified browser with anti-detection techniques. The default (BrowserProfile.LITE) is faster but doesn’t include these measures.
Add a proxy
from tinyfish import TinyFish, BrowserProfile, ProxyConfig, ProxyCountryCode
client = TinyFish()
response = client.agent.run(
goal="Extract the product name and price",
url="https://protected-site.com/products",
browser_profile=BrowserProfile.STEALTH,
proxy_config=ProxyConfig(
enabled=True,
country_code=ProxyCountryCode.US, # match the site's expected audience
),
)
Choosing a country: Pick the country where the site’s primary users are. Available values:
| Enum | Country |
|---|
ProxyCountryCode.US | United States |
ProxyCountryCode.GB | United Kingdom |
ProxyCountryCode.CA | Canada |
ProxyCountryCode.DE | Germany |
ProxyCountryCode.FR | France |
ProxyCountryCode.JP | Japan |
ProxyCountryCode.AU | Australia |
Verify what proxy was actually used
After a run, browser_config on the run object confirms what was applied:
run = client.runs.get("run_abc123")
print(run.browser_config.proxy_enabled) # True/False
print(run.browser_config.proxy_country_code) # "US" or None
Full example with both applied
from tinyfish import TinyFish, BrowserProfile, ProxyConfig, ProxyCountryCode, CompleteEvent, RunStatus
client = TinyFish()
with client.agent.stream(
goal="Extract the product name and price",
url="https://protected-site.com/products",
browser_profile=BrowserProfile.STEALTH,
proxy_config=ProxyConfig(enabled=True, country_code=ProxyCountryCode.US),
on_streaming_url=lambda e: print(f"Watch: {e.streaming_url}"),
on_progress=lambda e: print(f" > {e.purpose}"),
) as stream:
for event in stream:
if isinstance(event, CompleteEvent):
if event.status == RunStatus.COMPLETED:
print("Result:", event.result_json)
else:
print("Failed:", event.error.message if event.error else "unknown")
Watch the streaming view again after this change. If the actual page loads instead of a challenge screen — you’re through. Move to Step 3 to make the run more reliable at scale.
TinyFish cannot solve CAPTCHAs (reCAPTCHA, hCaptcha, etc.). The configurations above — stealth mode, proxies, and human-like goal patterns — reduce the likelihood of CAPTCHAs being triggered, but if a site serves one, it’s a hard limit for now. We’re actively working on expanding our anti-detection capabilities.
Step 3: Guide the Agent to Behave More Like a Human
Stealth and proxy get you past the door. But some sites layer behavioral analysis on top of fingerprinting — they watch for robotic patterns like instant form submissions, missing cookie consent dismissals, or zero mouse dwell time. Your goal controls a lot of this behavior.
Handle cookie and consent banners
Bot detection systems often look at whether a user interacted with a consent banner before the main content. Always dismiss it explicitly:
goal = """
Close any cookie consent or GDPR banner that appears before doing anything else.
Then extract the product name, current price, and availability status.
Return as JSON: { "name": string, "price": number, "available": boolean }
"""
Add deliberate pauses at suspicious checkpoints
Sites with aggressive behavioral detection (checkout pages, login flows) flag runs that move too fast:
goal = """
1. Wait for the page to fully load before interacting with anything.
2. Close any cookie banner.
3. Wait for the banner to disappear before proceeding.
4. Scroll down to view the pricing section.
5. Wait for the pricing section to fully render, then extract all plan names and monthly prices.
Return as JSON array: [{ "plan": string, "price_monthly": number }]
"""
Describe elements visually, not by selector
Automation-aware selectors are sometimes deliberately changed to trip scrapers. Visual descriptions are more resilient:
# Fragile — may be intentionally changed by the site
goal = "Click the button with id='add-to-cart-btn'"
# Resilient — describes what a human would see
goal = "Click the blue 'Add to Cart' button directly below the product price"
Use numbered steps for multi-step flows
For login flows or multi-page workflows, numbered steps give the agent explicit decision points rather than leaving it to guess:
goal = """
1. Wait for the page to fully load (spinner should disappear).
2. If a cookie consent banner is visible, click 'Accept' or 'Accept All'.
3. Locate the search bar at the top of the page and type "running shoes".
4. Wait for autocomplete suggestions to appear, then press Enter.
5. Wait for results to load.
6. Extract the first 10 results: product name, price, and product URL.
Stop after 10 results. Do not paginate.
Return as JSON array.
"""
Add explicit fallback instructions
Protected sites sometimes show intermediate pages (challenge passed, now redirecting). Tell the agent how to handle them:
goal = """
Extract the product price from this page.
If a loading screen or redirect page appears, wait for it to complete before extracting.
If an 'Access Denied' page appears, return { "error": "access_denied" }.
If the price shows 'Contact Us', return { "price": null, "contact_required": true }.
Return: { "price": number or null, "currency": string }
"""
Putting It All Together
A complete hardened run for a protected site:
from tinyfish import (
TinyFish,
BrowserProfile,
ProxyConfig,
ProxyCountryCode,
CompleteEvent,
RunStatus,
)
client = TinyFish()
with client.agent.stream(
url="https://protected-site.com/pricing",
browser_profile=BrowserProfile.STEALTH,
proxy_config=ProxyConfig(enabled=True, country_code=ProxyCountryCode.US),
goal="""
1. Wait for the page to fully load.
2. Close any cookie consent or GDPR banner that appears.
3. Wait 1 second before proceeding.
4. Locate the pricing section — it typically shows plan names in a grid or table.
5. For each plan, extract: plan name, monthly price, and annual price if shown.
If a Cloudflare or security check page appears, wait for it to complete automatically.
If you see an 'Access Denied' or CAPTCHA page, return { "error": "blocked" }.
Do not click any purchase or checkout buttons.
Return as JSON array:
[{ "plan": "Pro", "monthly_price": 49, "annual_price": 39 }]
""",
on_streaming_url=lambda e: print(f"Watch run: {e.streaming_url}"),
on_progress=lambda e: print(f" > {e.purpose}"),
) as stream:
for event in stream:
if isinstance(event, CompleteEvent):
if event.status == RunStatus.COMPLETED:
print("Result:", event.result_json)
else:
print("Failed:", event.error.message if event.error else "unknown")
Decision Tree
Run returned empty or wrong result?
│
├── Open streaming_url (from on_streaming_url callback or runs.get())
│ ├── Challenge / "Checking your browser" page → Anti-bot confirmed
│ ├── Access Denied / 403 → Anti-bot confirmed
│ ├── Blank page → Likely anti-bot (fingerprint-based)
│ └── Page loaded but result wrong → Goal issue, not anti-bot
│
└── Anti-bot confirmed?
├── Add browser_profile=BrowserProfile.STEALTH
├── Add proxy_config=ProxyConfig(enabled=True, country_code=ProxyCountryCode.US)
├── Re-run and watch stream again
│ ├── Page loads → Add goal hardening (Step 3) for reliability at scale
│ └── Still blocked → Site likely requires CAPTCHA (hard limit)
└── Done
Creative Solutions and Iteration
When a site is actively defended, the most effective approach is often to rethink the workflow rather than force the original one.
Watch yourself do it first. Before writing your goal, navigate to the target site yourself and think through what a human would actually do. Use that as your script. The streaming view is also useful here — watch a run or two to understand exactly what the agent encounters before committing to a final goal.
Start at the front door. Linking directly to a filtered search results page or a deep URL can look robotic. Starting at target.com and navigating to your destination — searching for a product, clicking through a category — often succeeds where a direct deep link fails.
Go to the source. If the formatted data you need lives behind anti-bot and paywalls, ask whether the underlying raw data is available elsewhere. Aggregator sites are often heavily protected; their primary sources may not be. Synthesizing from multiple simpler sources is frequently more reliable than fighting for one complex one.
Check for a public API or feed first. Some sites that actively block scraping also publish APIs, RSS feeds, or sitemaps. Five minutes checking saves a lot of iteration.
Keep dwell time low. The longer an agent stays on a site, the higher the likelihood of detection. Balancing human-like navigation with speed matters — break large workflows into focused, smaller tasks that can be handled by multiple agents running in parallel. You get both the human-like pacing and the throughput. Scale is one of TinyFish’s superpowers.
Time your runs intentionally. Anti-bot systems are sensitive to traffic volume. Running during off-peak hours for your target site (for US-based sites, late morning to early afternoon PST often works well) can reduce the likelihood of triggering rate-based challenges. If you’re testing a new workflow, start with a single run during a quiet period before scaling up.
Vary your entry points at scale. If you’re running hundreds of batch jobs against the same site, uniform traffic patterns can themselves become a fingerprint — even across different IPs. Mixing up how runs navigate to their destination (some via homepage search, some via category pages, some direct) makes the aggregate traffic look more organic. Runs have an approximate 5-minute timeout, so this also naturally encourages breaking complex workflows into smaller, parallelizable pieces.
Manage concurrency intentionally. TinyFish does not throttle runs by domain — if you enqueue a large batch against the same site, they will fire in parallel up to your account’s concurrency limit. For sensitive sites, consider staggering your jobs in your own queuing logic rather than enqueuing everything at once.
What’s Coming
TinyFish continuously improves browser behavior and anti-detection performance across the web. If a site blocked you on a previous project, it’s worth trying again — the same run that failed before may work without any changes on your end.
Authenticated sessions (in beta): TinyFish is adding a first-class Auth tool for logging into sites as part of a run. Beyond unlocking gated content, authenticated sessions naturally bypass many anti-bot measures — logged-in users are treated very differently by most protection systems. Contact us to request early access.
Need Help?
If you’re stuck on a specific site, share the URL and your current configuration with us — we can often diagnose the issue quickly.