> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tinyfish.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Fetch API Reference

> Complete reference for the Fetch API endpoint

## Endpoint

```
POST https://api.fetch.tinyfish.ai
```

All requests require an `X-API-Key` header. See [Authentication](/authentication).

***

## Request

```json theme={null}
{
  "urls": ["https://example.com"],
  "format": "html",
  "links": false,
  "image_links": false,
  "ttl": 3600,
  "per_url_timeout_ms": 45000
}
```

### Parameters

<ParamField body="urls" type="string[]" required>
  URLs to fetch and extract. Maximum 10 URLs per request.

  All URLs must use `http` or `https`. Private IP addresses, localhost, and cloud metadata endpoints are rejected.
</ParamField>

<ParamField body="format" type="string" default="markdown">
  Output format for the `text` field in each result. One of:

  * `html` — semantic HTML
  * `markdown` — clean Markdown, recommended for LLMs (default)
  * `json` — structured document tree
</ParamField>

<ParamField body="links" type="boolean" default="false">
  When `true`, include all `<a href>` URLs found on the page in the `links` field.
</ParamField>

<ParamField body="image_links" type="boolean" default="false">
  When `true`, include all `<img src>` URLs found on the page in the `image_links` field.
</ParamField>

<ParamField body="ttl" type="integer" default="omitted">
  Cache freshness tolerance in seconds.

  * Omit `ttl` to accept any cached entry.
  * Set `ttl` to `0` when you want a live fetch.
  * Set `ttl` to a positive integer to accept cached entries younger than that many seconds.
</ParamField>

<ParamField body="per_url_timeout_ms" type="integer" default="omitted">
  Per-URL wall-clock timeout budget in milliseconds. Must be between `1` and `110000`.

  If a URL exceeds this budget, that URL returns a `timeout` error in `errors[]` while other URLs in the same request can still complete.
</ParamField>

***

## Response

```json theme={null}
{
  "results": [...],
  "errors": [...]
}
```

### `results[]`

One entry per successfully fetched URL.

<ResponseField name="url" type="string">
  The original requested URL.
</ResponseField>

<ResponseField name="final_url" type="string">
  The URL after any redirects. May differ from `url`.
</ResponseField>

<ResponseField name="title" type="string | null">
  Page title, preferring `og:title` over `<title>`. `null` if not found.
</ResponseField>

<ResponseField name="description" type="string | null">
  Meta description, preferring `og:description` over `<meta name="description">`. `null` if not found.
</ResponseField>

<ResponseField name="language" type="string | null">
  Detected page language (e.g. `"en"`). `null` if undetectable.
</ResponseField>

<ResponseField name="author" type="string | null">
  Author from meta tags. `null` if not found.
</ResponseField>

<ResponseField name="published_date" type="string | null">
  Publication date, if detectable. `null` if not found.
</ResponseField>

<ResponseField name="text" type="string | object">
  Extracted page content. Format depends on the `format` request parameter:

  * `string` when `format` is `"html"` or `"markdown"`
  * `object` (document tree) when `format` is `"json"`
</ResponseField>

<ResponseField name="links" type="string[]">
  All `<a href>` URLs on the page, resolved to absolute URLs. Only present when `links: true` was requested.
</ResponseField>

<ResponseField name="image_links" type="string[]">
  All `<img src>` URLs on the page, resolved to absolute URLs. Only present when `image_links: true` was requested.
</ResponseField>

<ResponseField name="latency_ms" type="number | null">
  Time to fetch and extract this URL, in milliseconds. `null` if unavailable.
</ResponseField>

<ResponseField name="format" type="string">
  Output format used for the `text` field. Echoes the request `format` parameter (`"markdown"`, `"html"`, or `"json"`).
</ResponseField>

<Note>
  Fields that could not be extracted (`title`, `description`, `language`, `author`, `published_date`) are returned as `null`.
</Note>

### `errors[]`

One entry per URL that could not be fetched. Always present, may be empty. Per-URL failures do not affect the rest of the batch.

<ResponseField name="url" type="string">
  The URL that failed.
</ResponseField>

<ResponseField name="error" type="string">
  Structured error code identifying the failure type. One of: `target_http_error`, `page_not_found`, `target_unreachable`, `timeout`, `bot_blocked`, `empty_content`, `invalid_url`, `invalid_redirect_url`, `proxy_error`.
</ResponseField>

<ResponseField name="status" type="number (optional)">
  Upstream HTTP status code. Present when `error` is `target_http_error` or `page_not_found`.
</ResponseField>

## SDK Methods

<CodeGroup>
  ```python Python theme={null}
  from tinyfish import TinyFish

  client = TinyFish()
  result = client.fetch.get_contents(
      urls=["https://www.tinyfish.ai/"],
      format="markdown",
  )
  for page in result.results:
      print(page.title, "→", page.text[:100])
  ```

  ```typescript TypeScript theme={null}
  import { TinyFish } from "@tiny-fish/sdk";

  const client = new TinyFish();
  const result = await client.fetch.getContents({
    urls: ["https://www.tinyfish.ai/"],
    format: "markdown",
  });
  result.results.forEach((page) => console.log(page.title, "→", page.text.slice(0, 100)));
  ```
</CodeGroup>

***

## Error Codes

HTTP-level errors apply to the entire request.

| Status | Meaning                                                                          |
| ------ | -------------------------------------------------------------------------------- |
| `400`  | Invalid request — missing `urls`, too many URLs (max 10), or bad parameter value |
| `401`  | Missing or invalid API key                                                       |
| `429`  | Rate limit exceeded                                                              |
| `500`  | Internal server error                                                            |

Per-URL errors appear in `errors[]` alongside a `200` response. The `error` field is one of these codes:

| Error code             | `status` field                       | Meaning                                                                 |
| ---------------------- | ------------------------------------ | ----------------------------------------------------------------------- |
| `target_http_error`    | HTTP status code (e.g. `403`, `500`) | Target server returned a non-2xx HTTP response other than 404/410       |
| `page_not_found`       | `404` or `410`                       | Target URL does not exist (HTTP 404 Not Found or 410 Gone)              |
| `target_unreachable`   | —                                    | Connection refused, TLS failure, DNS failure, or other network error    |
| `timeout`              | —                                    | Page did not finish loading within the request deadline                 |
| `bot_blocked`          | —                                    | Site returned a bot-protection challenge (Cloudflare, Incapsula)        |
| `empty_content`        | —                                    | Browser returned HTML but no extractable text found                     |
| `invalid_url`          | —                                    | URL rejected before fetch (private IP, invalid scheme, disallowed host) |
| `invalid_redirect_url` | —                                    | Redirect target rejected before fetch (private IP or disallowed host)   |
| `proxy_error`          | —                                    | Proxy tunnel failed — site may be reachable directly                    |

<Note>
  Per-URL fetch failures are **not** HTTP errors. They appear as entries in `errors[]` alongside a `200` response.
</Note>

<Note>
  Each URL has a **110-second backend timeout**. If the page doesn't respond within 110 seconds, that URL returns a `timeout` error in `errors[]` while the rest of the batch continues. Requests are also subject to a **120-second CDN ceiling** for the full batch. Set your client-side timeout to at least **150 seconds** to receive CDN timeout errors cleanly.
</Note>

***

## Supported Content Types

| Content Type      | Behavior                                                           |
| ----------------- | ------------------------------------------------------------------ |
| HTML              | Full text extraction with formatting                               |
| PDF               | Text content extracted                                             |
| JSON              | Raw JSON returned as text                                          |
| Plain text        | Full text returned                                                 |
| Images (PNG, JPG) | Not supported — returns an error indicating no extractable content |

***

## Usage Endpoint

Retrieve a paginated history of your fetch operations.

```
GET https://api.fetch.tinyfish.ai/usage
```

All requests require an `X-API-Key` header. See [Authentication](/authentication).

### Query Parameters

<ParamField query="start_after" type="string (ISO 8601)">
  Filter results created after this timestamp. Example: `2026-01-01T00:00:00Z`
</ParamField>

<ParamField query="end_before" type="string (ISO 8601)">
  Filter results created before this timestamp. Example: `2026-02-01T00:00:00Z`
</ParamField>

<ParamField query="status" type="string">
  Filter by result status. One of: `completed`, `failed`.
</ParamField>

<ParamField query="limit" type="integer" default="100">
  Maximum number of items per page. Range: 1-1000.
</ParamField>

<ParamField query="page" type="integer" default="1">
  Page number for pagination.
</ParamField>

### Response

```json theme={null}
{
  "items": [...],
  "total": 42,
  "limit": 100,
  "page": 1,
  "total_pages": 1,
  "has_more": false
}
```

### `items[]`

<ResponseField name="id" type="string">
  Unique identifier for the fetch result.
</ResponseField>

<ResponseField name="url" type="string">
  The original requested URL.
</ResponseField>

<ResponseField name="final_url" type="string">
  The URL after any redirects.
</ResponseField>

<ResponseField name="title" type="string | null">
  Page title, if detected.
</ResponseField>

<ResponseField name="description" type="string | null">
  Meta description, if detected.
</ResponseField>

<ResponseField name="language" type="string | null">
  Detected page language (e.g. `"en"`).
</ResponseField>

<ResponseField name="author" type="string | null">
  Page author, if detected.
</ResponseField>

<ResponseField name="published_date" type="string | null">
  Published date, if detected.
</ResponseField>

<ResponseField name="format" type="string">
  The format used for extraction: `markdown`, `html`, or `json`.
</ResponseField>

<ResponseField name="status" type="string">
  Result status: `completed` or `failed`.
</ResponseField>

<ResponseField name="request_origin" type="string">
  Where the request originated: `api`, `cli`, `python-sdk`, `js-sdk`, `mcp`, etc.
</ResponseField>

<ResponseField name="request_id" type="string | null">
  The request ID that grouped this URL with others in a batch.
</ResponseField>

<ResponseField name="text_length" type="integer | null">
  Length of the extracted text content in characters. The full text is not included in usage responses.
</ResponseField>

<ResponseField name="links_count" type="integer">
  Number of links found on the page.
</ResponseField>

<ResponseField name="image_links_count" type="integer">
  Number of image links found on the page.
</ResponseField>

<ResponseField name="latency_ms" type="number | null">
  Time taken to fetch and extract the page, in milliseconds.
</ResponseField>

<ResponseField name="created_at" type="string (ISO 8601)">
  Timestamp when the fetch was executed.
</ResponseField>

<ResponseField name="error" type="string | null">
  Error message if the fetch failed. `null` for successful fetches.
</ResponseField>

### Error Codes

| Status | Meaning                    |
| ------ | -------------------------- |
| `400`  | Invalid query parameters   |
| `401`  | Missing or invalid API key |
| `500`  | Internal server error      |

***

## Rate Limits

Limits apply per API key, measured in URLs per minute across all requests.

| Plan          | URLs / minute |
| ------------- | ------------- |
| Free          | 150           |
| Pay As You Go | 150           |
| Starter       | 300           |
| Pro           | 600           |

When the limit is exceeded, the API returns `HTTP 429`.

***

## Billing

Fetch does not use credits.

***

## Related

<CardGroup cols={2}>
  <Card title="Fetch Overview" icon="bolt" href="/fetch-api">
    First request, response shape, and product routing
  </Card>

  <Card title="Authentication" icon="key" href="/authentication">
    API key setup and troubleshooting
  </Card>

  <Card title="Error Codes" icon="triangle-exclamation" href="/error-codes">
    Full list of API error codes
  </Card>
</CardGroup>
