AnchorGrid Developer Docs

Docs/Endpoints/POST /specs/parse/document

POST/v1/specs/parse/document

Queue TOC / spec structure extraction from a PDF already in ag_documents — the same upload as POST /v1/documents, not a second upload. Job model is toc-parser. Worker reads up to 40 pages of text and parses TOC lines in toc_parser/task.py.

X-API-Key and Content-Type: application/json. Async: immediate 202 with a new job id, then poll GET /v1/jobs/{job_id}. This path is in the tier RPM limiter with other job-submit POSTs. Credits: CreditCost.SPEC — 1 credit per parse job (free lifetime 402, paid monthly exhaustion 429). Optional webhook_url: delivery from the worker requires tier developer, pro, or enterprise.

ℹ

Some schema docstrings still reference POST /v1/specs/parse; that path does not exist. Use /v1/specs/parse/document (required for content-extract).

toc-parserAsync · 2021 credit / job≤ 40 pages text

Request

↑

This endpoint requires a document_id. Upload your PDF first →

Body type: ParseSpecByDocumentRequest.

document_idreq

string (UUID)

Document in ag_documents for your account; must not be missing, wrong account, or expired.

webhook_url

string

Optional. Honored on developer, pro, enterprise for worker delivery.

Code examples

curl -X POST https://api.anchorgrid.ai/v1/specs/parse/document \
  -H "X-API-Key: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "550e8400-e29b-41d4-a716-446655440000"
  }'

Response — 202 Accepted

Shape: ParseJobQueued. Poll until status is complete or failed.

job_id

string (UUID)

Poll identifier.

status

string

queued on this response.

estimated_processing_seconds

integer

Hint only (e.g. 15).

poll_url

string

Path only — prepend https://api.anchorgrid.ai.

Result shape

When status === "complete" and model === "toc-parser". Division/section shape comes from TOCParser.parse_toc_structure (division_code, division_title, sections[] with section_code as 6 digits and section_title). GET /v1/jobs does not post-filter toc-parser results.

toc_found

boolean

false if no TOC detected.

division_count

integer

Typically 0 when toc_found is false.

section_count

integer

Typically 0 when toc_found is false.

divisions

array

Typically [] when toc_found is false.

toc_found_reason

string

Optional when toc_found is false — e.g. nothing structured in first 40 pages.

model_version

string

e.g. toc-parser-v1.0.0

processing_time_ms

integer

Wall time.

Credits & rate limits

Cost

1 credit / job (SPEC)

Rate limit

Tier RPM (job-submit bucket)

Errors

Handler detail; Intelligence may still genericize JSON.

401

Invalid or missing API key.

422

Bad JSON / invalid document_id.

404

Document missing, wrong account, or expired (DOCUMENT_NOT_FOUND).

402 / 429

Credit limit.

429

RPM limit (middleware).

Worker failures → job failed with error_code such as URL_UNREACHABLE, PDF_CORRUPT, INTERNAL_ERROR — poll GET /v1/jobs.

See Errors for HTTP exception mapping.

Typical flow

Continue with POST /v1/specs/content-extract using the completed TOC job_id and section or division codes.

Response Preview

202 OK

{
  "job_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
  "status": "queued",
  "estimated_processing_seconds": 15,
  "poll_url": "/v1/jobs/7c9e6679-7425-40de-944b-e07fc1f90ae7"
}