Document Retrieval

Topograph gives you access to official company documents (trade register extracts, financial statements, articles of association, UBO certificates, and more) directly from government registers across 30+ countries.

For live document coverage, country-specific availability, and pricing, see Coverage and pricing.

Key things to know before you start:

The document list is dynamic. It depends on what the register actually holds for each company. A company may have 10 financial statements or none.
Document types are a fixed enum. Every document belongs to one of 11 categories (financial_statement, uncertified_trade_register_extract, etc.). See Document types.
Retrieval is asynchronous because we query real government registers, not a cache. Use webhooks or polling.
Documents are sorted newest first within each category. This is guaranteed.
Document IDs are opaque. Always discover them via availableDocuments before requesting a download. IDs are generally stable across consecutive requests but we do not guarantee long-term stability, so do not store them permanently.
Prices are shown upfront. Each document in availableDocuments includes a price in credits. You are only charged when you request the download.
Documents arrive progressively. A company’s documents may come from multiple underlying sources (e.g., trade register + financial filings authority). Results are delivered incrementally via webhooks or polling. You don’t need to wait for the full list: act as soon as the document you need appears.

How it works

Document retrieval follows a two-step process:

Always request availableDocuments first to get valid document IDs. Document IDs are opaque identifiers (typically UUIDs) specific to each company. While they are generally stable short-term, do not hardcode them or rely on long-term persistence.

Progressive delivery

For many countries, Topograph queries multiple underlying sources to build the full document list. For example, France combines the INPI register (for trade register extracts and articles of association) and the JOAFE (for official publications of associations). This means availableDocuments results arrive incrementally. Each webhook (or polling response) may contain more documents than the previous one. You don’t need to wait for the availableDocuments status to reach succeeded before acting: if the document you need is already in the list, you can request it immediately.

Build your integration to process documents as they arrive rather than waiting for the full list. This is especially useful for slow countries where the complete list can take minutes.

Step 1: List available documents

Request the availableDocuments data point to discover which documents the register holds for a given company:

curl --request POST \
  --url https://api.topograph.co/v2/company \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '{
    "countryCode": "FR",
    "id": "928020932",
    "dataPoints": ["availableDocuments"]
  }'

The response includes an availableDocuments array. Each entry describes a document:

{
  "availableDocuments": [
    {
      "id": "1c932de4-4610-5506-b48d-4e62529d58e8",
      "type": "financial_statement",
      "name": "Bilan annuel 2024",
      "description": "Annual financial statement for fiscal year 2024",
      "format": "PDF",
      "date": "2024-12-31T00:00:00Z",
      "price": 0,
      "estimatedDeliverySeconds": 15
    },
    {
      "id": "a7f3b2c1-9e8d-4a5f-b6c0-d1e2f3a4b5c6",
      "type": "uncertified_trade_register_extract",
      "name": "Certificat INPI",
      "description": "Uncertified trade register extract from INPI",
      "format": "PDF",
      "price": 0,
      "estimatedDeliverySeconds": 5
    }
  ]
}

Step 2: Request specific documents

Use the id values from step 1 to request the documents you need:

curl --request POST \
  --url https://api.topograph.co/v2/company \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '{
    "countryCode": "FR",
    "id": "928020932",
    "documents": [
      "1c932de4-4610-5506-b48d-4e62529d58e8"
    ]
  }'

Each requested document appears in the documents object of the response, grouped by category (see Document categories below). The document includes a temporary download URL.

You can combine both steps in a single request by including both dataPoints: ["availableDocuments"] and documents: ["doc-id"]. The available documents list will arrive first, and the requested documents will follow.

Document types

Every document in availableDocuments has a type field indicating its category. These types are a fixed enum:

Type	Description
`financial_statement`	Annual accounts, balance sheets, income statements
`uncertified_trade_register_extract`	Standard (uncertified) extract from the trade register
`certified_trade_register_extract`	Officially certified extract (when available)
`ubo_extract`	Ultimate beneficial owners certificate
`status`	Articles of association / company statutes
`trade_register_history`	Historical trade register modifications
`official_publication`	Official gazette publications (e.g., BODACC in France)
`annual_return`	Annual return filings (e.g., UK confirmation statement)
`certificate_of_good_standing`	Certificate confirming the company is active and compliant
`unknown`	Document whose type could not be classified
`other`	Documents that don’t fit other categories

While the type enum is fixed, the list of documents for a given company is dynamic. A company may have zero financial statements or ten. It depends entirely on what the register holds. This is why the availableDocuments step exists.

Document categories

When you download documents, the response groups them by category under the documents object:

Field	Type	Description
`tradeRegisterExtract`	Single document	Latest uncertified trade register extract
`certifiedTradeRegisterExtract`	Single document	Latest certified trade register extract
`financialStatements`	Array	Financial statements, sorted newest first
`articlesOfAssociation`	Array	Articles of association / statutes, sorted newest first
`ultimateBeneficialOwnersCertificate`	Single document	Latest UBO certificate
`officialPublications`	Array	Official publications, sorted newest first
`annualReturns`	Array	Annual returns, sorted newest first
`certificateOfGoodStanding`	Single document	Latest certificate of good standing
`otherDocuments`	Array	Other documents
`lastFiscalYearFinancialStatement`	Single document	Shortcut to the most recent financial statement

Within each array category, documents are always sorted by date descending (most recent first). This ordering is guaranteed.

Document IDs

Document IDs are opaque identifiers, typically UUIDs like 1c932de4-4610-5506-b48d-4e62529d58e8. They are:

Generated when you request availableDocuments for a company
Generally stable across consecutive requests for the same company, so you can reuse them short-term
Not guaranteed to be stable long-term. If you need a document days or weeks later, call availableDocuments again to get fresh IDs

The correct flow is always: list available documents, then pick the IDs you need.

Downloading documents

Document download links are signed URLs with a 15-minute lifespan. To handle this:

Download immediately after receiving the URL, or
Refresh the URL by calling GET /v2/company/{requestId}. This regenerates fresh signed URLs at no additional cost

curl --request GET \
  --url https://api.topograph.co/v2/company/253299d1-e8d0-4268-945b-f175f98bc114 \
  --header 'x-api-key: <api-key>'

Re-fetching a request by requestId is free and will not be billed again. Store your request IDs to refresh download links as needed.

Document pricing

Each document in availableDocuments includes a price field indicating its cost in credits. Many documents are free (price = 0), while others (typically certified extracts or documents from paid registers) have a cost. You can check the price before purchasing by inspecting the availableDocuments response. Credits are only charged when you actually request the document for download. Some documents are included at no extra cost when the matching data block is requested in the same call (or linked via 24-hour deduplication). Coverage is country-specific; see GET /v2/pricing and Coverage and pricing. For details on blocks, deduplication, and budgets, see Pricing & Caching.

Automatic PDF conversion

Some registers publish documents in non-PDF formats. Topograph automatically converts them to PDF. Supported source formats:

Images: JPG, JPEG, PNG, TIFF, TIF
Spreadsheets: CSV, XLS, XLSX
Documents: DOC, DOCX, RTF, TXT, XPS
Presentations: PPT, PPTX

When conversion is needed, the document status stays in_progress until processing completes (typically ~20 seconds). Once done, the document includes:

url: the original file from the register
pdfUrl: the PDF generated by Topograph

If the source is already a PDF, both fields point to the same file.

AI-powered financial data extraction

Financial statements are automatically parsed using AI to extract structured data: revenue, assets, liabilities, ratios, and more. This happens during document post-processing with no additional configuration. For the full data model and available fields, see Financial Data Extraction.

Performance expectations

Document retrieval speed depends on the country and the register being queried:

Fast countries (France, Finland, Estonia…): available documents listed in seconds, documents downloaded in under a minute
Medium countries (Belgium, Netherlands, Ireland…): 10-30 seconds for available documents, 1-2 minutes for downloads
Slow countries (Germany, Spain…): available documents can take several minutes due to slow government registers

If you have tight response time requirements, request availableDocuments as a separate step and handle the wait asynchronously. Use webhooks rather than polling for the best experience with slower countries.

For country-specific details on available documents and expected response times, browse the Country Guides.

Best practices

Always start with availableDocuments: This ensures you have valid, current document IDs and know what’s available before purchasing.
Store request IDs: Keep the requestId to refresh download URLs for free within 24 hours.
Monitor document status: Check request.dataStatus.documents[docId].status. Documents may be in_progress while being fetched or converted.
Handle timeouts gracefully: For slower registers, use webhooks or increase your polling timeout rather than abandoning the request.
Check prices before downloading: Inspect the price field in availableDocuments to avoid unexpected charges.

For a broader picture of how document retrieval fits into the company data flow, see the Verification Data guide.

Documentation Index

​How it works

​Progressive delivery

​Step 1: List available documents

​Step 2: Request specific documents

​Document types

​Document categories

​Document IDs

​Downloading documents

​Document pricing

​Automatic PDF conversion

​AI-powered financial data extraction

​Performance expectations

​Best practices

How it works

Progressive delivery

Step 1: List available documents

Step 2: Request specific documents

Document types

Document categories

Document IDs

Downloading documents

Document pricing

Automatic PDF conversion

AI-powered financial data extraction

Performance expectations

Best practices