You are here:

This service provides access to document artifacts stored in Use this service to download PDF files, XML files and individual page files.


The content delivery service (cds) is a HTTP service that provides access to information stored in the document store. Depending on the publishing office, we stored a variety of artifacts, some directly from the office itself, others generated during the import process.
It is possible to download either single artifacts, all artifacts of a single document or all artifacts of a list of documents. Multiple artifacts/documents are downloaded as ZIP archives.

Information Required

The only information required to access the cds is a valid document ID (such as DE.000112014003467.A5). Accessing the endpoint of a document directly will return a list, in JSON format, of all artifacts we currently hold for that particular document.


To use the cds, authentication is required, customers need an account for the proxy. Please send an email to if you need credentials.

Naming Conventions

To allow for uniform access to the various publishing offices that supports, a naming convention has been employed for common artifacts. For example, the PDF is always called DOCUMENT.PDF, the XML DOCUMENT.XML, and the individual page files PAGEnnnn. This conforms to the standard DEPAROM naming conventions. A DEPAROM client is not required to use cds.

Artifacts Stored in cds

Currently, holds the following artifacts:

  • Office XML
    The XML source from the publishing office contained, at minimum, the bibliographic data. For some offices, full text is available. See table below.
  • Office PDF
    The PDF file from the publishing office.
  • PAGEnnnn files
    Each page as single page TIFF in CCITT format (single bit, black and white) at 300 DPI. Generally rendered from the Office PDF.
  • Embedded Images
    Images and drawings in TIFF format. Naming convention depends on publishing office. Used in combination with the XML.
  • mtc JSON
    The document XML in JSON format. The artifact name is mtc.json.
  • mtc Simple JSON
    A simplified JSON format that complies more closely with DEPAROM. Generally speaking, for most uses the mtc JSON format is more appropriate since it is more. The artifact name is mtc.simple.json.
  • mtc Artifacts JSON
    A file in JSON format containing a list of all available artifacts. The file is called artifacts.json. Is similar is usage to what is returned by the List Artifacts endpoint. This file also contains structural information about what document sections are on which pages.


General Error Responses

The HTTP API returns the following error codes, these are listed here. Other responses are listed in the tables further down.

HTTP CodeReasonComments
404 Document Not Found Returned if a document is requested that is not in the store of if the document ID is malformed. Also returned for non-existent artifacts.
500Internal Server Error This error code indicates that something went wrong during the request. If errors persist, please contact mtc (

Service Endpoints

BASE url
List Artifacts GET/cds/:docid
Returns a JSON response
200 OK
:docid is a document ID. List available artifacts of a document.
Download Artifact GET/cds/:docid/:artifact Response depends on artifact mime type 200 OK
downloads the artefact directly. :artifact is the name of the artifact as listed in the artifacts.json file or the "List Artifacts" endpoint.
Download all Artifacts GET/cds/zip/:docid A ZIP stream containing all artifacts of a document. 200 OK
downloads all available artifacts of a document as a ZIP file. The filename is generated and have a "cds-" prefix.
Bulk Download POST/cds/zip
Request must contain a JSON body describing the documents to download. A Filter can be used to specify documents to download.
Response is a ZIP stream.
200 OK
downloads all artifacts of all documents requested as a single ZIP file. The artifacts of each document are stored in a separate folder.
If a document is not available, then that document will be missing the ZIP archive. There will be no error code in this case.
You may want to use "Check Availability" endpoint to identify missing documents.
Check AvailabilityPOST/cds/checkJson body contains a list of docids that are missing in store.200 OKcheck if requested docids are in Store available.

JSON Formats

Response from List Artifacts

FieldFormatUsage / Comments
artifactsJSON array of strings Contains file names within the requested directory
Actual container name in the store
docidStringdocid without revision suffix


    "container": "DE.000112014003467.A5",
    "docid": "DE.000112014003467.A5",
    "artifacts": [


Request JSON for Bulk Download

FieldFormatUsage / Comments
docidsJSON array of strings List of docids. All documents will be added to the ZIP file.
filterJSON array of stringsList of filters, wildcards (*) are supported.
The Filter section is optional.


    "docids": [
    "filter": [

MTC Artifacts JSON

FieldFormatUsage / Comments
idJSON String ID of the corresponding document.
artifacts List of JSON Strings List of available artifacts, as returned by the List Artifacts endpoint.
sections List of JSON Dictionaries
For each dictionary, the key is the section name and the fields start and end denote the start and end page numbers of that section. supports the following sections:

  • Title
    The title page(s) of the document

  • Abstract
    The pages the contain the abstract. Usuall the same as Title and is sometimes missing, depending on the data source.

  • Drawing
    The pages that contain drawings. Is not always present, for example if the document has no drawings.

  • Claim
    The pages that contain the claims. Is not always present.

  • Description
    The pages that contain the description. Is not always present.

The presence of a Claims or Description section does not necessarily mean that the XML document is full text.


    "id": "DE.000112014003467.A5",
    "artifacts": [
    "sections": [
            "section": "Title",
            "start": 1,
            "end": 1


Reponse JSON for Check Availability

FieldFormatUsage / Comments
docidsStringList of docids


    "docids": [