Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The content to extract and process into a document. This can be a URL to a website, a PDF, an image, or a video.
Plaintext: Any plaintext format
URL: A URL to a website, PDF, image, or video
We automatically detect the content type from the url's response format.
"This is a detailed article about machine learning concepts..."
Optional tag this document should be containerized by. This can be an ID for your user, a project ID, or any other identifier you wish to use to group documents.
"user_123"
(DEPRECATED: Use containerTag instead) Optional tags this document should be containerized by. This can be an ID for your user, a project ID, or any other identifier you wish to use to group documents.
["user_123", "project_123"]
Optional custom ID of the document. This could be an ID from your database that will uniquely identify this document.
"mem_abc123"
Optional file type override to force specific processing behavior. Valid values: text, pdf, tweet, google_doc, google_slide, google_sheet, image, video, notion_doc, webpage, onedrive
"pdf"
Required when fileType is 'image' or 'video'. Specifies the exact MIME type to use (e.g., 'image/png', 'image/jpeg', 'video/mp4', 'video/webm')
"image/png"
Optional metadata for the document. This is used to store additional information about the document. You can use this to store any additional information you need about the document. Metadata can be filtered through. Keys must be strings and are case sensitive. Values can be strings, numbers, or booleans. You cannot nest objects.
{
"category": "technology",
"isPublic": true,
"readingTime": 5,
"source": "web",
"tag_1": "ai",
"tag_2": "machine-learning"
}