Skip to main content
Track your documents through the processing pipeline to provide better user experiences and handle edge cases.

Processing Pipeline

Process of converting documents to memories Each stage serves a specific purpose:
  • Queued: Document is waiting in the processing queue
  • Extracting: Content is being extracted (OCR for images, transcription for videos)
  • Chunking: Content is broken into optimal, searchable pieces
  • Embedding: Each chunk is converted to vector representations
  • Indexing: Vectors are added to the search index
  • Done: Document is fully processed and searchable
Processing time varies by content type. Plain text processes in seconds, while a 10-minute video might take 2-3 minutes.

Processing Documents

Monitor all documents currently being processed across your account. GET /v3/documents/processing
// Direct API call (not in SDK)
const response = await fetch('https://api.supermemory.ai/v3/documents/processing', {
  headers: {
    'Authorization': `Bearer ${SUPERMEMORY_API_KEY}`
  }
});

const processing = await response.json();
console.log(`${processing.documents.length} documents processing`);

Response Format

{
  "documents": [
    {
      "id": "doc_abc123",
      "status": "extracting",
      "created_at": "2024-01-15T10:30:00Z",
      "updated_at": "2024-01-15T10:30:15Z",
      "container_tags": ["research"],
      "metadata": {
        "source": "upload",
        "filename": "report.pdf"
      }
    },
    {
      "id": "doc_def456",
      "status": "chunking",
      "created_at": "2024-01-15T10:29:00Z",
      "updated_at": "2024-01-15T10:30:00Z",
      "container_tags": ["articles"],
      "metadata": {
        "source": "url",
        "url": "https://example.com/article"
      }
    }
  ],
  "total": 2
}

Individual Documents

Track specific document processing status. GET /v3/documents/{id}
const memory = await client.memories.get("doc_abc123");

console.log(`Status: ${memory.status}`);

// Poll for completion
while (memory.status !== 'done') {
  await new Promise(r => setTimeout(r, 2000));
  memory = await client.memories.get("doc_abc123");
  console.log(`Status: ${memory.status}`);
}

Response Format

{
  "id": "doc_abc123",
  "status": "done",
  "content": "The original content...",
  "container_tags": ["research"],
  "metadata": {
    "source": "upload",
    "filename": "report.pdf"
  },
  "created_at": "2024-01-15T10:30:00Z",
  "updated_at": "2024-01-15T10:31:00Z"
}

Status Values

StatusDescriptionTypical Duration
queuedWaiting to be processed< 5 seconds
extractingExtracting content from source5-30 seconds
chunkingBreaking into searchable pieces5-15 seconds
embeddingCreating vector representations10-30 seconds
indexingAdding to search index5-10 seconds
doneFully processed and searchable-
failedProcessing failed-

Polling Best Practices

When polling for status updates:
async function waitForProcessing(documentId: string, maxWaitMs = 300000) {
  const startTime = Date.now();
  const pollInterval = 2000; // 2 seconds

  while (Date.now() - startTime < maxWaitMs) {
    const doc = await client.memories.get(documentId);

    if (doc.status === 'done') {
      return doc;
    }

    if (doc.status === 'failed') {
      throw new Error(`Processing failed for ${documentId}`);
    }

    await new Promise(r => setTimeout(r, pollInterval));
  }

  throw new Error(`Timeout waiting for ${documentId}`);
}

Batch Processing

For multiple documents, track them efficiently:
async function trackBatch(documentIds: string[]) {
  const statuses = new Map();

  // Initial check
  for (const id of documentIds) {
    const doc = await client.memories.get(id);
    statuses.set(id, doc.status);
  }

  // Poll until all done
  while ([...statuses.values()].some(s => s !== 'done' && s !== 'failed')) {
    await new Promise(r => setTimeout(r, 5000)); // 5 second interval for batch

    for (const id of documentIds) {
      if (statuses.get(id) !== 'done' && statuses.get(id) !== 'failed') {
        const doc = await client.memories.get(id);
        statuses.set(id, doc.status);
      }
    }

    // Log progress
    const done = [...statuses.values()].filter(s => s === 'done').length;
    console.log(`Progress: ${done}/${documentIds.length} complete`);
  }

  return statuses;
}

Error Handling

Handle processing failures gracefully:
async function addWithRetry(content: string, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    const { id } = await client.memories.add({ content });

    try {
      const result = await waitForProcessing(id);
      return result;
    } catch (error) {
      console.error(`Attempt ${attempt} failed:`, error);

      if (attempt === maxRetries) {
        throw error;
      }

      // Exponential backoff
      await new Promise(r => setTimeout(r, 1000 * Math.pow(2, attempt)));
    }
  }
}

Processing Times by Content Type

Documents: Created near instantly (200-500ms) Memories: Supermemory creates a memory graph understanding based on semantic analysis and contextual understanding.
Content TypeMemory Processing TimeNotes
Plain Text5-10 secondsFastest processing
Markdown5-10 secondsSimilar to plain text
PDF (< 10 pages)15-30 secondsOCR if needed
PDF (> 100 pages)1-3 minutesDepends on complexity
Images10-20 secondsOCR processing
YouTube Videos1-2 min per 10 min videoTranscription required
Web Pages10-20 secondsContent extraction
Google Docs10-15 secondsAPI extraction
Pro Tip: Use the processing status endpoint to provide real-time feedback to users, especially for larger documents or batch uploads.
I