Sources represent uploaded documents in the platform. Each source goes through an asynchronous extraction pipeline that converts files into structured derivatives (page texts, markdown, per-page images). Sources are the raw material for knowledge bases — once extracted, their content can be chunked and indexed for semantic search.Documentation Index
Fetch the complete documentation index at: https://docs.powabase.ai/llms.txt
Use this file to discover all available pages before exploring further.
Common Patterns
The typical flow is: upload a file (POST /api/sources/upload), poll for completion (GET /api/sources/{id} until extraction_status is ‘extracted’ or ‘attention_required’), then retrieve extracted text (GET /api/sources/{id}/page-texts). For files already in project storage, use import-from-storage. For web pages, use import-url. To swap extraction backends after the fact, POST /api/sources/{id}/reextract with a new extraction_model.GET /api/sources
List all sources with optional status filter.Filter by extraction_status. One of: pending, extracting, extracted, attention_required, failed, cancelled.
POST /api/sources/upload
Upload a file for extraction. Accepts PDF, DOCX, PPTX, XLSX, images (PNG/JPG/WebP/GIF/TIFF), and plain text. Uses multipart/form-data. Optional fields: name (display name), metadata (JSON string, preserved through indexing), extraction_model (PDF only — one of auto, mistral, paddleocr, lighton, opendataloader, fitz, pdfplumber).POST /api/sources/import-from-storage
Import a file already in project storage as a source.POST /api/sources/import-url
Import content from web URLs. mode=‘urls’ imports a fixed list, mode=‘crawl’ spiders from a seed URL, mode=‘sitemap’ parses a sitemap XML. Requires Firecrawl API key to be configured in project settings.GET /api/sources/
Get source details including extraction status.Source ID
GET /api/sources//page-texts
Get extracted text content organized by page.Source ID
Specific page number
PATCH /api/sources/
Update a source’s display name or metadata.Source ID
POST /api/sources//reextract
Re-run extraction on an existing source, optionally with a different extraction_model.Source ID
POST /api/sources//cancel
Cancel an in-progress extraction. Sets extraction_status to ‘cancelled’.Source ID
GET /api/sources//download
Download the original uploaded file (as stored in project storage).Source ID
GET /api/sources//derivatives//download
Download a derivative artifact. type is one of: markdown, text, page_text, image. For per-page types (page_text, image) pass index=N (0-based) in the query string.Source ID
Derivative type: markdown, text, page_text, or image
0-based index for per-page derivatives (page_text, image)
DELETE /api/sources/
Delete a source and its associated storage files (original + derivatives).Source ID
Error Responses
| Status | Code | Description |
|---|---|---|
| 400 | invalid_file | The uploaded file type is not supported or the file is corrupted |
| 404 | source_not_found | No source exists with the given ID |
| 413 | file_too_large | The uploaded file exceeds the maximum allowed size |