API Reference
POST /v1/files/upload
Technical reference for the file upload endpoint.
/v1/files/uploadUpload and index one or more files into a dataset.
httpContent-Type: multipart/form-data
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
datasetId | string | Yes* | - | Dataset to upload files to (*optional with frontend token) |
file | file | Yes | - | File(s) to upload (can include multiple) |
metadata | string | No | {} | JSON string of per-file metadata |
chunkSize | number | No | 300 | Chunk size in tokens |
chunkOverlap | number | No | 20 | Overlap between chunks |
Documents (max 100 MB):
.pdf - PDF documents.docx - Word documents.xlsx, .xls - Excel spreadsheets.pptx, .ppt - PowerPoint presentations.csv - CSV files.md - Makrdown Files.json - JSON FIles.txt - Plain Text FilesMedia (max 2 GB, auto-transcribed):
.mp3 - Audio files.wav - Audio files.mp4 - Video files (audio extracted)Web Scrapper (comming soon)
Web - Web URLSitemap - Sitemap URL| Type | Max Size | Error |
|---|---|---|
| Documents | 100 MB | FILE_TOO_LARGE |
| Media | 2 GB | FILE_TOO_LARGE |
chunkSize
chunkOverlap
The metadata field must be a JSON string (not object).
Metadata lookup order for each file:
metadata[originalFileName]metadata[fileId]metadata[index] (0-based)json{ "report.pdf": { "userId": "user_123", "department": "finance", "year": 2025 }, "0": { "priority": "high" } }
bashcurl -X POST https://api.easyrag.com/v1/files/upload \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "datasetId=my-dataset" \ -F "file=@document.pdf"
bashcurl -X POST https://api.easyrag.com/v1/files/upload \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "datasetId=my-dataset" \ -F 'metadata={"document.pdf":{"userId":"user_123","department":"legal"}}' \ -F "file=@document.pdf"
bashcurl -X POST https://api.easyrag.com/v1/files/upload \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "datasetId=my-dataset" \ -F "chunkSize=500" \ -F "chunkOverlap=50" \ -F "file=@document.pdf"
bashcurl -X POST https://api.easyrag.com/v1/files/upload \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "datasetId=my-dataset" \ -F "file=@file1.pdf" \ -F "file=@file2.pdf" \ -F "file=@file3.pdf"
javascriptconst formData = new FormData(); formData.append('datasetId', 'my-dataset'); formData.append('file', fileInput.files[0]); // Optional metadata const metadata = { [file.name]: { userId: 'user_123', uploadedAt: new Date().toISOString() } }; formData.append('metadata', JSON.stringify(metadata)); // Optional chunking formData.append('chunkSize', '400'); formData.append('chunkOverlap', '40'); const response = await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); const result = await response.json();
pythonimport requests files = {'file': open('document.pdf', 'rb')} data = { 'datasetId': 'my-dataset', 'metadata': '{"document.pdf":{"userId":"user_123"}}' } headers = {'Authorization': f'Bearer {api_key}'} response = requests.post( 'https://api.easyrag.com/v1/files/upload', headers=headers, data=data, files=files ) result = response.json()
json{ "success": true, "message": "Files processed and indexed successfully!", "files": [ { "customerId": "user_abc123", "datasetId": "my-dataset", "fileId": "f7a3b2c1-4d5e-6f7g-8h9i-0j1k2l3m4n5o", "filePath": "customers/user_abc123/datasets/my-dataset/f7a3b2c1-document.pdf", "originalName": "document.pdf", "mimeType": "application/pdf", "size": 245678, "loaderId": "pdf_loader_xyz", "created": "2024-12-13T10:30:00.000Z", "extension": ".pdf", "transcriptionText": null, "transcriptionSrt": null, "extraMeta": { "userId": "user_123", "department": "legal" } } ], "billed": { "fileCount": 1, "uploadUnits": 1 } }
| Field | Type | Description |
|---|---|---|
success | boolean | Always true on success |
message | string | Success message |
files | array | Array of uploaded file objects |
files[].fileId | string | Unique file identifier |
files[].originalName | string | Original filename |
files[].datasetId | string | Dataset containing the file |
files[].customerId | string | Customer who owns the file |
files[].filePath | string | Storage path |
files[].mimeType | string | File MIME type |
files[].size | number | File size in bytes |
files[].loaderId | string | EmbedJS loader ID |
files[].created | string | ISO 8601 timestamp |
files[].extension | string | File extension |
files[].transcriptionText | string|null | Transcription text (media only) |
files[].transcriptionSrt | array|null | SRT subtitles (media only) |
files[].extraMeta | object|null | Custom metadata |
billed.fileCount | number | Number of files uploaded |
billed.uploadUnits | number | Credits charged (file count × 10) |
For audio/video files, transcription fields are populated:
json{ "files": [ { "fileId": "a1b2c3d4", "originalName": "podcast.mp3", "extension": ".mp3", "transcriptionText": "Welcome to episode 47...", "transcriptionSrt": [ { "id": "1", "startTime": "00:00:00,000", "endTime": "00:00:03,500", "text": "Welcome to episode 47" } ] } ] }
Missing datasetId
json{ "error": "datasetId is required via token, body or form" }
No files provided
json{ "error": "At least one file is required" }
Invalid chunk size
json{ "error": "chunkSize must be a positive integer" }
File too large
json{ "error": "FILE_TOO_LARGE", "message": "File exceeds maximum size of 100MB" }
Unsupported format
json{ "error": "Unsupported file format: .txt" }
Invalid metadata JSON
json{ "error": "metadata must be valid JSON string" }
json{ "error": "Missing API key or token" }
json{ "error": "INSUFFICIENT_CREDITS", "message": "You are out of credits. Please top up to continue.", "details": { "required": 10, "available": 5 } }
Dataset mismatch with token
json{ "error": "datasetId mismatch between token and request" }
json{ "error": "Request entity too large" }
json{ "error": "RATE_LIMIT_EXCEEDED", "message": "Too many requests. Please try again later.", "retryAfter": 60 }
Note: Media processing adds ~1-2 minutes per hour of audio.
javascriptfunction validateFile(file) { const maxSize = 100 * 1024 * 1024; // 100MB const allowed = ['pdf', 'docx', 'xlsx', 'pptx', 'csv', 'mp3', 'wav', 'mp4']; const ext = file.name.split('.').pop().toLowerCase(); if (file.size > maxSize) { return { valid: false, error: 'File too large' }; } if (!allowed.includes(ext)) { return { valid: false, error: 'Unsupported file type' }; } return { valid: true }; }
javascript// Large chunks for documents with long sections formData.append('chunkSize', '500'); formData.append('chunkOverlap', '50'); // Small chunks for precise matching formData.append('chunkSize', '200'); formData.append('chunkOverlap', '20');
javascriptconst metadata = { [file.name]: { userId: currentUserId, uploadedAt: new Date().toISOString(), department: userDepartment, fileType: file.type, sizeBytes: file.size } };
javascriptconst xhr = new XMLHttpRequest(); xhr.upload.onprogress = (e) => { if (e.lengthComputable) { const percent = (e.loaded / e.total) * 100; updateProgress(percent); } }; xhr.open('POST', 'https://api.easyrag.com/v1/files/upload'); xhr.setRequestHeader('Authorization', `Bearer ${apiKey}`); xhr.send(formData);
javascriptasync function uploadWithRetry(formData, maxRetries = 3) { for (let i = 0; i < maxRetries; i++) { try { const response = await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); if (response.ok) { return await response.json(); } if (response.status === 500 && i < maxRetries - 1) { await new Promise(r => setTimeout(r, 1000 * (i + 1))); continue; } throw new Error(`Upload failed: ${response.status}`); } catch (error) { if (i === maxRetries - 1) throw error; } } }
GET /v1/files - List uploaded filesGET /v1/files/:fileId - Get file detailsDELETE /v1/files/:fileId - Delete file