Guides

Uploading Files

Upload and index your documents

Learn how to upload documents to EasyRAG and make them searchable.

Overview

When you upload a file to EasyRAG:

  1. Parsing - File content is extracted (text, tables, etc.)
  2. Chunking - Content is split into smaller pieces
  3. Embedding - Each chunk is converted to a vector
  4. Indexing - Vectors are stored for fast searching

After upload, you can immediately search and query the content.

Supported File Types

Documents

FormatExtensionMax Size
PDF.pdf100 MB
Worddoc, .docx100 MB
Excel.xlsx, .xls100 MB
PowerPoint.pptx, .ppt100 MB
CSV.csv100 MB
Markdown.md100 MB
JSON.json100 MB
TXT.txt100 MB

Media (Auto-Transcribed)

FormatExtensionMax Size
Audio.mp3, .wav2 GB
Video.mp42 GB

Note: Audio and video files are automatically transcribed using Whisper, then the transcription is indexed for search.

Basic Upload

Using cURL

bash
curl -X POST https://api.easyrag.com/v1/files/upload \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "datasetId=my-documents" \ -F "file=@document.pdf"

Using JavaScript

javascript
const formData = new FormData(); formData.append('datasetId', 'my-documents'); formData.append('file', fileInput.files[0]); const response = await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); const result = await response.json(); console.log('Uploaded:', result.files[0].originalName);

Response

json
{ "success": true, "message": "Files processed and indexed successfully!", "files": [ { "fileId": "f7a3b2c1-4d5e-6f7g", "originalName": "document.pdf", "datasetId": "my-documents", "size": 245678, "created": "2024-12-13T10:30:00.000Z" } ], "billed": { "fileCount": 1, "uploadUnits": 1 } }

Upload Multiple Files

You can upload multiple files in a single request:

javascript
const formData = new FormData(); formData.append('datasetId', 'my-documents'); // Add multiple files formData.append('file', file1); formData.append('file', file2); formData.append('file', file3); const response = await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData });

Cost: 1 credit per file (3 credits for 3 files)

Chunking Options

Control how documents are split into chunks:

Default Behavior

  • Chunk Size: 300 tokens
  • Chunk Overlap: 20 tokens

Custom Chunking

javascript
const formData = new FormData(); formData.append('datasetId', 'my-documents'); formData.append('file', file); formData.append('chunkSize', '500'); // Larger chunks formData.append('chunkOverlap', '50'); // More overlap await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData });

When to Adjust Chunking

Use larger chunks (400-600) when:

  • Documents have long, coherent sections
  • You want more context per result
  • Your queries are complex

Use smaller chunks (200-300) when:

  • Documents have short, distinct sections
  • You want precise matching
  • Your queries are specific

Increase overlap (40-60) when:

  • Important information spans multiple chunks
  • You want better context continuity
  • Precision matters more than speed

Adding Metadata

Attach custom metadata to your files for filtering later:

javascript
const formData = new FormData(); formData.append('datasetId', 'company-docs'); formData.append('file', policyFile); // Metadata as JSON string const metadata = { 'hr-policy.pdf': { department: 'HR', category: 'policy', year: 2024, confidential: false } }; formData.append('metadata', JSON.stringify(metadata)); await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData });

Metadata Matching

Metadata can be matched by:

  1. Filename: metadata['hr-policy.pdf']
  2. File ID: metadata['f7a3b2c1-4d5e']
  3. Index: metadata['0'] (first file in upload)
javascript
const metadata = { // Match by filename 'report.pdf': { type: 'financial' }, // Match by index (useful for multiple uploads) '0': { priority: 'high' }, '1': { priority: 'medium' } };

Later, you can filter by this metadata:

javascript
await search('company-docs', 'vacation policy', { filters: [ { key: 'department', match: { value: 'HR' } }, { key: 'year', match: { value: 2024 } } ] });

Transcription (Audio/Video)

When you upload audio or video files:

  1. The media is sent to Whisper for transcription
  2. Transcription text is indexed for search
  3. Both text and SRT subtitles are saved
javascript
// Upload a podcast episode const formData = new FormData(); formData.append('datasetId', 'podcast-episodes'); formData.append('file', audioFile); // .mp3 or .wav const response = await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); const result = await response.json(); // Access transcription const file = result.files[0]; console.log('Transcription:', file.transcriptionText); console.log('Subtitles:', file.transcriptionSrt); // Array of SRT entries

Note: Transcription adds processing time (typically 1-2 minutes per hour of audio).

Error Handling

File Too Large

json
{ "error": "FILE_TOO_LARGE", "message": "File exceeds maximum size of 100MB" }

Unsupported Format

json
{ "error": "Unsupported file format: .txt" }

Insufficient Credits

json
{ "error": "INSUFFICIENT_CREDITS", "message": "You are out of credits. Please top up to continue." }

Complete Error Handling Example

javascript
async function uploadWithErrorHandling(file) { // Check file size const maxSize = file.type.startsWith('video') ? 2 * 1024 * 1024 * 1024 : 100 * 1024 * 1024; if (file.size > maxSize) { alert('File too large!'); return; } // Check file type const allowed = ['.pdf', '.docx', '.xlsx', '.pptx', '.csv', '.mp3', '.wav', '.mp4']; const ext = '.' + file.name.split('.').pop().toLowerCase(); if (!allowed.includes(ext)) { alert('Unsupported file type!'); return; } const formData = new FormData(); formData.append('datasetId', 'my-dataset'); formData.append('file', file); try { const response = await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); if (!response.ok) { const error = await response.json(); if (error.error === 'INSUFFICIENT_CREDITS') { alert('Out of credits! Please top up in the dashboard.'); } else { alert(`Upload failed: ${error.message}`); } return; } const result = await response.json(); console.log('Success:', result); } catch (error) { console.error('Upload error:', error); alert('Network error. Please try again.'); } }

Best Practices

1. Validate Files Client-Side

Check file size and type before uploading:

javascript
function validateFile(file) { const maxSize = 100 * 1024 * 1024; // 100MB const allowed = ['pdf', 'docx', 'xlsx', 'pptx', 'csv']; const ext = file.name.split('.').pop().toLowerCase(); if (file.size > maxSize) { return { valid: false, error: 'File too large' }; } if (!allowed.includes(ext)) { return { valid: false, error: 'Unsupported file type' }; } return { valid: true }; }

2. Use Meaningful Dataset Names

javascript
// ✅ Good: Descriptive dataset names const datasetId = `user-${userId}-documents`; const datasetId = 'company-policies'; const datasetId = 'support-docs-2024'; // ❌ Bad: Generic names const datasetId = 'dataset1'; const datasetId = 'files';

3. Add Useful Metadata

javascript
// ✅ Good: Metadata for filtering const metadata = { 'contract.pdf': { type: 'contract', client: 'Acme Corp', signedDate: '2024-01-15', value: 50000, department: 'sales' } }; // ❌ Bad: No metadata or useless metadata const metadata = { 'contract.pdf': { uploaded: true, format: 'pdf' // Already known from filename } };

4. Handle Upload State

javascript
// Show loading state setUploading(true); try { await uploadFile(file); // Show success message showToast('File uploaded successfully!'); } catch (error) { // Show error message showToast('Upload failed. Please try again.'); } finally { setUploading(false); }

Billing

  • Cost: 1 credit per file
  • Bulk uploads: Each file costs 1 credit (10 files = 10 credits)
  • Failed uploads: Not charged
  • Re-uploads: Charged again (uploads don't replace existing files)

Next Steps

Now that you can upload files:

API Reference

For complete API documentation: