Documentation
Everything you need to integrate Vetra's powerful multi-modal AI infrastructure into your production applications.
Introduction
Vetra provides a unified API for accessing the world's most advanced AI models. Whether you need text generation, image creation, or complex video synthesis, our infrastructure handles the heavy lifting.
Unified Access
One API key for our massive array of models from Claude, OpenAI, DeepSeek, Google, and more.
Enterprise Scale
99.99% uptime guarantee and global low-latency edge deployment.
Authentication
Secure your requests using standard Bearer token authentication. Your API key should be included in the Authorization header.
Authorization: Bearer YOUR_API_KEYBase URL
All API requests should be made to our primary global endpoint:
https://vetraai.vercel.app/v1Infinite Memory Recall
Unleash the power of Long-Term Cognitive Memory! Every chat session across all models now features automatic, persistent recall. Your AI agents will remember user preferences, past interactions, and complex context across days, weeks, or months!
Zero Latency
Instant recall integrated directly into the inference stream.
User Isolation
Cryptographically isolated memory silos per end-user.
Auto-Sync
Every response is automatically indexed for future recall.
Activating Memory
Memory is enabled by default on the /v1/chat/completions endpoint. Simply provide a unique x-user-id to start building a persistent cognitive profile for your user.
// Memory is AUTOMATIC when you provide a x-user-id
const response = await fetch('https://vetraai.vercel.app/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
'x-user-id': 'customer_unique_id_99' // <--- This activates isolated memory!
},
body: JSON.stringify({
model: 'openai',
messages: [{ role: 'user', content: 'My favorite coffee is an Oat Milk Latte.' }]
})
});How it works
When a request includes x-user-id, Vetra can associate requests to the same end-user so context-aware features can work consistently across sessions.
Chat Completions
Generate text completions using industry-leading models like GPT-4o, Claude 3.5 Sonnet, DeepSeek V3, and more.
/v1/chat/completionsRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | The model ID to use (e.g., "gpt-4o", "claude-3-5-sonnet") |
| messages | array | Yes | Array of objects with role and content |
| stream | boolean | No | Enable Server-Sent Events (SSE) streaming |
Example Usage
curl -X POST https://vetraai.vercel.app/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai",
"messages": [{"role": "user", "content": "Explain quantum computing"}]
}'Image Generation
Create high-quality images from text descriptions using the OpenAI standard format.
/v1/images/generationsExample Usage
const response = await fetch('/v1/images/generations', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
prompt: "Cyberpunk city with neon lights",
model: "flux",
response_format: "url"
})
});
const { data } = await response.json();
const imageUrl = data[0].url;Note: For direct binary responses, you can still use the legacy /api/image endpoint.
Video Generation
Generate cinematic videos from text prompts.
/v1/video/generationsExample Usage
const response = await fetch('/v1/video/generations', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
prompt: "A cat playing with a red ball",
model: "veo"
})
});
const { data } = await response.json();
const videoUrl = data[0].url;Note: For direct binary responses, you can still use the legacy /api/video endpoint.
Custom Headers
Control how Vetra identifies your users and manages their data isolation.
| Header | Description |
|---|---|
| x-user-id | A unique identifier for your end-user. Used to isolate request context per user. |
| X-App-Source | Automatically set to "Vetra-API" for API requests. Can be used for custom tracking. |
Rate Limits
To ensure fair usage and protect our upstream providers, Vetra implements per-minute rate limits based on your authentication status.
| User Type | Chat | Image | Video |
|---|---|---|---|
| Free Plan | 100 / day | 10 / day | 5 / day |
| Pro Plan | 1000 / day | 50 / day | 5 / day |
| Ultra Plan | 2500 / day | 100 / day | 10 / day |
*Default limits for Pro plans. If you require higher throughput, join our Discord for custom quota adjustments.
Headers
Rate limit info is returned in the X-RateLimit-Remaining and X-RateLimit-Reset headers of every response.
Data Handling & Transparency
We believe in full transparency regarding how your data is handled. Vetra acts as a stateless router—we do not store your chat logs or generated content on our own servers. Instead, we propagate requests to specialized upstream providers.
How Routing Works
When you make a request, Vetra performs the following steps:
- Authentication: Verifies your API key or session.
- User Identification: Determines the user ID for storage isolation (see below).
- Provider Selection: Dynamically routes the request to our upstream infrastructure.
- Header Propagation: Passes custom headers downstream so providers can handle data isolation and tracking.
User Identification Chain
To ensure your users' data (like memory or images) is kept separate, we use a strict priority chain to identify the requester:
- 1Client-Provided Header: If you pass an
x-user-idheader, it takes absolute priority. - 2API Key Owner: If no header is present, the request is tied to the account that owns the API key.
- 3Session User: For website requests, we use the authenticated Kinde session user.
- 4Anonymous IP: If all else fails, requests are tied to the requester's IP address.
Where is data stored?
Data is stored at the edge provider level:
- Memory: Context handling depends on your configured backend features and user ID strategy.
- Images/Video: Generated media may be temporarily cached by our upstream providers for retrieval.
- Logs: Vetra only logs metadata (request count, model used) for billing and rate-limiting purposes.
Need more help? Join our community or contact support.