Developer Portal

Documentation

Everything you need to integrate Vetra's powerful multi-modal AI infrastructure into your production applications.

Introduction

Vetra provides a unified API for accessing the world's most advanced AI models. Whether you need text generation, image creation, or complex video synthesis, our infrastructure handles the heavy lifting.

Unified Access

One API key for our massive array of models from Claude, OpenAI, DeepSeek, Google, and more.

Enterprise Scale

99.99% uptime guarantee and global low-latency edge deployment.

Authentication

Secure your requests using standard Bearer token authentication. Your API key should be included in the Authorization header.

HTTP Header
Authorization: Bearer YOUR_API_KEY

Base URL

All API requests should be made to our primary global endpoint:

Endpoint
https://vetraai.vercel.app/v1

Infinite Memory Recall

Unleash the power of Long-Term Cognitive Memory! Every chat session across all models now features automatic, persistent recall. Your AI agents will remember user preferences, past interactions, and complex context across days, weeks, or months!

Zero Latency

Instant recall integrated directly into the inference stream.

User Isolation

Cryptographically isolated memory silos per end-user.

Auto-Sync

Every response is automatically indexed for future recall.

Activating Memory

Memory is enabled by default on the /v1/chat/completions endpoint. Simply provide a unique x-user-id to start building a persistent cognitive profile for your user.

// Memory is AUTOMATIC when you provide a x-user-id
const response = await fetch('https://vetraai.vercel.app/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
    'x-user-id': 'customer_unique_id_99' // <--- This activates isolated memory!
  },
  body: JSON.stringify({
    model: 'openai',
    messages: [{ role: 'user', content: 'My favorite coffee is an Oat Milk Latte.' }]
  })
});

How it works

When a request includes x-user-id, Vetra can associate requests to the same end-user so context-aware features can work consistently across sessions.

Chat Completions

Multiple Providers

Generate text completions using industry-leading models like GPT-4o, Claude 3.5 Sonnet, DeepSeek V3, and more.

POST/v1/chat/completions

Request Body

ParameterTypeRequiredDescription
modelstringYesThe model ID to use (e.g., "gpt-4o", "claude-3-5-sonnet")
messagesarrayYesArray of objects with role and content
streambooleanNoEnable Server-Sent Events (SSE) streaming

Example Usage

curl -X POST https://vetraai.vercel.app/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai",
    "messages": [{"role": "user", "content": "Explain quantum computing"}]
  }'

Image Generation

Create high-quality images from text descriptions using the OpenAI standard format.

POST/v1/images/generations

Example Usage

const response = await fetch('/v1/images/generations', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    prompt: "Cyberpunk city with neon lights",
    model: "flux",
    response_format: "url"
  })
});

const { data } = await response.json();
const imageUrl = data[0].url;

Note: For direct binary responses, you can still use the legacy /api/image endpoint.

Video Generation

Generate cinematic videos from text prompts.

POST/v1/video/generations

Example Usage

const response = await fetch('/v1/video/generations', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    prompt: "A cat playing with a red ball",
    model: "veo"
  })
});

const { data } = await response.json();
const videoUrl = data[0].url;

Note: For direct binary responses, you can still use the legacy /api/video endpoint.

Custom Headers

Control how Vetra identifies your users and manages their data isolation.

HeaderDescription
x-user-idA unique identifier for your end-user. Used to isolate request context per user.
X-App-SourceAutomatically set to "Vetra-API" for API requests. Can be used for custom tracking.

Rate Limits

To ensure fair usage and protect our upstream providers, Vetra implements per-minute rate limits based on your authentication status.

User TypeChatImageVideo
Free Plan100 / day10 / day5 / day
Pro Plan1000 / day50 / day5 / day
Ultra Plan2500 / day100 / day10 / day

*Default limits for Pro plans. If you require higher throughput, join our Discord for custom quota adjustments.

Headers

Rate limit info is returned in the X-RateLimit-Remaining and X-RateLimit-Reset headers of every response.

Data Handling & Transparency

We believe in full transparency regarding how your data is handled. Vetra acts as a stateless router—we do not store your chat logs or generated content on our own servers. Instead, we propagate requests to specialized upstream providers.

How Routing Works

When you make a request, Vetra performs the following steps:

  1. Authentication: Verifies your API key or session.
  2. User Identification: Determines the user ID for storage isolation (see below).
  3. Provider Selection: Dynamically routes the request to our upstream infrastructure.
  4. Header Propagation: Passes custom headers downstream so providers can handle data isolation and tracking.

User Identification Chain

To ensure your users' data (like memory or images) is kept separate, we use a strict priority chain to identify the requester:

  • 1Client-Provided Header: If you pass an x-user-id header, it takes absolute priority.
  • 2API Key Owner: If no header is present, the request is tied to the account that owns the API key.
  • 3Session User: For website requests, we use the authenticated Kinde session user.
  • 4Anonymous IP: If all else fails, requests are tied to the requester's IP address.

Where is data stored?

Data is stored at the edge provider level:

  • Memory: Context handling depends on your configured backend features and user ID strategy.
  • Images/Video: Generated media may be temporarily cached by our upstream providers for retrieval.
  • Logs: Vetra only logs metadata (request count, model used) for billing and rate-limiting purposes.

Need more help? Join our community or contact support.