AI Model Access

Access 400+ AI models through a single API — GPT-4, Claude, Gemini, Llama, and more — without managing separate accounts or API keys.

Quick Example

Axios
Fetch

import { withSapiom } from "@sapiom/axios";
import axios from "axios";

const client = withSapiom(axios.create(), {
  apiKey: process.env.SAPIOM_API_KEY,
});

const { data } = await client.post(
  "https://openrouter.services.sapiom.ai/v1/chat/completions",
  {
    model: "openai/gpt-4o-mini",
    messages: [{ role: "user", content: "Hello, world!" }],
    max_tokens: 100,
  }
);

console.log(data.choices[0].message.content);

import { createFetch } from "@sapiom/fetch";

const fetch = createFetch({
  apiKey: process.env.SAPIOM_API_KEY,
});

const response = await fetch(
  "https://openrouter.services.sapiom.ai/v1/chat/completions",
  {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      model: "openai/gpt-4o-mini",
      messages: [{ role: "user", content: "Hello, world!" }],
      max_tokens: 100,
    }),
  }
);

const data = await response.json();
console.log(data.choices[0].message.content);

How It Works

Sapiom routes requests through OpenRouter, which provides unified access to AI models from all major providers. The SDK handles payment negotiation automatically — you pay per token based on the model you use.

The API is OpenAI-compatible, so you can use familiar patterns:

Standard chat completions format
System, user, and assistant message roles
Temperature, top_p, and other generation parameters
Function/tool calling (on supported models)
JSON mode for structured outputs

You authorize a maximum cost based on max_tokens, but only pay for actual tokens generated.

Provider

API Reference

Chat Completions

Endpoint: POST https://openrouter.services.sapiom.ai/v1/chat/completions

Create a chat completion using any supported model.

Request

Parameter	Type	Required	Description
`model`	string	Yes	Full model name with provider (e.g., `openai/gpt-4o-mini`)
`messages`	array	Yes	Array of message objects with `role` and `content`
`max_tokens`	number	Yes	Maximum tokens to generate (required for cost calculation)
`temperature`	number	No	Sampling temperature (0-2)
`top_p`	number	No	Nucleus sampling (0-1)
`frequency_penalty`	number	No	Frequency penalty (-2 to 2)
`presence_penalty`	number	No	Presence penalty (-2 to 2)
`stop`	string[]	No	Stop sequences
`tools`	array	No	Tool definitions for function calling
`response_format`	object	No	`{ "type": "json_object" }` for JSON mode

{
  "model": "openai/gpt-4o-mini",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is TypeScript?" }
  ],
  "max_tokens": 500,
  "temperature": 0.7
}

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "openai/gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "TypeScript is a strongly typed programming language..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Price Estimate

Endpoint: POST https://openrouter.services.sapiom.ai/v1/chat/completions/price

Get a price estimate without executing the request.

const { data } = await client.post(
  "https://openrouter.services.sapiom.ai/v1/chat/completions/price",
  {
    model: "openai/gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
    max_tokens: 100,
  }
);

console.log(`Maximum cost: ${data.price}`);

Response:

{
  "price": "$0.000025",
  "currency": "USD",
  "model": "openai/gpt-4o-mini",
  "estimatedInputTokens": 10,
  "maxOutputTokens": 100,
  "scheme": "upto"
}

Supported Models

OpenRouter provides access to 400+ models. Here are some popular options:

Provider	Models
OpenAI	`openai/gpt-4o`, `openai/gpt-4o-mini`, `openai/gpt-4-turbo`
Anthropic	`anthropic/claude-3.5-sonnet`, `anthropic/claude-3-opus`
Google	`google/gemini-pro`, `google/gemini-flash`
Meta	`meta-llama/llama-3.1-405b`, `meta-llama/llama-3.2-90b`
Mistral	`mistralai/mistral-large`, `mistralai/mixtral-8x7b`

See the OpenRouter Models page for the full list and pricing.

import { withSapiom } from "@sapiom/axios";
import axios from "axios";

const client = withSapiom(axios.create(), {
  apiKey: process.env.SAPIOM_API_KEY,
});

const baseUrl = "https://openrouter.services.sapiom.ai/v1";

async function chat(userMessage: string) {
  const { data } = await client.post(`${baseUrl}/chat/completions`, {
    model: "openai/gpt-4o-mini",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: userMessage },
    ],
    max_tokens: 500,
    temperature: 0.7,
  });

  return data.choices[0].message.content;
}

// Usage
const response = await chat("Explain quantum computing in simple terms");
console.log(response);

import { createFetch } from "@sapiom/fetch";

const fetch = createFetch({
  apiKey: process.env.SAPIOM_API_KEY,
});

const baseUrl = "https://openrouter.services.sapiom.ai/v1";

async function chat(userMessage: string) {
  const response = await fetch(`${baseUrl}/chat/completions`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      model: "openai/gpt-4o-mini",
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: userMessage },
      ],
      max_tokens: 500,
      temperature: 0.7,
    }),
  });

  const data = await response.json();
  return data.choices[0].message.content;
}

// Usage
const response = await chat("Explain quantum computing in simple terms");
console.log(response);

Error Codes

Code	Description
400	Invalid request — check model name and parameters
402	Payment required — ensure you’re using the Sapiom SDK
404	Model not found — use full model name with provider prefix
429	Rate limit exceeded

Common Issues

Missing max_tokens: The gateway requires max_tokens to calculate maximum cost:

// Wrong — missing max_tokens
{ model: "openai/gpt-4o-mini", messages: [...] }

// Correct
{ model: "openai/gpt-4o-mini", messages: [...], max_tokens: 100 }

Model not found: Use the full model name with provider prefix:

// Wrong
{ model: "gpt-4o-mini" }

// Correct
{ model: "openai/gpt-4o-mini" }

Pricing

Pricing varies by model and is based on token usage:

Input tokens: Cost per 1M tokens (varies by model)
Output tokens: Cost per 1M tokens (varies by model)

The gateway uses an “upto” payment scheme — you authorize maximum cost based on max_tokens, but only pay for tokens actually generated.

See OpenRouter Pricing for per-model rates.