# AI Model Access

Access 400+ AI models through a single API

Access 400+ AI models through a single API — GPT-4, Claude, Gemini, Llama, and more — without managing separate accounts or API keys.

## Quick Example

**Axios:**
```typescript
    import { withSapiom } from "@sapiom/axios";
    import axios from "axios";

    const client = withSapiom(axios.create(), {
      apiKey: process.env.SAPIOM_API_KEY,
    });

    const { data } = await client.post(
      "https://openrouter.services.sapiom.ai/v1/chat/completions",
      {
        model: "openai/gpt-4o-mini",
        messages: [{ role: "user", content: "Hello, world!" }],
        max_tokens: 100,
      }
    );

    console.log(data.choices[0].message.content);
    ```
  
**Fetch:**
```typescript
    import { createFetch } from "@sapiom/fetch";

    const fetch = createFetch({
      apiKey: process.env.SAPIOM_API_KEY,
    });

    const response = await fetch(
      "https://openrouter.services.sapiom.ai/v1/chat/completions",
      {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({
          model: "openai/gpt-4o-mini",
          messages: [{ role: "user", content: "Hello, world!" }],
          max_tokens: 100,
        }),
      }
    );

    const data = await response.json();
    console.log(data.choices[0].message.content);
    ```
  
## How It Works

Sapiom routes requests through [OpenRouter](https://openrouter.ai), which provides unified access to AI models from all major providers. The SDK handles payment negotiation automatically — you pay per token based on the model you use.

The API is OpenAI-compatible, so you can use familiar patterns:

- Standard chat completions format
- System, user, and assistant message roles
- Temperature, top_p, and other generation parameters
- Function/tool calling (on supported models)
- JSON mode for structured outputs

You authorize a maximum cost based on `max_tokens`, but only pay for actual tokens generated.

## Provider

Powered by [OpenRouter](https://openrouter.ai). OpenRouter aggregates 400+ models from OpenAI, Anthropic, Google, Meta, Mistral, and others.

## API Reference

### Chat Completions

**Endpoint:** `POST https://openrouter.services.sapiom.ai/v1/chat/completions`

Create a chat completion using any supported model.

#### Request

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `model` | string | Yes | Full model name with provider (e.g., `openai/gpt-4o-mini`) |
| `messages` | array | Yes | Array of message objects with `role` and `content` |
| `max_tokens` | number | Yes | Maximum tokens to generate (required for cost calculation) |
| `temperature` | number | No | Sampling temperature (0-2) |
| `top_p` | number | No | Nucleus sampling (0-1) |
| `frequency_penalty` | number | No | Frequency penalty (-2 to 2) |
| `presence_penalty` | number | No | Presence penalty (-2 to 2) |
| `stop` | string[] | No | Stop sequences |
| `tools` | array | No | Tool definitions for function calling |
| `response_format` | object | No | `{ "type": "json_object" }` for JSON mode |

```json
{
  "model": "openai/gpt-4o-mini",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is TypeScript?" }
  ],
  "max_tokens": 500,
  "temperature": 0.7
}
```

#### Response

```json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "openai/gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "TypeScript is a strongly typed programming language..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}
```

### Price Estimate

**Endpoint:** `POST https://openrouter.services.sapiom.ai/v1/chat/completions/price`

Get a price estimate without executing the request.

```typescript
const { data } = await client.post(
  "https://openrouter.services.sapiom.ai/v1/chat/completions/price",
  {
    model: "openai/gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
    max_tokens: 100,
  }
);

console.log(`Maximum cost: ${data.price}`);
```

Response:

```json
{
  "price": "$0.000025",
  "currency": "USD",
  "model": "openai/gpt-4o-mini",
  "estimatedInputTokens": 10,
  "maxOutputTokens": 100,
  "scheme": "upto"
}
```

## Supported Models

OpenRouter provides access to 400+ models. Here are some popular options:

| Provider | Models |
|----------|--------|
| OpenAI | `openai/gpt-4o`, `openai/gpt-4o-mini`, `openai/gpt-4-turbo` |
| Anthropic | `anthropic/claude-3.5-sonnet`, `anthropic/claude-3-opus` |
| Google | `google/gemini-pro`, `google/gemini-flash` |
| Meta | `meta-llama/llama-3.1-405b`, `meta-llama/llama-3.2-90b` |
| Mistral | `mistralai/mistral-large`, `mistralai/mixtral-8x7b` |

See the [OpenRouter Models](https://openrouter.ai/docs#models) page for the full list and pricing.

## Complete Example

**Axios:**
```typescript
    import { withSapiom } from "@sapiom/axios";
    import axios from "axios";

    const client = withSapiom(axios.create(), {
      apiKey: process.env.SAPIOM_API_KEY,
    });

    const baseUrl = "https://openrouter.services.sapiom.ai/v1";

    async function chat(userMessage: string) {
      const { data } = await client.post(`${baseUrl}/chat/completions`, {
        model: "openai/gpt-4o-mini",
        messages: [
          { role: "system", content: "You are a helpful assistant." },
          { role: "user", content: userMessage },
        ],
        max_tokens: 500,
        temperature: 0.7,
      });

      return data.choices[0].message.content;
    }

    // Usage
    const response = await chat("Explain quantum computing in simple terms");
    console.log(response);
    ```
  
**Fetch:**
```typescript
    import { createFetch } from "@sapiom/fetch";

    const fetch = createFetch({
      apiKey: process.env.SAPIOM_API_KEY,
    });

    const baseUrl = "https://openrouter.services.sapiom.ai/v1";

    async function chat(userMessage: string) {
      const response = await fetch(`${baseUrl}/chat/completions`, {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({
          model: "openai/gpt-4o-mini",
          messages: [
            { role: "system", content: "You are a helpful assistant." },
            { role: "user", content: userMessage },
          ],
          max_tokens: 500,
          temperature: 0.7,
        }),
      });

      const data = await response.json();
      return data.choices[0].message.content;
    }

    // Usage
    const response = await chat("Explain quantum computing in simple terms");
    console.log(response);
    ```
  
### Error Codes

| Code | Description |
|------|-------------|
| 400 | Invalid request — check model name and parameters |
| 402 | Payment required — ensure you're using the Sapiom SDK |
| 404 | Model not found — use full model name with provider prefix |
| 429 | Rate limit exceeded |

### Common Issues

**Missing `max_tokens`:** The gateway requires `max_tokens` to calculate maximum cost:

```typescript
// Wrong — missing max_tokens
{ model: "openai/gpt-4o-mini", messages: [...] }

// Correct
{ model: "openai/gpt-4o-mini", messages: [...], max_tokens: 100 }
```

**Model not found:** Use the full model name with provider prefix:

```typescript
// Wrong
{ model: "gpt-4o-mini" }

// Correct
{ model: "openai/gpt-4o-mini" }
```

## Pricing

Pricing varies by model and is based on token usage:

- **Input tokens**: Cost per 1M tokens (varies by model)
- **Output tokens**: Cost per 1M tokens (varies by model)

The gateway uses an "upto" payment scheme — you authorize maximum cost based on `max_tokens`, but only pay for tokens actually generated.

See [OpenRouter Pricing](https://openrouter.ai/docs#models) for per-model rates.

> **Using Python?:** See [Service Proxy](/md/service-proxy.md) for REST API access without the Node.js SDK.