Setup Guides

Step-by-step guides for connecting models from Google Vertex AI, AWS Bedrock, Mistral, DeepSeek, Perplexity, and OpenAI-compatible endpoints.

Setup Guides

Step-by-step guides for connecting models from Google Vertex AI, AWS Bedrock, Mistral, DeepSeek, Perplexity, and OpenAI-compatible endpoints.

Select your provider below to see the setup steps.

Odeus supports two ways to connect Gemini models:

* **Google Vertex AI**: uses service account credentials. Best for enterprise setups with GCP infrastructure.
* **Google AI Studio**: uses a simple API key. Easier to set up.

## Option 1: Google Vertex AI

### Google Cloud Setup

1. Enable the Vertex AI API in your [Google Cloud Platform](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

2. Go to "Service Accounts" in the Google Cloud Console IAM Settings.

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-1.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=bc82aab0b2ac95f5ac0f4f00b1871063" alt="Go to Service Accounts in the sidebar" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-1.png" /&gt;

3. Click on "Create Service Account".

4. Give the Service Account a name.

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-3.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=8383dd25d7fbb015ba4d1b3fe7869609" alt="Give the Service Account a name" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-3.png" /&gt;

5. Assign the "Vertex AI User" Role.

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-4.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=f5541283131a285885b8f7479cf9470f" alt="Assign the Vertex AI User Role" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-4.png" /&gt;

6. Create the Service Account.

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-5.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=c07540a6fbc57e613cd4fe9798742c09" alt="Create the Service Account" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-5.png" /&gt;

7. You are brought back to the Service Account overview.

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-6.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=58e96da270d580a87c7a181aeabfe0f0" alt="Service Account overview" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-6.png" /&gt;

8. On the overview page, click on "Manage keys".

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-7.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=c38a552aedad87452a48038c60779beb" alt="Click Manage keys on the service account" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-7.png" /&gt;

9. Create a new JSON key.

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-8.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=9f7ad3c0eb79f7a6ae19b2ea5b8193e0" alt="Select Create new key from the dropdown" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-8.png" /&gt;

10. Download and open the JSON file.

&lt;img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-9.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=cfdd5794e12deb32d3625201f558af68" alt="Download the key and open the JSON file" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-9.png" /&gt;

### Odeus Setup

1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
2. Use the prebuilt Odeus config or set up manually. Set the SDK to **Google Vertex**.

> When you select the Google Vertex SDK, the UI relabels the fields: "Base URL" becomes **Service Account Email** and "API Key" becomes **Service Account Private Key**.

3. Fill in the connection fields:
   * **Service Account Email**: paste the `client_email` value from your JSON key file (e.g. `[email protected]`)
   * **Service Account Private Key**: paste the `private_key` value from your JSON key file (including `-----BEGIN PRIVATE KEY-----` and `-----END PRIVATE KEY-----`)
   * **Region**: your Vertex AI region (e.g. `europe-west3`, `us-central1`). This determines which Vertex AI endpoint is used.
   * **Model ID**: the model ID from the Vertex portal (e.g. `gemini-2.5-flash`, `gemini-2.5-pro`)

4. Click **Test & continue**, then click **Save model** after the test passes.

> The GCP project ID is automatically extracted from your service account email. You don't need to enter it separately.

## Option 2: Google AI Studio

1. Get an API key from [Google AI Studio](https://aistudio.google.com/apikey).
2. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**. Select **Google AI Studio** as the SDK.
3. Paste your API key and set the Model ID.
4. Click **Test & continue**, then click **Save model** after the test passes.

## Imagen (Image Generation)

Follow the Vertex AI setup above, but set the model type to **Image Generation** and use an Imagen model ID (e.g. `imagen-4.0-generate-001`).



AWS Bedrock allows you to access models like Claude through your own AWS infrastructure with enterprise-grade security and compliance.

**Prerequisites:**

1. An AWS account with Bedrock access enabled
2. IAM credentials with Bedrock permissions
3. Model access enabled in your AWS Bedrock console
4. Admin access to your Odeus workspace

## AWS Setup

### 1. Enable Model Access

1. Go to the [AWS Bedrock Console](https://console.aws.amazon.com/bedrock).
2. Navigate to **Model access** in the left sidebar.
3. Click **Manage model access** and enable the models you need.
4. Wait for access to be granted (this may take a few minutes).

### 2. Create IAM Credentials

1. Go to the [AWS IAM Console](https://console.aws.amazon.com/iam).
2. Navigate to **Users** and click **Create user**.
3. Give the user a descriptive name (e.g. `odeus-bedrock-access`).
4. Attach the `AmazonBedrockFullAccess` policy, or create a custom policy with minimum required permissions:

```json theme={null}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": "*"
        }
    ]
}
```

5. Go to the user's **Security credentials** tab, click **Create access key**, select **Third-party service**, and save both the Access Key ID and Secret Access Key.

> The Secret Access Key is only shown once. Store it securely before closing the dialog.

## Odeus Setup

1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.

2. Select **Bedrock** as the SDK.

3. Fill in the connection fields:
   * **Access Key ID**: your AWS Access Key ID
   * **Secret Access Key**: your AWS Secret Access Key
   * **Region**: your AWS region (e.g. `us-east-1`, `eu-central-1`)
   * **Model ID**: use the Bedrock model identifier (see below)
   * **Context Size**: set according to the model (see the [model configuration tables](/en/admin/byok/recommended-models#model-specific-configuration))

4. Click **Test & continue**, then click **Save model** after the test passes.

## Model IDs

| Provider  | Format                     | Example                                  |
| --------- | -------------------------- | ---------------------------------------- |
| Anthropic | `anthropic.{model-name}`   | `anthropic.claude-sonnet-4-6`            |
| Meta      | `meta.{model-name}-v1:0`   | `meta.llama4-maverick-17b-instruct-v1:0` |
| Amazon    | `amazon.{model-name}-v1:0` | `amazon.nova-pro-v1:0`                   |

> Check the [AWS Bedrock supported models page](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for exact model IDs.

## Cross-Region Inference Profiles

Prefix the model ID with a geographic code to route across regions automatically:

| Prefix    | Scope                  |
| --------- | ---------------------- |
| `us.`     | US regions             |
| `eu.`     | European regions       |
| `global.` | All commercial regions |
| `apac.`   | Asia-Pacific regions   |

Example: `eu.anthropic.claude-sonnet-4-6`

Check the [inference profiles documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) for available profiles per model.

## Supported Regions

* **US East (N. Virginia)**: `us-east-1`
* **US West (Oregon)**: `us-west-2`
* **EU (Frankfurt)**: `eu-central-1`
* **EU (Ireland)**: `eu-west-1`
* **EU (Paris)**: `eu-west-3`
* **Asia Pacific (Tokyo)**: `ap-northeast-1`
* **Asia Pacific (Sydney)**: `ap-southeast-2`

## Network Configuration

If your organization uses network allowlisting, add `bedrock.REGION.amazonaws.com` to your allowlist (replace `REGION` with your AWS region, e.g. `us-east-1`).

## Troubleshooting

**"Access Denied" errors**: verify IAM permissions and that model access is enabled in the Bedrock console.

**Model not available**: confirm the model is enabled in your AWS Bedrock model access settings and available in your selected region.

**Authentication failures**: double-check that Access Key ID and Secret Access Key fields contain the correct values and that the region setting matches your Bedrock region.

**Slow responses or timeouts**: consider using a region closer to your users. Check the [AWS Service Health Dashboard](https://health.aws.amazon.com/health/status) for any ongoing issues. Verify your AWS account has sufficient quotas for the model.



Mistral models connect via the Mistral API or via Azure (for Azure-hosted Mistral).

**Prerequisites:**

1. A Mistral account at [console.mistral.ai](https://console.mistral.ai)
2. An API key from the Mistral platform
3. Admin access to your Odeus workspace

## Setup

1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.

2. Fill in the connection fields:
   * **SDK**: select **Mistral**
   * **Base URL**: leave empty to use the default (`https://api.mistral.ai/v1`) or specify a custom endpoint
   * **Model ID**: use the official model identifier (see below)
   * **API Key**: paste your Mistral API key

3. Click **Test & continue**, then click **Save model** after the test passes.

## Model IDs

| Model ID               | Use case                                                                |
| ---------------------- | ----------------------------------------------------------------------- |
| `mistral-large-latest` | Flagship model — complex reasoning, multilingual, instruction following |
| `codestral-latest`     | Code-specialized — code generation, completion, and technical tasks     |
| `mistral-small-latest` | Fast and cost-effective — good for everyday tasks                       |

> Check [Mistral's model documentation](https://docs.mistral.ai/getting-started/models/models_overview/) for the full list of available models.

## Using Mistral from Azure

If you're using Mistral models hosted on Azure (via Azure AI Models-as-a-Service), you still need to select **"Mistral"** as the SDK in Odeus. The SDK selection refers to the API format, not the hosting provider.

> When configuring Azure-hosted Mistral models:

  * Set the **Hosting provider** to Azure
  * Set the **SDK** to "Mistral" (not Azure OpenAI)
  * Use your Azure endpoint as the Base URL
  * Use your Azure API key

## Configuration Notes

* Mistral models support tool calling natively.
* The default API endpoint `https://api.mistral.ai/v1` is used automatically when no custom Base URL is provided.
* Mistral models are known for strong multilingual capabilities, particularly in European languages.

## Troubleshooting

**Model not responding**: verify your API key is valid and that you have sufficient credits in your Mistral account. Ensure the model ID matches exactly (case-sensitive).

**Authentication errors with Azure**: double-check that you're using "Mistral" as the SDK, not "Azure OpenAI". Verify your Azure endpoint URL is correct and accessible, and that your Azure API key has the necessary permissions.

**Slow responses**: larger models may take longer for complex reasoning tasks. Consider using a smaller model for faster responses on simpler tasks.



DeepSeek models connect via the DeepSeek API. All models are hosted in the US region.

**Prerequisites:**

1. A DeepSeek account with an [API key](https://platform.deepseek.com/api_keys)
2. Admin access to your Odeus workspace

## Setup

1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.

2. Fill in the connection fields:
   * **SDK**: select **DeepSeek**
   * **Base URL**: `https://api.deepseek.com/v1`
   * **Model ID**: see table below
   * **API Key**: paste your DeepSeek API key
   * **Region**: US

3. For reasoning models (R1), enable **Always show reasoning** to surface the model's thinking in the UI.

4. Click **Test & continue**, then click **Save model** after the test passes.

## Model IDs

| Model ID            | Type                                                                            |
| ------------------- | ------------------------------------------------------------------------------- |
| `deepseek-reasoner` | Reasoning model (R1 series) — excels at step-by-step problem solving and coding |
| `deepseek-chat`     | General-purpose model (V3 series) — fast responses, good for everyday tasks     |

> Check [DeepSeek's API docs](https://api-docs.deepseek.com/quick_start/pricing) for the latest available models.

## Configuration Notes

* DeepSeek models are hosted in the US region only.
* DeepSeek R1 is a reasoning model — enable **Always show reasoning** to see its reasoning steps in the UI. DeepSeek models do not support image analysis.
* The base URL must include the `/v1` path: `https://api.deepseek.com/v1`.

## Troubleshooting

**Model not responding**: verify your API key is valid and that you have sufficient credits. Ensure the model ID matches exactly (case-sensitive).

**Slow responses**: DeepSeek R1 (reasoning model) may take longer due to its step-by-step reasoning process. Consider using DeepSeek V3 for faster responses on simpler tasks.



Perplexity's Sonar models combine LLM capabilities with real-time web search. All models are hosted in the US region.

**Prerequisites:**

1. A Perplexity account at [perplexity.ai](https://www.perplexity.ai)
2. An API key from your [Perplexity API settings](https://www.perplexity.ai/settings/api)
3. Admin access to your Odeus workspace

## Setup

1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.

2. Fill in the connection fields:
   * **SDK**: select **Perplexity**
   * **Base URL**: leave empty to use the default (`https://api.perplexity.ai`)
   * **Model ID**: see table below
   * **API Key**: paste your Perplexity API key
   * **Region**: US

3. Click **Test & continue**, then click **Save model** after the test passes.

## Model IDs

| Model ID              | Type                                                                     |
| --------------------- | ------------------------------------------------------------------------ |
| `sonar-pro`           | Advanced search-augmented generation — detailed responses with citations |
| `sonar`               | Fast search-augmented responses — good for general-purpose queries       |
| `sonar-reasoning-pro` | Deep analysis with search — multi-step reasoning with citations          |
| `sonar-reasoning`     | Reasoning with search augmentation                                       |

> Check [Perplexity's model documentation](https://docs.perplexity.ai/guides/model-cards) for the full list of available models.

## Configuration Notes

* Perplexity models include built-in web search capabilities, so they always have access to current information.
* The API endpoint `https://api.perplexity.ai` is automatically used when no custom Base URL is provided.
* Sonar Pro models provide more detailed responses with better source citations.
* Reasoning variants are best for complex analytical tasks that benefit from step-by-step thinking.
* Perplexity models do not support image analysis.

## Troubleshooting

**Missing citations**: Perplexity models include citations automatically when web search is used. If citations are missing, the model answered from its base knowledge rather than a web search.

**Slow responses**: Perplexity models perform web searches, which adds latency. Sonar (non-Pro) variants are faster than Pro versions. For time-sensitive tasks without search needs, consider using a different model.

**Model not responding**: verify your API key is valid and that you have sufficient credits. Ensure the model ID matches exactly (case-sensitive).



Use this for any API that follows the OpenAI spec — including vLLM, LiteLLM, Ollama, and self-hosted models.

Many LLM inference solutions implement the OpenAI API specification as a standard interface. This means they accept requests and return responses in the same format as OpenAI's API, making them interchangeable from an integration perspective.

Common OpenAI-compatible solutions:

* **vLLM**: high-throughput inference server for large language models
* **LiteLLM**: proxy server providing a unified interface to 100+ LLM providers
* **Ollama**: run large language models locally
* **Text Generation Inference (TGI)**: Hugging Face's inference server
* **LocalAI**: self-hosted, OpenAI-compatible API
* **Custom deployments**: any service implementing the OpenAI chat completions API

**Prerequisites:**

1. A running OpenAI-compatible inference endpoint accessible over HTTPS
2. The base URL of your endpoint
3. The model ID/name as configured in your inference server
4. An API key (if your endpoint requires authentication)
5. Admin access to your Odeus workspace

## Setup

1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.

2. Fill in the connection fields:
   * **SDK**: select **OpenAI Compatible**
   * **Base URL**: your endpoint URL (e.g. `https://your-server.com/v1`). Required.
   * **Model ID**: the exact model identifier as configured in your inference server
   * **API Key**: your authentication key, or leave empty if not required
   * **Context Size**: the context window size of your model in tokens

3. Click **Test & continue**, then click **Save model** after the test passes.

> Your endpoint must be publicly accessible over HTTPS. Odeus blocks requests to private IPs, localhost, and internal hostnames. Contact [[email protected]](mailto:[email protected]) if you need to connect to an internal endpoint.

## Example Configurations

| Server        | Base URL                     | Model ID                                                                |
| ------------- | ---------------------------- | ----------------------------------------------------------------------- |
| vLLM          | `https://your-server.com/v1` | Model name from vLLM startup (e.g. `meta-llama/Llama-3.1-70B-Instruct`) |
| LiteLLM proxy | `https://your-litellm.com`   | Alias from your LiteLLM config                                          |
| Ollama        | `https://your-ollama.com/v1` | Name from `ollama list` (e.g. `llama3.1`)                               |

> For Azure OpenAI, use the dedicated **Azure** SDK instead. It handles API versioning and deployment-based URL routing automatically.

## Common Use Cases

* **Data privacy**: run models on your own infrastructure so prompts and responses stay within your network.
* **Cost optimization**: running open-source models on your own hardware can significantly reduce costs for high-volume use cases.
* **Custom fine-tuned models**: deploy models fine-tuned for specific tasks or domains with vLLM or similar servers.
* **Multi-provider abstraction**: use LiteLLM as a proxy to route requests to different providers from a single interface.

## Troubleshooting

**Connection refused or timeout**: verify the endpoint is accessible from external servers over HTTPS. Check that your firewall allows incoming connections. Ensure your inference server is running and healthy.

**Authentication errors**: verify your API key and check if your endpoint expects a specific `Bearer` token format.

**Model not found**: ensure the Model ID matches exactly what your inference server expects (case-sensitive). Verify the model is loaded and available on your server.

**Responses are cut off**: check the max output tokens setting in Odeus and verify your inference server's generation length limits.

**Slow responses**: check your server's available GPU memory and compute resources. Consider using quantized model versions for faster inference. Monitor your server's queue length and scaling configuration.

**Incompatible API format**: not all "OpenAI-compatible" servers implement the full API specification. Verify your server supports the `/v1/chat/completions` endpoint and check if your server requires specific API version headers.