Setup Guides
Step-by-step guides for connecting models from Google Vertex AI, AWS Bedrock, Mistral, DeepSeek, Perplexity, and OpenAI-compatible endpoints.
Setup Guides
Step-by-step guides for connecting models from Google Vertex AI, AWS Bedrock, Mistral, DeepSeek, Perplexity, and OpenAI-compatible endpoints.
Select your provider below to see the setup steps.
Odeus supports two ways to connect Gemini models:
* **Google Vertex AI**: uses service account credentials. Best for enterprise setups with GCP infrastructure.
* **Google AI Studio**: uses a simple API key. Easier to set up.
## Option 1: Google Vertex AI
### Google Cloud Setup
1. Enable the Vertex AI API in your [Google Cloud Platform](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
2. Go to "Service Accounts" in the Google Cloud Console IAM Settings.
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-1.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=bc82aab0b2ac95f5ac0f4f00b1871063" alt="Go to Service Accounts in the sidebar" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-1.png" />
3. Click on "Create Service Account".
4. Give the Service Account a name.
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-3.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=8383dd25d7fbb015ba4d1b3fe7869609" alt="Give the Service Account a name" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-3.png" />
5. Assign the "Vertex AI User" Role.
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-4.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=f5541283131a285885b8f7479cf9470f" alt="Assign the Vertex AI User Role" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-4.png" />
6. Create the Service Account.
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-5.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=c07540a6fbc57e613cd4fe9798742c09" alt="Create the Service Account" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-5.png" />
7. You are brought back to the Service Account overview.
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-6.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=58e96da270d580a87c7a181aeabfe0f0" alt="Service Account overview" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-6.png" />
8. On the overview page, click on "Manage keys".
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-7.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=c38a552aedad87452a48038c60779beb" alt="Click Manage keys on the service account" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-7.png" />
9. Create a new JSON key.
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-8.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=9f7ad3c0eb79f7a6ae19b2ea5b8193e0" alt="Select Create new key from the dropdown" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-8.png" />
10. Download and open the JSON file.
<img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-9.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=cfdd5794e12deb32d3625201f558af68" alt="Download the key and open the JSON file" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-9.png" />
### Odeus Setup
1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
2. Use the prebuilt Odeus config or set up manually. Set the SDK to **Google Vertex**.
> When you select the Google Vertex SDK, the UI relabels the fields: "Base URL" becomes **Service Account Email** and "API Key" becomes **Service Account Private Key**.
3. Fill in the connection fields:
* **Service Account Email**: paste the `client_email` value from your JSON key file (e.g. `[email protected]`)
* **Service Account Private Key**: paste the `private_key` value from your JSON key file (including `-----BEGIN PRIVATE KEY-----` and `-----END PRIVATE KEY-----`)
* **Region**: your Vertex AI region (e.g. `europe-west3`, `us-central1`). This determines which Vertex AI endpoint is used.
* **Model ID**: the model ID from the Vertex portal (e.g. `gemini-2.5-flash`, `gemini-2.5-pro`)
4. Click **Test & continue**, then click **Save model** after the test passes.
> The GCP project ID is automatically extracted from your service account email. You don't need to enter it separately.
## Option 2: Google AI Studio
1. Get an API key from [Google AI Studio](https://aistudio.google.com/apikey).
2. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**. Select **Google AI Studio** as the SDK.
3. Paste your API key and set the Model ID.
4. Click **Test & continue**, then click **Save model** after the test passes.
## Imagen (Image Generation)
Follow the Vertex AI setup above, but set the model type to **Image Generation** and use an Imagen model ID (e.g. `imagen-4.0-generate-001`).
AWS Bedrock allows you to access models like Claude through your own AWS infrastructure with enterprise-grade security and compliance.
**Prerequisites:**
1. An AWS account with Bedrock access enabled
2. IAM credentials with Bedrock permissions
3. Model access enabled in your AWS Bedrock console
4. Admin access to your Odeus workspace
## AWS Setup
### 1. Enable Model Access
1. Go to the [AWS Bedrock Console](https://console.aws.amazon.com/bedrock).
2. Navigate to **Model access** in the left sidebar.
3. Click **Manage model access** and enable the models you need.
4. Wait for access to be granted (this may take a few minutes).
### 2. Create IAM Credentials
1. Go to the [AWS IAM Console](https://console.aws.amazon.com/iam).
2. Navigate to **Users** and click **Create user**.
3. Give the user a descriptive name (e.g. `odeus-bedrock-access`).
4. Attach the `AmazonBedrockFullAccess` policy, or create a custom policy with minimum required permissions:
```json theme={null}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": "*"
}
]
}
```
5. Go to the user's **Security credentials** tab, click **Create access key**, select **Third-party service**, and save both the Access Key ID and Secret Access Key.
> The Secret Access Key is only shown once. Store it securely before closing the dialog.
## Odeus Setup
1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
2. Select **Bedrock** as the SDK.
3. Fill in the connection fields:
* **Access Key ID**: your AWS Access Key ID
* **Secret Access Key**: your AWS Secret Access Key
* **Region**: your AWS region (e.g. `us-east-1`, `eu-central-1`)
* **Model ID**: use the Bedrock model identifier (see below)
* **Context Size**: set according to the model (see the [model configuration tables](/en/admin/byok/recommended-models#model-specific-configuration))
4. Click **Test & continue**, then click **Save model** after the test passes.
## Model IDs
| Provider | Format | Example |
| --------- | -------------------------- | ---------------------------------------- |
| Anthropic | `anthropic.{model-name}` | `anthropic.claude-sonnet-4-6` |
| Meta | `meta.{model-name}-v1:0` | `meta.llama4-maverick-17b-instruct-v1:0` |
| Amazon | `amazon.{model-name}-v1:0` | `amazon.nova-pro-v1:0` |
> Check the [AWS Bedrock supported models page](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for exact model IDs.
## Cross-Region Inference Profiles
Prefix the model ID with a geographic code to route across regions automatically:
| Prefix | Scope |
| --------- | ---------------------- |
| `us.` | US regions |
| `eu.` | European regions |
| `global.` | All commercial regions |
| `apac.` | Asia-Pacific regions |
Example: `eu.anthropic.claude-sonnet-4-6`
Check the [inference profiles documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) for available profiles per model.
## Supported Regions
* **US East (N. Virginia)**: `us-east-1`
* **US West (Oregon)**: `us-west-2`
* **EU (Frankfurt)**: `eu-central-1`
* **EU (Ireland)**: `eu-west-1`
* **EU (Paris)**: `eu-west-3`
* **Asia Pacific (Tokyo)**: `ap-northeast-1`
* **Asia Pacific (Sydney)**: `ap-southeast-2`
## Network Configuration
If your organization uses network allowlisting, add `bedrock.REGION.amazonaws.com` to your allowlist (replace `REGION` with your AWS region, e.g. `us-east-1`).
## Troubleshooting
**"Access Denied" errors**: verify IAM permissions and that model access is enabled in the Bedrock console.
**Model not available**: confirm the model is enabled in your AWS Bedrock model access settings and available in your selected region.
**Authentication failures**: double-check that Access Key ID and Secret Access Key fields contain the correct values and that the region setting matches your Bedrock region.
**Slow responses or timeouts**: consider using a region closer to your users. Check the [AWS Service Health Dashboard](https://health.aws.amazon.com/health/status) for any ongoing issues. Verify your AWS account has sufficient quotas for the model.
Mistral models connect via the Mistral API or via Azure (for Azure-hosted Mistral).
**Prerequisites:**
1. A Mistral account at [console.mistral.ai](https://console.mistral.ai)
2. An API key from the Mistral platform
3. Admin access to your Odeus workspace
## Setup
1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
2. Fill in the connection fields:
* **SDK**: select **Mistral**
* **Base URL**: leave empty to use the default (`https://api.mistral.ai/v1`) or specify a custom endpoint
* **Model ID**: use the official model identifier (see below)
* **API Key**: paste your Mistral API key
3. Click **Test & continue**, then click **Save model** after the test passes.
## Model IDs
| Model ID | Use case |
| ---------------------- | ----------------------------------------------------------------------- |
| `mistral-large-latest` | Flagship model — complex reasoning, multilingual, instruction following |
| `codestral-latest` | Code-specialized — code generation, completion, and technical tasks |
| `mistral-small-latest` | Fast and cost-effective — good for everyday tasks |
> Check [Mistral's model documentation](https://docs.mistral.ai/getting-started/models/models_overview/) for the full list of available models.
## Using Mistral from Azure
If you're using Mistral models hosted on Azure (via Azure AI Models-as-a-Service), you still need to select **"Mistral"** as the SDK in Odeus. The SDK selection refers to the API format, not the hosting provider.
> When configuring Azure-hosted Mistral models:
* Set the **Hosting provider** to Azure * Set the **SDK** to "Mistral" (not Azure OpenAI) * Use your Azure endpoint as the Base URL * Use your Azure API key
## Configuration Notes
* Mistral models support tool calling natively.
* The default API endpoint `https://api.mistral.ai/v1` is used automatically when no custom Base URL is provided.
* Mistral models are known for strong multilingual capabilities, particularly in European languages.
## Troubleshooting
**Model not responding**: verify your API key is valid and that you have sufficient credits in your Mistral account. Ensure the model ID matches exactly (case-sensitive).
**Authentication errors with Azure**: double-check that you're using "Mistral" as the SDK, not "Azure OpenAI". Verify your Azure endpoint URL is correct and accessible, and that your Azure API key has the necessary permissions.
**Slow responses**: larger models may take longer for complex reasoning tasks. Consider using a smaller model for faster responses on simpler tasks.
DeepSeek models connect via the DeepSeek API. All models are hosted in the US region.
**Prerequisites:**
1. A DeepSeek account with an [API key](https://platform.deepseek.com/api_keys)
2. Admin access to your Odeus workspace
## Setup
1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
2. Fill in the connection fields:
* **SDK**: select **DeepSeek**
* **Base URL**: `https://api.deepseek.com/v1`
* **Model ID**: see table below
* **API Key**: paste your DeepSeek API key
* **Region**: US
3. For reasoning models (R1), enable **Always show reasoning** to surface the model's thinking in the UI.
4. Click **Test & continue**, then click **Save model** after the test passes.
## Model IDs
| Model ID | Type |
| ------------------- | ------------------------------------------------------------------------------- |
| `deepseek-reasoner` | Reasoning model (R1 series) — excels at step-by-step problem solving and coding |
| `deepseek-chat` | General-purpose model (V3 series) — fast responses, good for everyday tasks |
> Check [DeepSeek's API docs](https://api-docs.deepseek.com/quick_start/pricing) for the latest available models.
## Configuration Notes
* DeepSeek models are hosted in the US region only.
* DeepSeek R1 is a reasoning model — enable **Always show reasoning** to see its reasoning steps in the UI. DeepSeek models do not support image analysis.
* The base URL must include the `/v1` path: `https://api.deepseek.com/v1`.
## Troubleshooting
**Model not responding**: verify your API key is valid and that you have sufficient credits. Ensure the model ID matches exactly (case-sensitive).
**Slow responses**: DeepSeek R1 (reasoning model) may take longer due to its step-by-step reasoning process. Consider using DeepSeek V3 for faster responses on simpler tasks.
Perplexity's Sonar models combine LLM capabilities with real-time web search. All models are hosted in the US region.
**Prerequisites:**
1. A Perplexity account at [perplexity.ai](https://www.perplexity.ai)
2. An API key from your [Perplexity API settings](https://www.perplexity.ai/settings/api)
3. Admin access to your Odeus workspace
## Setup
1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
2. Fill in the connection fields:
* **SDK**: select **Perplexity**
* **Base URL**: leave empty to use the default (`https://api.perplexity.ai`)
* **Model ID**: see table below
* **API Key**: paste your Perplexity API key
* **Region**: US
3. Click **Test & continue**, then click **Save model** after the test passes.
## Model IDs
| Model ID | Type |
| --------------------- | ------------------------------------------------------------------------ |
| `sonar-pro` | Advanced search-augmented generation — detailed responses with citations |
| `sonar` | Fast search-augmented responses — good for general-purpose queries |
| `sonar-reasoning-pro` | Deep analysis with search — multi-step reasoning with citations |
| `sonar-reasoning` | Reasoning with search augmentation |
> Check [Perplexity's model documentation](https://docs.perplexity.ai/guides/model-cards) for the full list of available models.
## Configuration Notes
* Perplexity models include built-in web search capabilities, so they always have access to current information.
* The API endpoint `https://api.perplexity.ai` is automatically used when no custom Base URL is provided.
* Sonar Pro models provide more detailed responses with better source citations.
* Reasoning variants are best for complex analytical tasks that benefit from step-by-step thinking.
* Perplexity models do not support image analysis.
## Troubleshooting
**Missing citations**: Perplexity models include citations automatically when web search is used. If citations are missing, the model answered from its base knowledge rather than a web search.
**Slow responses**: Perplexity models perform web searches, which adds latency. Sonar (non-Pro) variants are faster than Pro versions. For time-sensitive tasks without search needs, consider using a different model.
**Model not responding**: verify your API key is valid and that you have sufficient credits. Ensure the model ID matches exactly (case-sensitive).
Use this for any API that follows the OpenAI spec — including vLLM, LiteLLM, Ollama, and self-hosted models.
Many LLM inference solutions implement the OpenAI API specification as a standard interface. This means they accept requests and return responses in the same format as OpenAI's API, making them interchangeable from an integration perspective.
Common OpenAI-compatible solutions:
* **vLLM**: high-throughput inference server for large language models
* **LiteLLM**: proxy server providing a unified interface to 100+ LLM providers
* **Ollama**: run large language models locally
* **Text Generation Inference (TGI)**: Hugging Face's inference server
* **LocalAI**: self-hosted, OpenAI-compatible API
* **Custom deployments**: any service implementing the OpenAI chat completions API
**Prerequisites:**
1. A running OpenAI-compatible inference endpoint accessible over HTTPS
2. The base URL of your endpoint
3. The model ID/name as configured in your inference server
4. An API key (if your endpoint requires authentication)
5. Admin access to your Odeus workspace
## Setup
1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
2. Fill in the connection fields:
* **SDK**: select **OpenAI Compatible**
* **Base URL**: your endpoint URL (e.g. `https://your-server.com/v1`). Required.
* **Model ID**: the exact model identifier as configured in your inference server
* **API Key**: your authentication key, or leave empty if not required
* **Context Size**: the context window size of your model in tokens
3. Click **Test & continue**, then click **Save model** after the test passes.
> Your endpoint must be publicly accessible over HTTPS. Odeus blocks requests to private IPs, localhost, and internal hostnames. Contact [[email protected]](mailto:[email protected]) if you need to connect to an internal endpoint.
## Example Configurations
| Server | Base URL | Model ID |
| ------------- | ---------------------------- | ----------------------------------------------------------------------- |
| vLLM | `https://your-server.com/v1` | Model name from vLLM startup (e.g. `meta-llama/Llama-3.1-70B-Instruct`) |
| LiteLLM proxy | `https://your-litellm.com` | Alias from your LiteLLM config |
| Ollama | `https://your-ollama.com/v1` | Name from `ollama list` (e.g. `llama3.1`) |
> For Azure OpenAI, use the dedicated **Azure** SDK instead. It handles API versioning and deployment-based URL routing automatically.
## Common Use Cases
* **Data privacy**: run models on your own infrastructure so prompts and responses stay within your network.
* **Cost optimization**: running open-source models on your own hardware can significantly reduce costs for high-volume use cases.
* **Custom fine-tuned models**: deploy models fine-tuned for specific tasks or domains with vLLM or similar servers.
* **Multi-provider abstraction**: use LiteLLM as a proxy to route requests to different providers from a single interface.
## Troubleshooting
**Connection refused or timeout**: verify the endpoint is accessible from external servers over HTTPS. Check that your firewall allows incoming connections. Ensure your inference server is running and healthy.
**Authentication errors**: verify your API key and check if your endpoint expects a specific `Bearer` token format.
**Model not found**: ensure the Model ID matches exactly what your inference server expects (case-sensitive). Verify the model is loaded and available on your server.
**Responses are cut off**: check the max output tokens setting in Odeus and verify your inference server's generation length limits.
**Slow responses**: check your server's available GPU memory and compute resources. Consider using quantized model versions for faster inference. Monitor your server's queue length and scaling configuration.
**Incompatible API format**: not all "OpenAI-compatible" servers implement the full API specification. Verify your server supports the `/v1/chat/completions` endpoint and check if your server requires specific API version headers.