We use cookies to enhance your experience and measure how the site performs. Choose "Essential Only" to disable analytics. Read our Privacy Policy.

    Odeus Docs

    Setup Guides

    Step-by-step guides for connecting models from Google Vertex AI, AWS Bedrock, Mistral, DeepSeek, Perplexity, and OpenAI-compatible endpoints.

    Setup Guides

    Step-by-step guides for connecting models from Google Vertex AI, AWS Bedrock, Mistral, DeepSeek, Perplexity, and OpenAI-compatible endpoints.

    Select your provider below to see the setup steps.

    Odeus supports two ways to connect Gemini models:
    
    * **Google Vertex AI**: uses service account credentials. Best for enterprise setups with GCP infrastructure.
    * **Google AI Studio**: uses a simple API key. Easier to set up.
    
    ## Option 1: Google Vertex AI
    
    ### Google Cloud Setup
    
    1. Enable the Vertex AI API in your [Google Cloud Platform](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).
    
    2. Go to "Service Accounts" in the Google Cloud Console IAM Settings.
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-1.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=bc82aab0b2ac95f5ac0f4f00b1871063" alt="Go to Service Accounts in the sidebar" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-1.png" />
    
    3. Click on "Create Service Account".
    
    4. Give the Service Account a name.
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-3.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=8383dd25d7fbb015ba4d1b3fe7869609" alt="Give the Service Account a name" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-3.png" />
    
    5. Assign the "Vertex AI User" Role.
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-4.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=f5541283131a285885b8f7479cf9470f" alt="Assign the Vertex AI User Role" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-4.png" />
    
    6. Create the Service Account.
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-5.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=c07540a6fbc57e613cd4fe9798742c09" alt="Create the Service Account" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-5.png" />
    
    7. You are brought back to the Service Account overview.
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-6.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=58e96da270d580a87c7a181aeabfe0f0" alt="Service Account overview" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-6.png" />
    
    8. On the overview page, click on "Manage keys".
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-7.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=c38a552aedad87452a48038c60779beb" alt="Click Manage keys on the service account" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-7.png" />
    
    9. Create a new JSON key.
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-8.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=9f7ad3c0eb79f7a6ae19b2ea5b8193e0" alt="Select Create new key from the dropdown" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-8.png" />
    
    10. Download and open the JSON file.
    
    <img src="https://mintcdn.com/odeus-34/hDlYFjN4znXbKeFA/images/vertex-9.png?fit=max&auto=format&n=hDlYFjN4znXbKeFA&q=85&s=cfdd5794e12deb32d3625201f558af68" alt="Download the key and open the JSON file" style={{borderRadius: '6px'}} width="2404" height="1674" data-path="images/vertex-9.png" />
    
    ### Odeus Setup
    
    1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
    2. Use the prebuilt Odeus config or set up manually. Set the SDK to **Google Vertex**.
    
    > When you select the Google Vertex SDK, the UI relabels the fields: "Base URL" becomes **Service Account Email** and "API Key" becomes **Service Account Private Key**.
    
    3. Fill in the connection fields:
       * **Service Account Email**: paste the `client_email` value from your JSON key file (e.g. `[email protected]`)
       * **Service Account Private Key**: paste the `private_key` value from your JSON key file (including `-----BEGIN PRIVATE KEY-----` and `-----END PRIVATE KEY-----`)
       * **Region**: your Vertex AI region (e.g. `europe-west3`, `us-central1`). This determines which Vertex AI endpoint is used.
       * **Model ID**: the model ID from the Vertex portal (e.g. `gemini-2.5-flash`, `gemini-2.5-pro`)
    
    4. Click **Test & continue**, then click **Save model** after the test passes.
    
    > The GCP project ID is automatically extracted from your service account email. You don't need to enter it separately.
    
    ## Option 2: Google AI Studio
    
    1. Get an API key from [Google AI Studio](https://aistudio.google.com/apikey).
    2. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**. Select **Google AI Studio** as the SDK.
    3. Paste your API key and set the Model ID.
    4. Click **Test & continue**, then click **Save model** after the test passes.
    
    ## Imagen (Image Generation)
    
    Follow the Vertex AI setup above, but set the model type to **Image Generation** and use an Imagen model ID (e.g. `imagen-4.0-generate-001`).
    
    
    
    AWS Bedrock allows you to access models like Claude through your own AWS infrastructure with enterprise-grade security and compliance.
    
    **Prerequisites:**
    
    1. An AWS account with Bedrock access enabled
    2. IAM credentials with Bedrock permissions
    3. Model access enabled in your AWS Bedrock console
    4. Admin access to your Odeus workspace
    
    ## AWS Setup
    
    ### 1. Enable Model Access
    
    1. Go to the [AWS Bedrock Console](https://console.aws.amazon.com/bedrock).
    2. Navigate to **Model access** in the left sidebar.
    3. Click **Manage model access** and enable the models you need.
    4. Wait for access to be granted (this may take a few minutes).
    
    ### 2. Create IAM Credentials
    
    1. Go to the [AWS IAM Console](https://console.aws.amazon.com/iam).
    2. Navigate to **Users** and click **Create user**.
    3. Give the user a descriptive name (e.g. `odeus-bedrock-access`).
    4. Attach the `AmazonBedrockFullAccess` policy, or create a custom policy with minimum required permissions:
    
    ```json theme={null}
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "bedrock:InvokeModel",
                    "bedrock:InvokeModelWithResponseStream"
                ],
                "Resource": "*"
            }
        ]
    }
    ```
    
    5. Go to the user's **Security credentials** tab, click **Create access key**, select **Third-party service**, and save both the Access Key ID and Secret Access Key.
    
    > The Secret Access Key is only shown once. Store it securely before closing the dialog.
    
    ## Odeus Setup
    
    1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
    
    2. Select **Bedrock** as the SDK.
    
    3. Fill in the connection fields:
       * **Access Key ID**: your AWS Access Key ID
       * **Secret Access Key**: your AWS Secret Access Key
       * **Region**: your AWS region (e.g. `us-east-1`, `eu-central-1`)
       * **Model ID**: use the Bedrock model identifier (see below)
       * **Context Size**: set according to the model (see the [model configuration tables](/en/admin/byok/recommended-models#model-specific-configuration))
    
    4. Click **Test & continue**, then click **Save model** after the test passes.
    
    ## Model IDs
    
    | Provider  | Format                     | Example                                  |
    | --------- | -------------------------- | ---------------------------------------- |
    | Anthropic | `anthropic.{model-name}`   | `anthropic.claude-sonnet-4-6`            |
    | Meta      | `meta.{model-name}-v1:0`   | `meta.llama4-maverick-17b-instruct-v1:0` |
    | Amazon    | `amazon.{model-name}-v1:0` | `amazon.nova-pro-v1:0`                   |
    
    > Check the [AWS Bedrock supported models page](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for exact model IDs.
    
    ## Cross-Region Inference Profiles
    
    Prefix the model ID with a geographic code to route across regions automatically:
    
    | Prefix    | Scope                  |
    | --------- | ---------------------- |
    | `us.`     | US regions             |
    | `eu.`     | European regions       |
    | `global.` | All commercial regions |
    | `apac.`   | Asia-Pacific regions   |
    
    Example: `eu.anthropic.claude-sonnet-4-6`
    
    Check the [inference profiles documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) for available profiles per model.
    
    ## Supported Regions
    
    * **US East (N. Virginia)**: `us-east-1`
    * **US West (Oregon)**: `us-west-2`
    * **EU (Frankfurt)**: `eu-central-1`
    * **EU (Ireland)**: `eu-west-1`
    * **EU (Paris)**: `eu-west-3`
    * **Asia Pacific (Tokyo)**: `ap-northeast-1`
    * **Asia Pacific (Sydney)**: `ap-southeast-2`
    
    ## Network Configuration
    
    If your organization uses network allowlisting, add `bedrock.REGION.amazonaws.com` to your allowlist (replace `REGION` with your AWS region, e.g. `us-east-1`).
    
    ## Troubleshooting
    
    **"Access Denied" errors**: verify IAM permissions and that model access is enabled in the Bedrock console.
    
    **Model not available**: confirm the model is enabled in your AWS Bedrock model access settings and available in your selected region.
    
    **Authentication failures**: double-check that Access Key ID and Secret Access Key fields contain the correct values and that the region setting matches your Bedrock region.
    
    **Slow responses or timeouts**: consider using a region closer to your users. Check the [AWS Service Health Dashboard](https://health.aws.amazon.com/health/status) for any ongoing issues. Verify your AWS account has sufficient quotas for the model.
    
    
    
    Mistral models connect via the Mistral API or via Azure (for Azure-hosted Mistral).
    
    **Prerequisites:**
    
    1. A Mistral account at [console.mistral.ai](https://console.mistral.ai)
    2. An API key from the Mistral platform
    3. Admin access to your Odeus workspace
    
    ## Setup
    
    1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
    
    2. Fill in the connection fields:
       * **SDK**: select **Mistral**
       * **Base URL**: leave empty to use the default (`https://api.mistral.ai/v1`) or specify a custom endpoint
       * **Model ID**: use the official model identifier (see below)
       * **API Key**: paste your Mistral API key
    
    3. Click **Test & continue**, then click **Save model** after the test passes.
    
    ## Model IDs
    
    | Model ID               | Use case                                                                |
    | ---------------------- | ----------------------------------------------------------------------- |
    | `mistral-large-latest` | Flagship model — complex reasoning, multilingual, instruction following |
    | `codestral-latest`     | Code-specialized — code generation, completion, and technical tasks     |
    | `mistral-small-latest` | Fast and cost-effective — good for everyday tasks                       |
    
    > Check [Mistral's model documentation](https://docs.mistral.ai/getting-started/models/models_overview/) for the full list of available models.
    
    ## Using Mistral from Azure
    
    If you're using Mistral models hosted on Azure (via Azure AI Models-as-a-Service), you still need to select **"Mistral"** as the SDK in Odeus. The SDK selection refers to the API format, not the hosting provider.
    
    > When configuring Azure-hosted Mistral models:
    
      * Set the **Hosting provider** to Azure
      * Set the **SDK** to "Mistral" (not Azure OpenAI)
      * Use your Azure endpoint as the Base URL
      * Use your Azure API key
    
    ## Configuration Notes
    
    * Mistral models support tool calling natively.
    * The default API endpoint `https://api.mistral.ai/v1` is used automatically when no custom Base URL is provided.
    * Mistral models are known for strong multilingual capabilities, particularly in European languages.
    
    ## Troubleshooting
    
    **Model not responding**: verify your API key is valid and that you have sufficient credits in your Mistral account. Ensure the model ID matches exactly (case-sensitive).
    
    **Authentication errors with Azure**: double-check that you're using "Mistral" as the SDK, not "Azure OpenAI". Verify your Azure endpoint URL is correct and accessible, and that your Azure API key has the necessary permissions.
    
    **Slow responses**: larger models may take longer for complex reasoning tasks. Consider using a smaller model for faster responses on simpler tasks.
    
    
    
    DeepSeek models connect via the DeepSeek API. All models are hosted in the US region.
    
    **Prerequisites:**
    
    1. A DeepSeek account with an [API key](https://platform.deepseek.com/api_keys)
    2. Admin access to your Odeus workspace
    
    ## Setup
    
    1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
    
    2. Fill in the connection fields:
       * **SDK**: select **DeepSeek**
       * **Base URL**: `https://api.deepseek.com/v1`
       * **Model ID**: see table below
       * **API Key**: paste your DeepSeek API key
       * **Region**: US
    
    3. For reasoning models (R1), enable **Always show reasoning** to surface the model's thinking in the UI.
    
    4. Click **Test & continue**, then click **Save model** after the test passes.
    
    ## Model IDs
    
    | Model ID            | Type                                                                            |
    | ------------------- | ------------------------------------------------------------------------------- |
    | `deepseek-reasoner` | Reasoning model (R1 series) — excels at step-by-step problem solving and coding |
    | `deepseek-chat`     | General-purpose model (V3 series) — fast responses, good for everyday tasks     |
    
    > Check [DeepSeek's API docs](https://api-docs.deepseek.com/quick_start/pricing) for the latest available models.
    
    ## Configuration Notes
    
    * DeepSeek models are hosted in the US region only.
    * DeepSeek R1 is a reasoning model — enable **Always show reasoning** to see its reasoning steps in the UI. DeepSeek models do not support image analysis.
    * The base URL must include the `/v1` path: `https://api.deepseek.com/v1`.
    
    ## Troubleshooting
    
    **Model not responding**: verify your API key is valid and that you have sufficient credits. Ensure the model ID matches exactly (case-sensitive).
    
    **Slow responses**: DeepSeek R1 (reasoning model) may take longer due to its step-by-step reasoning process. Consider using DeepSeek V3 for faster responses on simpler tasks.
    
    
    
    Perplexity's Sonar models combine LLM capabilities with real-time web search. All models are hosted in the US region.
    
    **Prerequisites:**
    
    1. A Perplexity account at [perplexity.ai](https://www.perplexity.ai)
    2. An API key from your [Perplexity API settings](https://www.perplexity.ai/settings/api)
    3. Admin access to your Odeus workspace
    
    ## Setup
    
    1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
    
    2. Fill in the connection fields:
       * **SDK**: select **Perplexity**
       * **Base URL**: leave empty to use the default (`https://api.perplexity.ai`)
       * **Model ID**: see table below
       * **API Key**: paste your Perplexity API key
       * **Region**: US
    
    3. Click **Test & continue**, then click **Save model** after the test passes.
    
    ## Model IDs
    
    | Model ID              | Type                                                                     |
    | --------------------- | ------------------------------------------------------------------------ |
    | `sonar-pro`           | Advanced search-augmented generation — detailed responses with citations |
    | `sonar`               | Fast search-augmented responses — good for general-purpose queries       |
    | `sonar-reasoning-pro` | Deep analysis with search — multi-step reasoning with citations          |
    | `sonar-reasoning`     | Reasoning with search augmentation                                       |
    
    > Check [Perplexity's model documentation](https://docs.perplexity.ai/guides/model-cards) for the full list of available models.
    
    ## Configuration Notes
    
    * Perplexity models include built-in web search capabilities, so they always have access to current information.
    * The API endpoint `https://api.perplexity.ai` is automatically used when no custom Base URL is provided.
    * Sonar Pro models provide more detailed responses with better source citations.
    * Reasoning variants are best for complex analytical tasks that benefit from step-by-step thinking.
    * Perplexity models do not support image analysis.
    
    ## Troubleshooting
    
    **Missing citations**: Perplexity models include citations automatically when web search is used. If citations are missing, the model answered from its base knowledge rather than a web search.
    
    **Slow responses**: Perplexity models perform web searches, which adds latency. Sonar (non-Pro) variants are faster than Pro versions. For time-sensitive tasks without search needs, consider using a different model.
    
    **Model not responding**: verify your API key is valid and that you have sufficient credits. Ensure the model ID matches exactly (case-sensitive).
    
    
    
    Use this for any API that follows the OpenAI spec — including vLLM, LiteLLM, Ollama, and self-hosted models.
    
    Many LLM inference solutions implement the OpenAI API specification as a standard interface. This means they accept requests and return responses in the same format as OpenAI's API, making them interchangeable from an integration perspective.
    
    Common OpenAI-compatible solutions:
    
    * **vLLM**: high-throughput inference server for large language models
    * **LiteLLM**: proxy server providing a unified interface to 100+ LLM providers
    * **Ollama**: run large language models locally
    * **Text Generation Inference (TGI)**: Hugging Face's inference server
    * **LocalAI**: self-hosted, OpenAI-compatible API
    * **Custom deployments**: any service implementing the OpenAI chat completions API
    
    **Prerequisites:**
    
    1. A running OpenAI-compatible inference endpoint accessible over HTTPS
    2. The base URL of your endpoint
    3. The model ID/name as configured in your inference server
    4. An API key (if your endpoint requires authentication)
    5. Admin access to your Odeus workspace
    
    ## Setup
    
    1. Go to [Workspace Settings -> Models](https://app.odeus.ai/settings/workspace/models) and click **Add custom model**.
    
    2. Fill in the connection fields:
       * **SDK**: select **OpenAI Compatible**
       * **Base URL**: your endpoint URL (e.g. `https://your-server.com/v1`). Required.
       * **Model ID**: the exact model identifier as configured in your inference server
       * **API Key**: your authentication key, or leave empty if not required
       * **Context Size**: the context window size of your model in tokens
    
    3. Click **Test & continue**, then click **Save model** after the test passes.
    
    > Your endpoint must be publicly accessible over HTTPS. Odeus blocks requests to private IPs, localhost, and internal hostnames. Contact [[email protected]](mailto:[email protected]) if you need to connect to an internal endpoint.
    
    ## Example Configurations
    
    | Server        | Base URL                     | Model ID                                                                |
    | ------------- | ---------------------------- | ----------------------------------------------------------------------- |
    | vLLM          | `https://your-server.com/v1` | Model name from vLLM startup (e.g. `meta-llama/Llama-3.1-70B-Instruct`) |
    | LiteLLM proxy | `https://your-litellm.com`   | Alias from your LiteLLM config                                          |
    | Ollama        | `https://your-ollama.com/v1` | Name from `ollama list` (e.g. `llama3.1`)                               |
    
    > For Azure OpenAI, use the dedicated **Azure** SDK instead. It handles API versioning and deployment-based URL routing automatically.
    
    ## Common Use Cases
    
    * **Data privacy**: run models on your own infrastructure so prompts and responses stay within your network.
    * **Cost optimization**: running open-source models on your own hardware can significantly reduce costs for high-volume use cases.
    * **Custom fine-tuned models**: deploy models fine-tuned for specific tasks or domains with vLLM or similar servers.
    * **Multi-provider abstraction**: use LiteLLM as a proxy to route requests to different providers from a single interface.
    
    ## Troubleshooting
    
    **Connection refused or timeout**: verify the endpoint is accessible from external servers over HTTPS. Check that your firewall allows incoming connections. Ensure your inference server is running and healthy.
    
    **Authentication errors**: verify your API key and check if your endpoint expects a specific `Bearer` token format.
    
    **Model not found**: ensure the Model ID matches exactly what your inference server expects (case-sensitive). Verify the model is loaded and available on your server.
    
    **Responses are cut off**: check the max output tokens setting in Odeus and verify your inference server's generation length limits.
    
    **Slow responses**: check your server's available GPU memory and compute resources. Consider using quantized model versions for faster inference. Monitor your server's queue length and scaling configuration.
    
    **Incompatible API format**: not all "OpenAI-compatible" servers implement the full API specification. Verify your server supports the `/v1/chat/completions` endpoint and check if your server requires specific API version headers.