Cody Gateway

Sourcegraph Cody Gateway powers the default "provider": "sourcegraph" Cody completions and embeddings for Sourcegraph Enterprise customers on Sourcegraph 5.1 or later, both on-prem and in Sourcegraph Cloud. It supports a variety of upstream LLM providers, such as Anthropic and OpenAI, with rate limits, quotas, and model availability tied to Sourcegraph Enterprise product subscriptions. Sourcegraph App users with accounts will also be able to use Sourcegraph Cody Gateway.

In general, we have two Cody Gateway deployments running:

  • - for production usage
  • - for development and testing



See Cody Gateway: 1-pager for a recent overview of Cody Gateway architecture and details.

Service images

Source code for Cody Gateway is in sourcegraph/sourcegraph/cmd/cody-gateway. The image gets built the same as any other Sourcegraph service, i.e. with insiders and the standard main-branch tags.

Local development

For local development, refer to How to set up Cody Gateway locally.


Cody Gateway infrastructure is defined in Terraform in sourcegraph/infrastructure/cody-gateway/envs, corresponding to each of the long-running Cody Gateway instances:

To get access to most resources, you’ll need to request infrastructure access.

Infrastructure access

The following Entitle requests can be used to get access to Cody Gateway infrastructure:


See above for links to each resource for each of the following resources for each deployment. In all cases, you’ll need to request infrastructure access.


We have several tiers of alerting for each Cody Gateway instance to help notify engineers if something has gone wrong:

  1. Error reporting
    1. Sentry: All error-level application logs with errors attached, such as:
      1. Internal or background errors
      2. 5xx response details
    2. GCP Error Reporting: All GCP-generated events, such as:
      1. Cloud Run instance panics or failure to start
      2. Unable to route request to a Cloud Run instance (e.g. if no instance is available)
  2. Metrics alerting
    1. GCP Alerting Policies: Policies provisioned through Terraform, covering facets such as:
      1. Cloud Run service health: startup latency, CPU utilization, memory utilization, instance count, request latency, etc.
      2. Cloud Redis service health: CPU utilization, memory utilization, etc.

All alerts from all environments currently go to #alerts-cody-gateway.


Each deployment’s Cloud Run metrics overview page provides basic observability into the service provided out-of-the-box by Cloud Run, such as instance count and resource utilization. Similarly, we depend on out-of-the-box dashboards and metrics from Managed Redis as well.

Cody Gateway does push a few custom metrics via its OpenTelemetry metrics - hand-made dashboards for the prod instance are available here, including our concurrent upstream requests graph.


Each instance also collects and exports traces to Google Cloud Trace. Common ways of approaching traces:

  • Each HTTP request trace can be correlated with the corresponding originating trace through a span link - this will give you the trace ID that you can use to find the corresponding trace in the sourcegraph-dev project for
    • Note: For now, ignore the automatically generated traces from /component: AppServer, as those currently aren’t attached to our application spans.
  • Log entries and Sentry error events will generally have trace IDs attached to them, which can be used to find the corresponding trace in Cloud Trace.

Cody Gateway traces can give you tons of useful information about a request, including:

  • Rate limit information (concurrent requests, consumed quota, current consumption, reason for rejection, and more)
  • What happened in requests to upstream providers
  • Full BigQuery event that was collected
  • Whether a quota usage notification was sent

See above for links to Cloud Trace.

Tracing from a Sourcegraph instance

Cody Gateway is configured to accept all incoming trace contexts as its parent. For Sourcegraph instances in GCP, this means that simply collecting a trace for a request will automatically link up with any corresponding trace in Cloud Trace for Cody Gateway when a trace is viewed in GCP.

The trace that Cody Gateway collects is returned as x-trace and x-span headers in all requests - these are also set as attributes (cody-gateway.x-trace and cody-gateway.x-span respecitvely) on one of the outgoing spans in Sourcegraph. This can be used to find the Cody Gateway side of the trace in Cloud Trace.


To roll out a new Cody Gateway build:

  • Make a PR that updates cody-gateway/envs/prod/cloudrun/ to point to the new build. The image must be in the standard main-branch tag format e.g. 218287_2023-05-10_5.0-5bd03cd18e71.
  • Go to the “Deploy revision” page of the Cloud Run service and click “Deploy” without changing any configuration - this will redeploy the service with the latest cody-gateway:insiders image.
    • This will also happen whenever a Terraform change happens to the cloudrun module.

To configure alerting, some initial setup outside of the Terraform module are required as Terraform modules may not be available or configured:

  1. In Sentry, under each Project, set up an alert rule for all issues that sends notifications to the desired channels.
  2. In GCP Error Reporting, set up alerting through the notification channel(s) provisioned through Terraform.

Usage events

Usage data is collected on a variety of events going through Cody Gateway, which is then sent to BigQuery. BigQuery data can be found in the events table of the following datasets:

See internal/codygateway for the list of events types that are currently tracked.

Data can be queried directly in BigQuery tables above (requires infrastructure access), or in Redash by querying the table for production events. Some sample Redash queries you can use or fork and edit:

A simple overview can also be seen in each product subscription’s licenses page - see Using Cody Gateway: Analyzing usage.


A Google managed Redis instance is provisioned alongside each Cody Gateway deployment to handle caching and rate limiting. The Cody Gateway Cloud Run service connects to this Redis instance via a VPC network - the Redis instance is not directly accessible over the public internet, so you cannot connect to it locally like the Redis that is bundled with Sourcegraph deployments.

To connect to the Redis instance locally for investigative or debugging purposes:

In sourcegraph/infrastructure, update cody-gateway/envs/$ENVIRONMENT/cloudrun/

module "cloudrun" {
  # ...
  deploy_network_compute_instance = true

Create a PR and merge it for Terraform Cloud to apply.

Once applied, you can SSH via an IAP tunnel into the compute instance provisioned in the same VPC as the Cloud Run service and Redis instance:

export PROJECT=""
export REDIS_HOST="" # find the address of Redis instance in GCP Console
gcloud compute ssh cody-gateway-network-connector --project=$PROJECT --zone=us-central1-c --tunnel-through-iap -- -N -L 6378:$REDIS_HOST:6378

In another terminal, you can then connect to Redis locally on port 6378:

export REDISCLI_AUTH="" # find the auth string of Redis instance in GCP Console
redis-cli -p 6378 --tls --insecure

When done, make sure to set deploy_network_compute_instance = false again.

Service accounts

cody-gateway access through standard users that are configured with feature flags to enable special access to GraphQL queries and mutations related to product subscriptions.

The current accounts are as follows:

  • llm-proxy-readonly - this account is the default one provisioned for cody-gateway instances, and should have read-only access to product subscriptions.
  • llm-proxy - this account should have read and write access on cody-gateway-related resources. This is primarily used for Sourcegraph Cloud integration, where we ened to be able to manage cody-gateway access for product subscriptions.

More details for each account are available in the 1password entries linked above.