This documentation details significant changes of Managed Instance v2.0 comparing to the previous version.
Unless we explictly call it out, you may assume things are unchanged.
Learn more from:
- Cloud v2 MVP shortlist
- Cloud v2 migration and decision docs
- Cloud v2 Orchestration docs
- Cloud v2 diagrams
- Cloud v2 remaining work
- Cloud v2 SOC2 working docs
The largest architecture changes are moving from a standalone VM to GKE. Learn more from our Cloud v2 diagrams.
Postgres database now uses a single [Cloud SQL] instance, which is a fully managed service by GCP. It provides fully automated daily backup with point-in-time-recovery and retains for 7 days. We also have on-demand backup prior to upgrade to provide fallback plan for unanticipated events.
All services of a Cloud instance are running on a dedicated GKE cluster. We utilize Backup for GKE to provides fully automated daily backup with retention set to 90 days. The backup includes all production disks and application state. Additionally, backup is always taken prior to upgrade or other major operation.
Deployment artifacts are stored in a centralized GitHub repoistory sourcegraph/cloud.
Each enviornment is namespaced under
environments/$env. A centralized repo makes sharing global configuration much easier comparing to having multiple repo.
Learn more from diagram
This is our internal development environment. All
dev deployment should be short-lived and they should always be teardown when they are no longer needed.
All engineering teammates are allowed to create instances and perform experiment under the
dev environment. Access in general is unrestricted.
prod environment is restricted and follow our access policy.
This is our production environment and consists of internal and customer instances. All
prod deployment is long-lived.
Below is a list of long-lived internal instances:
Internal instances are created for various testing purposes:
- testing changes prior to the monthly upgrade on customer instances. upon a new release is made available, Cloud team will follow managed instances upgrade tracker (this is created prior to monthly upgrade) to proceed with upgrade process.
- testing significant operational changes prior to applying to customer instances
- long-lived instances for product teams to test important product changes, e.g. scaletesting.
All customer instances are considered part of the
prod environment and all changes applied to these customers should be well-tested in the
dev environment and internal instances.
The following processes only apply to Cloud v2.0:
- Create a Managed Instance
- Restore a Managed Instance
- Upgrade a Managed instance
- Delete a Managed instance
- Disaster Recovery process for a Managed instance
Below is the bare minimal prereq before you can work with Cloud instances
- sourcegraph/cloud: deployment repo where we persist all Cloud instances config and deployment artifacts
- sourcegraph/controller: mi2 - cloud controller source code
mi2by following sourcegraph/controller#installation
- Sufficient access to GCP projects, see below FAQ to learn how to request access.
Let’s walkthrough the process of accessing a Cloud instance:
gh repo clone sourcegraph/cloud
Then you can start running various
mi2 commands to work with a specific Cloud Instance (where we will infer the current instance base on current working directory).
# start a proxy to the database instance mi2 instance db proxy
Learn more from the
mi2 cli reference for detail usage and examples.
We utilize Entitle to provide time-bound access to GCP infrastructure for both production and development environment.
Use the slash command in Slack, type
/access_request in any chat window and hit enter. Fill out the following values:
- Search permission: One of
Cloud V2 Dev Access,
Cloud V2 Prod Access
- Permission duration: Preferably to request the minimal amount of time
- Add justification: Add a note to provide context why access is needed
@security-support in #cloud for immediate attention if it is time sensitive. If the request is related to an ongoing incident, please page Cloud on-call engineer using OpsGenie.
Learn more from Request access to Cloud instances UI
There are two ways
If you not sure about the
environment of an instance, go to s2
repo:^github\.com/sourcegraph/cloud$ file:config.yaml <insert customer name or domain name as keyword to filter>
INSTANCE_ID is the value of
If you know the slug of the instance, run below at the root of the sourcegraph/cloud deployment repo to retrieve the instance ID
mi2 instance get -e $ENVIRONMENT --slug $CUSTOMER | jq -r '.metadata.name'
Learn more from CLI reference.
Run below command to retrieve the credentials and configure the proper
mi2 instance workon -exec
Then run the typical
kubectl command to interact with the cluster. Additinoally, you can always use the GKE UI on GCP Console if you prefer.
In v2, we use
mi2 cli to dynamically generate the cdktf stacks for each modules.
cloud repo, run the following:
mi2 workflow run -e $ENVIRONMENT -exec -exec.concurrency 4 generate-cdktf
Commit the changes and open a pull request.
The following modules have auto-apply enabled, hence when they’re changed, no action is required once they are merged
For other modules, it’s recommended to utilize below process.
# retrieve status of the plan # make sure to run `--help` to learn more about different output format options mi2 instance tfc check $module_name # confirm the plan and apply it mi2 instance tfc confirm
We will add more step-by-step instruction in the future
Depending on how complex and the blast radius of the change, you may consider sample plan outputs of a few instances,
and use the
mi2 workflow command to apply across all instances at once.
You can also utilize the
mi2 workflow command to aggregate the raw plan output of all instances and perform precise check on them to ensure
the plan output is exactly what you are looking for.
Use the fork in GitHub Actions, modify the
setup-mi2 action to reference the fork and pin to a specific commit, branch, or tag.
- name: setup mi2 tooling uses: ./.github/actions/setup-mi2 with: # Add a comment explain why a fork is required # cdktf-version: 0.13.3 cdktf-repository: sourcegraph/terraform-cdk cdktf-ref: fix/tfc-planned-status
Use the fork locally:
gh repo clone sourcegraph/terraform-cdk cd terraform-cdk yarn install yarn build # in your shell config file or within the terminal session alias cdktfl=/abspath-to-terraform-cdk-repo/packages/cdktf-cli/bundle/bin/cdktf
Then replace all
cdktf command with