MI v1.1 on-prem data migration
This process describes the current state of how to do a full data migration of an on-prem instance to a MI v1.1 Cloud instance.
Requirements
To qualify for a data migration, the customer must:
- have a Sourcegraph instance on v3.20.0 or later
- use databases on Postgres 11 or later
- not have on-disk database encryption enabled
- have the latest release of
src
- have direct database access
- have a site-admin access token for their instance
An operator must:
- have the latest build of
src
installed - have the
gcloud
CLI installed
Process
Prepare instance
The customer should add a non-dismissible site notice to their instance in global settings:
{
"notices": [
{
"dismissible": false,
"location": "top",
"message": "🚨 A Sourcegraph instance migration is underway - changes to configuration might not be persisted, and performance may be affected, until the migration is finalized."
}
]
}
Create snapshot contents
Databases
The customer should first be asked to create pg_dump
exports of their Sourcegraph databases. Template commands can be generated with src snapshot databases
for various configurations:
$ src snapshot databases --help
'src snapshot databases' generates commands to export Sourcegraph database dumps.
Note that these commands are intended for use as reference - you may need to adjust the commands for your deployment.
USAGE
src [-v] snapshot databases <pg_dump|docker|kubectl> [--targets=<docker|k8s|"targets.yaml">]
TARGETS FILES
Predefined targets are available based on default Sourcegraph configurations ('docker', 'k8s').
Custom targets configuration can be provided in YAML format with '--targets=target.yaml', e.g.
primary:
target: ... # the DSN of the database deployment, e.g. in docker, the name of the database container
dbname: ... # name of database
username: ... # username for database access
password: ... # password for database access - only include password if it is non-sensitive
codeintel:
# same as above
codeinsights:
# same as above
See the pgdump.Targets type for more details.
Each of the generated commands must be run to completion to generate a database dump for each database. The output is as follows:
src-snapshot/primary.sql
src-snapshot/codeintel.sql
src-snapshot/codeinsights.sql
For custom or complex database setups, the operator will decide how best to proceed, in collaboration with IE/CSE/etc - the goal in the end is to generate the above database dumps in a format aligned with the output of src snapshot databases pg_dump
(the plain pg_dump
commands).
Instance summary
A snapshot summary is used to run acceptance tests post-migration. The customer should create one with src snapshot summary
- note that a site admin access token is required:
src login # configure credentials for the instance
src snapshot summary
This will generate a JSON file at src-snapshot/summary.json
. See src snapshot summary --help
for more details.
Create migration resources
First, the operator must create an instance with the configuration for the desired final Cloud instance and freeze it:
mi ssh-exec 'cd /deployment/docker-compose && docker-compose down'
In the cloud-data-migrations
repository:
- Copy the
template/
directory, naming it corresponding to the customer - For
project/
:- Fill out all
$CUSTOMER
variables and set all unset variables interraform.tfvars
as documented - Commit and push your changes
- Create Terraform Cloud workspace for the directory and apply
- Fill out all
- Then do the same for
resources/
, using the outputs ofproject/
Once resources/
has been applied, you should have outputs for a GCP bucket and a GCP service account with write-only access to it. Create a 1password share entry with these outputs:
snapshot_bucket_name
writer_service_account_key
Outputs can also be retrieved from the Terraform state of resources/
:
cd resources/
terraform init
# Bucket name
terraform output -json | jq -e -r .snapshot_bucket_name.value
# Credentials, sent to file
terraform output -json | jq -e -r .writer_service_account_key.value > credential.json
Upload snapshot contents
If the steps to create snapshot contents were followed correctly, the customer should run src snapshot upload
with the appropriate bucket and credentials will find the snapshot contents and upload them to the configured buckets.
src snapshot upload -bucket=$BUCKET -credentials=$CREDENTIALS_FILE
Once the customer has indicated the upload succeeded, validate the contents of the bucket to ensure everything is there:
primary.sql
codeintel.sql
codeinsights.sql
summary.json
Audit logs are generated for bucket access in the project’s logs, under log entries with @type: "type.googleapis.com/google.cloud.audit.AuditLog"
.
Reset databases
First, prepare the Cloud database for import:
gcloud components install cloud_sql_proxy
Get the following data from the Cloud v1.1 instance created in deploy-sourcegraph-managed
:
# from GCP project
export TARGET_INSTANCE_PROJECT="sourcegraph-managed-migration"
export TARGET_INSTANCE_DB="main-47cc60a2"
# in deployment dir
export INSTANCE_ADMIN_PASSWORD=$(terraform show -json | jq -r '.values.root_module.child_modules[].resources[] | select(.address == "module.managed_instance.random_password.db_main_admin_password") | .values.result')
# from migration resources output
export DB_DUMP_BUCKET=""
Then in cloud-data-migrations
, drop all database contents:
cmd/cdi/recreate_dbs.sh
Import databases
Ensure databases have been reset. Then, one by one, import each database from the bucket the customer has uploaded to:
gcloud --project $TARGET_INSTANCE_PROJECT sql import sql $TARGET_INSTANCE_DB gs://$DB_DUMP_BUCKET/primary.sql --database=pgsql
gcloud --project $TARGET_INSTANCE_PROJECT sql import sql $TARGET_INSTANCE_DB gs://$DB_DUMP_BUCKET/codeintel.sql --database=codeintel-db
gcloud --project $TARGET_INSTANCE_PROJECT sql import sql $TARGET_INSTANCE_DB gs://$DB_DUMP_BUCKET/codeinsights.sql --database=codeinsights-db
Upgrade databases
Start the database proxy on the instance:
mi ssh-exec 'cd /deployment/docker-compose && docker-compose up -d cloud-sql-proxy'
If the imported version is less than 2 versions behind Cloud, then you should be able to simply run the migrator:
mi ssh-exec 'cd /deployment/docker-compose && docker-compose up migrator'
Otherwise, you may need to run a multi-version upgrade:
mi ssh-exec 'cd /deployment/docker-compose && docker run --env-file .env --network docker-compose_sourcegraph sourcegraph/migrator:$TO upgrade -from=$FROM -to=$TO'
Spin up instance
If all upgrades succeed, spin up the instance:
mi ssh-exec 'cd /deployment/docker-compose && docker-compose up -d'
Sync configuration:
mi init-instance
Run a health check:
mi check --executors
Run an acceptance test using the downloaded summary.json
from the snapshot bucket:
src login # to the instance
src snapshot test -snapshot-summary="./summary.json"
Remove the migration notice that was added previously.