This defines the maximum disruption we have agreed is allowable. We currently define this as 4-Hours across all public facing sites. For example, about.sourcegraph.com, sourcegraph.com & sourcegraph.com/search. This does not currently include managed services which have RTO & RPO defined seperately.
This defines the point to which we can restore to after an incident. For the sourcegraph cloud primary database, 1 Hour is the agreed upon value for the primary Sourcegraph Postgres database. For other services we define a recovery point maximum of 24 Hours. For example, LSIF data, gitserver persistent disks have daily snapshots taken.
Engineers working on business critical applications on Sourcegraph should ensure that they are able to meet these requirements. If not, please bring it to the attention of the distribution team so we document that service here.
See RFC 281 for further details.