How repo-updater works
- How repo-updater works
Sourcegraph mirrors repositories from code hosts. Code hosts may be SaaS products, such as GitHub or AWS CodeCommit, or local installations that are private to a customer’s environment. The
repo-updater service schedules repository synchronization activities using gitserver and any configured code hosts.
repo-updater instance exposes an HTTP server as its primary interface. This interface allows clients to schedule synchronization requests for the following:
- Code host
- Repository permission
Although the majority of Git operations are issued directly to
gitserver, clones and fetches are routed through
repo-updater to ensure that code host limits and other concerns are respected.
As noted earlier, there are a variety of code hosts that Sourcegraph can integrate with. The Source interface abstracts these code host communication details. For example, listing GitHub repositories is handled differently than listing GitLab repositories.
The service’s key data structure is a priority queue of repository updates. It implements the
heap.Interface and the
sort.Interface and functions in the following ways:
- Updates are sorted using simple heuristics based on repository metadata
- Queue positions can be modified in response to explicit requests
- Priority levels can be set for permissions and authorization updates
- Updates are handled via background worker jobs
- The external_service_sync_jobs_with_next_sync_at view provides insights into the priority queue’s activities and current depth
There is exactly one instance of
repo-updater running, by design. This allows us to:
- Avoid expensive coordination issues
- Respecting the aforementioned code host limits
repo-updater can begin accepting work, it needs to check that the following services are running and responsive to pings:
- frontend - implemented by the internal API client
- gitserver instances - implemented by the gitserver client
See “How gitserver works: Production instances” for more information.
repo-updater is running in sourcegraph.com mode, it will verify that certain code hosts (specifically GitHub and GitLab) are properly configured. This is a requirement for us to be able to automatically add repositories from those code hosts when users browse to them.
We track a variety of metrics in
repo-updater that you’ll want to familiarize yourself with. For example: