The Data & Analytics team uses two primary tools for dashboards and reporting - Looker and Amplitude. Other tooling is used across the org (ex. Google Analytics, Google Sheets, etc) but these two tools house the majority of the enterprise reporting content owned by D&A.
Looker is a business intelligence tool used for standard enterprise reporting and ad hoc reporting and analysis. All Sourcegraph employees can have View access to Looker. Some groups have the ability to create reports/explore datasets. If you need help getting access to Looker, please reach out to #ask-it-tech-ops.
March 2023 update
Please be advised that on March 31, 2023 we transitioned to to a new project in Looker in order to improve performance, deprecate outdated content, add new content, and give existing content a refresh. As a result, a lot of content was changed and re-organized, and links to content from the old project have broken. Please see the FAQ for more details about these changes and how to navigate the new project. We’ve also created a video demo to help you find old content in the new project.
Looker is essentially the frontend interface for our BigQuery data warehouse. The data available in BigQuery/Looker is evolving all the time, but here are a few of the primary data sources:
- Pings data from free, on-prem instances, customer managed/on-prem instances, and Sourcegraph.com
- Eventlogger’s event-stream usage data for sourcegraph.com and about.sourcegraph.com (for customer event-stream data, please see Amplitude)
- Sourcegraph app data
- Salesforce data, like information about accounts, opportunities, leads, etc.
To find content you may be interested in, we recommend subscribing to “boards” that are relevant to you. Boards are collections of dashboards and looks related to a specific team or initative. We’ve curated these boards to make it easy for you to browse content for the teams/iniatives you’re interested in.
A few of these boards are listed below, but our boards are evolving every day, so we recommend browsing our boards in Looker directly. You can do this by going to the menu (from wherever you are in looker, click the three lines in the top left to go to the menu!) then click the “+” next to “Boards” in the left navigation menu and select “Browse all boards”
- AE/CE team board: This board contains a variety of dashboards and looks that might be relevant to AEs and CEs, mostly pertaining to customer/prospect product usage (daily active users, batch changes activity, support tickets, etc)
- SDR team board:This board contains dashboards that are relevant to SDRs / anyone doing prospecting, such as a dashboard that enables you to see if any target accounts have existing sourcegraph instances
- Technical advisor team board:The board contains content that TAs will find particularly useful, such as detailed information about product usage on customer instances, customer support tickets, seat consumption data, etc.
- Customer support team board: This board contains dashboards that might be helpful for debugging customer issues. It contains data about deployment types, code hosts, dependency versions, etc
- Finance team board:This board contains content that is used in regular finance reporting or for specific finance team initiatives
- Marketing team boardThis board contains dashboards and content that the marketing team will find particularly useful - such as campaign performance overviews and sourcegraph.com web traffic data
We don’t have boards for every team yet, but we’re working on it!
- Product-led growth board: Contains data on PLG efforts, such as Sourcegraph App KPIs
- Instance configuration board: Contains data on things like instance deployment methods, postrges dependency versions, customer telemetry status, etc. We recommend engineers subscribe to this board
- Sourcegraph.com traffic analytics board: Contains web traffic data for sourcegraph websites (ex pageviews,sessions, clicks, etc)
- [Company-wide metrics board]:(https://sourcegraph.looker.com/boards/58): This board contains data for company metrics that are top-of-mind, such as DAUs, DAU/MAU ratio, seat consumption ratio, and how many customers are on the most recent version. We recommend everyone add this board! You never know, what you learn may come in handy during Sourcegraph trivia at Merge :)
It depends on the dataset, but most of our data pipelines/transformations run anywhere from once a day to once an hour. So you should always be seeing near real-time data in Looker. However, we do have a few datasets (specifically customer telemetry status, and linking a customer to their instance identifiers) that are currently updated manually, and therefore a bit less frequently.
If you suspect you’re looking at stale data on a dashboard or look, be sure to:
- Check the filters: most of our charts are filtered by date, often with filters such as “the last complete week” or “the last complete day.” If the data you’re seeing is not filtered to the timeframe you need, you may have to adjust the filters.
- Clear the cache: Sometimes looker will cache the results of a query to avoid querying the database multiple times. If you think you’re seeing stale data on a dashboard, select the gear button in the top right corner and choose “clear cache and refresh” from the drop-down menu options.
We’ve created a few training videos to help you navigate our Looker project:
- Finding content in the new Looker project
- Navigating our Looker project
- Using our dashboards
- Using our looks
- “Exploring” our datasets - note that “View” users do not have access to explore content
- Creating custom fields - note that “View” users do not have access to explore content
Looker also puts out guides and training videos that may be useful if you’re looking for a more general tutorial about using the tool. Here are a few that we like in particular:
- Almost all of our Looker content is filterable by
installer email, and (where relevant)
Date. To find a customer instance, we recommend filtering by account name. However, to find a propsect’s instance or an instance belonging to a free user, you should filter using the installer_email. If you don’t a prospect’s installer_email, you can look it up in salesforce (Account > Instance > unique_server_id). This is because we don’t currently automatically associate all instances to a salesforce account automatically (this functionality is coming soon)
- Most of our dashboards/looks filter to
Account type = Customerby default. To search for a prospect or free instance, change the filter to
Account type = Free
- To find content you may be interested in, we recommend subscribing to “boards” that are relevant to you. Boards are collections of dashboards and looks. You can subscribe to boards by going to your looker homepage (from wherever you are in looker, click the logo in the top left corner to go to your homepage!) then click the “+” next to “Boards” in the left navigation menu and select “Browse all boards”
- Many (though not all) dashboards offer the ability to “drill-down” into data points by clicking on values. For example, on this dashboard, you can click any data point in the top chart to see a full list of the data that comprises the data point.
- Many dashboards and charts offer the ability to navigate to other dashboards by clicking on an
installer_email. For example, on this dashboard you can click on any account name in the last chart and then select “Single-Instance Overview” from the drop-down menu to navigate to a dashboard that will give you more detailed usage data for that account
- Keep track of the content you like by “favoriting” it! You can favorite content by clicking the “heart” icon next to the name of any dashboard or look. View favorited content by clicking “Favorites” in the left-hand menu
- For best practices for writing LookML, adding new tables to our project, creating content, etc. please refer to our best practices
Below you’ll find links to Looker reports containing frequently requested data that many teams find useful. This doesn’t represent all the reports available, but should give you a good idea of the types of reports that can be found in Looker.
Company level metrics If you’re looking for high-level data about how our user base interacts with our product generally, below are links to charts for some commonly requested data points. Unless otherwise noted, these charts include data for both free and paying users.
- Monthly users by instance type
- Instances by deployment type
- Average seat consumption (MAU/Total Seats)) - customers only
- Technical health score
Customer-level metrics If you’re looking for product usage data for a specific customer(s), below are links to charts for some commonly requested data points. Please note that if you want to see a full overview of a specific instance’s usage, the dashboards in the “Full instance overview” section may be more useful.
- MAUs by customer
- Total user accounts by customer (monthly)
- DAU/MAU by customer
- Customers by Sourcegraph version
- Deployment type by customer (ex. kubernetes, docker-compose)
- Hosting type by customer (ex. self-hosted, managed instance)
- Telemetry status by customer
- Code hosts by customer
- Total repos by customer
- NPS score by customer
- Customer support metrics by customer
Full instance overview If you’re looking for a full breakdown of usage on a server or for a particular customer, below are links to a few widely used dashboards that contain a variety of usage metrics.
- Single-instance dashboard: This dashboard is much more comprehensive than the customer account overview, but not every metric on here will be important/relevant for every user/customer. You can also find free and prospect instances on this dashboard
- Technical health dashboard: This dashboard contains the account’s technical health score and the metrics that are included as a part of that score.
Feature specific dashboards and charts
- Code intelligence - all customers
- Code intelligence - single customer
- Batch changes - all customers
- Batch changes - single customer
- Code insights - all customers
- Code insights - single customer
- Search - all customers
- Search - single customer
- App Performance KPIs
Related useful links
- placeholder for user metrics definitions
- placeholder for sources of truth Includes some definitions and links to data points owned/managed by Sales and Finance (ex. ARR)
Amplitude is a business intelligence tool specifically for product analytics, unlike Looker, which visualizes many different types of data. Amplitude is where we visualize user-level, event-stream data for both customers and for Sourcegraph.com.
Amplitude contains data from some managed instance customers, and from dotcom.
Request access from #ask-it-tech-ops
Amplitude is best used to perform product-specific analyses and to better understand the user journey, and is therfore often most useful for designers or product managers with an understanding of those types of analyses.
Redash is a data analysis tool to enable power users of our data to query our data warehouse directly using SQL.
Redash allows you to:
- Query our data tables & create saved queries that can be easily revisited and shared with your colleagues.
- Have queries auto-refresh, so whenever you go back to it, it’ll have the most up-to-date data
- Create customizable queries that you can update parameters to (example here)
- Create simple charts/visualizations (example here)
Redash is connected to our BigQuery data warehouse, so you’ll be able to query any table that lives there. There are a lot of tables in our data warehouse and many may not be relevant. Here are the ones we’d suggest you use Redash to query. If you need data that doesn’t live in one of these tables, you may want to reach out to our team so we can point you in the right direction.
- dotcom_events.events: dotcom click data
- dotcom_events.events_usage: managed instance data
- sourcegraph_analytics.update_checks: pings
- dotcom_events.cody: all cody-specific event data
This is really only useful for those who have experience with where our data lives and have strong understanding of SQL. If that is you, then this tool can enable you to answer one-off questions and create simple charts. All other users will be better served by Looker and Amplitude
We’ve configured Redash to authenticate you with your Sourcegraph Google account, so no need to request access. Just click here!
Here’s a quick loom overview to get you started.