OKR Plan

In support of our product/engineering objective (see all OKRs here to Make cloud and enterprise successful at massive scale, one way we will measure our success in achieving this goal is for the Customer Support team to maintain 100% support issue resolution within 7 days while only requiring help (filing a #rfh Github issue) on 10% (measured weekly looking at last 30 days). To accomplish this, we will…

1🚫WarrenAdd src debug command to src-cli
2🚫All CSMake at least 45 doc updates/additions across the team
3GiselleRetro all tickets that resulted in a #rfh for Distribution and Core App
4AdeolaCreate cheat sheets of what logs are most needed in certain situations
5BeatrixMake the command generator customer-facing and scalable
6MichaelCreate a database type solution to make it easy and reliable for application engineers to learn from past tickets
7🚫AlexStreamline key steps in CS workflow
8🚫Carl5 folks complete kubernetes certification
9🚫VirginiaImplement retro practice for all tickets that take longer than X days to solve
10🚫VirginiaProvide enablement in how to navigate difficult conversations with customers
11AdeolaCS Onboarding updates V3

Task details

1 src debug command to src-cli

  • Workgroup: Warren, Tomas
  • Details: This command will create an archive (zip file) with the information we need most often in troubleshooting (values, logs, etc) so that we can ask for one thing and get the majority (if not all) the information we need while troubleshooting. We’ll additionally need a way for customers to transfer us this file (it will probably be too big for slack). This is an MVP in accordance with the observability RFC. Future plans involve encorperating grafana snapshots and jaeger tracing into this tool. You can see the code in the src-cli src debugger branch.

2 Doc updates

  • Workgroup: n/a – all CS
  • Details: Make at least 45 doc updates/additions across the team. These can be tied to cases or not. If not, be sure to link to the PR here:

3 Retro key tickets

4 What logs, when cheat sheets

  • Workgroup: Adeola, Amber, Stompy
  • Channel: #wg-cse-debug
  • Details: Create a more simplified and streamlined troubleshooting process by outlining common customer issues and highlighting what services and logs are related to certain issues and generalizing initial troubleshooting steps.

5 Command generator

6 CSE “database”

  • Workgroup: Michael, Jason, Ben, Gabe, Warren
  • Channel: #wg-post-aux
  • Details: Having a Guide/pool/database of all resolved tickets with specific keywords to easily identify what the troubleshooting steps are talking about, especially for frequent or complex cases where we can easily make reference to for faster customer resolution. Having a well documented case note( outlining thought process, and steps towards resolution) would really go a long way in achieving this.
    • Place for documenting known historic bugs indexed to versions (thinking an md file in our github page), I don’t think the changelog is sufficient for this nor the upgrade pages on Docs.
    • Ensuring we have a framework in place that accounts for data integrity and ensuring customer sensitive information are not exposed.
    • It will be interesting to assess the pros/cons of the solution being customer-facing or not and/or what can be customer-facing vs not

7 Streamline key steps in CSE workflow

  • Workgroup: Alex, Carl, Warren
  • Channel: #wg-cse-automation
  • Details: How can we improve the experience of writing an issue summary for our Slack thread, writing an issue summary for Zendesk, and writing an issue summary for Github, could we make this more DRY? Easier to search?

8 Kubernetes certification

  • Workgroup: Carl, Beatrix, Stompy, Ben, Gabe
  • Channel: wg-cse-k8-training
  • Details: 5 folks on the team to complete certification and make a recommendation as to whether this should be required for the rest of the team

9 Long running issue retros

10 Hard convo enablement

  • Workgroup: Virginia
  • Details: Provide enablement in how to navigate difficult conversations with customers (delivering hard to hear news, keeping calls productive and on time, etc)

11 CSE Onboarding updates V3

Progress update

Progress update on how we are tracking toward our OKR can be found here.

Final summary

During , we averaged resolution time of 7.5 days and engaged engineering 13% of the time. While we did not meet our OKR, that does not mean we failed. This quarter has allowed us more finite clarity on where we need to invest in and , as well as completing the foundation of the team. Additionally, most of the projects we did to help us realize our OKR were only just completed within the last week of the quarter, meaning our results from this quarter were largely without the benefit these projects will bring.

Given that we finished onboarding 10 new members of the team, as well as started to onboard 3 new managers on the team, our performance for is nothing to balk at. It is performance we can be proud of. We may not be where we want to be just yet, but we are so close and it’s ours to have in . Additionally, we made 29 doc updates – not the 45 we thought might be possible, it’s more than double the 11 updates we did the quarter prior.

Here are a few things the team will be able to use in to continue working toward our definitions of support and also realize our Q4 OKR:

  1. A cheat sheet of what logs are most needed in certain situations
  2. A customer-facing and scalable command generator app
  3. Onboarding improvements
  4. And the pièce de résistance, a database of resolved tickets using the power of Sourcegraph (aka an entire new use case for our product!)

Finally, not related to our OKRs or definitions of success, also saw two members of the team accept offers to move into dev teams. You know the support team is respected when moves like this start happening.

It’s been another good quarter with lots of growing, learning, and getting there.