The groups within this stage are:
Using GitLab, you automatically get broad and deep insight into the health of your deployment.
We provide a robust monitoring solution to give GitLab users insight into the performance and availability of their deployments and alert them to problems as soon as they arise. We provide data that is easy to digest and to relate to other features in GitLab. With every piece of the devops lifecycle integrated into GitLab, we have a unique opportunity to closely tie our monitoring features to all of the other pieces of the devops flow.
We work collaboratively and transparently and we will contribute as much of our work as possible back to the open source community.
The monitoring team is responsible for:
This team maps to Monitor Stage.
We use the geekbot slack plugin to automate our async standup, following the guidelines outlined in the Geekbot commands guide. Answers are concise and focused on top priority items. All question prompts are optional and only answered when the information should be surfaced to the team:
Are you facing any blockers requiring action from others?
Why do we ask this question?
What do we hope to achieve?
What are you aiming to accomplish by the end of the week?
We want to understand how our daily actions drive us toward our weekly goals. This question provides broader context for our daily work, but also helps us hold ourselves accountable to maintaining proper scopes for our tasks, issues, merge requests, etc. This answer may stay the same for a week, this would mean things are progressing on schedule. Alternatively, seeing this answer change throughout the week is also okay. Maybe we got side tracked helping someone get unblocked. Maybe new blockers came up. The intention is not to have to justify our actions, but to keep a running record of how our work is progressing or evolving.
What will be your primary focus for today?
This question is aimed at the most impactful task for the day. We aren't tyring to account for the entire day's worth of work. Highlighting only a primary task keeps our answers concise and provides insight into each team member's most important priority. This doesn't necessarily mean sharing the task that will take the most time. We focus on results over input. Typically this will mean highlighting the task that is most impactful in closing the gap between today and our end of the week goal(s).
Any personal tidbits you'd like to share?
This question is intentionally open ended. You might want to share how you feel, a personal anecdote, funny joke, or simply let the team know that you will have limited availability that afternoon. All of these answers are welcome.
Every-other week we have a Monitor Stage Demo Hour for engineering and design demos by members of the Monitor Stage group. Demos are voluntary and on a sign-up basis.
There is also an optional Monitor Social Hour meeting every week. This call has no agenda and alternates times every other week to be more inclusive of team members in different time zones.
We follow the same retrospective process as the rest of the engineering department, which can be found here.
To encourage a more iterative retrospective process, we create a new retrospective issue at the beginning of each milestone, using the Monitor retrospective template. We leave this issue open for the duration of the milestone so any team member can add feedback as it happens instead of waiting until the end of the milestone.
Just like the rest of the company, we use PTO Ninja to track when team members are traveling, attending conferences, and taking time off. The easiest way to see who has upcoming PTO is to run the
/ninja whosout command in the
#g_monitor_standup slack channel. This will show you the upcoming PTO for everyone in that channel.
Not everyone in the Monitor stage has a background that resonates with our primary user personas:
In this program, engineers are expected to devote 1 entire week to shadow SREs. There is no expectation for the engineer to complete their assigned issues during this time. Engineers are added to PagerDuty and will follow the existing SRE shadow format of interning (except scaled down to a shorter duration of 1 week). Although typical SREs on-call for multiple days at a time, shadows are only expected to shadow during their regular business hours. This can be set as a preference in PagerDuty.
Engineers interested in the program should notify their respective frontend/backend engineering managers. Managers should collaborate and determine an optimal schedule in the slack channel
#monitor-sre-shadow and create an access request for PagerDuty (and assign to the SRE manager). We are currently limited to 2 max shadows per release so that we do not overload the SRE team.
Alumni of the program are encouraged to add themselves to this list and document/link to the observations/outcomes they were able to share with the wider team.
|Tristan Read||My week shadowing a GitLab Site Reliability Engineer|