CMO: Establish credibility and thought leadership with Enterprise Buyers delivering on pipeline generation plan through the development and activation of integrated marketing and sales development campaigns:
CMO: Define category strategy, positioning and messaging and plan for activation across the company. => Done.
CMO: Develop and document messaging framework => Draft completed.
PMM: Develop and roll out updated pitch and analyst decks => Drafts completed.
PMM: Develop Action Plan for Q1/Q2 activation of the strategy => Done.
PMM: Develop GTM Strategy for EEU => Done.
Enable Sales with decks and Early Adopter Program => Done.
CMO: Continue website redesign iteration to support our awareness and lead generation objectives, accounting for distinct audiences. => IA work initiated. Multiple pages redesigned, /resources 75% complete and /kubernetes launched.
MSD: achieve target in new inbound opportunity SQL $ => Achieved 90% of target.
MSD: achieve target in new outbound opportunity SQL $ => Achieved 72% of target.
MSD: achieve new opportunity volume target in strategic Sales segment accounts => Achieved 137% of target.
CMO: Build out Product Marketing function, including hiring and on-boarding three people, updating objectives, process and handbook, and developing cross-functional alignment. => Partially completed with IC role. Not achieved on hiring additional roles.
PMM: Evolve and deliver updated AE pitch deck messaging and incorporate Sales feedback => see individual items below
PMM: Update current “toolchain DevOps” to EE pitch Deck => 85% complete.
PMM: Create CE to EE “get a demo” Deck => complete.
PMM: Create CE to EE Pitch Deck => Deferred to Q1.
PMM: Create SVN to EE pitch Deck => Deferred to Q1.
PMM: Develop a website Information Architecture (IA) to roll out in 2018 => 33% complete.
PMM: Enhance ROI section of website including adding interstitial to promote individual calculators => Deferred to Q1.
PMM: Work with content team to deliver 5 customer case studies or customer-centric blog posts. => 75%. 5 Case Studies in Queue. Customer interviews complete. Drafts in progress.
PMM: Evolve EEP vs EES vs CE differentiation messaging and optimize website experience for product sections => Complete. Product pages updated.
Generate joint PM/Eng plan to update issue taxonomy (labels) to aid prioritization and kickoff implementation => 30% done, MR started, need to socialize and implement next quarter
Support Lead
85% Premium Support SLA (up from 68% last quarter) => 83%
+5% MoM on all other SLAs => October: Average SLA: 52%. December Average SLA: 60%. 2% MoM Average improvement.
One hire through active sourcing => Not hit, but Augie is sourcing EMEA Support Engineers for us!
Support Blog post every other week to use as recruiting tool => Done!
Write 60 unit tests to resolve test debt in Q4 (evenly distributed throughout the team) => 602% of 120 for both teams (Overall 723 new tests)
Crush 140 bugs this quarter (evenly distributed through the team) => 130% (366 Frontend Bugs closed)
Improve codebase by making modules ready for webpack by moving it to our new coding standards (#38869) => 93% (Will be completed with 10.5)
Improve performance by making library updates and webpack bundle optimizations (#39072) => 55%
Finish conversion from inline icons to SVG Icons to improve performance => 80% done (Will be completed with 10.5)
Frontend (DC) Lead
Write 60 unit tests to resolve test debt in Q4 (evenly distributed throughout the team) => 602% of 120 (Overall 723 new tests)
Crush 140 bugs this quarter (evenly distributed through the team) => 130% (366 Frontend Bugs closed)
Refactor the MR discussion in Vue to decrease load times, and increase performance/usability => 70%, Will ship in 10.5
Remove global namespaces, to enable webpack code splitting, which improves performance (#38869) => 93% (Will be completed with 10.5)
Director of Backend
Author a demo script for use throughout the quarter => 70% complete, script still varied week to week
Expose EE and CE code coverage metrics => 100% complete, see http://gitlab-org.gitlab.io/gitlab-ce/coverage-ruby and http://gitlab-org.gitlab.io/gitlab-ee/coverage-ruby
Assist 1 top tier customer switch to GitLab and ensure P1 bugs/issues get fixed => 80% complete, customer switch was successful, but some issues remain
Distribution Lead
Establish baseline metric for install time/ease and come up with a plan to achieve and maintain it => 10% complete
Decrease build times from 60 minutes to 30 minutes => Done
Create integration test for Mattermost => Done
Platform Lead
Identify 1 sub-standard area of the code base and raise local unit test coverage up to project level => No area was identified and no unit test coverage was raised
Write integration test for backup/restore => Not scheduled, and not done
Make GitLab QA test LDAP => Not scheduled, and not done
Resolve or schedule all priority 1 & 2 Platform issues (and groom performance issues) => Out of the 23 AP1, AP2, SP1, SP2, SL1 and SL2 issues that existed at the beginning of the quarter, 12 are resolved, 7 are scheduled, and 4 are current unscheduled. 19/23 resolved or scheduled is 83% done!
CI/CD Lead
Add 1 integration for runners: done,
Resolve or schedule all priority 1 & 2 CI/CD issues (and groom performance issues) => 33% resolved (6/16 for P1, 8/26 for P2), 19% scheduled (5/10 for P1, 3/18 for P2)
Reduce amount of system failures to less than 0.1% => As of 5th Dec it was 0.29% [Failure rate],(https://gitlab.com/gitlab-com/infrastructure/issues/3349)
Improve cost efficiency of CI jobs processing for GitLab.com and GitLab Inc. => we process all jobs on DO and Google, since Google billing is more favorable we are more cost effective
Discussion Lead
Write integration test for squash/rebase
=> MR in progress, but not merged: gitlab-org/gitlab-ce!15964.
Blocked by the branches MR.
Resolve or schedule all Priority 1 & 2 Discussion issues (and groom performance issues)
=> All AP1, AP2, SL1, SL2, and SP1 issues are either scheduled or done.
However, we did not address the SP2 issues, many of which are feature proposals, not bugs.
By category, let's call that 5/6 addressed, or 83% complete.
Prometheus Lead
Reach parity with Prometheus metrics for Unicorn, Sidekiq, and gitlab-shell and Deprecate InfluxDB => 70% complete, Unicorn and Sidekiq metrics shipped, gitlab-shell and Deprecate InfluxDB moved to Q1.
Make Grafana dashboards available for all Prometheus data easy to install for GitLab instances => 50% complete, Dashboards created, but need polish and documentation.
Identify 1 sub-standard area of the code base and raise local unit test coverage up to project level => Done, prometheus-client-mmap now has better testing and coverage.
Resolve or schedule all Priority 1 & 2 Prometheus issues (and groom performance issues) => Done, prometheus-client-mmap performance improvements shipped.
Geo performant at GitLab.com scale => 30% done: a full HA testbed was built, but it hasn't been pushed to the limits yet; marking as only 30% since there may be more unknown unknowns to deal with here.
Manual failover robust in Geo as first step to Disaster Recovery => 80% done: manual failover demo-ed and documented, but still incomplete.
Director of Quality
Document what quality means at GitLab on an about page => Done. See High Level Goals here
Communicate standard in 3 different ways to internal team => 33% Complete. Only covered in handbook.
Make issue/board scheme change recommendation to allow us to better mine backlog for quality metrics => 0% complete. Not started due to other priorities
Initiate a project to make quality metrics and charts self-service => Done. See gitlab-insights project
Initiate a project to allow for UI testing of the web application locally and on CI => Done. GitLab QA does this, however the work to make it production ready continues
Improve at least the 5 longest spec files by at least 30% => 0% complete. Not started due to other priorities.
Investigate code with less than 60% tests coverage and add tests for at least the 5 most critical files => 0% complete. Not started due to other priorities.
100% of Git operations on Gitlab.com go through Gitaly (Gitaly v1.0) => 70% complete
Demo Gitaly fast-failure during a file-server outage => Done
Generate a project plan for the GCP migration and get approved by EJ and Sid => Done
Execute milestone 1 of the GCP migration plan by Dec 15 => Done
Database Lead
Demo restore time < 1 hour => postponed until the GCP migration has been completed
Solve 30% of the schema issues identified by Crunchy => 6.5% done (2 out of 30)
Database Uptime 99.99% measured in Prometheus => Done!
SQL timing under 100ms for Issue, MR, project dashboard, and CI pages measured in Prometheus => Improving the 99th percentile has proven to be very difficult, but progress being made.
Director of Security
Strong security for SaaS and on-premise product. Top 10 actions from risk assessment done and actions for top 10 risks started. => 50%, will finish next quarter.
HackerOne bug bounty program. Implemented and bounties awarded. => Done.
Security policies for cloud services and cloud migrations. Policy published and enacted. => Done.
CMO: Build trust of, and preference for GitLab among software developers
CMO: Hire Director, DevRel/Developer Relations => Deferred to Q1.
CA: Grow community and increase community engagement. Increase number of new contributors by 10%, increase number of total contributions per release by 5% and increase number of Twitter mentions of GitLab by 10%. => Not achieved.
PMM: Support field marketing at AWS: Reinvent & KubeCon with booth decks and training => Done.
MSD: $600K in self serve revenue. => 98% achievement.
MSD: Grow followers by 20% through proactive sharing of useful and interesting information across our social channels. => 40% achievement of target.
MSD: Grow number of opt-in subscribers to our newsletter by 20%. => Achieved. Grew 47.84%
CMO: Generate more company and product awareness including increasing lead over Bitbucket in Google Trends => Achieved. GitLab = 65; Bitbucket = 60.
MSD: Implement SEO/PPC program to drive increase in number of free trials by 20% compared to last quarter, increase number of contact sales requests by 22% compared to last quarter, increase amount of traffic to about.gitlab.com by 9% compared to last quarter => 34% increase in trial request leads; 21% increase in contact request leads; 7% growth QoQ on about.gitlab.com.
CMO: PR - October Announcements - 10.0, Series C, Wave, CLA => Done!
CMO: AR - v10 briefing sweep for targeted analysts => Meetings secured for Q1.
Objective 3: Great team.
CFO: Improve team productivity
Analytics: Data and Analytics vision and plan signed off by executive team
Analytics: Real time analytics implemented providing visual representation of the metrics sheet
Legal: Create plan for implementing Global Data Protection and Data Privacy Plan
Controller: Reduce time to close from 10 days to 9 days.
Accounting Manager: Identify and add to the handbook two new accounting policies.
Accounting Manager: Create monthly process for BvsA analysis with department budget owners.
CCO: Create an Exceptional Corporate Culture / Delight Our Employees
Launch training for making employment decisions based on the GitLab Values. Launch by November 15th - Moved to Q1 2018
Launch a short, quarterly Employee Pulse Survey. Strive for 80% completion. Completed, 69.5% completion.
Analyze and make recommendations based off of New Hire Survey and Pulse surveys which will drive future KRs. Have at least 3 areas to improve each quarter. Ideally, we will also have 3 areas to celebrate. Completed 1/3, other two moved to Q1 2018.
Revise the format of the Morning Team Calls to allow for better participation and sharing. Strive for 80% participation. Completed.
Improve use of the GitLab Incentives by 15%. /handbook/incentives/.
Discretionary Bonus: 0% change from Q3 2017 to Q4 2017, but 50% increase from Q2 2017 to Q3 2017.
Referral Bonus: 40% increase for hires in Q4.
Tinggly: Quadrupled the number of awards granted.
Iterate on the Performance Review process with at least two changes initiated by end of year. - moved to Q1 2018.
CCO: Grow Our Team With A-Players
KR: Socialize and grow participation in our Diversity Referral bonuses by 10% (measurement should be made in January as many hires in December don’t start until January, with awareness that the actual bonuses aren’t paid out for 3 months) - 0% increase.
More sourced recruiting. 20% of total hires - 4.9% of total hires in Q4.
Ensure candidates are being interviewed for a fit to our Values as well as ability to do the job, through Manager Training and Follow-up by People Ops - moved to Q1 2018.
Hire Recruiting Director - Completed.
90% of all candidates will be advanced through the pipeline within 7 business days in each phase, maximum. - Average time in each stage for all candidates: 11.57 days.
CCO: Make All of Our Managers More Effective and Successful
Provide consistent training to managers on how to manage effectively. Success will mean that there are at least 15 live trainings a year in addition to curated online trainings - moved to Q1 2018.
Ensure every manager is doing regular 1-on-1 meetings with 2-way feedback. Measure will be seen in Employee Pulse survey, with at least 90% of employees indicating they have received feedback from their manager in the last month. - 83.33% agree someone at work has talked to them about their progress.
Hire People Business Partners to partner with managers to operate as leadership coaches, performance management advisors, talent scouts, and Culture/Values evangelists. Goal of 2 hires. - Completed.
VPE: Build the best, global, diverse engineering, design, and support teams in the developer platform industry
Revise hiring plan for Q1 2018 based on Q4 financials and product ambitions => Done
Launch 2018 Q1 department OKRs before EOQ4 2017 => Done
Hire an additional Director of Engineering => Job posted, pipeline looks decent, but hire not made
Hire a production engineers => Job posted, pipeline looks decent, but hire not made
Support: Grow the support team to better comply with SLAs and cover gitlab.com cases
Hire a Services Support Manager => Done, Starting in Feb.
Hire an support specialist
Hire an EMEA support engineer => Hired AMER as needed.
Hire an EMEA/AMER support engineer => Done.
Hire an AMER support engineer => Done, Starting Jan 8th.
UX: Increase the profile of GitLab design and grow the team
Hire 2 UX designers => Incomplete. Strong pipeline and process in place.
Hire a junior UX researcher => 100% Complete.
Frontend (DC): Hire 3 front end developers => Done
Frontend (AC): Hire 2 front end developers => Done
VP of Scaling: Hire an Engineering Manager for the Geo team => Not done
Distribution
Hire 2 developers => Done.
Hire a senior developer => Not done, pipeline for senior is weak.
CI/CD
Hire 3 developers => 33% complete
Hire 2 senior developers => Not done
Discussion: Hire 2 developers
Gitaly: Hire a developer => Not done.
Database: Hire a database specialist => Done! Starting end of January 2018
Director of Quality
Hire a test automation lead => 20% Complete. Process has been defined and we are screening candidates
Hire 3 test automation engineers => 20% Complete. Process has been defined and we are screening candidates
Director of Security
Hire Security Engineer(s) => Done, two hires starting in January.
Hire a Security Specialist Developer => Not a security team hire, but a product developer (SAST).
Retrospective
VPE
GOOD
Anecdotally, Engineering had been setting promises and achieving ~20-30% of them in past quarters. It's important to set aggressive but achievable goals that both motivate the team, but also accurately represent our bandwidth to the rest of the organization. Otherwise, the wrong business decisions are made. My goal was to raise our achievement somewhat, which I think we've done. Eventually, I would like to be regularly hitting 70-80% of our OKRs–more when things go spectacularly. But it will take time to dial this is, because we do not want to encourage sandbagging.
The focus on hiring velocity and quality improved dramatically. Teams such as design, frontend, and support did a great job, meeting their plan, raising the bar higher, and making the process more efficient.
We gave engineering teams many goals that were fully under their control, such as resolving test debt, improving code quality, and improving the hygiene of their backlog and most delivered on these
We got 2018 Q1 OKRs drafted before the end of the quarter, which helps with adoption
We delivered geo
Two revisions of the hiring plan were delivered to the board
We kicked off the GCP migration project and delivered a milestone
BAD
These goals were set in my 2nd week at GitLab, so some missed the mark (which was known to be likely to happen)–I know much more for Q1
Some teams lagged behind in hiring, only getting their vacancies up in Late November after being pressed
My own hires (some of which were inherited from infrastructure) were not made
The late start and holiday season made it very unlikely that some of our hires would be made (but it was important to capture them on the record, rather than leave them off)
My goal to enhance our process somehow meandered through the quarter. It started as starting an estimation process, eventually becoming a taxonomy change
My time spent with the infrastructure team took the place of several other things I hoped to do–unfortunate, but the right call
TRY
Anticipate the holiday season next year and front-load hires in Q3
Push for all vacancies to be posted in the first full business week of each quarter
Assign each team a goal to deliver 100% of commitments for releases (and find a way to measure it)
Find a way to incentivize hiring in efficient regions
Assign each applicable team a goal to merge a certain amount of contributions from the community
Director of Backend
GOOD
We made Geo Generally Available after a frenetic development effort
We hired a number of strong backend developers who were able to contribute from Day 1
We successfully migrated a Tier 1 customer to GitLab
We shipped major features, such as GPG-signed keys, with the help of the community
We expanded the use of GitLab QA, adding a Mattermost integration test, and caught a number of regressions
BAD
We underestimated the amount of work required to make hashed storage production-ready
The shuffling of people to Geo, while helpful for Geo, slowed down other teams
A lot of migrations and features caused unplanned downtime with GitLab.com
Prometheus metrics got close to running in production, but we still had to turn it off on GitLab.com
Our customers are still experiencing performance issues, particularly with API access
We broke LDAP logins (again) in 9.5.0 and still do not have integration test for this
TRY
Make adding integration testing a priority instead of an afterthought (starting with Geo)
Get more team members involved with identifying significant bugs in Sentry
Improve overall security release process by defining roles and expectations of release managers, developers, and security team
Increase team productivity by scheduling pairing sessions with different developers
UX Design Manager
GOOD
We documented the majority of existing and UX Ready design patterns in GitLab
Establishing a pattern library for designers has sped up the design cycle significantly. Designers can quickly put together an entire UI and know it contains the latest standards. This will ensure consistency across the team and application
As a side-effect, the pattern library has brought to light major usability improvements the UX team has defined but not been successful in getting implemented. We will focus on pushing these improvements into the app
We hired a Jr. UX Researcher
We completed phase one of the UI Repository with the help of FE
We succeeded in publishing three blog posts related to UX vision and implementation. The response from the design community has been positive and we look forward to establishing GitLab as an Open Source Design authority
BAD
We were not successful in hiring the two UX Designers we need in spite of our efforts
We were not successful in closing the loop on UX standards and guidelines. The goal was to deprecate the existing UX Guide in favor of design.gitlab.com but review of standards took longer than anticipated
Not accounting for holidays and vacations in planning led to some missed deliveries
Continue collaborating with FE on UI Repository and UX Backlog cleanup
Push harder for significant iterative UX improvements in each release
Anticipate holiday season's influence on ability to deliver
Staff Developer, Database
GOOD
Despite the large OKR we managed to solve a lot of performance issues.
We improved the team workflow by using issue boards more actively and by having a weekly database meeting.
We managed to add health / uptime monitoring to Prometheus / Grafana, allowing us to see how the database health changes over time. This is based on the number of alerts sent out, not the uptime of the database.
We managed to hire a 3rd database specialist.
We rewrote the GitHub importer from scratch, resulting in much better performance.
We wrote a (popular) blog post about scaling the database: </2017/10/02/scaling-the-gitlab-database/>
We managed to optimise retrieving CI pipeline statuses, which used to execute very slow SQL queries.
BAD
We added far too much work to the Q4 OKR, resulting in us only being able to complete a small portion of the planned work.
We didn't take the summit into account when planning the OKR.
One database specialist was unavailable for a few weeks due to having to move to a different apartment. This lead to a reduction in productivity of the team as a whole.
There were too many issues that required the help of others, some of these were not worked on for several weeks.
We estimated we'd be able to complete 30 schema issues, but only ended up completing two of them.
10.3 had a few bad migrations causing trouble.
TRY
Schedule more well defined issues for an OKR so we can actually solve them.
Make it harder to introduce performance problems (planned for Q1 of 2018).
Delegate more work to the other teams so database specialists don't have to do so much one their own.
Director of Security
GOOD
Established HackerOne private paid bug bounty program and monetarily awarded researcher.
Hired two security team members, in Application Security and Security Automation.
Drafted and published GCP Security Guidelines.
Created Security Vision for 2018 and beyond, including hiring plan.
BAD
Security became a team of one during Q4, and that impacted our ability to deliver on some OKRs.
My goal to deliver on all Top 10 Security Risks Assessment was superseded by needing to spend significant time transitioning non security-impacting tasks to other teams.
Security Scanner PM role was assigned to me, without a lot of context. Ultimately, that became challenging, because the scope was much larger than anticipated. So, we made the decision to have this role transitioned. My goal is to be more engaged once the security team is larger.
TRY
Find a methodology to analyze recurring security vulnerability types, and work towards mitigating clusters of vulnerabilities.
Work towards creating more automation of security tasks, to scale our team.
Continue to provide cross-functional security guidance, but through issues and MRs much more frequently, now that workflow fluency is established.
Engineering Manager, Platform Backend
GOOD
It's hard to determine after the fact what percentage of deliverables we actually manage to ship each release, because issues that slip have their milestone adjusted to a future release, instead of keeping the milestone of the release they were originally scheduled for. My perception, however, is that we've shipped about 75% of deliverables each release, with that number closer to 70 for 10.2 (because of the summit in October) and 10.4 (because of the holidays in December), and closer to 80 for 10.3 when we had 4 uninterrupted weeks of work.
We resolved 52% of priority 1 & 2 Platform issues, and scheduled another 30%.
We added 3 developers to the team, including 2 seniors.
BAD
We lost 5 of the 10 people we started the quarter with to other teams (4 to Geo and 1 to CI/CD), including 2 of the new team members, obviously affecting our ability to get stuff done significantly.
We didn't add GitLab QA tests for either backup/restore or LDAP.
We didn't identify a sub-standard area of the code base and raise local unit test coverage up to project level.
Circuit breakers are done, but are not enabled in production yet. See this infrastructure issue for more information.
We only resolved 52% of priority 1 & 2 Platform issues. We scheduled another 30%, but 17% was left untouched. (Numbers don't add up to 100% because of rounding.)
We didn't manage to resolve any significant tech debt or make progress on engineering tasks without immediate user-facing benefit, like shipping a first iteration of a GraphQL API, or migrating to Rails 5.
TRY
Adjust OKRs during the quarter if circumstances (like team capacity) changes significantly.
Create well defined issues for OKRs to make it harder to lose track of them.
More proactively keep an eye on SP1, SP2, AP1, AP2, SL1 and SL2 issues.
Proactively schedule and allocate time for tech debt and "pure" engineering tasks. This will become easier as the team gains people again.
Engineering Manager, Discussion Backend
GOOD
No major features slipped.
All AP1 and AP2 issues are done or scheduled.
We added one senior developer to the team.
BAD
We didn't look at SP2 issues at all.
The GitLab QA tests we wrote aren't done, because they got overtaken by AP issues.
Some developers ended up with too many issues to work on, because others were blocked and became unblocked.
We didn't get any closer to migrating to Rails 5.
Solving AP1 and AP2 issues often means writing migrations, which then can have a bad production impact on GitLab.com, and we didn't do a good job of catching those early.
Migrating uploads to object storage was more complicated than expected.
Some developers ended up working on a lot of OKR-focused issues, others worked on very few.
TRY
Ensuring that we reduce the bug backlog by having every developer work on the backlog.
Keeping better track of important issues (like SP2 issues), perhaps with an explicit monthly refinement of that list.
Expressing targets in OKRs as a count of issues to solve, with the backlog size recorded for reference, to make future retros easier.
Being more conservative about adding new issues when someone has issues in progress at the end of the release.
Spreading OKR-related issues better among the team.
Frontend Engineering Manager
GOOD
We had good results for our targeted OKR's, especially with adding a lot new Karma tests and crushing bugs
Overall good results on our deliverables
We closed 5 great frontend hires and fulfilled our hiring plan
We made constant progress on our pure engineering tasks and performance improvements (Modules for Webpack, Libraries, SVG's, etc.)
We reduced the number of regressions produced per release
BAD
Our big refactorings/restructering got way too big, which led to review & merging problems and then slipped cut-off times
There are still way too many merges around the 7th, which led to conflicts, broken versions, failing masters, etc.
Library updates were a 50:50 chance if they would be easy or hugely complex
Sometimes technical topics were rushed and too many at a time which then led to confusion in the team
Our CSS debt is growing release by release
We were not able to push forward the next big frontend performance topics which need intra team collaboration (images and gzip) except the CDN topic
TRY
Establish our new team structure which will give us the benefit to drive forward technical improvements fast, which will make us way more productive on the long run (by example stable Vue components), and on the other hand more flexible planning for release cycles over the different product areas
Unified scheduling plans over all product areas and establish good predictability on our velocity
Make everyone aware clearly about our OKR's and our overall progress on them
Solve one big technical change at a time with a clear and communicated plan
Integrate 5 new team members with good onboarding and be ahead of our hiring plan so hires are made according to our plan
Continue working closely with UX not only on deliverables but also on broader topics like the component library, SVG's and more
Better documentation and tooling to support our overall team size and the overall frontend development workflow
Clear communication and insights from our side when a deliverable is becoming problematic and maybe slip, needs more attention, etc.
CI/CD Lead
GOOD
We seem to be able to ship around 75-85% of our Deliverables every month, we also start to notice to have less overload with the issues
We are continually investing in performance/scalability fixes every month, and we see big improvements in resiliency of CI/CD infrastructure, especially from Database perspective
We manage to solve significant amount of Technical Debt every month
We started having own CI/CD Retrospective around one week before company's, it helped us to voice and address our problems and contribute better to company-wide one
We are continuing to improve our monitoring capabilities by having end-to-end monitoring of all CI infrastructure running by GitLab Inc.
We are able to deal with abuses on GitLab.com, react fast, and this no longer affects stability of the CI
We are aware of cost of maintenance/change and we always accommodate that in our architectural choices, by forcing us in implementing not MEP (engineering product), but MMP (maintainable product)
BAD
We only solved 33% of priority issues. We scheduled only 18% of them to be resolved. Most of the rest are hard/yet-to-impossible to solve without conclusive Product decision as they are feature requests
We didn't manage to meet our hiring expectations
We started monitoring some of our OKRs only at the end of Q4 which resulted in some of them being not completed
We struggle with lack of large-scale database knowledge which results in slower velocity on shipping some of the changes
TRY
Make OKRs focus better on the achievable goals for the team
More proactively keep an eye on SP1, SP2, AP1, AP2, SL1 and SL2 issues
Help develop and work closely with the database specialists to ensure that we can improve the database architectural changes velocity
Support Engineering Manager
GOOD
With the help and support of the VPE, we hit 4/5 of our aggressive hiring goals this quarter.
With Team size + skills increase, we were able to see dramatic improvements in our SLAs.
We limited our OKRs to focused, attainable OKRs which absolutely lead to our success.
BAD
We are still struggling to hire in EMEA. It’s been our hardest challenge.
Support SLAs need to be at 100% ASAP, but it will take time to train + hire to meet demands.
GitLab.com Support Experience is not where it should be.
TRY
New Accountability processes in support to avoid SLA Breaches
Using Active Sourcing targeting EMEA to fill our hiring needs
Focus on training around new complexities (HA/Kubernetes) in preparation for future demand.
Quality & Edge
GOOD
We automated the CE->EE merges.
We automated the triaging for several projects (gitlab-ce, gitlab-runner, gitlab-qa).
We increased the Branches page speed by 2x.
We increased our contributions to QA, the team is ready to be productive on this matter for Q1 2018.
We shipped 2 new RuboCop cops.
We contributed extensively to the documentation.
BAD
Two developers (almost half of the team) were release managers in October/November so we got less things merged during these months.
We tend to just work on issues that we triage or that we find in reviews, instead of balancing that kind of work with the OKRs.
Some work has been done towards non-written OKRs, such as "CTO: Less effort to merge CE into EE. 10 times less efforts to merge CE to EE".
This objective was definitely in Edge's scope but we should've replaced some other minor OKRs with this one to reflect the reality of where the focus was.
One team member was responsible for the CE->EE daily merge, that can take a significant amount of time depending on the conflicts and/or CI failures, which obviously gives less time to achieve the OKRs we've set.
Developers spend time reviewing community merge requests, which obviously gives less time to achieve the OKRs we've set.
Documentation improvements and new RuboCop cops weren't part of the OKRs, but are still important.
As we don't have Product telling us "what to ship", we tend to contribute a lot of small changes but I have the feeling we lack "big" achievements because of that.
This can lead to less motivation, as we have less "big" things to celebrate.
TRY
Ensuring that at the end of each week, every one on the team has made progress toward an OKR.
Ensuring that community merge requests are being triaged and reviewed (or assigned) in a timely fashion.
Engineering Manager, Distribution
GOOD
We are delivering around 90% of scheduled items every month
The team is increasing their Kubernetes experitise
Solid progress has been made with the Cloud Native charts, contributing to one of the highest company priorities while keeping customer requirements in mind
We keep delivering stable package releases, blocking problems related to the package are addressed within one patch release
We invested time in consolidating our package building infrastructure that allowed us to have better visibility in infrastructure costs
We were able to help out with non-team tasks such as GitLab QA
Number of actionable user reported issues has stagnated
Improved the team communication by establishing a team specific issue tracker and process
We considerably improved HA installation experience
We delivered a PG HA solution, used on GitLab.com at scale
Improved user experience by informing of package deprecations and better error messages, which has also prevented issues internal to company
Continued regularly decreasing technical debt
Continued with team training sessions
BAD
We did not have time to start working on measuring installation time OKR
The hiring pipeline for a Senior level developer is poor
We still get tasks added last minute when dependencies or production readiness is not considered during feature planning
Significant time is used by the team on legacy projects
The issue queue is not decreasing
Some of the major tasks that the team has focused on was not described in the OKRs
TRY
Establish better tooling that would allow more visibility internal and external to the team
Automate tasks that are external team dependencies
Review legacy projects for automation or deprecation
Gitaly Lead
GOOD
A major Gitaly bug was resolved.
Good progress was made at the end of the quarter on acceptance testing.
BAD
By several measures, the Gitaly team made less progress than in the previous quarter.
For example:
Q4: 113 vs. Q3: 179 merge requests accepted on Gitaly repo.
Q4: 202 vs. Q3: 329 issues closed on the Gitaly repo.
These of course are not necessarily indicators of progress, but the differences should be investigated further.
Both of these bugs were in underlying libraries and there was little we could do to mitigate them.
The former was fixed by a change to the Go runtime
The latter was a GRPC bug which we worked around by rolling back to an old version
Gitaly lead role reduced to 20% of Andrew's time
Distractions around team changes (Jacob possibly leaving team, Andrew's new split role)
Recruiting for a position until January which has been closed since December
Very little uptake in Gitaly open recruiting position, probably due to confusing title
After the title being changed from "Backend Developer, Gitaly" to "Backend Developer, Ruby and Go", there was a surge in candidates.
TRY
Better communication around head-count updates.
Gitaly should get Product representation. Headcount loss was a result of decision made at a meeting of the Product Team, at which Gitaly was not represented.
(Too late for Gitaly, but…) For future large migration projects: better upfront analysis and planning of scope.
In future, do not use the team name as part of the recruiting position title: try stick to required skills and experience if possible.
Engineering Manager, Geo
GOOD
GitLab Geo was shipped as Generally Available in 10.2.
To get Geo to GA, the team ramped up very quickly in October. Shifting
priorities can be a stressor for people, teams, and the organization at large,
in this case it went remarkably smoothly with a lot of support from everyone involved.
With guidance of the VPE, the Geo team started doing weekly demos, which were
very valuable to the development process and are now a standard practice on the team.
Other teams - notably Solutions Architects - also demo-ed Geo and this "external"
set of eyes on the product also contributed to better documentation, UX improvements, etc.
BAD
There was significant slippage from milestone to milestone. This is due to (i)
focusing on too many objectives per milestone and (ii) recovering from the push
to get to GA in 10.2 which involved committing features well beyond the feature
freeze windows in 10.2, and then also in 10.3.
Discovered that hashed storage had been rolled out in 10.0 as GA when in fact
it can be considered alpha or beta at the moment. This had knock-on effects for
how we plan(ned) to use Geo for the GCP Migration; see the ongoing discussion
on the fate of hashed storage.
Not "bad" per sé, but noticed by careful tracking that our estimates for calendar
days spent on issues is at about 50% of actual.
TRY
Correcting for the points listed in "bad" by not allowing exceptions to
feature freeze in 10.4, and committing to a single focus in 10.5.
Better estimation of calendar days spent by (i) allowing 0.5 days as the smallest
unit of time, (ii) reducing scope of any issue sleighted to take more than 5 days,
(iii) use past data to set more realistic targets for following milestones.
Prometheus Lead
GOOD
We shipped major performance improvements to the Ruby Prometheus client library.
Prometheus 2.0 work helped production greatly scale to handle additional metrics load as we expand what is possible to collect.
BAD
We are very short staffed.
There was some slippage due to miscommunication between frontend and backend development.
TRY
Hiring is a prioirity for Q1.
We are working on improving communication by brining up blocking issues between frontend and backend more than once a week, and making sure we communicate clearly about these blockages.