- You are here:
- Engineering Roles
- Site Reliability Engineer
Site Reliability Engineers (SREs) take on problems that require both development and operations expertise. For example, an SRE may solve distributed computing and/or concurrency problems that affect both our application and our infrastructure. An SRE works closely within a team of developers to make sure that the service or feature set that is being developed will reach its' target metrics on availability and latency, and that the solutions are scalable and reliable once deployed to production on GitLab.com.
Candidates need to qualify to join both as a developer and as a production engineer to join the team as an SRE.
- Work with developers to make their service or set of features ("service" for brevity) reliable.
- Contribute modular, well-tested, and maintainable code
- Write production-ready code with little assistance
- Write complex code that can scale with a significant number of users
- Fix performance issues on GitLab.com using our existing tools, and improve those tools where needed; providing guidance to others.
- Develop monitoring and alerting to measure and act on improving the availability, and scalability of the service on GitLab.com.
- Responsible for managing the infrastructure related to the service.
- Radiate knowledge to the infrastructure team about the service, and radiate knowledge of the service's infrastructure and reliability to the rest of the development team.
- Together with other SREs and Production Engineers, design, build and maintain core infrastructure pieces that allow GitLab scaling to support hundred of thousands of concurrent users.
- Identify parts of the system that do not scale, provide immediate palliative measures and drive long term resolution of these incidents.
- Participate in on-call rotation to respond to GitLab.com availability incidents, and use your on-call rotation to prevent pages from ever happening.
- Document every action so your learnings turn into repeatable actions and then into automation.
- Debug application and production issues across services and levels of the stack.
- Ship every solution into the GitLab-CE and EE package as a default.
- You can reason about software, algorithms, and performance from a high level.
- You have experience thinking about systems - edge cases, failure modes, behaviors, and specific implementations.
- You have worked with distributed systems and have a solid understanding of how modern web stacks are built, and why.
- You are passionate about open source.
- You have worked on a production-level Ruby application, preferably using Rails.
- You know how to write your own Ruby gem using TDD techniques
- You know your way around Linux and the Unix Shell.
- Strong written communication skills
- Experience with Docker, Nginx, Go, Kubernetes, a plus
- Experience with online community development a plus
- Self-motivated with strong organizational skills
- You share our values, and work in accordance with those values.
- A technical interview is part of the hiring process for this position.
Please note that if we are actively hiring for a position, you will see it listed on our jobs page, where all of our current openings are advertised. To apply, please click on the name of the role you are interested in, which will take you to our applicant tracking system (ATS), Lever.
Avoid the confidence gap; you do not have to match all the listed requirements exactly to apply. Our hiring process is described in more detail in our hiring handbook.
GitLab Inc. is a company based on the GitLab open-source project. GitLab is a community project to which over 1,000 people worldwide have contributed. We are an active participant in this community, trying to serve its needs and lead by example. We have one vision: everyone can contribute to all digital content, and our mission is to change all creative work from read-only to read-write so that everyone can contribute.
We value results, transparency, sharing, freedom, efficiency, frugality, collaboration, directness, kindness, diversity, boring solutions, and quirkiness. If these values match your personality, work ethic, and personal goals, we encourage you to visit our primer to learn more. Open source is our culture, our way of life, our story, and what makes us truly unique.
Top 10 reasons to work for GitLab:
- Work with helpful, kind, motivated, and talented people.
- Work remote so you have no commute and are free to travel and move.
- Have flexible work hours so you are there for other people and free to plan the day how you like.
- Everyone works remote, but you don't feel remote. We don't have a head office, so you're not in a satellite office.
- Work on open source software so you can interact with a large community and can show your work.
- Work on a product you use every day: we drink our own wine.
- Work on a product used by lots of people that care about what you do.
- As a company we contribute more than we take, most of our work is released as the open source GitLab CE.
- Focused on results, not on long hours, so that you can have a life and don't burn out.
- Open internal processes: know what you're getting in to and be assured we're thoughtful and effective.
See our culture page for more!
Work remotely from anywhere in the world. Curious to see what that looks like? Check out our remote manifesto.