Updates to de-identifying Service Usage Data

Oct 8, 2021 · 1 min read
Tanuki GitLab profile

GitLab has been working on a process to intentionally limit our own ability to identify individual users from Service Usage Data in order to protect user privacy even better. Earlier this year, we solicited input on plans to de-identify GitLab’s Service Usage Data. We are now ready to move forward with a new system to de-identify SaaS usage data before it enters GitLab’s internal analytics environment.

What isn’t changing?

The service usage data policy for SaaS and Self-Managed remains unchanged.

What is changing?

With user privacy in mind, we are building a pseudonymization process to run against our SaaS service usage data.

We have determined that we do not need fully identifiable information anymore for our analytics environment, and as such we are pursuing this approach that will result in better privacy for GitLab users.

We’ll create a one-way hash or transform directly identifiable data. This means that the data will be hashed at the collection layer before it is sent to our analytics environment.

You can read more details about our pseudonymization solution here. Once our pseudonymization solution is in place, we will update data we collect to ensure it follows the solution. You can read more about what data we collect and its de-identification state here.

Timeline and implementation

We’re planning to roll out these changes in October 2021. Keep an eye on our Product Intelligence roadmap to monitor our progress. Once complete, we’ll update this blog post with the final status.

More information

Please find more information about our privacy policy. Further details on how service usage data is used for product improvement can be found on our Product Direction page. Also see GitLab’s analytics environment and SaaS Data Collection Catalog.

Please share your questions on the community forum.

Open in Web IDE View source