The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features or functionality remain at the sole discretion of GitLab Inc.
Last updated: 2021-09-22
The Geo-replication category helps distributed developer teams be more productive. With a single GitLab instance working with large repositories can take a long time for developers located in different geographies. Geo-replication provides an easily configurable, read-only mirror (we call it a Geo site) of a GitLab installation that is complete, accurate, verifiable and efficient. This is valuable because using Geo reduces the time it takes to fetch and clone repositories, which increases developer productivity.
Please reach out to Nick Nguyen, Acting Product Manager for the Geo group (Email) if you'd like to provide feedback or ask any questions related to this product category.
This strategy is a work in progress, and everyone can contribute. Please comment and contribute in the linked issues and epics on this page. Sharing your feedback directly on GitLab.com is the best way to contribute to our strategy and vision.
Geo-replication requires a significant investment to be
configured by systems administrators but allows users in different locations to
git read operations. Write requests are transparently proxied to the
primary site. Geo replicates around 83% of the data generated by GitLab and
a number of data types can be disabled if required.
Our goal for Geo-replication is to offer the same experience to users, regardless of their location. In the future, we want our users to be able to configure Geo within minutes - not hours. We envision Geo-replication to be fully transparent to users. This means that a developer should not need to actively decide to use Geo, or select the right Geo site - GitLab should be able to determine what Geo site should be used to provide the best user experience. For systems administrators, it should be simple to add, configure and remove new sites.
For more information on how we use personas and roles at GitLab, please click here.
Using a Geo site to overcome UX issues (e.g. latency) requires additional configuration for software developers, which is cumbersome. Using the secondary Web interface is a worse user experience than using the primary. A software developer needs to switch between a primary and secondary frequently, which can be highly confusing and frustrating.
We plan to automatically choose the best Geo site. This means that Geo will forward any requests from a secondary to a primary unless the user experience can be significantly improved by using the secondary. This will likely result in the deprecation of the read-only web interface because requests will be proxied from a secondary to a primary.
We are investigating a proof of concept that would allow Geo to proxy any requests to the primary site unless a secondary site can significantly improve the user experience.
Setting up Geo is highly manual and cumbersome, especially in high-availability configurations. Simplifying the installation and configuration of Geo for single and multi-node sites will remove a pain point for administrators and help drive adoption.
To save bandwidth and resources, an administrator may want to selectively enable and disable Geo replication for certain types of data. Currently, this is not possible unless a data type is released behind a feature flag, and this is not the case for all data types. We want to provide administrators an easy way to enable or disable replication by data type in the Geo Administrator UI.
It is currently possible for systems administrators to get a basic overview of the Geo status using the Geo Web UI. However, administrators would like easier access to more in-depth Geo metrics such as the time it takes to mirror a commit. We want to define and implement key metrics that allow administrators to better monitor their Geo installations and publish the metrics to a preconfigured Grafana dashboard.
Some customers would like to use their Geo sites for CI/CD. For example, a customer located in Europe and offices in Australia may want certain pipelines to clone from the Australia secondary. Another possible use-case for Geo could be to work as a "mirror" to lower the pressure on the primary site for the automated builds. At the moment, using a Geo secondary for CI/CD is not well-documented, may require complex workarounds, and does not guarantee that repositories and other data are up-to-date.
For Geo-replication only a subset of data may need to be replicated but Geo sites require spinning up the entire GitLab stack, less may be sufficient. Additionally, systems administrators can select a subset via selective sync, but they may be wrong.
We are investigating an advanced caching mode with the following properties:
We are currently not planning on moving away from PostgreSQL as a backend database in favour of e.g CockroachDB or Google Spanner. This has implications for writable Geo site Geo, but for now we will continue to support PostgreSQL.
Geo secondary sites are read-only. Customer feedback has indicated a desire for additional Active active git replication. With the availability of Gitaly Cluster we may start investigating writable Geo sites at some point in FY23.
This category is currently at the
Viable maturity level, and our next maturity
Complete (see our definitions of maturity
You can track the work that will move the category to
The top competitors for Geo-replication are
|Feature||GitHub||AzureDevOps||Bitbucket Smart Mirroring||GitLab|
|Mirror docker registries||❌||N/A||❌||✅|
|LFS and file upload support||✅||N/A||✅||✅|
✅ Fully available ⚠️ Partially available ❌ Not available N/A No information available