This is a collection of best practices collected from working with customers on each stage of the SDLC. This list is not supposed to be exhaustive but provide the SA with a few good pointers when going into a meeting on a specific stage or feature.
|1. Elasticsearch - Advanced Search||Enabling Elasticsearch for faster and more accurate searching of artefacts in GitLab|
Search is a very important aspect in our everyday lives. From using navigation apps to guide us to our restaurants nearby to translating words that are not in our native language, it is crucial that the search functionality is fast, accurate and flexible to allow us to get get the most relevant results possible.
With integrations with Elasticsearch, we are able to leverage on the Lucene library to provide advanced search functionalities for GitLab users.
Elasticsearch leverages on clustering to search, distribute tasks and index across nodes to achieve high performance search. Unless deployed in a test environment, typically an Elasticsearch cluster should have at least 3 nodes to establish quorum.
As the cluster grows, more nodes can be added to the cluster to improve concurrency of users or improve resiliency
Elasticsearch is able to install on most platforms. Inline with our recommendations to push customers to adopt Kubernetes, Elasticsearch also has an offical Kubernetes operator and docker image that simplifies the deployment process and helps the cluster to scale out quickly as determined by search traffic.
Question: Do you recommend running Elasticsearch on the same host as GitLab?
Answer: No. Elasticsearch will consume memory and file descriptors that is not allocated to the JVM heap (e.g. results caching, aggregations, etc.) As a result, it may result in resource contention of of the various systems resulting in instability of the whole setup.
Question: Are Elasticsearch operations transactional?
Answer: No, not out of the box. Elasticsearch was not designed to be ACID compliant.