Multi-node upgrades with downtime
DETAILS: Tier: Free, Premium, Ultimate Offering: GitLab Self-Managed
While you can upgrade a multi-node GitLab deployment with zero downtime, there are a number of constraints. In particular, you can upgrade to only one minor release at a time, for example, from 14.6 to 14.7, then to 14.8, etc.
If you want to upgrade to more than one minor release at a time (for example, from 14.6 to 14.9), you must take your GitLab instance offline, which implies downtime. Before starting this process, verify the version-specific upgrading instructions relevant to your upgrade path:
For a single node installation, you must only upgrade the GitLab package.
The process for upgrading a number of components of a multi-node GitLab installation is the same as for zero-downtime upgrades. The differences relate to the servers running Rails (Puma/Sidekiq) and the order of events.
At a high level, the process is:
- Shut down the GitLab application.
- Upgrade your Consul servers.
- Upgrade the other back-end components:
- Gitaly, Rails PostgreSQL, Redis, PgBouncer: these can be upgraded in any order.
- If you use PostgreSQL or Redis from your cloud platform and upgrades are required, substitute the instructions for Omnibus GitLab with your cloud provider's instructions.
- Upgrade the GitLab application (Sidekiq, Puma) and start the application up.
Stop writes to the database
Before upgrade, you need to stop writes to the database. The process is different depending on your reference architecture.
::Tabs
:::TabTitle Linux package
Shut down Puma and Sidekiq on all servers running these processes:
sudo gitlab-ctl stop sidekiq
sudo gitlab-ctl stop puma
:::TabTitle Cloud Native Hybrid
For Cloud Native Hybrid environments:
- Note the current number of replicas for database clients for subsequent restart:
kubectl get deploy -n <namespace> -l release=<helm release name> -l 'app in (prometheus,webservice,sidekiq)' -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.replicas}{"\n"}{end}'
- Stop the clients of the database:
kubectl scale deploy -n <namespace> -l release=<helm release name> -l 'app in (prometheus,webservice,sidekiq)' --replicas=0
::EndTabs
Upgrade the Consul nodes
Consult the Consul documentation for the complete instructions.
In summary:
-
Check the Consul nodes are all healthy.
-
Upgrade the GitLab package on all your Consul servers.
-
Restart all GitLab services one node at a time:
sudo gitlab-ctl restart
If your Consul cluster processes are not on their own servers, and are shared with another service such as Redis HA or Patroni, ensure that you follow the following principles when upgrading those servers:
- Do not restart services more than one server at a time.
- Check the Consul cluster is healthy before upgrading or restarting services.
Upgrade the Gitaly nodes (Praefect / Gitaly Cluster)
If you're running Gitaly cluster, follow the zero-downtime process for Gitaly cluster.
If you are using Amazon Machine Images (AMIs) on AWS, you can either upgrade the Gitaly nodes through the AMI process, or upgrade the package itself:
- If you're using the Elastic network interfaces (ENI), you can upgrade through the AMI process. With ENI, you can keep the private DNS names through AMI instance changes, something that is crucial for Gitaly to work.
- If you're not using ENI, you must upgrade Gitaly using the GitLab package. This is because Gitaly Cluster tracks replicas of Git repositories by the server hostname, and a redeployment using AMIs issues the nodes with new hostnames. Even though the storage is the same, Gitaly Cluster does not work when the hostnames change.
The Praefect nodes, however, can be upgraded by using an AMI redeployment process:
- The AMI redeployment process must include
gitlab-ctl reconfigure
. Setpraefect['auto_migrate'] = false
on the AMI so all nodes get this. This preventsreconfigure
from automatically running database migrations. - The first node to be redeployed with the upgraded image should be your deploy node.
- After it's deployed, set
praefect['auto_migrate'] = true
ingitlab.rb
and apply withgitlab-ctl reconfigure
. This runs the database migrations. - Redeploy your other Praefect nodes.
Upgrade the Gitaly nodes not part of Gitaly cluster
For Gitaly servers which are not part of Gitaly cluster, upgrade the GitLab package.
If you have multiple Gitaly shards or have multiple load-balanced Gitaly nodes using NFS, it doesn't matter in which order you upgrade the Gitaly servers.
Upgrade the PostgreSQL nodes
For non-clustered PostgreSQL servers:
-
The upgrade process does not restart PostgreSQL when the binaries are upgraded. Restart to load the new version:
sudo gitlab-ctl restart
Upgrade the Patroni node
Patroni is used to achieve high availability with PostgreSQL.
If a PostgreSQL major version upgrade is required, follow the major version process.
The upgrade process for all other versions is performed on all replicas first. After they're upgraded, a cluster failover occurs from the leader to one of the upgraded replicas. This ensures that only one failover is needed, and once complete the new leader is upgraded.
Follow the following process:
-
Identify the leader and replica nodes, and verify that the cluster is healthy. Run on a database node:
sudo gitlab-ctl patroni members
-
Upgrade the GitLab package on one of the replica nodes.
-
Restart to load the new version:
sudo gitlab-ctl restart
-
Repeat these steps for the other replica: upgrade, restart, health check.
-
Upgrade the leader node following the same package upgrade as the replicas.
-
Restart all services on the leader node to load the new version, and also trigger a cluster failover:
sudo gitlab-ctl restart
Upgrade the PgBouncer nodes
If you run PgBouncer on your Rails (application) nodes, then PgBouncer are upgraded as part of the application server upgrade.
Upgrade the GitLab package on the PgBouncer nodes.
Upgrade the Redis node
Upgrade a standalone Redis server by upgrading the GitLab package.
Upgrade Redis HA (using Sentinel)
DETAILS: Tier: Premium, Ultimate Offering: GitLab Self-Managed
Follow the zero-downtime instructions for upgrading your Redis HA cluster.
Upgrade the Rails components
::Tabs
:::TabTitle Linux package
All the Puma and Sidekiq processes were previously shut down. On each node:
-
Ensure
/etc/gitlab/skip-auto-reconfigure
does not exist. -
Check that Puma and Sidekiq are shut down:
ps -ef | egrep 'puma: | puma | sidekiq '
Select one node that runs Puma. This is your deploy node, and is responsible for running all database migrations. On the deploy node:
-
Ensure the server is configured to permit regular migrations. Check that
/etc/gitlab/gitlab.rb
does not containgitlab_rails['auto_migrate'] = false
. Either set it specificallygitlab_rails['auto_migrate'] = true
or omit it for the default behavior (true
). -
If you're using PgBouncer:
You must bypass PgBouncer and connect directly to PostgreSQL before running migrations.
Rails uses an advisory lock when attempting to run a migration to prevent concurrent migrations from running on the same database. These locks are not shared across transactions, resulting in
ActiveRecord::ConcurrentMigrationError
and other issues when running database migrations using PgBouncer in transaction pooling mode.-
If you're running Patroni, find the leader node. Run on a database node:
sudo gitlab-ctl patroni members
-
Update
gitlab.rb
on the deploy node. Changegitlab_rails['db_host']
andgitlab_rails['db_port']
to either:- The host and port for your database server (non-clustered PostgreSQL).
- The host and port for your cluster leader if you're running Patroni.
-
Apply the changes:
sudo gitlab-ctl reconfigure
-
-
If you modified
gitlab.rb
on the deploy node to bypass PgBouncer:-
Update
gitlab.rb
on the deploy node. Changegitlab_rails['db_host']
andgitlab_rails['db_port']
back to your PgBouncer settings. -
Apply the changes:
sudo gitlab-ctl reconfigure
-
-
To ensure all services are running the upgraded version, and (if applicable) accessing the database using PgBouncer, restart all services on the deploy node:
sudo gitlab-ctl restart
Next, upgrade all the other Puma and Sidekiq nodes. The setting gitlab_rails['auto_migrate']
can be
set to anything in gitlab.rb
on these nodes.
They can be upgraded in parallel:
-
Ensure all services are restarted:
sudo gitlab-ctl restart
:::TabTitle Cloud Native Hybrid
Now that all stateful components are upgraded, you need to follow GitLab chart upgrade steps to upgrade the stateless components (Webservice, Sidekiq, other supporting services).
After you perform the GitLab chart upgrade, resume the database clients:
kubectl scale deploy -lapp=sidekiq,release=<helm release name> -n <namespace> --replicas=<value>
kubectl scale deploy -lapp=webservice,release=<helm release name> -n <namespace> --replicas=<value>
kubectl scale deploy -lapp=prometheus,release=<helm release name> -n <namespace> --replicas=<value>
::EndTabs