Backup and upgrade your EKS cluster from version 1.17 to 1.18 with Velero and eksctl.
For compatibility and product maturity reasons, I usually tend to upgrade my EKS clusters to a lower version than the one currently released.
AWS introduced 1.19 support for EKS, so it’s time to upgrade my cluster to 1.18.
Before proceeding with the update it is always good to read the changes that the new version will make and make a backup. I will never tire of emphasizing the importance of a backup in case something goes wrong.
Tools used:
an EKS cluster (it seems obvious to me) :-)
kubectl
AWS cli
eksctl (version 0.38.0 or above)
Velero
KubePug — Pre UpGrade (optional)
Velero
Velero is an open source tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes. With Velero you can backup the entire cluster or single namespace (s) or filter objects by using labels. Can be used to migrate your on-prem Kubernetes workloads to cloud and cluster upgrades too.
Velero uses AWS S3 bucket to backup EKS clusters, so the first thing to do will be to create the bucket that will contain our backup. For this purpose we can use the console or the AWS cli with the command:
aws s3api create-bucket \
--acl private \
--bucket velero-bucket-eks-backup \
--region eu-west-1 \
—-create-bucket-configuration LocationConstraint=eu-west-1
now let’s create an IAM user for Velero:
aws iam create-user —-user-name velero
create a file called velero-policy.json and paste the following policy in order to give the IAM user previously created the necessary policies:
attach policy to velero IAM user using the command:
aws iam put-user-policy \
--user-name velero \
—-policy-name velero \
—-policy-document file://velero-policy.json
create and export the user’s access key:
aws iam create-access-key \
—-user-name velero > velero-access-key.json
let’s create a file called velero-credentials and paste in the access key and secret key created previously:
[default]
aws_access_key_id=AKIAXXX111222BBB444
aws_secret_access_key=V123AaaB456cCc789ddD012EeE345ffF678
Install velero downloading the latest version available from the repository:
wget https://github.com/vmware-tanzu/velero/releases/download/v1.5.3/velero-v1.5.3-darwin-amd64.tar.gz
extract the content and move the velero file to /usr/local/bin
tar -xvf velero-v1.5.3-darwin-amd64.tar.gz
cd velero-v1.5.3-darwin-amd64
sudo mv velero /usr/local/bin
verify that it is correctly installed:
velero version
in my case the command also returns the version installed on the cluster as it was previously installed.
Let’s proceed with the installation of velero on EKS:
velero install \
-—provider aws \
-—plugins velero/velero-plugin-for-aws:v1.1.0 \
-—bucket velero-bucket-eks-backup \
-—backup-location-config region=eu-west-1 \
-—snapshot-location-config region=eu-west-1 \
-—secret-file ./velero-credentials \
-—velero-pod-cpu-limit 2 \
-—velero-pod-mem-limit 4G
check that the installation was successful:
kubectl get all -n veleroNAME READY STATUS RESTARTS AGE
pod/velero-54bdf48d57-hzmhz 1/1 Running 0 4dNAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/velero 1/1 1 1 4dNAME DESIRED CURRENT READY AGE
replicaset.apps/velero-54bdf48d57 1 1 1 4d
Optional step
Check the deprecated API and any breaking changes using KubePug — Pre UpGrade.
Download and install the latest version available:
wget https://github.com/rikatz/kubepug/releases/download/v1.1.3/kubepug_darwin_amd64.tar.gz
tar -xvf kubepug_darwin_amd64.tar.gz
cd kubepug_darwin_amd64
sudo mv kubepug /usr/local/bin
verify the correct installation:
kubepug —-version
Let’s check the cluster status for deprecated API using the EKS Kubernetes released version (1.18.9 in this case):
kubepug —-k8s-version=v1.18.9
Once we have verified the compatibility of our deployments with our API, we can continue.
Let’s backup
Let’s proceed to make the backup:
velero backup create staging-backup
We will see that a folder called “backups” will be created in the bucket containing the data of the various backups, in our case a folder called staging-backup
The snapshots of the PVCs present will also be done automatically:
Run a backup describe and logs to see if you get any error or failed status on your backup:
velero backup describe staging-backup
velero backup logs staging-backup
Once we have backed up everything, we can proceed with the cluster upgrade. This procedure requires eksctl version 0.38.0 or later. You can check your version with the following command:
eksctl version
If you don’t have eksctl already installed, you can install it with the following command:
brew install weaveworks/tap/eksctl
If eksctl is already installed, run the following command to upgrade:
brew upgrade eksctl && brew link --overwrite eksctl
Let’s do the cluster upgrade:
eksctl upgrade cluster —-name cicd-staging —-version=1.18
After verifying all the changes that will be made we can launch:
eksctl upgrade cluster —-name cicd-staging —-version=1.18 --approve
let’s wait the end of the upgrade and proceed to upgrade the nodegroups:
eksctl get nodegroups —-cluster cicd-stagingeksctl upgrade nodegroup —-name=cicd-staging-nodes —-cluster=cicd-staging —-kubernetes-version=1.18
Once the upgrade of the nodegroups has been completed, let’s update the default add-ons:
eksctl utils update-kube-proxy —-cluster=cicd-stagingeksctl utils update-aws-node —-cluster=cicd-stagingeksctl utils update-coredns —-cluster=cicd-staging
also in this case we will have to use
--approve
at the end of the command to execute it
Once this operation is completed, our cluster will have been upgraded to version 1.18 and the nodegroups to the latest AMI release version available.
After having done all the necessary checks it will be possible to delete the bucket and the snapshots created during the backup (although personally I recommend keeping them at least for a week to be more secure).