Feature Update February 2023
Mid-winter is fast approaching, meaning it’s nearly time to start thinking about spring again! But here at Catalogic all we’ve been thinking about lately is adding more features to CloudCasa. We were thrilled to hear that we’ve been named a leader and outperformer in GigaOm’s recently released Radar for Kubernetes Data Protection Report, but we have no intention of resting on our laurels!
We’ve billed this feature update as a minor release, but it still brings with it a pretty extensive list of new features and improvements. Many are centered around extending our cloud integration capabilities. Expect to see these extended even more in future updates.
Backup and restore of EKS VPC and subnet configuration
There is more to building a Kubernetes cluster in the cloud than just creating the cluster itself, so there is more to backup and restore of cloud-based Kubernetes clusters than just backing up and restoring the cluster! In this update, we’ve extended our backup and restore support for EKS to include VPCs and subnets. Now CloudCasa is able to back up VPCs and subnet configurations associated with EKS clusters and restore them, if you choose to do so. During the backup of an EKS cluster, the system now discovers and records the VPC associated with the EKS cluster and all the subnets within it. During restore, you will be presented with options to create a new VPC and subnets with the same configuration as the ones that were backed up (or a different configuration).
Backup and restore of AKS VNet and subnet configuration
Similar functionality has been added for Azure AKS. AKS supports two different types of networking:
kubenet - creates a new VNet for your cluster using default values
Azure CNI - allows clusters to use a new or existing VNet with customizable addresses. Application pods are connected directly to the VNet, which allows for native integration with VNet features.
CloudCasa can now create AKS clusters with either type of networking on restore. It also supports backup of all of the network configuration data related to both.
When an AKS cluster with an Azure CNI type of network is found, CloudCasa discovers the Azure VNet that is used by the cluster and associated subnets. When a backup of the cluster begins, it backs up the configuration of the cluster, resource group, and Azure VNets and subnets. If the Azure VNet is located in a different resource group than the cluster, the configuration of the resource group for the virtual network will be backed-up as well.
On the flip side, CloudCasa also supports restore of the Azure VNets and subnets. You can specify which subnets you want to restore. And since AKS admins can specify different subnets for different node pools in the same cluster, the system provides an option to specify which of the restored subnets should be used for each node pool.
A new ARM template version is necessary to support auto-discovery, backup, and restore of Azure VNets and subnets. Users with Azure accounts previously configured in CloudCasa should apply the new template.
Backup and restore of GKE VPC and subnets configuration
Similar functionality has been added for GKE as well. Before this update, CloudCasa supported only inventory of GCP VPC Networks and Subnets. You had the option to select the VPC and subnet in the restore screen when creating a new GKE cluster, but if the resources were deleted or didn’t exist in the target account, you would need to recreate them manually. Now CloudCasa supports backup and restore of the VPC and subnets associated with a GKE cluster. These are backed-up automatically whenever you run a GKE backup.
When creating a cluster on restore, the cluster subnet will be restored by default. You can optionally specify other subnets not associated with the cluster to be restored as well.
Support for creating EKS private clusters
AWS EKS private clusters (i.e. clusters with no outbound Internet access) can now be created on restore via AWS PrivateLink. AWS PrivateLink allows you to communicate privately with a CloudCasa VPC endpoint from your own VPC, without the traffic traversing the public Internet. To restore a private EKS cluster, you must select a VPC that has been configured to communicate with CloudCasa via PrivateLink when creating the restore definition. Previously CloudCasa could back up private clusters, but it could only create public clusters. Note that AWS PrivateLink support is only available with premium service plans.
Support for EKS, AKS, and GKE add-ons
EKS, AKS, and GKE add-ons allow you to easily deploy and maintain workloads from the Kubernetes community, the cloud provider, or third-party vendors. Since add-ons install Kubernetes resources, CloudCasa previously had the ability to protect these workloads and restore them if necessary. However, some add-ons, such as the AWS EBS CSI add-on, require an IAM role with specific permissions or other configuration in order to function properly. This means that if, for example, an EKS cluster had the EBS CSI add-on installed and was restored to a different AWS account, it would not function properly until the admin manually updated the IAM role configured for it.
CloudCasa now retains the list of add-ons and their versions for each cloud cluster, and presents them during restore so that you can select which to restore and update their configuration when necessary.
New Log Analytics option for Azure accounts
Users are now presented with an option for “AKS Log Analytics” support when running the ARM template while adding an Azure account. If this option is not enabled, CloudCasa will disable the Azure Log Analytics add-on when creating an Azure AKS cluster during a restore. The default for this option is enabled for newly added Azure accounts. Users with Azure accounts that were previously configured in CloudCasa will need to re-deploy the ARM template for each account so that Log Analytics support can be enabled, if they wish it to be. Accounts that need the new template applied will be flagged with attention icons in the Configuration/Cloud Accounts page. A reminder alert will also be generated.
GKE non-CSI Google Persistent Disks support
We have introduced automatic support for snapshotting non-CSI PVs using Google Persistent Disks (GPD) on GKE. This support is made available via an option in the new version of the GCP Deployment Template, since it required additional permissions. When you enable the option, besides the normally required permissions, the template deploys an additional custom role and service account, and grants the primary service account permission to generate the necessary temporary credentials.
During the inventory process, CloudCasa now discovers GCE Persistent Disk resources in the zones your clusters are located in. During the backup operation, it checks if the cluster has any GPD PVs that were provisioned using the non-CSI driver. If it does, it looks up the discovered GCE Persistent Disk, generates the appropriate temporary credentials for it, and passes them to the agent in order to create a snapshot. Snapshot status can be found in the “PV Details (snap)” tab on the job activity page.
GCP accounts that were previously configured in CloudCasa will need to be updated with the new version of the Deployment Template in order to support this feature. Accounts that need the new template applied will be flagged with attention icons in the Configuration/Cloud Accounts page.
Previously, a manual configuration method was available for supporting non-CSI GPD PVs, and it can still be used if you prefer it.
Added support for Block mode PVs
With this update we have added backup and restore support for block mode PVs (i.e. those with volumeMode set to “Block”). From the user’s perspective, everything should look the same as for PVs with volume mode Filesystem. This addition is aimed primarily at supporting KubeVirt, which frequently makes use of block mode PVs, but it has other applications as well.
With CloudCasa’s data deduplication features, only changed blocks will be backed up. However, since Kubernetes does not support generating a list of blocks changed between snapshots (i.e. “dirty blocks”), the whole block PV must be read at each backup to discover which blocks changed. This read will generally be very fast for blocks that were never written.
Added support for efficient restores of sparse files
We have made a change to CloudCasa to support more efficient restores of sparse files. Sparse files are files with “holes” created by seeking past the end of the file before writing. Systems interpret these holes as nulls when the file is read, but do not actually store the nulls on disk, allowing sparse files to consume much less storage space than they otherwise would.
When restoring, CloudCasa will now automatically create sparse files when it detects runs of nulls a block or more in length. This is especially advantageous when restoring block mode PVs and other disk image files, which often contain many null blocks. This change is aimed primarily at better supporting KubeVirt environments, but is useful for other applications as well.
Note that because of the de-duplicating storage scheme used by CloudCasa, backup of sparse files was already performed efficiently.
Added overwrite option for Kubernetes restores
We’ve added an option to allow overwriting of existing Kubernetes resources during a restore. Previously, existing resources would not be overwritten by a restore, and this is still the default behavior. This can be a very useful feature, but enable it with caution!
New implementation of Security scan
CloudCasa’s Kubernetes security scans have changed significantly with this update, due mainly to the phase-out of Starboard, an open source project that CloudCasa previously relied on. The new implementation now depends on Trivy for all cluster security scanning.
The most obvious result of this change is that when viewing cluster scan results, you will see different tabs and categories (2 tabs instead of 4). The assortment of checks performed has also changed, although overall coverage is similar. More checks will be added over time.
Due to the change, the CloudCasa agent needs to be updated to the latest version in order for new scans to run. If you have automatic updates for your agents disabled and haven’t yet updated them manually, scan jobs will fail until you do.
Added ability to select storage classes from a drop-down during restore
CloudCasa has provided the ability to remap storage classes on restore for quite a while, but with this update we’ve made it easier by providing available storage class options in a drop-down. This works whether you are restoring to an existing cluster or creating a new one. For an existing cluster, the selections will reflect the storage classes available on the cluster. If you are creating a cluster, the selections will reflect the classes supported by the cloud provider.
The order of information input in the restore wizard has been changed slightly so that the necessary data is available at this step to determine which storage classes are available.
Remember that storage class remapping is only available for restores from copy backups, not for snapshot restores.
Other changes
Pause and resume schedule options for all backup jobs - The pause and resume schedule functions are now available for all scheduled backup jobs, both copy and snapshot. See the Protection/Backups page. We apologize that this feature wasn’t available for all jobs previously.
Added storage location data to recovery point displays - Often, when performing a recovery, it’s important to be able to quickly determine which recovery points are available in a specific location. To make this easy, we’ve now added a location column to all tables listing recovery points (e.g. Protection/Recovery Points). We’ve also added the ability to sort and filter these views by storage location.
Restore wizard re-organization - Sub-steps have been added to the cluster creation step in the restore wizard in order to better organize all the options now available for cluster creation.
Time zone column added in the policies list - The policy listing under Configuration/Policies now has a time zone column so that you can easily see what time zone the policy uses.
Added application hook information to job activity - When application hooks are configured for a job, job activity now shows when they are running and their status.
Activity log improvements - We’ve again made numerous activity log message improvements, including improved warnings about resources not overwritten on restore. See the new overwrite option mentioned above if you want to overwrite resources!
Run cloud inventory job during EKS, AKS, and GKE backups - Cloud inventory jobs are now run at the start of each EKS, AKS, and GKE backup rather than just periodically. This prevents recently added native volumes or other cloud resources from being missed by the backup.
Resource type options limited to appropriate types - When selecting resources by type for a restore, cluster-scoped resource types will now only be displayed if a full cluster restore (i.e. not specific namespaces) has been selected.
UI no longer lists too-small VM sizes for GKE - VM sizes too small to support CloudCasa are no longer listed as options when creating a GKE cluster during a restore. These would be unlikely choices for Kubernetes worker nodes anyway.
As usual, many bug fixes and improvements to performance, reliability, and usability are also included in this update.
CloudFormation Stack, ARM Template, & GCP Deployment Template updates
Our CloudFormation stack used for AWS accounts, our ARM template used for Azure accounts, and our Deployment Template used for GCP accounts have all been updated in this release in order to support the new features. You’ll need to apply the new versions to any previously configured cloud accounts in order to take advantage of these features. You can see which accounts need to be updated by going to the Configuration/Cloud Accounts page. Accounts needing updates will be flagged with an attention icon. Just click on the icon to update begin the process.
Kubernetes agent updates
In this update we’ve again made several changes to our Kubernetes agent to add features, improve performance, and fix bugs. However, manual updates shouldn’t normally be necessary anymore because of the automatic agent update feature. If you have automatic updates disabled for any of your agents, you should update them manually as soon as possible.
Notes
With some browsers you may need to restart, hit Control-F5, and/or clear the cache to make sure you have the latest version of the CloudCasa web app when first logging in after the update.
As always, we want to hear your feedback on new features! You can contact us via the user forum, using the support chat feature, or by sending email to support@cloudcasa.io.