Backup to S3 fails with a generic error
Summary
Backups to S3 bucket can sometimes fail with this error in the job activity:
Failed to validate object store access permissions. Error: "S3 Error: RequestError | send request failed"
Note
This article assumes that the error is seen with a user configured S3 storage. For debugging issues with CloudCasa storage, please contact CloudCasa support.
Details
This error is very generic and there are few different reasons that can result in this message, such as these:
S3 end point DNS name cannot be resolved.
Firewalls closing connections to the S3 provider.
To debug the problem, exec into the “kubeagent-” pod in cloudcasa-io namespace and run the following command:
curl [-k] -v <ENDPOINT>
use “-k” option when the S3 server is using self signed certificate. “ENDPOINT” is what you configured in user objectstore.
If “curl” doesn’t reveal the issue, one can use “aws” cli which can be run using Docker image “amazon/aws-cli”. In air-gapped environments, the image needs to be copied to local registry. Here is one way to run the cli and check S3 connectivity:
kubectl -n cloudcasa-io run -it --rm awscli --image=amazon/aws-cli --env="AWS_ENDPOINT_URL=" --env="AWS_ACCESS_KEY_ID=" --env="AWS_SECRET_ACCESS_KEY=" --command -- aws --debug --no-verify-ssl s3 ls
Fill in the env values. If there are no issues with the connectivity, you should see list of buckets.
If you see the following message, it implies that a firewall closed the connection:
Connection was closed before we received a valid response from endpoint URL: <ENDPOINT>.
Note that the above command runs a new pod every time it is run. A slightly different variation is to run “bash” command with the same image and then run “aws” command from the shell, like so:
kubectl -n cloudcasa-io run ...... --command -- bash
# aws --debug --no-verify-ssl s3 ls
This allows the pod to be inspected and command to be run multiple times from the same pod. For example, if a firewall is closing connection, we can check the node where Pod is running and add the node’s source IP to firewall allow list. The command can then be run again to verify that the connection goes through this time.