CNE AWS Ready: Troubleshooting

Liferay Cloud Native Experience (CNE) deployments involve multiple components, including AWS infrastructure, Kubernetes, GitOps workflows, and Terraform configurations. Issues can occur if resources become inconsistent or are modified outside the expected workflow.

Below is how to resolve common possible problems with deploying or managing CNE AWS Ready environments.

Terraform Errors After Deleting AWS Resources Manually

Infrastructure managed through infrastructure-as-code (IaC) tools such as Terraform should remain the source of truth and control for the resources it creates. When resources are modified or deleted manually in the AWS console, the IaC state becomes inconsistent with the actual infrastructure.

When this happens, Terraform may fail because it still expects those resources to exist.

For example, if an EKS cluster is deleted manually, Terraform may return an error like this:

Error: Get "http://localhost/apis/storage.k8s.io/v1/storageclasses/gp3": dial tcp [::1]:80: connect: connection refused

with kubernetes_storage_class_v1.gp3_storage_class,
on eks.tf line 203, in resource "kubernetes_storage_class_v1" "gp3_storage_class":
203: resource "kubernetes_storage_class_v1" "gp3_storage_class"
...

This occurs because the Terraform state still contains references to Kubernetes resources that no longer exist.

Navigate to the EKS Terraform directory:

cd liferay-cne-bootstrap/cloud/terraform/aws/eks

Authenticate with AWS:

aws sso login

Attempt to destroy the infrastructure:

terraform destroy -var-file="../../../cloud/scripts/global_terraform.tfvars"

If the command succeeds, the issue is resolved.

If the command fails, check whether Terraform still tracks Kubernetes resources:

terraform state list | grep kubernetes

If resources appear, remove them from the Terraform state:

terraform state rm [resource_name]

Repeat the same process in the following directories:

bootstrap/cloud/terraform/aws/gitops/platform

bootstrap/cloud/terraform/aws/gitops/resources

After removing the invalid state entries, Terraform should complete successfully.

Grafana Workspace Provisioning Fails

During bootstrap, AWS Managed Grafana (AMG) provisioning can fail with errors similar to:

Error: creating Grafana Workspace: operation error grafana: CreateWorkspace, ...
lookup grafana.<region>.amazonaws.com ... no such host

This is an intermittent issue and may depend on the selected AWS region.

Re-run the bootstrap script:

./cloud/scripts/setup_aws.sh ./config.json ./cloud/scripts/versions_aws.tfvars

If the issue persists, disable observability in your configuration:
```
"observability_config": {
   "enabled": false
}
```

Disabling observability skips Grafana provisioning and allows the bootstrap process to complete.

Bootstrap Script Requires Two Arguments

Recent versions of the bootstrap script require two arguments.

If you see an error like:

Usage: ./setup_aws.sh <configuration-json-file> <versions-tfvars-file>

Run the script with both required files:

./cloud/scripts/setup_aws.sh ./config.json ./cloud/scripts/versions_aws.tfvars

config.json defines the deployment configuration.
versions_aws.tfvars defines infrastructure component versions.

Partial Infrastructure Provisioning After Failure

If the bootstrap process fails, some AWS resources may still be created.

This can result in

Incomplete infrastructure
Terraform state inconsistencies
Subsequent bootstrap failures

Re-run the bootstrap script to resume provisioning:

./cloud/scripts/setup_aws.sh ./config.json ./cloud/scripts/versions_aws.tfvars

If issues persist, clean up the environment before retrying.

Cannot Access Argo CD

If you cannot access Argo CD at http://localhost:8080, the bootstrap process may not have completed successfully.

Argo CD becomes available only after

Infrastructure provisioning completes
Kubernetes resources are deployed
GitOps synchronization finishes

Verify the bootstrap script completed without errors.
If the script failed, resolve the error and re-run the bootstrap process.

Liferay Pods Stuck in NotReady After OpenSearch Init Container Fails

After bootstrap, Liferay pods in your CNE AWS Ready cluster may remain in a NotReady state because the liferay-install-opensearch-modules init container fails.

The init container downloads the OpenSearch connector modules from a public mirror at pod startup. It fails when any of these conditions apply:

The mirror is unreachable (network policy or air-gapped environment)
The DXP image version has no matching OpenSearch connector published
The download URL is blocked or misconfigured

In each case, the container logs an error like:

Unable to download OpenSearch modules for product version [version] from [url].

To start Liferay without the OpenSearch modules, disable the init container in your override liferay.yaml. The container is keyed as x-liferay-install-opensearch-modules under customInitContainers:

liferay-default:
   customInitContainers:
      x-liferay-install-opensearch-modules: null

When you skip the init container, Liferay falls back to the sidecar Elasticsearch bundled with the image. Liferay supports the sidecar for development only — not production. See Using the Sidecar or Embedded Elasticsearch.

This opt-out is a stopgap. For a supported production search engine, deploy Elasticsearch — see Using Elasticsearch.

Capability: