Issue 1: Pods fails to come up with CreateContainerConfigError.

Error Message:
failed to prepare subPath for volumeMount "dit-diyprod-rel-a-logs" of container "cluster-manager".


Solution:

Step 1: scale down all pods and stateful services.

kubectl scale deploy --replicas=0 -n <namespace> --all

kubectl scale sts --replicas=0 -n <namespace> --all


Step 2: Restart the Node Pool on AKS.

             

            1 .Open the kubernetes service cluster page in the Azure portal.

            2. Goto Settings > Node pools

            3. Identify the node pool which infoworks pods are running on.

            4. Click on the Stop button in the node pool Overview.

            5. Once it is stopped, one can click on the Start button to start the node pool.


Step 3: scale up all pods and stateful services.


            kubectl scale deploy --replicas=2 -n <namespace> --all

kubectl scale sts --replicas=3 -n <namespace> --all


Note: scale up the replicas based on the number of replicas configured.


----------------------------------------------------------------------------------------------------------------------


Issue 2: Infoworks pods like cluster-manager and dt are failing to come up connection timeout errors. 


Root cause: 

Upon further investigation it can be found that the DNS name not resolving from pods but resolvable from the browser.


Solution:

  1. Check Private DNS Zone Configuration

    • Ensure that the necessary IP records are added correctly in the Private DNS zone associated with your AKS cluster.

  2. Verify DNS resolution from the pods

    • Execute a DNS lookup from within a pod to confirm DNS resolution:
      kubectl exec -it <pod-name> -n <namespace> -- nslookup <dns-name>