Below is a script to monitor for active EMR job drivers that start with "ClusterJobDriver_". At any given time there should be only one "ClusterJobDriver_" in the running state. 

The below script is to calculate the no of active "ClusterJobDriver_ " and notify by email or AWS SNS topic when the count is more than 1.

  • Need to configure AWS region and cluster ID in the script
  • The script assumes aws cli is already setup with the required authentication 
  • All the job drivers launched for ingestion or pipeline jobs start with "ClusterJobDriver_". 


# Set the AWS region and EMR cluster ID

# Get the list of step functions for the EMR cluster that start with "ClusterJobDriver_"

step_functions=$(aws emr list-steps --cluster-id "$EMR_CLUSTER_ID" --region "$AWS_REGION" --output json | jq '.Steps[] | select(.Name | startswith("ClusterJobDriver_")) | select(.Status.State == "RUNNING")')

# Count the number of running step functions
running_count=$(echo "$step_functions" | jq 'length')

# Display the count of running step functions
echo "Running Step Functions Count: $running_count"

# Check if the running count is more than 1
if [[ $running_count -gt 1 ]]; then
  # Email alert subject
  alert_subject="More than 1 job driver found to be in running state on Cluster $EMR_CLUSTER_ID"

alert_body="Please investigate the running step functions on EMR Cluster $EMR_CLUSTER_ID. There are $running_count job drivers currently running."
  # Uncomment the following lines to send email using mailx
  # echo "$alert_body" | mailx -s "$alert_subject" "$ALERT_EMAIL"
  # Uncomment the following lines to send an AWS SNS notification
  # sns_topic_arn="your_sns_topic_arn"
  # aws sns publish --region "$AWS_REGION" --topic-arn "$sns_topic_arn" --subject "$alert_subject" --message "$alert_body"