1. Answers
  2. Scheduling AWS EMR Serverless Jobs with Scheduler

How Do I Schedule AWS EMR Serverless Jobs Using AWS Scheduler?

In this guide, we will demonstrate how to schedule AWS EMR Serverless jobs using AWS Scheduler with Pulumi. The purpose of this guide is to help you automate the execution of EMR Serverless jobs by setting up a scheduler that triggers the application at specified intervals. We will cover the creation of an EMR Serverless application and the configuration of the AWS Scheduler.

Key Points

  • AWS EMR Serverless Application: A serverless application that can run big data workloads.
  • AWS Scheduler: A service to schedule tasks and automate workflows.

Steps

  1. Create an EMR Serverless Application:

    • Define the application with necessary configurations such as the type of application, EMR release version, and capacity settings.
  2. Set Up AWS Scheduler:

    • Create a schedule to trigger the EMR Serverless application using a cron expression.
    • Configure an IAM role to provide the necessary permissions for the scheduler to start the EMR job.
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

// Create an EMR Serverless Application
const emrApp = new aws.emrserverless.Application("emrApp", {
    name: "my-emr-serverless-app",
    type: "SPARK", // Specify the type of application (e.g., SPARK, HIVE)
    releaseLabel: "emr-6.4.0", // EMR release version
    maximumCapacity: {
        cpu: "4 vCPU",
        memory: "16 GB",
    },
    initialCapacities: [{
        initialCapacityType: "DRIVER",
        initialCapacityConfig: {
            workerCount: 1,
            workerConfiguration: {
                cpu: "2 vCPU",
                memory: "8 GB",
            },
        },
    }],
});

// Create an IAM Role for the Scheduler to trigger the EMR application
const schedulerRole = new aws.iam.Role("schedulerRole", {
    assumeRolePolicy: aws.iam.assumeRolePolicyForPrincipal({ Service: "scheduler.amazonaws.com" }),
});

// Attach the necessary policies to the role
const schedulerRolePolicy = new aws.iam.RolePolicy("schedulerRolePolicy", {
    role: schedulerRole.id,
    policy: pulumi.output({
        Version: "2012-10-17",
        Statement: [{
            Effect: "Allow",
            Action: [
                "emr:StartJobRun",
            ],
            Resource: "*", // Adjust the resource as needed
        }],
    }),
});

// Create an AWS Scheduler Schedule
const schedule = new aws.scheduler.Schedule("emrSchedule", {
    name: "my-emr-schedule",
    scheduleExpression: "cron(0 12 * * ? *)", // Every day at 12 PM UTC
    flexibleTimeWindow: {
        mode: "OFF",
    },
    target: {
        arn: emrApp.id, // ARN of the EMR Serverless application
        roleArn: schedulerRole.arn,
        input: JSON.stringify({
            name: "my-emr-job",
            executionRoleArn: schedulerRole.arn, // Role to execute the job
            releaseLabel: "emr-6.4.0",
            jobDriver: {
                sparkSubmitJobDriver: {
                    entryPoint: "s3://my-bucket/my-script.py", // Replace with your script location
                },
            },
            configurationOverrides: {
                monitoringConfiguration: {
                    s3MonitoringConfiguration: {
                        logUri: "s3://my-bucket/logs/",
                    },
                },
            },
        }),
    },
});

Summary

In this guide, we successfully created an AWS EMR Serverless application and set up an AWS Scheduler to automate its execution at specified intervals. By using a cron expression, we defined a precise schedule for the job execution. Additionally, we configured an IAM role to ensure that the scheduler has the necessary permissions to trigger the EMR job. Following this guide enables you to efficiently manage and automate your EMR Serverless jobs using AWS Scheduler and Pulumi.

Deploy this code

Want to deploy this code? Sign up for a free Pulumi account to deploy in a few clicks.

Sign up

New to Pulumi?

Want to deploy this code? Sign up with Pulumi to deploy in a few clicks.

Sign up