Introduction
AWS Lambda functions are great for doing small units of work. And not that they support dotnet core 2.0, they’re my go to choice for queue based workloads. However, there are some tasks that will definitely take longer than the 5 minute limit on Lambda functions. For example, if you have a daily batch job that you need to run, and if that job is going to take 30 minutes, or a hour, then how are you going to set up that process? You could do something like setup a stand-alone EC2 instance, create a cron job to trigger it, etc. But then you’re having to manage infrastructure. I’d rather let Amazon worry about that.
FARGATE
Amazon has had docker support for a while, which gets close to solving our server problem. Initially you needed to create EC2 instances and attach them to the Elastic Container Service. More recently they’ve launched FARGATE which is a serverless way to run docker. You just specify the requirements of your task, and FARGATE will manage the infrastructure for you. A typical case is that you want to have a docker enabled website that you’ve built, and you can let FARGATE host it without ever having to provision an EC2 instance.
But you can also define a task to run in FARGATE. For example, you can create a task definition that you tell to use a docker dotnet console app that does your batch process. (In another post I will go over the various components of setting up a docker base dotnet core console app and deploying it to ECS). What you can’t do (as of the time of this writing) is automatically set up a schedule task to invoke your task definition. You can do this if you use EC2 instances in your cluster. But when possible, I try to keep my platform serverless.
Use Lambda to execute a task
I suspect that some time in the future AWS will enable scheduled tasks for FARGATE, but in the meantime we’re on our own for setting this up. I did this with a very simple Lambda who’s purpose is to start a task that I had already defined in my task definition.
/// <summary>
/// Using json that is passed in as options, invoked the target as requested.
/// </summary>
public async Task FunctionHandler(Options options, ILambdaContext context)
{
if (options == null)
throw new ArgumentNullException(nameof(options));
if (String.IsNullOrWhiteSpace(options.Cluster))
throw new ArgumentOutOfRangeException("options.TaskDefinitionArn");
if (String.IsNullOrWhiteSpace(options.TaskDefinitionArn))
throw new ArgumentOutOfRangeException("options.TaskDefinitionArn");
if (options.Subnets == null || options.Subnets.Count() == 0)
throw new ArgumentOutOfRangeException("options.Subnets");
Console.WriteLine($"Launching task {options.TaskDefinitionArn}");
var results = await AmazonECS.RunTaskAsync(new AwsModel.RunTaskRequest()
{
Cluster = options.Cluster,
LaunchType = LaunchType.FARGATE,
TaskDefinition = options.TaskDefinitionArn,
Count = options.Count,
NetworkConfiguration = new AwsModel.NetworkConfiguration()
{
AwsvpcConfiguration = new AwsModel.AwsVpcConfiguration()
{
Subnets = options.Subnets,
AssignPublicIp = AssignPublicIp.ENABLED,
},
}
});
Console.WriteLine($"Task {options.TaskDefinitionArn} completed with {results.HttpStatusCode}");
}
Most of the code is just input safety checks. What’s going on here is that we’re getting a json string for options that is automatically serialized into our Options class. Then, using those options, we can directly invoke the task. The task will start and continue to run, but the Lambda function will complete it’s job of starting the task.
Here’s what the Options class looks like:
public class Options
{
/// <summary>
/// Target fargate cluster
/// </summary>
/// <value>The cluster.</value>
public string Cluster { get; set; }
/// <summary>
/// Fully qualified name of a task that is already defined.
/// </summary>
/// <value>The task definition arn.</value>
public string TaskDefinitionArn { get; set; }
/// <summary>
/// How many to invoke
/// </summary>
/// <value>The count.</value>
public int Count { get; set; }
/// <summary>
/// Subnets to use.
/// </summary>
/// <value>The subnets.</value>
public List<string> Subnets { get; set; }
}
There are a couple of ways that I could have configured the Lambda function (environment variables are another good way), but I wanted to have the flexibility to have one Lambda (“launch-fargate-task”) and then have it trigger any task that I wanted to specify.
Using CloudWatch to Schedule
There’s actually a couple of ways to set this up. But I like using the CloudWatch’s UI. When you go to CloudWatch, click on the Rules link and then click the button to create a new rule.
Here for example, I can create a rule that runs once per day. On the left side of the screen we can set up one or more triggers.
For the target, you select Lambda function and then select launch-fargate-task (assuming that’s the name of the function that you deployed). What makes this so flexible is that we can select “Constant” for the input. If you provide valid json, then that json gets serialized into our Options object from above. This way you can specify exactly what task definition to run. Note that you also have to specify the subnets to use as well. That information is the same information that you’d see in the UI if you manually ran a task definition. For example:
I am sure that there is also a way to programmatically get that information within the Lambda itself. But I opted to keep my Lambda very focused and decided to pass it in as configuration values.
So with that information, the full json object passed in will look something like:
{ "Cluster":"default", "TaskName":"Your Name", "TaskDefinitionArn":"arn:aws:ecs:us-east-1:########:task-definition/task-name:##", "Count":1, "Subnets":[ "subnet-db123", "subnet-db456" ] }
(I just used a compact version of that when specifying the input parameters).
If it’s set up correctly, your Lambda details screen will look something like:
This was just a test Lambda, so the security is pretty wide open.
But there you go! You can now run scheduled tasks to invoke long-running FARGATE processes.