jarv.org

Building cmdchallenge using Lambda and API Gateway in the AWS free-tier with Docker and Go

Have you ever thought about building a side-project for fun without spending a lot on hosting? This post might be for you. With the most tech-buzz-wordy title I could conjure up here is a quick overview of how cmdchallenge.com is built. The site is a simple web application side-project that executes shell commands remotely in a docker container in AWS. The front-end gives the feeling of a normal terminal but underneath it is sending whatever commands you give it remotely on an EC2 instance inside a Docker container.

The source code for most of it is located on github including a tiny command executer written in Go, the challenge definitions, and a test harness.

The following AWS services are used for the site:

In addition to this Amazon Certificate Manager and Route53 was used but for everything above you can keep costs close to zero in AWS. There is no free tier for Route53 (sad panda) but it’s like 50 cents a month for a single zone.

Features

Deployment tools (simple and boring)

With these tools the following automated steps are taken to deploy the site:

Architecture diagram:

There are two public entry points for the site, one is the main web-site which is static and served S3. The other is the API gateway at api.cmdchallenge.com which is also fronted by CloudFront so that it can use a certificate from ACM and cache requests.


  api.cmdchallenge.com         cmdchallenge.com
  ********************         ****************
+---------------------+    +---------------------+
|      Cloudfront     |    |      Cloudfront     |
+---------------------+    +---------------------+
           |                          |
+---------------------+         +-----------+
|    API Gateway      |         | s3 bucket |
+---------------------+         +-----------+
           |
  +-----------------+
  |                 |
  | Lambda Function |    +----------+
  |                 |--- |          |
  +-----------------+   \| DynamoDB |
           |             |          |
   +--------------+      +----------+
   | EC2 t2.micro |
   |   (coreos)   |
   +--------------+

One nice thing about using AWS server-less components was that a single t2.micro instance ended up being fine for handling all of the load, even at peak.
See section on caching/performance below.

Here is what happens when a command is submitted in the cmdchallenge.com terminal:

The challenges are expressed in a single YAML, here an example of one challenge:


  - slug: hello_world
    version: 4
    author: cmdchallenge
    description: |
      Print "hello world".
      Hint: There are many ways to print text on
      the command line, one way is with the 'echo'
      command.

      Try it below and good luck!      
    example: echo 'hello world'
    expected_output:
      lines:
        - 'hello world'

Interested in coming up with your own? You can submit your own challenge with a pull request. Your challenge will be added to the user-contributed section of the site.

Caching:

You may notice that when you do echo hello world on the hello world challenge it returns almost immediately. As it is shown above there are two layers of cache, one at CloudFront and one at DynamoDb to reduce the number of command executions on the Docker container. API Gateway can provide caching but it costs money, I worked around this by sticking CloudFront in front of it but this is only possible with HTTP GETs. With Cloudfront in front the cache-control header in the response from Lambda is set to a very long cache lifetime with every request. The version of the challenge as well as a global cache buster param is passed in so we never have to worry about returning a response from a stale challenge.

Performance:

If you are wondering how well this would scale for a lot of traffic, the Lambda function currently dispatches commands to a random host in a statically configured list of EC2 instances making it pretty easy to add more capacity. So far it seems to be operating fine with a single t2.micro EC2 instance handling all command requests that are not cached.

Without caching this wouldn’t be possible and also the caching at CloudFront enables most commands to return fairly quickly.

Wrapping up

If you like the site please follow @thecmdchallenge on twitter or if you have suggestions drop me a mail at [email protected].

Thanks!