The CDK Book

A while back, a notification popped up in the community-led cdk.dev Slack. This is where thousands of teams from across the globe are helping each other with anything related to the CDK, the Cloud Development Kit. The private message said, “A group of us are teaming up to write a book about the AWS CDK, and we would love for you to write the foreword.”

It’s been over four years since we started the CDK journey. We set out to improve the experience of developing cloud applications, but the thing that I’m truly humbled by is the community that formed organically around this mission. A community that is not only about the AWS CDK as a specific product, but around the CDK as a programming model for creating abstractions by defining desired state through software. This programming model, which we sometimes refer to as the Construct Programming Model (CPM), is used today to define applications on Kubernetes with CDK8s, Terraform with CDKTF, Azure with Armkit, and even things like complete development experiences with Projen.

Our community is ready for a book, and I’m honored to write this foreword and share my perspective. To start with, I’d like to take you back in time and tell you the origin story of the AWS CDK.

In early 2016, I joined a group of Amazon engineers who were exploring hardware utilization in one of the most important and large-scale services behind Amazon.com—Amazon Product Search. One piece of this solution required processing, in real time, all activity from Amazon.com to determine which products were being purchased. Amazon was in the midst of a company-wide program to enable internal teams to use AWS to its full extent. Back then, many of our services were deployed in a centrally managed AWS account within a single network. This new project gave us an opportunity to use some of the latest AWS building blocks. The service had to serve Amazon-scale production traffic across multiple marketplaces from day one, so we devised an architecture that heavily relied on AWS Lambda, Amazon Kinesis, and Amazon DynamoDB to be able to process millions of events per day with minimal operational costs.

We kicked off a two-pizza team (an Amazon term for a small, focused, highly cooperative team) with the responsibility to deliver this service. Given its reliance on many AWS resources, and our need to deploy the service across development, integration, and production environments, we decided to use AWS CloudFormation to provision our resources. It wasn’t reasonable to manually configure all these resources through the AWS Management Console, and we wanted to capture our resource configuration in source control so we could incrementally iterate on it through our normal development process.

We started simple with a bunch of AWS Lambda functions, but as the project progressed, our architecture evolved, and we used more and more AWS resources and capabilities. It became increasingly harder to maintain our CloudFormation templates. Although the CloudFormation YAML files represented the configuration of our resources, they weren’t designed to capture the ideas behind our architecture, such as logical units, relationships, best practices, and repeating patterns. We found ourselves copying and pasting too much. We constantly forgot to update certain values across multiple resources and templates, and we struggled to find good ways to validate our code with tests before it was deployed.

As software engineers, we’ve been solving this exact set of problems using object-oriented programming languages since the late 1960s. In computer programs, logic can be expressed with imperative code, conditions, and loops. Abstract ideas can be modeled through object-oriented primitives, and best practices and patterns can be shared and reused through libraries and package managers. We wanted the best of both worlds—we wanted to be able to define our infrastructure using modern programming languages, but still provision our cloud resources through a declarative desired-state engine that took care of updating our infrastructure in a safe and deterministic way.

We experimented with some existing tools for generating CloudFormation from code. There were a few open-source projects like Troposphere and GoFormation that generated CloudFormation, but they lacked one basic ingredient—composability. Composability is the key to software abstraction because it allows solving problems by breaking them down into smaller problems. If we wanted to simplify the cloud, we needed to be able to create composable, reusable abstractions.

Furthermore, we wanted a solution in a programming language that our engineers would be comfortable using, and that would allow us to take advantage of the investment we already had in our development environment and processes. We didn’t want to introduce a new toolchain into our build process, figure out the best IDE setup, and learn new programming patterns and idioms. We also wanted to use the same programming language to natively connect between our infrastructure code and our runtime code. We realized this was particularly important for serverless applications like ours because our AWS Lambda handlers directly interacted with many infrastructure resources. More and more we realized that infrastructure and runtime code are two sides of the same coin, and we wanted to develop, test, and release them together.

We created a library called “Cloudstruct.” At the heart of this library were composable primitives we named constructs. Constructs were simple object-oriented classes that represented cloud building blocks and could be combined to form higher-level constructs. At the base level, each AWS resource was a construct, so we had a construct for “Amazon S3 Bucket,” “AWS Lambda Function,” “Amazon DynamoDB Table,” and so forth. Then, we could combine these together to represent logical units in our system. So, we had an “ingestion” construct that represented our incoming data pipeline, a “publisher” construct that took care of publishing results to the downstream search index, and a nice little “dynamo scanner” construct that performed a daily full, concurrent scan of our Amazon DynamoDB table. When a Cloudstruct program was executed, it would synthesize a set of CloudFormation templates that we could examine and deploy to our accounts.

As soon as we started to migrate our application to Cloudstruct, everything clicked into place. This paradigm felt right, and solutions emerged and popped up naturally. A few months later, after we shipped our service to production, we demonstrated Cloudstruct to AWS leadership along with an internal “press release” document that described a vision for releasing it as a public AWS product in multiple programming languages. It was an easy pitch. We just needed to show examples from constructs we created for our service alongside the CloudFormation templates that they generated. Comparing a few easy-to-understand lines of code against hundreds of lines of YAML was all it took to get initial funding to build the AWS Cloud Development Kit.

And here we are, four years later. You are holding a book that is a practitioner’s guide to using the AWS Cloud Development Kit. It offers guidance, best practices, and advice written by prominent members of the CDK community who have been involved with the project from its pre-1.0 days. This book builds on experience from real production use cases across dozens of different types of customers. It’s based on learnings, failures, and respectful discussions across the growing CDK community.

I see the CDK community and ecosystem as our real success. However, even with millions of CDK stacks deployed, adoption from across the industry, three major CDK products (AWS CDK, CDK for Terraform, and CDK for Kubernetes), and hundreds of construct libraries listed in Construct Hub, I truly think we are just getting started. Building and operating cloud applications is still too complex: Developers and operators are still required to have a deep understanding of the low-level pieces. Managing the end-to-end development experience still involves stitching together dozens of tools and services every time, and it is still almost impossible to reuse construct-based abstractions across different provisioning domains.

As I look forward in the waning days of 2021, I’d like to share my perspective on how the CDK could continue to improve the experience of developing and operating cloud applications. I believe that we will see evolution in four different directions: more L3 constructs (up), cross-domain interoperability (down), using constructs as “meta-IDEs” (left), and runtime representation (right).

Up: I believe we are going to see more and more truly high-level abstractions, or as we sometimes call them in CDK parlance, L3 constructs. The CDK offers a powerful programming model for simplifying the cloud, but so far most of the simplification has been up-leveling the API experience for individual AWS resources (what we call L2s). The “S3 Bucket” construct offers a rich, intent-based API for buckets, but it still requires users to understand what a bucket is. On the other hand, a “Static Website” construct offers a higher-level mental model: As long as users can wrap their heads around the concept of a static website, they don’t need to care about the underlying implementation. It shouldn’t matter if behind the scenes there is an Amazon S3 bucket and Amazon CloudFront distributions or Amazon API Gateway and AWS Lambda functions. Opportunities for new high-level mental models like this are abundant— microservice frameworks, big-data analytics, compliance patterns, regulatory constraints, line-of-business applications, and machine learning pipelines. We can already find many useful examples in the Construct Hub today, but I believe this is just the tip of the iceberg—there are still many high-level ideas across our industry waiting for you to codify them through constructs.

Down: There is still strong coupling between constructs and the underlying provisioning mechanism. To maximize the investment in high-level abstractions we will need cross-domain interoperability. For example, users will be able to define infrastructure using AWS L2 constructs within their CDK for Terraform or CDK for Kubernetes applications. This means that construct-based abstractions will no longer be confined to a specific provisioning domain, and they will be more broadly usable and valuable for more users. There is already initial work in this space, such as AWS CDK L2 support in CDKTF and CDK8s support in AWS CDK, but I believe we will see more of this decoupling and increased choice happening in the coming years.

Left: The constructs programming model can be used to codify full development experiences. Constructs are not limited to synthesizing infrastructure declarations. They can also generate artifacts such as compiler and test setups, release pipelines, dependency upgrade policies, issue workflows and editor settings. This means that we can use constructs to simplify development experiences. We’ve been exploring this direction with some very interesting results through the Projen incubation project. One can think of this approach as a “meta-IDE”—a programming model for producing integrated development experiences. As these ideas mature, we will be able to address more and more development experience challenges through a common modeling space. Being able to create constructs that encompass cloud infrastructure, application logic, and developer ergonomics can open up whole new possibilities in software development.

Right: Construct-based abstractions are leaky at runtime. Today, constructs are primarily a design-time abstraction—they hide complexity when writing code. But what happens after a construct is deployed? The abstraction is lost, and operators are left to deal with a bunch of resources. If a developer uses the “Static Website” construct in their app they are not required to know how static websites are implemented. However, after their website is deployed, it stops being a website and becomes an Amazon S3 bucket, Amazon CloudFront distribution, and Amazon Route 53 hosted zone—the abstraction does not carry over. Solving this challenge is critical. If we want to raise the abstraction level, we must find ways to interact with these abstractions after they are deployed. We are exploring a few ideas on how to retain the fidelity of the construct tree after an app is deployed, and I’m excited by the experiences we will be able to offer with this information.

I would like to take this opportunity to thank our users, open-source contributors, and the amazing CDK team at AWS. I am proud, inspired, and humbled by this inclusive and welcoming community, which shares a relentless passion for moving the cloud industry forward through delightful development experiences. I also want to thank the authors for publishing this book. I couldn’t imagine a more suitable group to collaborate on such an endeavor.

At Amazon we like to say, “It’s always day one!” It definitely feels like that with the CDK today, even though, as this book makes clear, we’ve come a long way. There are substantial challenges ahead, but there are also exciting opportunities to change how software is created through simplification and abstraction. I am excited that you decided to join the ride, and I hope this book will become a valuable tool in your journey to build better software.

The foreword from The CDK Book by Elad Ben-Israel