An illustrated guide to Amazon VPCs

4 months ago 21

In this section, I talk about why VPCs were invented and how they work. This is critical to understand because almost everything you do in AWS will happen inside of VPC. If you don't understand VPCs, it will be difficult to understand any of the other networking concepts.

If you're reading this, maybe you have one of these

and you just found out that to put your app on AWS, you need all of this:

complex diagram of a VPC, subnet, IGW, etc

And you have no idea what VPCs, subnets and so on are.

I'll help you learn about all those pieces. A little about me, I’m a long-tailed duck, and I run a business selling phones to hackers, called Blackhatberry. Now let's get started.

This is the story of VPCs (Virtual Private Cloud)s, our first big topic. Many moons (and suns) ago, some AWS engineers were sitting in a room. They had a serious issue.

"Guys, lets talk business. Why aren't more companies moving to AWS?" they said.

"Maybe because all instances run in a single shared network, which means users can access each other's instances, and see each other's data," someone said.

"Maybe because it's hard for them to move their existing servers to AWS, because of IP address conflicts," someone else said.

"Wait… what are IP address conflicts?”

“And existing servers? Shouldn’t they be moving to our servers?”

This is the first reason people weren’t switching to AWS. Here's what I mean by IP address conflicts. I own a bunch of servers for Blackhatberry. One of them has the IP address `172.98.0.1`. Now, my neighbor also has a server for her business. She loves my ip address. “Ah, 172.98.0.1, what a beautiful destination,” she says. So she copies my address! Now we both have servers with the same IP address!

Sidebar: You can find your local IP address using `ipconfig getifaddr en1` (works for Macs for wireless internet connections).

Now you're thinking "so what?". And actually... you're totally right. Even though our servers have the same IP address, they are in different networks, so it's not an issue.

But here comes trouble. Because we both want to get on AWS. But if we have two servers with the same IP address on AWS, that's a problem!

two machines in AWS with the same IP address and the text uh-oh

Bam: IP address conflict. Every server in a network needs to have a unique IP address for the same reason that every house in a city needs to have a unique address. Otherwise, if someone has a package, they wouldn't know which house to deliver it to.

Now I know what you're thinking. Why don't you just get new servers in AWS? Why connect your on-prem ("on premises") servers to AWS? Isn't the whole point that you're moving to AWS?

And sure, we can afford to do that, but there are companies with dozens of on-prem servers. Migrating everything to AWS is simply not an option for them. They need functionality so they can create new servers in AWS, but also connect their existing, on-prem servers into the same network. And this does not work when everyone is part of the same network, because of IP address conflicts.

This was a huge problem for AWS! I mean, see how serious these engineers look:

This IP conflict issue meant people with on-prem servers had no easy way to gradually move to AWS. Think of all the potential customers they were losing!

I'm giving you this background so you can understand why VPCs were invented. IP address conflicts weren't the only issue. In AWS, everyone's servers used to be on the same network, which meant if you were careless, it was easy for anyone to connect to your server and look at all kinds of sensitive data!

A drawing of many servers in a circle, with the caption "everybody's servers on the same network!"

For both these reasons, Amazon needed to give each customer their own private network, instead of having them all on the same shared network. And so VPCs were born.

So there are two problems we're trying to solve:

IP address conflicts
The fact that users can access each other's instances because they're in one big shared network.

Remember: duplicate IPs were totally fine when my network and my neighbor's network were separate. What Amazon needed was a way to give each person their own private network, but inside AWS. That way, they could bring their IP addresses with them, and they wouldn't conflict with anyone else's IP addresses.

Maybe you're wondering, "why can't we just change the IP addresses so all machines have a unique IP address?" Well, in networking, you set up some things based around specific IP addresses (I'll get to exactly what stuff later), so that idea would require a lot of work in practice.

Separate networks would also solve the security problem.

By the way, why am I spending so long on VPCs? Isn't this post about putting one of these

on one of these?

Two reasons.

Because everything we will build happens in a VPC, so it's the starting point for things.
Because a VPC is not something you can see, and I like to visualize my internet architecture. Other people visualize it in a way that's really confusing for me, and I want to make it less confusing for you.

I have read guides (such as the AWS docs) where people visualize a VPC like this. First, they'll say, "Oh yeah, I created a new VPC with four subnets inside it, in two availability zones".

And they'll draw an image that looks like this:

A drawing of two availability zones inside a region with a VPC overlaid on both availability zones.

But less pretty obviously – this is what theirs look like:

A VPC with an internet gateway and subnets in three Availability Zones.

(Taken from the AWS VPC docs)

Now,

a region is a place you can go to,
and an availability zone has data centers you can walk inside, that hold lots of servers.

Both of those are physical places. But what is the VPC? Is it a big tarp that sits on top of the data centers? Is it a dark fog? Is it a general feeling of unease that blankets the region, as all the data centers play Radiohead's "Fitter, Happier" on repeat?

…what is it?

We've talked about why AWS needed VPCs, and the idea behind VPCs, but how are they implemented? How do they actually work?

Your instances in AWS always run inside a VPC. But in real life, of course, your instances are just running on servers in AWS datacenters.

You may have many instances running on many different servers. How does Amazon connect these instances, across different servers, into their own private network? It uses something called the mapping service.

Suppose I have an instance A on server 1, and I want to talk to another instance B on server 2.

A picture of two servers with an instance on each

All instance A knows is the IP address for instance B. It hits the mapping service with that IP address. The mapping service then checks what VPC instance A is in, finds the instance with that IP in that VPC, and forwards the request to it.

A picture of two servers with an instance on each and a box representing a mapping service in the middle.

That "in that VPC" part is important. The mapping service makes it so me and my neighbor can both have instances on AWS with the same IP, but when I hit that IP address, the mapping service will connect me to the instance in my VPC, and when my neighbor hits that IP address, the mapping service connects them to the instance in their VPC.

A picture of two VPCs inside AWS, both with servers that have the same IP address.

The mapping service is what ensures that we can never connect to each other's instances. Through the mapping service, all my instances are connected together, and they can have any IP address I want, because it's like they're namespaced to me. The mapping service is what creates the private network inside AWS for me.

So when you think VPC, picture a service that connects all these instances together.

A drawing of two servers running a bunch of instances, all connected to each other

Going back to this image, we can now understand what it means:

A picture of two availability zones in a region with a VPC box overlaid on both availability zones

That VPC box just means the scope of the mapping service. A mapping service can connect EC2 instances that are on servers in different availability zones, which is why the VPC is overlaid over the two availability zones. But the mapping service can't connect instances in different regions, so the VPC doesn't span regions.

Today everything happens in VPCs. Your instances are always in a VPC. Everyone gets a default VPC when they open an AWS account. We no longer have the issue where users can access each other's instances, and we don't have the IP collision issue either.

So we find out that they don't play "Fitter, Happier" in the data centers after all. Maybe it's just a recording of Jeff Bezos singing "Money" by Pink Floyd.

drawing of Jeff Bezos singing "Money" by Pink Floyd

Throughout this guide, I'll show you how to create AWS resources using Terraform. I find Terraform easier to follow than point-and-click on the AWS console, because you can just copy the code and run it.

Here's the Terraform code to create a VPC:

resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" }

In any Terraform file, you'll also need a couple of boilerplate blocks for terraform and provider. The full code listing is here. You can apply this code to create a new VPC. Ignore the cidr_block part for now, I'll discuss CIDR in more detail in a future section.

In AWS, every customer has their own private network called the VPC.
Without private networks, we run into IP address collisions.
Without private networks, everyone is on the same network, which is really bad for security.
VPCs are implemented using the mapping service.

Chapter 2: subnets