Follow me:
Listen on:

Day Two Cloud 097: Azure Cloud Networking Essentials

Episode 97

Play episode

On today’s episode we peel back the covers on networking in Azure. We get details on the ability to inspect packets in Azure and the use of third-party virtual appliances, the role of network security groups, the design implications of how Azure spans its availability zones, and the use of network peering and private links to connect to the cloud.

We also get into options for SSL termination, Virtual WAN, point-to-point VPNs, and Azure’s VPN gateway.

Our guest is Pierre Roman, Sr Cloud Ops Advocate at Microsoft. This is not a sponsored episode.

Pierre’s cloud networking takeaway: Did you think about it first?

Show Links:

@wiredcanuck – Pierre Roman on Twitter

Step-By-Step: Connect your AWS and Azure environments with a VPN tunnel – Updated – Argon Systems

Microsoft Docs Library – Microsoft

IT OpsTalk – Discord

IT OpsTalk – YouTube



[00:00:03.610] – Ned
Welcome to Day Two Cloud. Today’s topic is what’s going on with Azure networking these days, and we’ve got a great guest, Pierre Roman. He’s the senior cloud ops advocate at Microsoft and he has been doing networking for, I think he said, almost 30 years or maybe more than 30 years. So the man knows his way around a packet. What stood out to you, Ethan?

[00:00:26.380] – Ethan
Pierre does indeed know his way around a packet and he’s got a good sense of humor. He’s thoughtful. He is not shy about saying what’s on his mind either. Now, despite all that stuff that he knows Ned, I did manage to stump him with a question he did not know the answer to. And you can have to wait to the end of the show to find out what that was. But I don’t think I was proud of myself. It’s like, oh, wow, the guy there’s something he doesn’t actually know off the top of his head because he was amazing.

[00:00:49.880] – Ned
Yeah, absolutely fantastic, so enjoy the conversation with Pierre Roman from Microsoft. Well, Pierre, thank you so much for joining us today. Let’s get right into it. You’re a fellow human and we want to know a little bit more about you. So what would you say you do around here?

[00:01:09.140] – Pierre
What do I do around here? Well, first of all, I’ve been in IT for well, over 30 years, OK, aging myself a little.

[00:01:19.470] – Ned
It’s all right.

[00:01:20.310] – Pierre
It started with, like, deploying Novell, I think it was one point seven.

[00:01:27.200] – Ethan
I thought I was an old guy with Novell 3.11. One point seven? Wow. OK.

[00:01:31.830] – Pierre
Yeah. Where you had to recompile the server kernel for every time you wanted to change the network interface card because there was no drivers there was built into the kernel. Wow. That was fun. So I’ve seen a lot of like the changes over the years and from back then when we’ve been we were actually having to do every little nitty gritty detail of all of the networking to now just saying connect from here to here and go. Which I which I love, by the way.

[00:02:06.410] – Ethan
But there’s a little more to it than that Pierre, but, you know, I know what you’re saying.

[00:02:09.500] – Pierre
There’s actually the way I look at networking, whether it’s in the cloud or in on Prem, it’s mostly about did you think about it first? Like when you know what your network’s going to look like or what you need to connect where and how, then the rest is just laying cables, whether it’s they’re virtual cables or whether they’re they’re they’re physical cables. But the implementation is not as hard as the architecture.

[00:02:40.470] – Ned
Hmm. Right. Right. But when you’re implementing that architecture, there’s a lot of foot guns involved.

[00:02:46.400] – Pierre
Yeah. And sometimes and we’ll we’ll we’ll talk about it a little bit later. Sometimes the tools are better on one side than they are on the other. And and sometimes the tools are basically not complete or partially there. So I always have to figure it out.

[00:03:06.890] – Ned
Yeah. Well, we wanted to have you on the show to talk about what’s going on with networking in Microsoft Azure. And maybe we could start with, since networking is fundamentally just moving packets from point A to point B and hopefully they get there. Sometimes we want to inspect those packets and see what’s inside them, what’s going on in Azure when it comes to packet inspection.

[00:03:31.590] – Pierre
So there’s a few things with packet inspection and. A lot of people I talked to say, oh, I don’t need to set up anything specific or appliances or anything like that because I’ve got network security group and that’s when I typically go, oh, wait, wait, wait, wait.

[00:03:52.480] Because if we know what the OSI layers are like, we know that we have to basically from layer three to layer seven is where we play in and that’s where we have to inspect and make sure that the payloads and to and from and over which protocol and what’s inside is important when you’re trying to really secure your environment. So I always say, first of all, NSGs are great to filter traffic, but they are not appropriate in my mind, in my own opinion, for inspecting traffic.

[00:04:28.140] – Ned
Right. Because they’re just a filter that says allow traffic from source to destination on these ports or deny that traffic that it’s it it’s just a very basic allow or deny list.

[00:04:39.760] – Pierre
There’s five five parameters. The source address source port, destination address, destination port and allow or deny, that’s it.

[00:04:52.650] – Ned

[00:04:52.650] – Pierre
So that’s all the control you have.

[00:04:55.700] – Ned
OK, so if I want to do something deeper than that, if I actually want to do packet inspection and make some more intelligent decisions, what is the solution or solutions that I could do that with?

[00:05:06.590] – Pierre
Well, there’s multiple things here, because when we’re talking about packet inspection, there is. Do you want to do packet inspection for troubleshooting an issue or do you want to do a packet inspection for securing your environment?

[00:05:20.150] If you wanted to do securing your environment, then go to the either the Azure firewall or any of the other firewalls that we have like that Sophos, Palo Alto checkpoint, Juniper, Qualys. There is a army of of partners out there that have virtual appliances. And I know a lot of companies that I’ve spoken with that have been around for a long time. And one in particular I talked to not too long ago, they’ve always been on checkpoint.

[00:05:49.520] So internally they’ve always been on checkpoint that comes from before the era of the cloud. Their people know it. They trust it. They’re used to it. So talking about them, they’re like, OK, I don’t know, we should we really go to Azure. We don’t really know what it is. It won’t go interface with our own tools. Well. Well, why don’t you just deploy the checkpoint appliance?

[00:06:12.380] – Ned
Like one of the challenges of deploying those network virtualized appliances is getting like high availability, ability to work properly, making sure you’re filtering all your traffic. And like, if you have more than one Vnet, then do you have to drop two of those appliances in every Vnet. Or can you have like a hub and spoke? So what what’s the guidance on how you can effectively deploy one of those NVA pairs.

[00:06:35.030] – Pierre
Now you’re getting into because there’s depending on who you talk to, you’re going to if you ask 50 people, you’re going to get one hundred answers.

[00:06:47.420] Because there is. There really is. No. That, yes, there is bad ways of doing this, but there is there is varying degrees of right ways of doing that because it all depends on your capacity to pay and all depends on your what you’re trying to accomplish. It all depends on whether or not to remember earlier when we talked about architecting your network properly. Are you connecting or are you filtering all of the traffic or funneling all the traffic, sorry, to a single point coming in and out of your entire network?

[00:07:29.850] And if so, then great, because then you put a pair there and then you’re done a load balancer in, front load balancer behind, and then you’re good to go. If you’re looking at the Azure firewall. And I’ll take that one as an example, because it’s not really an appliance. Most of those appliances from our partners are, they’re virtual machines that are hardened and they’ve got their own special sauce on it and they’re deployed as a black box.

[00:08:00.120] So this gets deployed. You have an interface to manage it, but you can’t go into like the guts of that machine and change things. The Azure firewall is a cloud service, so it’s not necessarily a one VM. So there is some redundancy that’s already built into that.

[00:08:24.880] – Ethan
In other words, I’m relying on Azure to provide some resilience in that service without me having to set up a dual pair.

[00:08:30.910] – Pierre
Exactly, yeah. Yeah. So you’re basically using the cloud service or Azure firewall is a cloud service that already has some of that redundancy reliability built into it. Now, am I saying that’s that one firewall can never take a nosedive? It possibly could, but then again now when we go back to another kind of conversation, which is probably not the focus of your or our topics here today, which is like how many nines do you want and how many nines can you afford?

[00:09:09.180] – Ethan
Yes, that’s. All of those nines, none of them come for free for sure.

[00:09:14.110] – Pierre
I’ve had so many conversations with companies, and when we’re talking about high availability and networking, they’re saying, oh, we absolutely want to know. We want we want five nine five nine five nines. And I’m like, you realize that five nines is less than four minutes of downtime per year, unplanned downtime per year. And at four nines I think it is like an hour and a half. Roughly that, yeah, so those millions of dollars you’re going to pay in to support five nines, are they worth an hour and a half?

[00:09:54.050] – Ethan
There are a few companies where it actually might be. But most people go they get that budget line item and go, oh, oh, it turns out we’re more tolerant than we thought we were.

[00:10:03.440] – Pierre
Exactly like if you’re running a nuclear silo or a hospital where people might die, then then, yes, spend the money. If you’re if it means that you’re going to lose two orders on your e-commerce sites for a grand total of fourteen dollars and ninety eight cents plus tax. Well, maybe they can wait an hour to place their order

[00:10:28.610] – Ned

[00:10:29.600] – Ethan
Pierre, can I do SSL, not SSL offload, but SSL decrypt. I don’t know why. I just completely forgot the word, but will it break into the middle of my SSL sessions so that I can deeply inspect that traffic?

[00:10:42.830] – Pierre
Yes and no. You can tell it. You can do SSL termination.

[00:10:48.350] – Ethan
That’s what I mean. Thank you. Yes. Termination.

[00:10:50.600] – Pierre
Yeah, you can do SSL termination or you can let it pass through you. It’s configurable. But if you’re running a workload behind your firewall and let’s say like it’s a web, it’s a web front end where SSL terminates to, and there are other. Solutions that we have that are more suited for that than just a straight firewall. You’re looking at Front Door, for example, which has a firewall built in, but it’s an application level.

[00:11:26.900] – Ethan
It’s a WAF?

[00:11:28.400] – Pierre
Yeah. Yeah, there’s WAF built into it. It will do termination or pass through depending on what you want. It some of it is based on on DNS in terms of as to how it will handle the traffic. Others is based on like the internals of the packets. So to throw my own company under the bus here.

[00:11:49.500] – Ned
Oh boy.

[00:11:50.330] – Pierre
I’m hoping I’m not going to lose my job over this, but I find that we have a ton of different services that all do almost the same. And we called them something very different. And sometimes it’s a bit of confusion as to what I should get and what I should use. Do I get the Traffic Manager or Firewall or Load Balancer or Internal Load Balancer or Application Gateway with WAF or WAF or Front Door or it gets to a point where you like, which one do I pick?

[00:12:28.580] – Ethan
If only Azure was the only cloud service with that problem.

[00:12:32.330] – Pierre
I agree. I agree. At least we call our products in most cases something that’s a little bit descriptive of what it does. I’ve never been a big fan of our marketing, that’s like a Windows Server 2015 enterprise version for Web, like a the name gets this long, but at least you kind of know what it’s about when you read it.

[00:12:59.010] – Ned
Yeah, yeah. With the number of services that already exists, it’s really hard to keep track of all of them. And if you could at least make the name descriptive so I can go, oh, OK, I get what that thing’s doing by calling it Azure firewall. Like I got a pretty good idea what it’s doing. You brought up availability a little bit earlier and I wanted to get into that because one of the things that I noticed is availability zones have been rolling out to all the regions. I don’t think all of them have them yet.

[00:13:29.160] – Pierre
But now we are committed. We are committed to have every zone in the world, which I think we’re a little over 60 now. I keep I keep track. I don’t keep track of the amount because they change so often. But we’re around the 60 ish zones worldwide. We’re committed to by the end of calendar year 2021, every zone, every region is going to be have availability zones.

[00:13:57.150] – Ned
OK, that’s exciting. One of the big distinctions that I think I tripped over when I first tried using AWS is AWS subnets are by availability zone. Whereas on Azure subnets spann. Well, there wasn’t availability zones before, but now there are a subnet can span availability zones, which is very nice. That makes my life easier sometimes. But what is that? Is there a way to pin a subnet to an availability zone? Is that a thing that I would ever want to do? I’m just curious what some of the new design and architecture impacts are from having availability zones available.

[00:14:34.030] – Pierre
Now, that one’s a little fuzzy. And the last time when we talked and you kind of mentioned it and I tried to get one of the engineers and the product manager to have a meeting with me, I am still waiting for this that meeting. But my view on that is it doesn’t really matter to me.

[00:14:57.150] Because availability zones are areas or they’re either areas of or separate data centers, because in some regions, like if I’m looking at a Canada East or there’s there are some regions where there is a single giant data center, but that data center is actually carved into. Three or more separate, it’s almost like three or four different separate data centers inside the same physical location. And when we say physical location, it’s like three square miles of land that we’ve just plopped a pad on.

[00:15:36.830] So inside that one data center, you may have three availability zones because availability zones are a logical and physical separation of network, cooling, electricity and everything that’s basically physical.

[00:15:51.290] There’s also a logical separation to ensure that, for example, storage gets replicated across availability zones. Or if you are deploying a scale set or a cluster, let’s call it a cluster across and have nodes in different availability zones. So if one zone goes down, the other two keep going.

[00:16:13.940] – Ethan
So again, not geographic separation, but system separation. You were talking about power and cooling and so on. So there’s the you can have a fire in one part of the data center. The other parts of the data center are fine.

[00:16:25.760] – Pierre
Yes. Yeah. And we have both a situation where in very large region, when we have multiple data centers, then you can have like physically separated, connected by it’s like super low latency between those zones to allow for replication and so on to take place. So in some regions we do have this physical three completely separate data centers on three different pads, potentially like on either side of the road or in a different and the city, because if you if you can you want to build one on different electrical grid, unless you’re in Texas when you’re screwed wherever you are.

[00:17:09.500] I’m sorry. I’m sorry.

[00:17:11.090] I just I had to make that joke. But in some cases where we have smaller data centers and regions that are just growing, which for us is the case in Canada, the Canada East and Canada Central, then it becomes a physical separation. But not a geography separation that makes sense, like there’s network, electricity, cooling, fire suppression and all that good thing in three separate self-contained units. So if one goes down, it doesn’t affect the other. If the commercial power goes down three sets of generators and power banks and so on.

[00:17:53.120] So that’s that’s the redundancy. So when we were think you look at that and you say, OK, well, my my subnetwork, my Vnet is spanning. So if one goes down, it’s no, it’s only going to take down the workloads that are sitting in there. But if you’ve architected your workload properly, which is going back to the beginning of that our conversation, when I said the vision, the the mental exercise of organizing what your workload that your network is going to support is the most important part, as opposed to laying the pipes.

[00:18:27.530] If you’ve got a node in one availability zone, and a node and another availability zone, we have node A is going to disappear because that that data center is hopefully not in flames.

[00:18:39.230] – Ned
But something’s going on with it.

[00:18:42.260] – Pierre
Something’s going on with it. Right. And in some case, it is physical like it. Last year we had a data center where the electrical got struck by lightning or the cooling got struck by lightning or one section and the data center kept going. But the rest of the the rest of what we had couldn’t support it. And it got to a point where we had to do a graceful shutdown of the data center, if not to lose everything.

[00:19:09.320] But if you were in a availability zone, zone two or zone three. You’re OK, you’re right, you’re good. And so your, if your a subnet goes across too, I don’t see and I’m I’m willing to you to challenge me on that. I don’t see the big problem, OK?

[00:19:31.720] – Ned
Yeah, it was just a it was one of those things when I made the move between clouds, I was like, oh, that’s different. Oh, but now it’s available. But does that impact the architecture? And it sounds like you should be taking advantage of availability zones when it comes to your workload placement. But it’s less of a consideration when it comes to the the Vnet and that the subnet distribution. Yeah, that’s OK.

[00:19:52.720] – Pierre
That’s. In Azure, basically our approach is is a virtual network is not an end in itself. It’s only there to support something else. So if you architect that something else properly then your Vnet is just going to support it.

[00:20:10.750] – Ned
Right. Getting into Vnets a little bit more. When I first started working on Azure, I started creating Vnet peerings because that’s just a natural thing that happens. And, you know, as you add more and you try to create a full mesh, you can quickly that can spin out of control.

[00:20:27.010] – Pierre
Yeah. You end up with a spaghetti plate.

[00:20:29.140] – Ned
Exactly. And sometimes we don’t want the spaghetti plate. I mean, I like that for dinner, but maybe not for my network. Is there anything new that’s been introduced that can help with reducing the number of peering connections while still maintaining that full connectivity between Vnets.

[00:20:45.040] – Pierre
Oh network peering, in my view, is only there to avoid having to deploy VPN gateways to connect every other network that you have. I see it and I’ve always seen that that cloud computing or cloud networking is an extension of the physical that we’ve been dealing with for years. The the concepts are pretty much the same. The implementation of them is different. So when you’re looking at. I’d say a physical environment, I think a company X that has offices in multiple buildings or multiple cities, but they all need to access the same central HQ for the HR database or the corporate workload that needs to happen.

[00:21:37.980] And you have those individual networks that are supporting branch offices and then you have to connect them somehow to that head office. That’s the same thing when you end up in cloud. So you have your Vnet in Region A to support and. And I like to personally, I like to when I design or I help design infrastructure to support workloads. To group everything that has the same life cycle, so the that follows the application lifecycle, so if it’s supporting workload A and workload A is being turned off or because we’ve migrated to something else, all of that goes into the same resource group cause a resource group are just logical containers.

[00:22:31.580] And then I have my virtual network to support that workload. And that’s it. I don’t have a I don’t have a virtual network to support multiple workloads because then it becomes too complicated when you try to filter, you end up with going from point A to point B, having to go through a hundred different NSGs. And if your connection breaks, which one is it? The one at the NIC is the one at the VM. Is that the one on the subnet? Is it the one on the other subnet or is it the one like there’s too many points where it can affect in between.

[00:23:08.590] – Ethan
Is there a design concept where if you take a bunch of Vnets that you need to talk and you don’t want the spaghetti plate, you make a hub and spoke topology out of it? Something like that?

[00:23:17.410] – Pierre
Yeah, that’s what we have. That’s what Virtual WAN is.

[00:23:22.970] – Ned
Oh interesting. When I when I was first introduced to the concept of Virtual WAN, it sounded like more of an SD WAN play where oh, I’ve got I’ve got these SD WAN appliances in my branch offices and I want them all to hook into to Azure. So I’ll use Virtual WAN. But you’re saying it’s one use case, but it could also apply to how does that work with the Vnets then does it use the peering connection or do you have to set up VPN gateways on every Vnet?

[00:23:47.620] – Pierre
So Virtual WAN is basically the architecture, the hub and spoke you just mentioned. OK, that’s what it’s, that’s what its main function is. And you can connect VPN devices to it. And I’m looking at my list, make sure I don’t forget any. So private, so point to site VPN to it so your customers know your customers, your, your people are working at home now because of these, this age of the human malware. And if you don’t have a VPN concentrators everywhere, so you have to get them to connect to the WAN, the Virtual WAN, and then from there they can connect to every other resource that is connected to that Virtual WAN.

[00:24:26.980] So whether it’s a VPN or other SD WAN device or software defined networking device. Users, whether they’re Azure VPN, OpenVPN, anything that basically like IKEv2 client type, of course there are some that are not supported, but we’re not going to jump into that because then I will definitely lose my job. ExpressRoute circuits or virtual networks either in a peered or VPN Gateway connected capacity. So you end up with that hub and spoke as opposed to have Vnet A connected the Vnet B and Vnet be connected Vnet C and Vnet C connected to B and A, and then you end up with that spaghetti.

[00:25:09.530] And every time you deploy a new. Site or virtual network or or work workload that has its own Vnet and it’s, oh, I need to talk to this database was over there. So we have to connect to this. Oh, and because of the monitoring, I have to connect to this. And so for every one you deploy, you end up having to put like ten more connections. And most of those connections are bidirectional. So you have to set ten connections at 20 different end points.

[00:25:38.050] – Ned
So, yeah, you go you go with that hub and spoke type architecture. Is the Virtual WAN limited to one region, or can you hook, plumb in stuff from other regions?

[00:25:47.210] – Pierre
It’s global. You can have a global it basically basically uses the our own backbone.

[00:25:53.680] – Ned

[00:25:54.310] – Pierre
So the the private fiber that we own and in most cases have laid down ourselves to connect all of our network or our data centers or worldwide, when you’re looking at the Virtual WAN you are using that fiber and not public utilities or public Internet. So there’s a bit more resiliency there because it’s under control. We’re not relying on AT&T and Bell Canada, and I don’t know what the rest of the Americas or Europe are using in terms of telcos, but you know what I mean, where we don’t have to worry about utility Internet connections, it’s our own fiber and that’s. Virtual WAN is on that fiber.

[00:26:39.980] – Ned
Interesting. So if I had an ExpressRoute circuit in the US and one over in Europe, I could use both of those ExpressRoutes for my local offices there to hook into Virtual WAN. And then I’m riding the Microsoft backbone across the Atlantic. I don’t have to worry about the vagaries of the Internet and Internet providers.

[00:27:02.420] – Pierre

[00:27:03.320] – Ethan
Which is not the way all the cloud providers work. Ned, as I understand it, depends on which cloud provider. Everybody’s got fiber all over the world. But not everybody wants you on that fiber all the time, though. They’ll maybe punt you to the Internet quickly if they can.

[00:27:17.200] – Ned
Some of them will make you pay for that privilege quite dearly.

[00:27:21.590] – Pierre
And you are paying for it’s a Virtual WAN is not a free, free product.

[00:27:26.600] – Ned
Right. OK, so there there is a cost associated with it.

[00:27:29.960] – Pierre
So there’s in Azure networking. There’s always a cost for egress unless it’s within a region. And if it’s not leaving the region, then it’s like copying from server A to server B within the same region. You’re good. OK, copying from server A to server B in a different region, then you’re playing egress.

[00:27:53.950] – Ned
OK, that makes sense.

[00:27:55.970] – Pierre
Basically, everything coming in is free. Everything coming out you pay for.

[00:28:04.670] – Ned
You brought up VPN gateways, and I think you knew this was this question was coming when I go to spin up a VPN gateway, it takes like 30 minutes. That seems like a really long time to spin up a virtual machine. So can you just if you have any insight, could you provide a little insight into what’s going on behind the covers that’s taking that that long period of time?

[00:28:25.820] – Pierre
OK, that is one of those topics that I told you before we started recording that I was basically told to be very careful in how I approach that particular subject. I will say mea culpa, mea culpa. In terms of me, as in representing Microsoft in in this particular situation, it’s not the best story. And it’s all based on how that the automation to deploy that machine. Because the gateway is, that’s not a cloud service, that’s literally a virtual appliance that gets deployed.

[00:29:10.030] Where it’s deploying a base OS and then over top, it installs the bits that it needs. So it’s basically it’s a built in, I don’t know if I can call it that, but it’s almost like a pipeline where you say deploy this. So it says, OK, well, I need that OS and then I need on top of the OS and I’m I can’t really get into the details as to which OS it is, because apparently there are some confusion as to whether it it’s a Linux back end or Windows back end.

[00:29:38.200] And I was not able to because I know the last time we talked, we kind of had that. Well, I thought it was this and I thought it was that I haven’t had the up to date confirmation as to what we’re actually running at the back end right now. But yeah, so it deploys the OS, it deploys the modules that it needs on it, it actually builds the rules and everything and then sets up the configuration. Yes, if you need to change one thing, it basically tears it down and restarts it, so it’s always like a 30 to 40 minute deploying it, what I was able to ascertain is that we haven’t had too many complaints from enterprise customers that are using it because this is not typically something you tear down and bring back often when we’re doing it in the lab or you’re trying to set up a demo.

[00:30:32.150] Yes, it’s very annoying. And you’re right. Your your your your CI/CD pipeline or whatever your your ARM template or your PowerShell or CLI script, say I’m going to deploy in this environment for my development team to be able to use and then it stalls on. That’s 30 minutes of building the gateway.

[00:30:53.730] – Ned
Right. That’s exactly the scenario that I was running into, is because I do a lot of demos, if that’s one of the things I have to deploy. I’m like, all right, well, I can’t do that demo live about that. That has to be either have to have a warmed up environment or prerecorded just because I know that that’s going to take longer. But you’re right, in a in a regular environment, you would build that once and then you wouldn’t really do much with it except create connections

[00:31:17.330] – Ethan
In a regular environment Ned, you could just go get a cup of coffee and relax because you need to relax, buddy.

[00:31:23.420] – Ned
But I’m special. I want it now.

[00:31:29.480] – Pierre
Special how? Like it’s like my wife keeps telling me I’m funny and I keep asking her my funny, strange or funny ha ha.

[00:31:36.020] – Ned
Does she say yes?

[00:31:38.450] – Pierre
Pretty much.

[00:31:41.120] – Ned
I’ve had a similar conversation. Oh dear. Another thing that we talked about previously was connecting VPN Gateway in Microsoft Azure to a VPN gateway in AWS. Because what I’ve encountered is there’s this first mover problem where each of them wants the other one to initiate the connection. And so neither of them ever does. And you said you might have gotten that working. Can you?

[00:32:05.010] – Pierre
I got I got it. I got it working. And I’ll send you the link to the article I wrote, and that was in the early twenty nineteen. So things might have changed since. Well, I’ve done it a few years ago using a Windows server 2012R2 with RRAS as an edge device in my virtual network to connect to the AWS appliance. And establish a connection that way. In early twenty nineteen, I got it to work, the problems that they have is, as you mentioned, is like the first responder syndrome, meaning Azure can either respond or initiate the call or the connection.

[00:32:52.210] AWS can only respond to it.

[00:32:56.800] – Ned
So you gotta let the Azure side know you got to make the call first.

[00:33:00.760] – Pierre
So Azure always has to initiate the connection. So for some reason, if AWS side thinks it’s not connected, but the Azure side still shows as connected, it’s not going to try to reconnect it. As far as it knows, it’s already connected and the AWS side is not going to say, hey, I’m down or the connections down for some reason and it’s not going to try to reinitiate it. So it’s just going to sit there. And it’s I’ve seen this in production where you look at one site and says connected. You look at the other side and it’s not connected and not quite there. It’s one of those things where we’re all using “standards”.

[00:33:47.290] – Ned

[00:33:48.100] – Pierre
With massive air quotes, but we’re implementing them in kind of like our own little special sauce on top side sometimes. The other thing is we use IKEv1 one for policy based and IKEv2 for route based. So when you’re looking at the VPN, there’s two two different types of VPN connections. Your route based and policy based and AWS only supports IKEv1. Or that’s the last time I looked so that things might have changed. I haven’t I haven’t had a lot of time in the last few months to actually revisit. But now that you mention it, maybe I should try to reset the redo my work to see if it works again.

[00:34:34.210] – Ned
Yeah, I also, I noticed the same thing about v1 being the only one supported by the AWS side and I always thought that was a little strange, but it was the choice they made.

[00:34:43.830] – Ethan
So you think, you think v2 would be supported just for the efficiency of it.

[00:34:51.140] – Pierre
And there’s also a mismatch in the phase lifetime, so phase two lifetime, for example, it’s thirty six hundred millisecond for policy based in twenty seven. And I’m reading my notes here because I know typically I foobar the numbers and then somebody says, oh, that number was wrong anyway. Twenty seven hundred twenty seven thousand seconds for root phase. AWS only has one setting for thirty six hundred, so when the settings are off. Then that the handshake gets a little wonky and then sometimes it connects and sometimes it doesn’t.

[00:35:28.020] – Ethan
In theory, that shouldn’t stop the tunnel from coming up, because I think the way the standards read for IP sec, you should pick the the lower whichever one is, you know, tightens the scope. I think that’s the way that’s supposed to happen. But as you say, “standards”, air quotes you know. Yeah.

[00:35:46.960] – Pierre
So yeah. So it does. It’s getting better. It becomes now how do you. Like what the use case is, it’s getting to the point where what’s the use case of connecting those two, right? You always have the option to deploy a virtual appliance or a virtual machine like you did to get the connectivity going if you find it’s too onerous to do it the other way. I just I thought it was neat that you actually got working because I struggled to do that.

[00:36:16.410] – Ethan
The use case is, is IPSec, air quotes “standards”. It’s there. We should be able to do this. Right? Why do I have to drop a virtual appliance into both of my environments and pay for the cost of those and so on? If you’re not already invested in SD WAN let’s say you don’t want to do that or you don’t want to get into a cloud as a service vendor, that kind of a relationship, then I should just be able to nail up a tunnel. I can do that. Right? You know, yeah. Maybe not right.

[00:36:44.220] – Pierre
Or everybody that says why, why don’t you just use something open source like everybody supports OpenVPN.

[00:36:52.920] – Ned
OK, maybe. Still got to set up an appliance, though. Yeah.

[00:36:58.200] – Pierre
Yeah, but but if your native appliance supports or is based on open VPN, for example, and AWS is also support based on open VPN, then there shouldn’t be a problem with both of them connecting. Shouldn’t be a problem.

[00:37:15.160] – Ned
Right, assuming the same version and you’ve got the same control over configuration settings. Yeah. Oh, it’s always a mess. The other thing that I wanted to talk to you about is IPv6, because I think we’ve all noticed that the public address space for IPv4 is starting to look a little sparse.

[00:37:34.930] – Pierre
The last block was was allocated. I think, if I’m not mistaken, in March, twenty eighteen.

[00:37:41.950] – Ned
At least for the United States or is it out to all the different providers, but they haven’t allocated them all to.

[00:37:49.060] – Pierre
The providers have not allocated them. But the international body that manages that has actually allocated its last block of IP to the service providers. So there are no more to doled around once the service providers run out.

[00:38:05.110] – Ned
Right. So IPv6 is looking a little little attractive, like maybe that’s something we should move towards. What can I do with IPv6 in Azure today?

[00:38:14.870] – Pierre
OK, this is a good story and a not so good story. A good story is IPv6 is a foundational. Did I pronounce that properly?

[00:38:23.360] – Ned
A fundational?

[00:38:25.870] – Pierre
I’m, I’m French. OK, so I put the emphasis on the wrong syllable all the time.

[00:38:31.060] – Ethan
There’s fundamental and foundational. So I guess it depends which one you want.

[00:38:34.510] – Pierre
Actually it’s both.

[00:38:36.410] – Ned
Well there we go.

[00:38:37.990] – Pierre
Because the original Azure fabric was built with IPv6.

[00:38:42.580] – Ethan
Which isn’t surprising to me, just knowing Microsoft’s history with IPv6. In fact, if you’re a Packet Pushers podcast network listener, go to IPv6 buzz. There was a whole interview with Microsoft Internal on how they did v6. And as it rolled out over the last couple of decades, Microsoft’s been a leader in v6 implementation. So yeah.

[00:39:00.820] – Pierre
So it’s part of the fabric. It’s always been there. That’s the good story. The bad story is it hasn’t been implemented in every service that’s sitting on top of the of the fabric and the tools where you can use it. The tools are like you can’t in the portal. You can’t go into the portal and say, oh, for these virtual for that virtual network. I want to add in IPv6 range, address space. I want to use this IP address space for this virtual network and then tell the NICs, oh, you now have an IPv6 stack, get it from that range.

[00:39:39.520] You can’t do it in the portal. You can do it in PowerShell. You can do it with ARM templates, you can do it with Azure CLI and then they will show up in the portal. But there’s no way of doing it in the portal, which for a lot of people is where they start when they’re learning the technology.

[00:39:54.370] – Ned

[00:39:55.060] – Pierre
Very few actually going to say, oh, I want to learn all about Azure, but I’m never going to look at the portal. I’m just going to dive into PowerShell, Azure CLI and rest APIs. And is is like being in the eye with the needle. I’m just saying I’m just being facetious here because this is the way I learn. I do it once in the portal so I can understand how things fit together and then I go hmmm. All right. So if I have to do this again, now I know the order which things need to happen now. I’ve seen it happen. I see it running.

[00:40:31.300] I have a reference architecture I can go to when when I write to my either PowerShell, CLI or ARM template or Bicep now and then I then I figure out how to automate it so I don’t have to sit in front of it and click fifty times on on some progress bars and radio buttons and checkbox.

[00:40:52.540] – Ned

[00:40:53.530] – Pierre
And I think it’s the way most people learn.

[00:40:55.900] – Ned
That’s certainly the way that I go about it whenever I’m deploying a new thing is during the portal first so I can see the architecture. And like I said, you can also get it to render an ARM template for you so you can kind of see what the underlying values are it uses and then go and script it out using your tool of choice.

[00:41:11.950] – Pierre
Except that I find that the export function of any resource in ARM, I think we have some work to do on that. That puts up a lot of stuff that you basically end up having to massage for a lack of a better word.

[00:41:29.740] – Ned
It’s an overabundance of detail and specificity and you need to carve that stuff out and just take, ok these are the actual settings I need to put in and Azure will figure out the rest.

[00:41:39.760] – Pierre
Exactly, since.

[00:41:41.170] – Ned
So IPv6, I’m assuming it’s supported inside of Vnet.

[00:41:46.390] – Pierre

[00:41:47.560] – Ned
And then there’s some of the other services that support IPv6\. Are all the core ones supported like Azure, VMS storage, maybe some of the database services?

[00:41:58.870] – Pierre
Don’t quote me on that, but I believe Storage, Azure SQL, Vnets, VMs. And there’s a couple others that I know for sure that that are like fully implemented. And you can. Use them now, except, like I said, you need to use like PowerShell, CLI or ARM template to enable it. And but considering we are getting to that point where we three years ago we said get off IPv4, get off IPv4 the sky is falling, the sky is falling. And then nobody did and the sky didn’t fall.

[00:42:34.770] – Ethan
Yeah, yeah. And there’s a lot of things going on there, part of its gray market IPv4 where people have found out, I have way more than I needed. It turns out they’re super valuable. I’d like to sell them. So that’s the thing that’s going on. If I have my own IPv6, though, that’s been allocated to me, provider independent address space, it’s quote unquote my v6. Can I bring that to my Azure environment?

[00:42:55.850] – Pierre
Oh, that’s a very good question and that’s a question I do not have an answer to. So if you don’t mind, I’m going to write that down.

[00:43:03.260] – Ethan
I think in some environments you can and in some you can’t. You just going to get it. They’re going to carve you off the cloud providers are going to carve off v6 for you because they got tons of it and it’s not a big deal. But still, there are those shops that are going to want to maintain their own if they can.

[00:43:15.710] – Pierre
I saw not too long ago actually a analogy of IPv6, where, for example, in Azure virtual network, when you allocate an address space, it’s always a slash sixty four and a slash sixty four has enough addresses. And the calculation works so that every person on earth would have like a billion devices that they could each address individually without running out.

[00:43:45.300] – Ethan
Yeah, I don’t know what the ratio is, is, is something insane like that, but it network engineers get hung up on that where it’s like I don’t want to assign a slash sixty four. It seems like such a waste of addresses. Yeah. But you got so much to play with. It doesn’t matter. Think about subnetting, keeping it, keeping yourself sane as you’re trying to manage your address plan, et cetera. Even if that means putting a slash sixty four on a point to point link between routers and people really get hung up on that one. But that that is best practice. As I understand it. Slash sixty four on links is fine. Yeah.

[00:44:15.600] – Pierre
Yeah. And we’re far away from the days where I had to calculate, calculate variable subnet masks so that I could like the heartbeat in between network clusters.

[00:44:27.890] – Ethan

[00:44:29.080] – Pierre
So that if I knew if I had five nodes in my cluster that I would have a subnet that had no more let’s say than like twelve addresses, because I don’t want anybody else to connect to that. So I want to control like gone are those days.

[00:44:42.830] – Ethan

[00:44:44.240] – Ned
I’m pretty sure that was on the CCNA or something where you have this number of devices on the network and how what size subnet mask would you use and you had to calculate it in your head. I’m so glad I don’t. That’s why they invented subnet calculators online.

[00:45:00.200] – Pierre
Yeah, no. And it gets even worse when you say you have multiple subnets that each have bluh bluh bluh and you want to like what’s the what’s the supernet address and the variable subnet master you’re going to use that. It’s the same address base, but.

[00:45:15.800] – Ethan
Just convert it to binary it’s easy. Come on, you guys.

[00:45:19.950] – Ned
Convert the binary? Get out, you get out.

[00:45:23.300] – Ethan
That’s what I used to do. I would sit convert the octets from decimal to binary, do the math that way. Then I couldn’t get it wrong.

[00:45:32.310] – Pierre
What button do I use the mute him?

[00:45:36.440] – Ned
Oh he’s the host. There’s nothing we could do.

[00:45:38.270] – Pierre
Oh God. OK now. Yeah but you’re right, it’s, it’s different. And so the IPv6, the good part of the story is it’s built into the foundation, it’s available to a lot of services, but not all the services. And it’s slowly coming and slowly being kind of like revived. One of the, in the original version of what I think it was called Red Dog. The internal project name of Azure.

[00:46:08.610] – Ned
OK, that might have been it.

[00:46:11.250] – Pierre
Which at the time was only PaaS, no IaaS so virtual network was not a thing. So it really didn’t care because you didn’t get to pick your network. You were just running a PaaS service in the network was completely obfuscated for you. But once we started in the IaaS environment and people started setting up their own virtual network and connecting with on Prem and in doing the thing and considering that everything that’s internal is all private IP addresses from like class Bs and class Cs of private, and you can never run out of private because you put a class B in your subnets in your virtual network and it’s like you’ve got a metric ton of I’m looking to see. Yeah.

[00:47:02.380] – Ethan
You’re listening to this. Pierre just bleeped himself, we didn’t even have to do that in the edit.

[00:47:06.540] – Ned
That was impressive.

[00:47:08.820] – Pierre
But you have enough addresses to do everything and then you really have to be conscious of what you’re allowing out and what you’re allowing in. And for that, we have pools of addresses that you can use. So IPv4 it has a, IPv6 sorry, it hasn’t been this burning demand from our customers because we’re at the point where IPv4 is still doing the job and doing it well.

[00:47:33.090] – Ned

[00:47:33.810] – Pierre
It’s going to become now a problem because of all of these IoT devices that are IPv6 based. And now there is a what’s the word I’m looking for? There is like we’re almost at the tipping point where, oh, now we have to get this done now because there is a there’s a pent up demand. It cause really Azure is a business. So we’re not going to put if we have to choose whether we put our engineering resources to setting up the GUI for IPv6 or building a next service that’s that is in demand.

[00:48:10.440] – Ethan
I see it as a global market problem here, where it’s not as much that IPv4 can or can’t do the job as there are some markets that are emerging that will only be IPv6. And so if you want to connect to those, then you need to offer your services on v6.

[00:48:28.260] – Pierre
And currently in Microsoft, if you’re building a new infrastructure or even an old infrastructure because you can add that in without any problems is running a dual stack.

[00:48:38.970] Yeah, and virtual network and IaaS is completely capable of doing that. So all your VMs the the thing I would say to is IPv6, there’s no such thing as private IPv6 or public IPv6. Right, it’s IPv6. So when you start allocating IPv6 addresses to your machines now, you have to think that. You don’t have that little question mark when you were deploying that machine originally where you’d say, do you want to allow this a public IP address for this machine?

[00:49:12.420] And you go ooh. No, I don’t. Cause this is going to be internal. Now, they’re all IPv6 or they’re all going to have IPv6, which means they’re all internal and external.

[00:49:23.340] – Ned

[00:49:24.740] – Pierre
So now how do you stop? How do you do this? So there is now and now we get back into the firewall and we get back into that load balancing discussions where you segregate all traffic through a controlled end point.

[00:49:40.710] – Ethan
If every device has a globally routable IPv6 address, then right. Your firewall policies become quite crucial.

[00:49:49.770] – Pierre
Exactly. And you’re routing so that everything goes out not to the default router, but the default appliance that’s going out.

[00:49:58.560] – Ned
All right. Well, this has been a far reaching and very invigorating conversation, Pierre. If folks want to know more about you or they want some follow up tips for getting into some of the topics we talked about, where should they go? Where should they look?

[00:50:14.850] – Pierre
Number one, docs, dot com or docs at Microsoft, Is basically where all of our documentation is. So if you go in there, there’s a section for networking and in it is everything I’ve talked about, there’s an article for if you have problems finding it, you can reach me @wiredcanuck, which is my Twitter address. We also have a Discord server that’s set up for community that’s completely open with a permanent invite,

[00:50:49.610] I’ll send you the address for that too, and you can put it in the description. We have our blog on ITOpsTalks, dot com and our YouTube video on YouTube dotcom slash ITOps talk. You can leave comments, you can connect with us and through all of these and I can help you, and point you in the right direction. My DMS are open.

[00:51:13.890] – Ned
Awesome, thank you so much, Pierre Roman, for joining us today on Day Two Cloud. And hey, listener out there, virtual high fives to you for tuning in. If you’ve got suggestions for future shows, we’d love to hear them. You can hit either of us up on Twitter at Day Two Cloud show. Or you can fill out the form of my fancy website, Ned in the cloud dot com. Did you know that Packet Pusher’s has a weekly newsletter? It’s called Human Infrastructure Magazine. You’re the human and it is loaded with the best stuff that we found on the Internet, plus our own feature articles and commentary.

[00:51:46.350] It is free and it doesn’t suck. So that’s good. You can get the next issue via a Packet Pushers dot net slash newsletter until next time. Just remember, cloud is what happens while IT is making other plans.

More from this show

Episode 97