Follow me:
Listen on:

Day Two Cloud 085: Hosting Your Infrastructure Code In The Cloud

Let’s say you’re an IT infrastructure professional with increasing cloud responsibilities. You’re doing as much automation as you can, which is actually quite a lot. In fact, you’ve got an army of scripts, playbooks, and plans you’ve been running from a Linux VM sitting on an old VMware server in the corner. Which works. But you’ve been wondering…

Is there a better way to host this code in the cloud era? Or would we be confusing “better” with merely “different”?

Here to help us sort out hosting code in the cloud is Calvin Hendryx-Parker. Calvin is the co-founder and CTO of Six Feet Up, a Python web application development company.

We discuss:

  • Infrastructure as Code (IaC)
  • Reasons for moving infrastructure code and scripts to the cloud
  • Getting your code ready to host in the cloud
  • Serverless and IaC
  • Hosting options
  • Security and access control issues
  • Cost concerns
  • More

Show Notes:

@calvinhp – Calvin Hendryx-Parker on Twitter

Terraform Cloud – Terraform

Python on Google Cloud – Google

Get started with Azure PowerShell – Microsoft

Python Web Conference – March 22 – 26



[00:00:06.000] – Ethan
Welcome to Day Two Cloud, and I hope you brought your your your swim fins, because we’re going to the deep end of the pool right away when the show starts up.

[00:00:14.460] I’ll tell you what we’ve got Calvin Hendryx-Parker is joining us today.

[00:00:18.240] And Calvin knows a lot about a lot.

[00:00:21.390] And we probe his mind on dealing with infrastructure as code. But from a standpoint of how do you actually manage the code dealing with the infrastructure as code, do you host it in the cloud? And then what does that look like? And this isn’t I feel like Luke Skywalker, Ned in the final trilogy there.

[00:00:41.940] At some point he says this is not going to go the way you think it’s going to go. That’s why I felt this whole conversation.

[00:00:49.860] – Ned
Oh, geez. I feel at least partly responsible for that, because, like you said, within like ten minutes a maybe, five minutes of the conversation, we already, like, dive down into the deep end. Calvin and I are getting into it with details around terraform and using Lambda to deploy stuff. And you’re like, whoa, whoa, whoa.

[00:01:07.840] Back the truck up a little bit, buddy.

[00:01:10.290] – Ethan
But it was it was spectacular. I absolutely loved it. And and actually, we’re hoping to have Calvin back for more shows. And as soon as you listen to him explain things and give you his perspective, you’re going to understand why. So we can enjoy this conversation with Calvin Hendryx-Parker. Calvin is the co-founder and CTO of Six Feet Up. It’s a Python Web application development company. And great stuff coming at you. Well, Calvin, welcome to Day Two Cloud and okay, man, so I hit you up kind of cold on on this conversation where the whole idea was, I want to know about hosting my infrastructure as code management programming material up in the cloud.

[00:01:51.260] Is that a thing that you should do? And and came across your name? And I said, hey, Calvin, here’s an idea that I have. And you said, yes, I’d love to come on Day Two Cloud and have this conversation with you. So why were you so enthusiastic and what is your background? What do you do that this was programming and hosting your code somewhere other than on a Ubuntu server sitting in your data center was interesting to you.

[00:02:15.080] – Calvin
So this really does hit home with a lot of areas that we deal with on a day to day basis. So I’m co-founder and CEO of a company and we do Python and Cloud Consulting. And so I run into so many people who are wanting to just like lift and shift their code from maybe internal sitting on a VM bare metal into a cloud someplace. It’s much better if we can talk about how they can do that in a cloud native way. And so I’ve gotten really, really passionate over the past couple of years about how to get people to think more cloud natively about these kinds of things.

[00:02:47.720] Serverless containers and leveraging all the cloud native tools. Now, I am heavily involved with, like the AWS community actually run the local user group here in Indianapolis. And so it’s definitely a subject I have a lot of passion for and I care deeply about. And I want to see people make the right kinds of decisions.

[00:03:06.450] – Ethan
OK, so when you say lifted shift and moving things to the cloud, it this is a context here for me personally where it’s like, OK, when I have my scripts living on a server somewhere, I feel in control. It works. It’s nearby. It’s right there. It’s a thing I crafted look as a file, it sits there and then I invoke the interpreter and I run it. If it’s a Python script, let’s say. But what you just said about cloud native and moving it, you’re making it sound like there there really are good reasons that I should consider moving this stuff to the cloud.

[00:03:37.980] It’s not just something the cool kids are doing. We have reasons. Is that fair? To put it that way?

[00:03:43.620] – Calvin
I’d say they’re definitely reasons. I used to really be into finely crafted bespoke servers that I felt like I grew from the ground up, but those turn into giant organic messes that will not allow your code to run in any place other than that one server. And it’s kind of pets versus cattle discussion. Like if I’ve got this server I care a lot about and has been in stuff’s been installed by hand over time. How will you ever get that same code to run in another spot?

[00:04:14.410] If you’ve got dependencies that need specific versions and you’re not tracking all that and you’ve just been fiddling the code here and there because you know, your your current whim for what you want with that code to do, you’re going to be in trouble when you need to do something else with it, or if that server somehow goes away and doesn’t exist anymore on the face of this planet. How do you take that to the next step? How do you actually provision your code in a new place or you’ve got to put in a new region or a new data center or someplace else?

[00:04:44.020] So there’s more than just like the cloud specifically. There’s all the other practices that kind of lead up to how do I make sure my code is ready to deploy into the cloud? How do I adhere to those best practices that are going to not leave me stranded or be, the person you need to think about the most is yourself six months to 18 months from now, you really want to be nice to that person? And how do you be nice that person, other than making sure you’re following those practices and getting things crafted in a way that it’s kind of just you don’t think about it.

[00:05:15.220] Like I can just put my code up there. It can run. I think actually the cloud providers even making this simpler because you’re kind of talking about a couple of different kinds of code, like you mentioned, like scripts. You’ve got some some kind of things that provision and do some work for you and kind of one off type things. I mean, the new cloud shells. Well, new Googles is not new, but Amazon’s is really relatively very new.

[00:05:35.950] And they’re doing some really interesting things. They almost nearly free for that kind of operation. But if you really want to turn that process into something that’s even more reproducible, you’re packaging is a container or be able to deploy. Well, now you can deploy containers as Serverless means that you’re not even managing servers anymore. I don’t worry about security patches. I don’t worry about how do I get this onto a server? How do I get that server punch through a firewall someplace?

[00:06:01.180] I just know that I need to get this script up into or this thing packaged into an image. It can now run in any place where a container can run.

[00:06:09.610] – Ned
Hmm, you know, when I think of containers, I think of developers and not infrastructure, folks like me like IT ops people, I thought of containers. That’s the technology for developers when they need to run their job applications stuff. That’s not something I need to worry about.

[00:06:24.880] But what you’re saying makes a lot of sense to me because I think about how much work it takes for me to set up a python environment or to even set up an environment to run, terraform or the Azure CLI. And what version am I on? Am I on the most up to date version? And it sounds like you’re advocating for a scenario where you have a container that gets updated on a regular cadence. Is that something you’d want to manage yourself or is that something you’re comfortable letting the cloud vendors handle the management of that container?

[00:06:58.480] – Calvin
I mean, ultimately, what’s nice about containers is I as the developer, whether I’m a developer or an infrastructure person, I am developing the script that’s going to go in that container. If you think about it, like if I just wanted to develop a cron job that ran like once a day and is going to batch job, if I can package that into an image and deploy that to any number of cloud providers or even deploy it. Serverless, where I’m not dealing with a server, there’s a huge benefit there and I can actually now run like a full continuous integration environment where I can test and run it in some environment where it’s very easy to deploy, you know, put it through a pipeline.

[00:07:34.090] And now I can just simply instead of running a cron job on a EC2 server that I got to manage, I can now run that as a Lambda that just runs once a day or whatever schedule I use as the event that drives that Lambda and it’s basically free. I’m not paying to get done that kind of cost stuff, too. I’m not paying to run a server 24/7. When it’s running a script once a day.

[00:07:57.970] – Ned
Right. That is the sort of situation where Serverless makes a lot of sense, because I’ve heard from various people that if you’re running a pretty constant and and workload, that doesn’t vary a lot.

[00:08:10.390] But it’s always you’re running a certain amount of CPU then going serverless might not make sense financially because Serverless really meant for that more bursty type business or something that runs periodically but doesn’t need that constant compute presence. Is that your experience or.

[00:08:27.310] – Calvin
It depends. I mean, there’s always, you know, with everything that comes with it depends kind of clause, but I think that it can make sense for even workloads that have fairly consistent traffic. But it really makes the most sense when you definitely have that spiky traffic where you’ve got something that you need to scale out horizontally at a moment’s notice, things like Lambda on AWS, you don’t even think about it.

[00:08:51.280] It just obviously some limits that you can get raised and kind of figure out what size your box is for when you deploy those kinds of things. But you also don’t want to deploy a Lambda that uses the new maximum sizes that are available like the fifteen minute or better runtime. Plus I think the new sizes are 16 gig of RAM, like they’re getting some pretty big Lambda sizes that are available, but those come with a huge price tag, huge, pretty hefty price tag.

[00:09:17.020] So you could definitely run yourself into a cost problem real quick if you weren’t paying attention to like sizing you your your instance of Lambda correctly to run whatever your your task is. And maybe you’re better suited to run a more like in Fargate or some kind of a serverless container environment like Kubernetes. I’m not a huge fan of Kubernetes necessarily. I really like the Fargate aspect of less for me to think about. Amazon does more that work for me kind of keeps it in the serverless mindset.

[00:09:45.810] I think those those make more sense for long running, very consistent. And now those scales with autoscaling groups as well. So you do have that flexibility. But I think to get kind of you to take into consideration all the various metrics and dimensions that your workload has as to where it’s going to fit best. Now, the one nice thing is if you want to run in either Lambda or Fargate or Kubernetes or any of these technologies, containers and images are all compatible there.

[00:10:13.600] As long as you run in some kind of an image compatible environment, you can package yourself up locally, develop and use Docker, Docker Compose locally, and then deploy someplace where there’s no Docker. They’re using Kubernetes, which doesn’t have Docker runtime anymore, or using Fargate, which doesn’t necessarily need Docker. You just runs the packaged up. Container, the instance image in container, they’ve made so many mistakes in how they named container registries because they’re technically registries of images.

[00:10:44.150] – Ned
Sorry, it should be image registries, right?

[00:10:47.350] – Calvin
It should be an image registry. Technically

[00:10:49.790] – Ned
Yes. Yeah. Oh, that’s so that’s that’s there’s a lot to unpack there, if I remember correctly. So just for the listeners and also for my mental. Remembrance Fargate is just running containers as a service and but you don’t need to manage any sort of container orchestrator on the back end, whereas kubernetes, it’s also containers or pods, but that you actually have to interact with the kubernetes orchestrators.

[00:11:17.740] That sort of the difference between fargate and kubernetes.

[00:11:21.960] – Calvin
I mean, Fargate is does do task orchestration for you. I feel like you’re less involved is a little more hands off like there’s kubernetes has a lot more flexibility. I mean, a lot more rope to hang yourself with when it comes to sophistication and customizability, like you can tell it about different kinds of kinds of machines and kinds of containers that are available for it to run the workloads in where Fargate is kind of a little more like, oh, I got a task.

[00:11:48.820] And this container about this size make it go. And then what I like about the way we how we use Fargate, we develop applications, we’ve launched them into Fargate. But we’ll also use right alongside it, you know, the other AWS tools and services, RDS for a database or Elasticache for Redis and for cloud front for a CDN. We still use an ALB for a load balancer in front of it. And so it all works together, being kind of Amazon tools, but also like about as I’m not locked into specifically using Amazon’s orchestration or deployment tools we use.

[00:12:24.070] For example, we use Terraform to deploy all these pieces. And so that’s well supported in terraform to define your tasks, define the sizes, define the scaling properties in terraform. And then I just push that into our pipeline. And then the pipeline the Code Build pipeline does all the fiddling of the bits for us whenever I deploy that.

[00:12:44.800] – Ethan
I love, that we jumped into the deep end of the pool. We’re like this and it’s like I just got there it is ten feet to the bottom.

[00:12:52.210] I’m going in. Here we go.

[00:12:54.110] – Calvin
When you’re talking to a developer who’s been an infrastructure person or an infrastructure person is a developer. I’m not sure which which direction that goes.

[00:13:01.450] – Ethan
Well, and a lot of ways that’s that’s everybody in infrastructure is becoming a developer and some time to some point or another. And and for me, I started way back in the day as a developer. And I’ve got you know, my career’s mostly been in infrastructure, so it feels like it’s all coming back around again. Ned, as you said a minute ago, there’s so much to unpack from what we’ve been talking about here in the deep end of the pool, right at the start of the show. One thing you mentioned, Calvin, was talking about Lambda. If I’m going to use Lambda to execute a task, for me, there’s time limits and there’s resource limits for the function that I am running. Let’s say my function. The thing I’m trying to do is something tied to infrastructure as code.

[00:13:43.330] It’s it’s provisioning. There’s a lot of async communications that can happen there. Where is Lambda a good fit for that sort of a task and when is it not?

[00:13:55.180] – Calvin
There’s a lot of options when it comes to using the Serverless and Lambda is there’s the step functions and lots of kind of state machines, you can actually piece together some really good talks this year at Reinvent about those specifically from some pretty big names like I sat there, one from the Lego group, and they talked about how they use all these Serverless bits together like that.

[00:14:16.150] And when it comes to infrastructure provisioning, if you’ve got things that are asynchronous, I mean, be able to throw those up, kick them off as a quick Lambda that fires and kind of forgets and then throws out into queues. And then you’ve got fan out functions and step functions that pick up from there and then can do the little bits of work, all very, you know, either is in their own series or in parallel. You get a lot of options to either speed up your process because maybe you were doing something in series before, like all serially, like do this, do this, this, this.

[00:14:47.650] Now you can say, well, there aren’t any real dependencies between those things all going on simultaneously. I can now kick it off and actually have them all fire simultaneously and do them parallel so you can actually speed up the processing. And we’ve had customers speed up. They had a process that ran on bare metal server, took four and a half days to do a machine learning model build. Now, using Serverless, we were actually able to get that down to about 90 seconds, which is heavily parallel operation.

[00:15:14.330] But because we could launch hundreds or thousands of Lambda simultaneously, the total cost of running the operation greatly decreased because there was no bottlenecks standing in the way. We were just waiting on idle operations or we were fully maximizing the CPU throughput of the operation, kind of removing any blocking processes.

[00:15:35.020] – Ethan
Well, it’s interesting because this comes back to that whole lift and shift problem versus cloud native. You’ll lift and shift. You’re probably bottlenecking yourself. You’re not thinking in a cloudy way. And what you just described is if I’ve written a script that is sequential, it does a thing and then it does a thing and it does a thing.

[00:15:52.270] – Calvin
Right. And you can fit that better to the various profiles of your Lambda is like you may not need to spin up a Lambda with a ton of memory to do a very quick, like, kick off script that you can use the smallest size Lambda for obviously the least at least inexpensive Lambda. I think a lot of people think they’re going to lift and shift in the cloud, but maybe they don’t think this they see the cloud as expensive because the only way they’ve ever thought about the cloud is if I lifted what I have in my data center and put it in the cloud, it’s going to cost more.

[00:16:18.250] And that’s absolutely true. It is going to cost more because you’re not thinking about how do I actually take advantage of streamlining, you know, really using exactly the resources I only need and no more and be able to to elastically scale those resources up and down as I need. So you it’s going to require some thought process change. Some some applications can be deployed in a more cloud native way. Some applications may need to be refactored to support some of this operation.

[00:16:46.150] So that’s why it’s it takes a little more thought to go to a fully cloud native way of thinking and deploying your code. But you’ll benefit greatly from it. I mean, with the the speed with which you can release new features into a cloud native type of a deployment, as opposed to the traditional monolith application running on bare metal, you go from weekend conference bridge release sessions, parties to we release hundreds of times a day in some cases.

[00:17:17.370] – Ethan
But it is a change in how you think about provisioning and how you do tasks. Your skill as a developer needs to go to go up here because you’re rethinking your your procedures, your process as you if you map, if you the kind of person that maps how you think into your code, maybe you need to make some of that go away and learn from a developer’s perspective how to actually create these tasks and have the computer do the work for you.

[00:17:48.130] – Calvin

[00:17:48.730] I also think I also claim your developer happiness will also go up along with your marketability as a as a developer or an infrastructure person.

[00:17:58.780] You’re going to generally be happier and more marketable, more valuable to the company.

[00:18:02.890] Now, the company will be happy because in the end they should save money. They should have a happy developer. They should be paying less for infrastructure, a lot less for infrastructure. If they go for the cloud native and be able to release and get features out to their customers at a more rapid pace, they also should improve the hopefully code quality.

[00:18:21.850] I mean, anytime you get a chance to refactor and go in and look deeper at your code, be able to put in unit tests, functional tests, end-to-end tests will run those in the CI environment so that every time any developer. Commits code, you now can get a red green is as good as a code going to pass, you know, hopefully eliminating defects, reaching into production.

[00:18:46.850] – Ned
One of the things that might help is taking a more declarative approach, because I know, like when I first started learning Infrastructure as code and the first thing I encountered was cloud formation, not terraform, though everybody would think that it was terraform first. Cloud formation was my first entree into infrastructure as code, and it made me really rethink the way I was provisioning infrastructure, because before that I would write a PowerShell script that would OK, go do this and then do this and then do this. And it was very sequential and very imperative. And then when I moved to declarative, it was like, OK, you don’t have to worry about how it’s done. Yeah. You just have to tell it what you want.

[00:19:26.600] Is that sort of the best way to go is to just find a declarative model that puts the onus on something else to figure out how to chop that the dependency graph and how to parallel analyze everything? Or is that something you need to go through mentally yourself for deployment?

[00:19:42.830] – Calvin
So I kind of started similar to you. I mean, not cloud formation. But we were doing we used for our infrastructure as code operations. We were doing Salt Stack. I’m telling you, there was the cloud library built into it. But it’s also very it’s not so much you declare what you want the end result to be kind of like you do with terraform. There were still a kind of like sequential operations like this thing depends on this thing, depends on this thing.

[00:20:07.370] That depends on this thing. And these are the run in the right order or stop and error out. And you don’t end up with a kind of a half built infrastructure unless you put in place all the bits to roll it back. The big issue we had with that is it’s very fragile. You have to maintain your own calls, all those API calls. I mean, AWS and a lot of the other cloud providers, it’s like walking on quicksand, how fast they can change their underlying APIs.

[00:20:33.890] And they obviously all the time rolling out new infrastructure and the new services and new three letter acronyms that you can wondrously deploy into their clouds. And if you’ve just got a power shell script that sat for six months, it’s already out of date. Good luck if it runs. I’m a python guy and same thing’s true for Python and like the Boto APIs, Boto’s nice and it does wrap a lot of the stuff out of the way, but it still moves forward at a very rapid pace.

[00:21:01.850] Whereas if I model something in, say, terraform, where I say declaratively I want these infrastructure components, and I want you to go compare what’s out there with what I’ve got and you figure out the API calls to make that all all that magic happen. What I’ve developed I feel is a little less fragile because I, I’m not making the calls directly. I’m relying on that tool to do the work for me. I kind of model the world I want to live in.

[00:21:25.910] And then I just have to make sure I keep that model up to date with what Terraform does. And terraform itself moves a little slower.

[00:21:32.330] But Terraform does a really good job of keeping up with all the enhancements to each of the cloud providers. And I think they do that. They’ve got the each of the writers kind of responsible maybe for where they’ve kind of siloed off each provider inside of Terraform to make it easier for them to keep up to date with all the moving changes. So you kind of get that abstraction, a nice abstraction between what I want my cloud provider to build and what’s going to do the building in the cloud provider.

[00:21:58.520] – Ethan
There’s a point here to make, though, about. What it is that you’re provisioning as far as that dependency tree goes, so so most of my focus is on networking and network automation. When you are working on hardware, something physical in your data center, while the order of operations can matter a whole lot, just depending on what it is you’re trying to get done. The nature of that change, whether there’s live traffic going through the network or not, whether you’ve got an out of bad network to work with or not, which is sadly not a given.

[00:22:30.390] And you can really shoot yourself in the foot if you’re not thinking very carefully about those dependencies. And and we can fail at scale because you’re you were talking about the tools can get things done very fast. Oh, yeah. Yeah. You could really blow up a lot of stuff in a hurry. So there’s a concern. Again, I just want to make the point that it depends on what you’re building and the nature of it, how those dependencies look and how much you can rely on a tool to do all of that stuff for you.

[00:23:02.280] – Calvin
I think once you get above the physical hardware layer, using terraform makes sense, no matter whether you’re in the cloud or in your own data center. Yeah, like, I think I would still do that because I don’t want to deal with I don’t want to finally craft my my switch configuration. I don’t want to finely crafted router configuration. I don’t want to deal with like firewall rules because it is, like you said, very easy for a human kind of messed that up.

[00:23:25.100] Terraform gives you nice ability to kind of test check that you can even write test suites for the terraform states to make sure that it’s going to do what you expect this thing to do before you even go out and run it. And so I think a lot of benefit there. Plus, keeping code is another benefits alongside this, which is keeping code in a code repository, having an ability to code review and maybe have a pull request process, putting in place actual software development lifecycle processes that I think network people maybe infrastructure people like us hadn’t used in the past benefit greatly from that process.

[00:23:58.940] Now from in terms of quality and not releasing regressions or defects into our environments, whether it’s infrastructure or software.

[00:24:06.800] – Ethan
When you’re provisioning something, let’s let’s stick with terraform. Let’s say we’re going to use terraform to provision a thing and my terraform processes running in in public cloud, am I using my there in public cloud to only provision things in public cloud or can I use terraform to provision something that is to my in my physical data center that there’s a network path there. Let’s assume that. Would you.

[00:24:29.940] – Calvin
Yeah. You mean you absolutely have the ability to do both. I mean it works in all the major public cloud providers in some small niche ones, you know, support for VMware and like on on premise data center technologies as well. I’m not using much for on premise. I’ll tell you honestly, we we left our last physical piece of hardware we ever managed happened probably about two years ago. And we started out as a company who had servers and racks and closets, you know, data centers with cages.

[00:24:59.630] And I used to be totally in love with all the blinky lights, but no more. That does not that doesn’t drive my passion anymore.

[00:25:07.460] – Ethan
More and more people I’m talking to are like like you in this in this regard. They’re they’re over the blinky lights for sure. It’s got to be virtual. It’s got to be if it’s networking, it’s got to be a virtual network function of some sort. If it’s, you know, services and so on why but on bare metal. No, no.

[00:25:25.820] I’ve got to stand it up in the cloud because of course I will. But let’s say I’m in this hybrid cloud environment. Calvin and I, you know, where I run terraform tends to be up in the cloud. Would you architecturally, as you think about it, would you run terraform from public cloud to provision something on premise or would you stand up, terraform on premise, on premises to build that thing? That’s on premises?

[00:25:52.050] – Calvin
I mean, I really don’t know if it matters too much. I mean, for me, I mean, our typical workflow right now for deploying infrastructure into the public cloud is yet to have some at least some place to maintain that state. As long as there’s a shared state which has numerous technologies behind Terraform that allow for sharing that state amongst most people, most people who are working on a project, they can run it from their local laptop as long as their laptop has VPN access or API access to whatever cloud you’re provisioning into, whether it’s in a data center or in a public cloud, it doesn’t matter because it’s going to compare the current save state, the current state of what’s in the cloud to what you’re proposing for changes and give you back a check saying, is this what you really expected to do?

[00:26:36.180] Was this really what you meant to deploy up into that cloud? And you can also then run your testing against that as well. We I most have I’m running it locally from my laptop, gives kind of developers a little more power, like certain pieces of infrastructure, like kind of the global platform that all of our apps may sit on, maybe controlled by one repository of terraform code that each developer who is working on a specific project now has the terraform files, the state files in their repository for that project because there’s going to be application changes because we’re we’re living in a cloud native world.

[00:27:09.600] There’s going to be application changes that are going to be tied to potentially infrastructure changes. Like I may need to tune the Redis cluster, I may need a tune cloud front headers and a change I make to the application that can be tied directly to changing cloud front headers or allowing certain ones through to the application. And I want those to be an atomic commit so that if something were to go wrong, like, say, the CI failed or I do a bluegreen deploy, I start seeing failures.

[00:27:34.860] I can easily roll out that change very quickly and get the infrastructure change right alongside the application change. And I think we’re living in a world now where we want to we want to put out lots of small changes and be able to monitor accurately and get the metrics off of our application to see whether we’re getting success or failure or some percentage of them. And it’s going to really be about how fast can I roll back and I want to roll forward fast. But if I can roll back a change, like just as quick as I rolled it out, then there’s no harm.

[00:28:07.230] I mean, as long as you’ve got a good monitoring place, you got observability, you’ve got some kind of traceability. You can kind of detect where a failure is actually happening and kind of draw a line directly to a commit in your code repository that one atomic change gets rolled back out or gets fixed quickly and redeployed because you’re deploying, say, hundreds of times a day.

[00:28:27.100] – Ethan
It’s like you haven’t lived. Man, don’t you want to plan a huge change over the course of, you know, three months, have it go through the change control process, get approved, come into the data center at one in the morning and sit down and start grinding away on that change. That doesn’t go well. And then it’s six o’clock and you haven’t slept, but it kind of got it mostly done and it’s OK. And you didn’t have to run that.

[00:28:49.050] Don’t you want to live that life?

[00:28:50.820] – Calvin
No, no, I’m telling you, I’m done. I mean, I’ve lived there. I get it. I’ve had really interesting conversations, you know, being our companies, a smaller consulting company. But we do kind of higher application work like this. We always wonder what the big boys do. So it was like an opportunity to talk to folks who work at large Python applications like Instagram, for example, giant Django Python application. And it’s interesting to hear how they manage that. And it’s it’s very fast.

[00:29:18.720] I mean, there’s hundreds and hundreds of releases per day in their CI environment. The the changes are fairly decoupled, makes it easy to roll them back out. And that really inspires me to to strive toward that, even for smaller projects, because the bigger the change, the harder it falls.

[00:29:37.280] – Ned
Speaking of coupled and lots of changes and that high change rate environment, I think you kind of alluded to this before, but I want to pick it apart a little bit more.

[00:29:46.370] To what degree do you combine the code to build your infrastructure with the code for your application? Should they be in separate repositories or should they be in the different directories of the same repo? Would you use the same pipeline to deploy them or is it too different? Like, that’s the thing I’ve struggled with is to what degree are the two coupled together versus keeping them somewhat separate?

[00:30:09.350] – Calvin
More and more, they’re getting more coupled together. I mean, we’re building software that is decoupled like, say, more micro service type software, where I may deploy the websockets part of the app.

[00:30:20.360] I may deploy the the API part of the app. I may deploy separately the front end, which is like maybe the react portion of the application we’re talk about like a Web application here if I keep those pieces decoupled. But inside of each of those, the front end or the back end the infrastructure to the code, there is coupling there so that I can control, from a developer standpoint, more granularly how my application behaves, because maybe I said I want to be able to tune it and want to take advantage of these cloud features.

[00:30:51.470] But I don’t want to wait for, you know, put in a ticket with the operations team, have them do their work. I really know what I need. All of our developers are empowered to understand what they need from the application and the operations teams, obviously still watching and tuning, tracing, monitoring and looking for performance issues. And they’re contributing into the code repositories right alongside the developers. Like there’s not there’s no throwing over a wall anymore. I don’t feel like I feel like we’re really living in a world where you’ve got teams of specialized people, maybe SREs who are working on making sure that the performance is good.

[00:31:26.300] They’re dealing with the real world interactions with your code. But the developers themselves need to understand the infrastructure as well so that they can take advantage of the cloud and fully cloud native components, you know, be able to leverage infrastructure infrastructure as code to gain them functionality in the application so that they don’t have to write it. I mean, why do I want to reinvent wheels that may already exist?

[00:31:50.490] – Ethan
I think a little bit of what you said was probably the kinds of companies you’re working with that are very forward thinking are like that.

[00:31:56.360] I don’t think I think they’re still throwing things over the wall, all depending on the organization.

[00:32:00.890] – Calvin
I’m sure that they exist. And I’m going to start campaigning really hard against some of these enterprise companies who just refuse to move toward this more modern processes.

[00:32:10.910] – Ned
I like that we avoided the word DevOps. Good job, everybody. We all get a gold star. That is something I’ve thought about a lot. As to what degree should they be coupled? And I think your example of if I’m provisioning an app and there’s infrastructure that is solely dedicated to that, maybe it’s some Lambda functions, maybe it’s an Elasticache, maybe it’s some back end serverless database, then that should probably live with the rest of the code because it’s dedicated to just that one app and that makes it easier to debug and test.

[00:32:44.960] But if it’s infrastructure that has to support multiple applications now, that’s maybe like a shared services repository or something like that. Is that a pattern that you’ve seen?

[00:32:55.340] – Calvin
Yeah. For us, I’m just going to give you how we’re doing right now for some of our Amazon projects.

[00:32:59.990] We have, you know, multiple accounts in our organization and we’ll have a single repository that basically does all the organizational structure, the sub accounts and all the shared consolidated billing it basically it deploys and provisions the account container that you may deploy your one app into. And then we have the project. If there’s one single application we’re deploying for a specific customer into that account, that code that terraform states will live inside the repository for that specific application. And the big benefit there is you get the you get during like code review or all the source code control commits are together.

[00:33:40.580] So I can I can quickly find when an issue happened. It’s all chronological. I got all the commits tied together and keeping that atomic-ness to the commit. I mean, as long as you’ve got developers who are following again, good practices like all the changes, maybe they maybe they worked on a branch, but maybe they did the right thing and kind of squash it down into a PR pull request and brought it into the application as part of the release.

[00:34:03.900] Again, want to keep these changes small, nimble, quick, and you get more value out to the customers faster that way. And then you can trace back when there are issues a lot faster directly to the commit that caused the problem. And then you get the infrastructure piece and the application piece right in one spot

[00:34:22.220] – Ned
OK. So the other thing that I think about is when you are doing the actual deployment, if it’s from like a developer’s laptop, but it seems like it’s more and more going to be from a pipeline, like where does that pipeline get permission to deploy into these accounts?

[00:34:38.630] How long does it have that permission for? How do you manage those credentials for deployment? I mean, before if you are using a bare metal server, you probably just gave it some keys to use and you’re like, well, it knows how to SSH into things and deploy stuff and we’re good.

[00:34:54.860] But that doesn’t fly, you know.

[00:34:58.250] – Calvin
No, and we’re using like we depend on the cloud technology, but like when we’re inside of it, we’re using the the SSM became, I can’t remember all these, Service Manager Parameter store is so many, so many acronyms these days. I can’t remember what all of them are anymore. But this primary store can securely store the secrets you need, for example, for the infrastructure pieces so that locally I’ll be using my AWS credentials, my own token, to be able to access something which I store securely in 1Password.

[00:35:34.820] So I have a 1Password that puts it into my environment temporarily when I need it. Then I have a I can assume a role in another account, depending on what project I’m working on. So I don’t ask store like ten keys, one for each project, I have one, me. I elevate my privileges for that specific project when I’m working on it and release. Now that release process may typically be pushing code into a code repository that kicks off a code build or BitBucket like CI type job.

[00:36:06.110] Based on the results of the success or failure there, it’ll typically release an image into a container registry. Now you can have it listen, for that event, which is nice, with all these clouds, you have the ability to listen for various events happening in the infrastructure and reacting to them with Lambda or whatever the case may be, then it can do the release process. Nowhere along the line do I store any secrets in these containers, which is another nice thing is you can feed the secrets into the containers with environment variables.

[00:36:34.550] So, I mean, leverage this 100 percent always environment variables. Parameter stores are perfectly compatible with the terraform tasks. You can just tell it to go grab this key, that key, that key, that key out of the parameter store. When you launch in this environment and when you launch to production, it’s a whole separate set of keys, which means you can now provide developers with dev environments. They don’t have the same passwords as production now.

[00:36:58.590] Now you can kind of keep production, I mean, as needed. Right. A need to know basis. If you don’t need to know the production database password, there’s no reason for you to have it. So you kind of get plausible deniability. I think as a developer that’s a benefit. I’m going to run the same code, whether it’s in my local machine or in production. But now I can’t mess up production accidentally OK. How many times you heard of someone having the production database password on their laptop connecting thinking they were working locally, migrating database tables and just causing an utter mess?

[00:37:31.430] I that that happens and it’s not not pretty.

[00:37:36.290] – Ned
It’s happened to some of us who might be on this episode right now. I swear that was the development files here that I tested my robocopy script on. Oh, no, it wasn’t. Yeah, oh no.

[00:37:51.050] – Calvin
Now you can use IAM roles and privileges. You can give the containers specific roles. You can have your developers having different roles. You have each environment having different roles. And so you get a nice separation of concerns. And obviously if you’ve got apps that have sensitive data, you may have an audit kind of concerns where you need to ensure that people can’t access data without there being a trail, a log of who access the data when and where.

[00:38:17.570] – Ned
You think I think about all these services that we just take for granted in the cloud, having like a robust IAM service, having a robust Serverless service, that you could just say, hey, give me a container, give me a virtual machine, just I just need for a little bit. And then I think about what most people have running on premises. And it makes me little sad.

[00:38:37.010] – Calvin
Sad and worried.

[00:38:40.280] – Ned
This is total pontification. You can pass on this if you want, but do you think we’re ever going to reach a point where the cloud like services we’re used to in the cloud ever make it back down to on premises? Is that happening? Going to happen?

[00:38:54.530] – Calvin
Has to happen. I think it has to happen.

[00:38:56.500] – Ned
Has it has to happen.

[00:38:58.050] – Calvin
I don’t that we just can’t sit around with code on old servers anymore. There’s just too many entryways into that code to those servers from a security standpoint that I don’t think companies can afford to to continue down this path, that they would be like, well, if I don’t touch it, it’s like it didn’t exist there. Nothing happens if I don’t look at it like it didn’t happen.

[00:39:20.350] I don’t think that’s that they’re going to be too much liability, even though I’m I can’t imagine businesses where even the smallest kind of a business they’re going to have personally identifying information, customer CRM type database, a single application that some internal spun up over the summer that happened to be collecting data online. And they’ve got stuff to place where it’s like accessible, the Internet, those kinds of things just just cannot fly if we hope to kind of move the whole world forward technologically because it’s just people won’t stand for it.

[00:39:54.370] You’re going to be sued out of existence. So either you will you will move forward or you will not exist.

[00:40:00.530] – Ned
I think that gets back to your point, Ethan, where you’re asking about where should this stuff run from? Should it run from the cloud, should it run on premises. And I’m kind of coming down on the side of it should probably run in the cloud on a managed service of some kind to lower your security exposure and your liability.

[00:40:17.060] – Ethan
So that leads into a question I wanted to ask you, Calvin. There are companies listening to this. Some of them are like, yeah, man, we’re doing all this and we’re out there well, on the way and there’s some big companies going. I don’t know half of what they’re even talking about or how to get started. And it sounds really complex. And in fairness, it is. This version control that we’ve barely mentioned. But that’s an important part of this discussion and and the pipelines and the testing that’s going on, let alone creating the artifacts to begin with and moving this all along.

[00:40:45.440] So for someone who’s more just trying to get going with this approach to managing their code, even if it’s infrastructure as code, is there a packaged solution that does a lot of this for them, maybe streamlines the process as opposed to I need to use this service to do this and this other thing does this and I’m going to kind of cobble it all together.

[00:41:04.010] – Calvin
I mean, I think there are there’s some nice tooling out there that you can kind of either use or buy into when it comes to look at whoever you’re using already for source control. Like if you’re using GitHub, BitBucket, get Lambda, Azure, DevOps. All of those tools I just mentioned, have really brought in a lot of this developed software development lifecycle, all this workflow into their tool to make it easy for you to just follow along at home when it comes to best practices.

[00:41:32.120] I need a wiki, I need documentation, I need issue management. I need, you know, Kanban board or some kind of some means of of moving things to release process. They all include typically build pipelines, container registries, like all the technology is kind of needed to do this best practices. It’s kind of baked into a lot of these things. So I would recommend instead of trying to piece together twenty different technologies, one for each of these different kind of concepts, trying to find something you’re already using in your internal infrastructure, you look at your code repository to start with because they’ve already got some tools there for you.

[00:42:10.520] I’m not going to go have a different CI system if there’s one built into these but well. Now, I say that and at the same time, we don’t do that all the time because sometimes you realize when you outgrown a piece of it, a portion of it, or you want to kind of keep a little separation of concerns because maybe you are a little scared of just one public cloud provider. Maybe you do want to have your data backed up into.

[00:42:35.660] So maybe you’re not ready to have Amazon whole all the things like I’d like to have a copy of it and say Backblaze because it’s different data centers, different different organization, kind of separation of concerns there. And if something were to happen again and granted their services are very, very reliable. Every once in a while, a key fumble on S3 indexes. Happens from time to time, but no data was lost, it just took a while to get to it, but you’d have your data in another service.

[00:43:06.970] You can easily deploy into another cloud or deploy on premise. How long it’s going to take you to order a server and get one and racked up. You’re still talking weeks. That doesn’t work. That’s not how this world works anymore. But be able to get all those tools into one spot and just leverage an existing ecosystem of those tools, whether it’s like the Atlassian ecosystem as whether it’s the, you know, the AWS ecosystem because they’ve got code build and code code.

[00:43:34.420] This code star, I think they really combine it on the one just called Code Star, cause it’s like Code five different.

[00:43:39.860] – Ned
Did they? Thank goodness. Because that was so confusing. I was like, yeah, it’s code something

[00:43:46.180] – Calvin
Code code thing, code things code star. If you just look up Code Star it’ll show you all the services. But yeah, it is nice integrated tools. It’s easy going now. Getting started is still a little harder because great. I got all the things, whew.

[00:44:01.840] How do I know how to Dockerize, how do I make an image out of my application so I can deploy it. Like how do I take advantage of some of these more modern cloud native bits of it. But you just got to start with those those foundational pieces. First, you’ve got to have a code repository. You got to have documentation and you got to have issue tracking. You’ve got to have all these things available to you. If you don’t have those foundations in place like issue tracking, wikis and things, it’s it’s impossible to go the next step.

[00:44:27.700] Don’t even try. You really need to have even good communication processes like do you have a slack channel where people can actually talk to one another? Do you do you have a way of sharing, you know, screen sharing and coding together and pair programming? Do you have a way of can you write a unit tests, do you have a way to run the tests? Do you know do you understand what CI is? That’s a good place to start because that’s that’s going to lead you into all the other cloud native niceties.

[00:44:53.830] – Ethan
Calvin I feel like we started the outline of about thirteen or fifty or thirty podcasts to discuss here and dive into this, a lot of this more deeply. You touched on so many things, but for this show, I think we’ve covered a lot and and enough for today. Now, as I was researching who would be a good guest to talk talk about this, your name came up because of various presentations you’ve done and so on. Would you tell folks how they can follow you and find additional material that they can consume of yours?

[00:45:27.340] Sure. On Twitter, as CalvinHP. Probably all my presentations are on YouTube from various conferences that I’ve given presentations at, and I also like running conferences. So if you are interested in joining in the fun, the Python Web conference is actually coming up here in March, March twenty second I think to the twenty sixth. You can check that out at I would love to have you all join us. And because there’s going to be a track actually about cloud, a track about data and AI, and a track about app dev.

[00:46:01.870] Oh. And a track about culture. So if you’re interested in developer culture, more the human side of what we do, come check us out. And that’s going to be really great conference. I’m super excited about that.

[00:46:14.170] – Ethan
It’s a virtual conference, I assume.

[00:46:15.990] – Calvin
Yeah, it’s a virtual conference, actually. The Python Web Conference has always been a virtual conference since twenty nineteen before virtual conference were even hip and cool. So I try to stay cutting edge on all things technology.

[00:46:28.440] – Ethan
Very good Now, you said you’re running this event, Calvin.

[00:46:35.010] – Calvin
Yeah, yeah, I’m Six Feet Up. Our company is one of the organizing companies for the Python Web Conf. This will be our third year and we’re super excited about it. The speaker lineup is just fantastic. We have an amazing, diverse group of speakers. It’s not your standard tech group of speakers, I think you’ll be pretty impressed with the caliber of folks you can listen to and interact with. That’s another thing we really focus on. We built a whole application using cloud native technologies to host this conference because we really felt like there was a need for a better way to do virtual conferences.

[00:47:13.510] And so we actually built our own app, we use Fargate. We use all the technology that I talked about during the podcast, actually to deploy this application.

[00:47:22.720] – Ethan
Calvin, thank you for spending time with us. Again, folks listening. This is Calvin Hendryx-Parker and again, Calvin, thank you for chatting with us. If we can get more of your time, we’ll have you back to dive into some more things. Maybe we’ll deep dive on something like how do you get that app containerised and deployed? You know, how do you leverage Lambda in a way that makes sense for the way Lambda was designed to use to be used and so on.

[00:47:46.760] So, again, thanks very much. And if you’re out there listening because I mean, if you’re hearing this, then you’re out there listening, right? Hey, Virtual High five. Thank you for tuning in.

[00:47:54.460] If you got suggestions for future shows, more things that you want to hear us chat about, we would love to hear your ideas. You can hit us up on Twitter. We are at DayTwoCloudShow or fill out the form on Ned’s fancy website,

[00:48:08.620] Now, this is the Day Two Cloud podcast, which is part of the Packet Pusher’s Podcast Network and the packet pusher’s offer, a free weekly newsletter called Human Infrastructure Magazine. HIM is loaded with the very best stuff that we found on the Internet each week, a lot of engineering oriented content, plus our own feature articles and commentary. It’s free. It doesn’t suck. We don’t sell your information or anything like that.

[00:48:27.580] It’s just genuinely something we’re doing for the community to share good information, expose blogs that maybe you’ve never heard of before that you’d like to subscribe to and so on. And you can get the next issue absolutely free And until then, just remember, cloud is what happens while it is making other plans.

Episode 85