Follow me:
Listen on:

Day Two Cloud 111: Infrastructure As Software With Kris Nóva

Episode 111

Play episode

Kris Nóva, Senior Principal Software Engineer at Twilio, claims that managing infrastructure using tools like Terraform isn’t that far away from just writing your own code to do the job yourself. Kris joins co-hosts Ned Bellavance and Ethan Banks to challenge the notion that ops folks can’t become developers. Kris says they can. What’s more, Kris thinks they should become devs. Give this episode a listen to understand why she feels that way.

Follow Kris

Sponsor: CBT Nuggets

CBT Nuggets is IT training for IT professionals and anyone looking to build IT skills. If you want to make fully operational your networking, security, cloud, automation, or DevOps battle station visit


[00:00:00.000] – Ethan
[AD] Sponsor CBT Nuggets is IT training for IT professionals and anyone looking to build IT skills? If you want to make fully operational your networking, cloud security, automation or DevOps battlestation, visit CBT Nuggets com Cloud. That’s CBT Nuggets com Cloud.[/AD] [00:00:23.730] – Ned
Welcome to Day two Cloud. Today’s topic is Infrastructure as software. And why you, yes you, listener right there, sitting right there listening to my voice. You’re a programmer. You might not think you’re a programmer, but we are here to change your mind. Our guest today is Kris Nova. She is the senior principal software engineer at Twilio, and she has some really interesting thoughts about what the difference is between configuration management and writing software. And it’s not as different as you might think, right Ethan?

[00:00:53.460] – Ethan
Oh basically, it’s not different from her point of view. And we’re going to get into something that. Ned, for me, this I feel at this moment that my mind has been changed based on the arguments that Kris made saying that, hey, why make this big dividing line as an Ops person between these special tools you use and domain specific languages such as you might encounter with Terraform, let’s say, and being an actual developer with a general purpose language, because it’s not that different. And I walked away fairly convinced, Ned.

[00:01:24.270] – Ned
Yes, it was very convincing and just very thought provoking. So enjoy this episode with Kris Nova.

[00:01:31.340] – Ned
Well, Kris, welcome to Day two Cloud. And what we’re here to talk about today is Infrastructure as code and maybe infrastructure as software. But let’s level set for people who might be familiar with Infrastructure as code, but everybody kind of has a different idea of what that is, what it means. So could you define how you think of infrastructure as code?

[00:01:51.960] – Kris
Oh, wow. Starting off with easy questions. Yeah. Let’s talk infrastructure as code. Yeah. So infrastructure is like I’ll do real quick. We’ll do a red first approach with us. Infrastructure, to me, is everything below what I would say an application team consumes right? It’s always been that point in solving technical problems. Where you go, this should just be solved. And as an application engineer, I shouldn’t have to deal with it. And of course, that line means a lot of different things to a lot of different people.

[00:02:22.280] – Kris
And different teams have different concerns. But if we can make the assumption that that’s infrastructure, infrastructure as code is just bundling that up in a way that you can reproduce it right? And I look at this reproducibility element as kind of like the first big lesson in the infrastructure industry, if you will, for the past ten plus years, and we had gotten to a point where we needed to create infrastructure, needed to manage it, and we needed to be able to repeat it, and we needed to do that well.

[00:02:51.430] – Kris
And therefore, I think we started to naturally go down this, well how do we capture that? How do we represent that? What does that look like we should probably start writing some of this stuff down, and, oh, hey, we’re writing it down, and now we can start actually responding to the things we’re writing down. And this is starting to sound all pull out like a configuration language and enter Puppet, Chef, Terraform, Ansible, Salt the whole nine yards there. And to me, that’s infrastructure as code.

[00:03:18.020] – Kris
Right. Let’s write this down. Let’s define what it is, let’s express what it is, let’s declare it, and then let’s write some software on the back end to bring that to life.

[00:03:27.070] – Ned
Right, one of the things that I’ve struggled with a little bit is the differentiation between infrastructure as code and configuration management, because in my mind, they do two different jobs, but there’s definitely some bleed over in terms of the tools and also kind of what they do, who has which responsibility. How do you feel about the differences between IAC and config management?

[00:03:50.620] – Kris
I mean, I’m jaded. Right. Like, I’m old. I’ve been giving this for a while, so I feel like my response here, always, I always want to break things down into computer science primitives. So I would also want to hear your thoughts on this because I have my thoughts, and I’m certainly happy to share them. But before we get into that, can I flip it around? Because I want to hear on your end, what does configuration management mean to you? And how is it different for you? Because I want to see if that jives with what I have kind of stewing in the back of my brain.

[00:04:21.010] – Ned
Right. The tables have turned the microphones in front of me now.

[00:04:25.660] – Ned
Yeah. When I think about infrastructure as code, I generally think about immutable infrastructure or as close to immutable as we can get. If you need to make a change, you’re going to destroy the infrastructure that’s there and rebuild it using the code. Whereas config management means it’s sort of a long running resource that needs to be mutated over time. And so your config management tool is going to be running on that thing, or at least checking in with it and going, do you still match the way I want you to be?

[00:04:53.670] – Ned
No. Okay. I’m going to revert you or I have a new config I want to push down to you because there’s a new requirement from the application or something. So if I put into tool sets, like for me, Terraform handles the infrastructure as code, whereas something like Ansible or puppet would be more config management.

[00:05:10.100] – Kris
Yeah. So I’m glad I asked. Cause this is why I want to get really crisp about which words we’re using for what here, because these words mean different things to me. So when I say infrastructure as software, it sounds like we’re almost saying the same thing as infrastructure. Sorry. As config management, except for I’m probably taking a step further and just drawing more of, like an intellectual or cultural line around how we approach some of it. So, like, I had mentioned, repeatability with like, why I asked with infrastructure as code, which, again, this goes back to immutability and whether it’s immutable or not, whether you can change it or not, I think that’s beside the point.

[00:05:47.650] – Kris
I think what it is is we could recreate it from the same place. That’s really the value I see with the first big infrastructure as code. And, yeah, there’s great things about making your infrastructure in such a way that you can’t change it or you can’t SSH into it or you can’t do anything. There’s a lot. There’s some good patterns there. It’s also really annoying if you need to change something. And I know I can SSH into all 100 nodes at the same time, and we only need 80 percent of them to work.

[00:06:15.360] – Kris
And this production is on fire, and I just need to change a zero to one, spinning up all new infrastructure. That’s hard, right. And there’s a lot of complexity with that. So I think this trade off there.

[00:06:27.660] – Ethan
Well, it feels like we’re making a distinction where there maybe isn’t one, because if you deploy infrastructure as code, there is a configuration that is implicit in that that is happening. There is the config that’s going on, but Ned going back to your definition. It feels like you’re making a distinction between here’s an object that has a configuration, I send it out into the world, and if it doesn’t do what I need, I’m going to destroy it and make another one that’s got a different configuration as opposed to I’m going to change the thing that’s out there and tweak its configuration, and then that’s configuration management to you.

[00:06:57.910] – Ethan
Is that how you’re making the draw on the line?

[00:06:59.920] – Ned
Yeah. That’s where I’ve drawn the line before, but it sounds like I might be wrong about that. So I’m curious to hear what Kris was about to bring up in terms of how you’re thinking about config management.

[00:07:11.640] – Kris
So when I think of config management, I hate to say it, but I think of C++ templates. I think of Helm charts. I think of, you know, Terraform Turing complete configuration languages, because I think of, like, yes, we need something that’s got to reconcile that state over time, which is really a two part process. So first observing the state of the world and then reconciling it. But I think when I think config management, I think it’s the tools, and it’s the idea that, like, we, we express what we want, but then we want to be able to tweak it for different places.

[00:07:45.900] – Kris
And there’s some sort of management system in place that allows us to basically render those tweaks and bring them to life on a conditional basis. So this is why I started finally, just like, throwing my hands in the air in drawing the distinction between infra as code and infra as software, which is like Config is great but we need to start understanding that there’s a whole world of how we reconcile that config that we can play with as well. And then I said that that’s the software component, and we can either try to standardize that, or we can start to just finally just accept the fact that we’re an engineering team and we probably ought to be building or at least contributing to a substantial part of whatever that software is.

[00:08:28.140] – Ned
I think I sort of dealt with that a little bit back in the early days of PowerShell desired state configuration. It was a tool where you could write your own modules for desired state, and you had to create a get set and a test component to each module. And then you had to script out how it checked for the state and then compared it against desired state and then ran a set of things to change what it just checked to match what the desired state was. And that sounds like like that’s more software than it is just straight up configuration.

[00:09:02.740] – Ned

[00:09:03.080] – Kris
Yeah. And that’s the point. Right? It’s, like, I feel like the config is so tangential to all of this. So ancillary in fact, I want less config. Right? Like, I want on and off that’s your config, and I want the software to be responsible for doing all of these if statements and all of these conditional comparisons and understanding what this means to us. I think the problem is when we start expressing things in config, it’s a slippery slope to go from. We have a few fields, too.

[00:09:29.650] – Kris
We have a thousand fields, too. We have a thousand Turing complete fields to we’re now building applications in Config land. And that, to me, violates all of this in the first place, because it’s like that’s why we have software. So I don’t know. I feel like I’m a bit egregious when I say things like standardizing infrastructure management was probably a necessary evil, but ultimately it was one of the biggest disservice we could have done as an industry.

[00:09:53.180] – Ethan
It feels like you don’t want to have to care about the infrastructure thing, the infrastructure and the config are like one thing that we deal with, we define it. We create it and in the virtual world, in the cloud world, that makes perfect sense. Now, if I go old school and talk about network stuff, something that was a six figure spend that lives there forever, then you’re in the config management world forever with that physical, expensive object in the cloud world. What is the point of thinking that way?

[00:10:19.190] – Ethan
That’s how I’m interpreting what you’re saying, Kris.

[00:10:22.620] – Kris
And to be clear, I work at Twilio now, and I’m a senior principal engineer on our bare Metal Kube team, where we have three, I think slash 22s I want to say, and we deal with BGP routes, and we’re in bare metal network Kubernetes land. We’re PXE booting servers and we’re dropping nodes, and we’re having to write software to reconcile that. I’m opening iDRAC consoles in my browser every once in a while. It’s old school, right? It’s the real deal. And in that world, it’s totally like there is this big conflict of interest when it comes to how we manage these big stateful workloads versus how we do things in the cloud.

[00:11:05.210] – Kris
And ultimately, if you look at the cloud, that’s your abstraction, right? I’ll take a tool like kops because I worked on it, and I see you have up-up when you’re license plate there. So I’m assuming you also worked on kops, but we took all these paradigms in the cloud, we abstracted them. And I feel like we did a really good job of writing a meaningful abstraction that did one thing and do it well. Whereas I look at things like Terraform that just rewrites the EC2 object in the form of a provisioner and says, Great.

[00:11:32.960] – Kris
Now you have another abstraction. That’s a one to one abstraction between what we have in the cloud. And so I just don’t know if there’s any value in that. I don’t know. It’s bringing it from an Http, API to a Go struct. It’s really introducing all that much more value as a management tool. So anyway, I think about the stuff a lot.

[00:11:53.620] – Ned
So one term you brought up a few times was Turing complete, and that was something that I saw in the blog post that sparked this whole conversation. And as someone who has a CS degree, I totally understand what Turing Complete means. But maybe you could explain to the audience for a little bit.

[00:12:12.720] – Kris
Yeah, absolutely. I love this term because to me, we’ve alluded to this, like, where do you draw the mental boundary? This is it. For me, this is the line in the sand, right? Turing Complete just means that you have a computer and the most primitive form of what we, as humans, can think of as a computer, which means you give it a set of instructions and it can compute those instructions. And really what that, everything in computer science can be boiled down to pretty much three basic principles, which is the ability to have a concept of memory, to remember things, the ability to iterate over things, and the ability to have a single logical switch.

[00:12:50.540] – Kris
And with those three things, you can build recursion. You can build iteration, you can build computer systems. And that’s what a computer is in its most primitive form. And the whole point there is that one of those three is a logical switch. The moment you introduce a logical switch into your config, you you’re no longer in the config business. You’re in the application building business. And that is a super slippery slope to get down, because we all we all see it. We all want to have an if statement that says, if production, TLS, if not production don’t TLS, we want to do that.

[00:13:29.970] – Kris
That’s what we do we’re humans. Right. I’m just saying it’s like I talk about this children’s book. If you give a moose a muffin. Right. Like if you introduce a production switch for TLS, you might as well introduce a production switch for this other thing. And if you’re going to do it in production, you might introduce a switch first for Dev and for stage. And then you start introducing application switches. And then you start introducing application switches based on what cloud they’re running in, and for what team they’re running for what version you want to run.

[00:13:57.100] – Kris
And then before you know it, you’re back to software land. But you’ve done it all in configuration management land.

[00:14:02.280] – Ethan
Your tone suggests that this slippery slope, adding all these conditions is a negative or a bad thing. Is that what you’re trying to say? Or just making the point that this is the road you’re going down? If you do this.

[00:14:12.540] – Kris
It’s just making the point that this is the road you’re going down if you do it. And again, let’s go back to C plus plus templates. Right. We wrote C++ templates to abstract out configuration to make our life easier. All I need to say it’s a three letter word is called Lua. That’s all I need to say for anybody to understand that you can start writing Turing complete applications there. And that’s why I draw the line in the sand is like, if it is a Turing complete system, it’s a programming language.

[00:14:41.230] – Kris
If it’s a programming language, we should treat it as a programming language. And there’s reasons. As programmers, we do things like we have we have feature work, we test our code. There a lifetime generation upon generation of programming language constructs and how you manage teams of engineers around programming languages. And I feel like we throw all that out the window when we start dealing in Turing complete config management. So I feel like this is like I told you, politics would come up. I feel like configuration management is analogous to late stage capitalism.

[00:15:14.610] – Kris
This is just where we’re going to end up. Right, like, this is just what happens when you start putting if statements in your config. You’re going to end up with an application, whether you call it that and think of it as that or not.

[00:15:26.990] – Ned
Okay. So if I’m taking just a if you can draw, like, your ideal config that didn’t have any of the stuff in it, what would be in the config? And then what would be in the software that implements that config?

[00:15:40.180] – Kris
So if you look at the Kubernetes Cluster API project, we spend a lot of time talking about you know, if we wanted to create a holistic Kubernetes API that spanned multiple clouds, what does that look like? And we looked at CNI. We looked at container storage. We looked at a lot of these successful projects or paradigms in Kube. And what we found out is that there’s kind of two main patterns you either have a very empty set of configuration, or you have a very verbose set of configuration with a lot of switches.

[00:16:11.960] – Kris
And there’s obviously trade offs to each of these. But when you look at having a static config, we found out that it actually just makes more sense to just declare very, very small amounts of common logic, like, what Kubernetes version do you want? And then actually, what we found out is really trying to pull too much of that is the config was a bit of an anti pattern. And really, what we’re saying here is we actually just need to change the piece of software that reconciles that config in different environments.

[00:16:42.400] – Kris
So it turns out that we just wanted to say, run Kubernetes 1.15, but we wanted that one that 1.15 to mean different things in different clouds. And that’s the big paradigm switch is we actually don’t put the logic of what each cloud should do, in the config, we just replace the controller that reconciles that logic in different environments. And when we went through that exercise, we actually found out that we really don’t need to declare all too much. We actually are now writing software in Go and writing unit tests, and we now have a lifetime of small micro projects.

[00:17:22.150] – Kris
This is the AWS controller. This is the Azure controller. This is the VMware controller. And each of these just interpret a very slim set of config differently. And then we started to find out that there’s actually AWS public topology where we do public subnetting controller. And then we do the AWS private topology, where it’s all private and locked down, and you need a bastion server to get into the subnets. And then all of a sudden, it was like, oh, and these two controllers can share some common libraries, but they’re really intrinsically different.

[00:17:50.680] – Kris
And we discovered this was more of a software engineering problem than it was a config management problem, which is why when I hear things like config management, my brain goes C++ templates, Helm charts, putting all this cloud specific stuff into all the config. And I’m like, I don’t think that’s what we actually need here. I don’t think that’s what we actually want to do. It’s what we want to do, but it’s not what we should do.

[00:18:12.980] – Ned
Right so, to me, if I’m not writing that software, I’m just consuming the software. So I have a config. I want to use cluster API to create a cluster in AWS, not having those fields available in config is somewhat limiting to me as the consumer, because now I’m reliant on a software developer to write that software to implement that portion of the config, because I can’t be more specific about it. Do you see that as a big limitation, or is that a feature, not a bug?

[00:18:42.110] – Kris
All of the above? Right like, I think in a perfect world, I’m challenging the idea that infrastructure management teams, infrastructure operators should also be software engineers. That’s new. That’s hard. There’s a lot that goes into that. That’s a very loaded statement. I also think that there’s something to be said about if we are in a world where we have generic config and we are in a world where we have use case specific config. Let’s just put things where it goes right. Like there’s nothing wrong with pulling out EC2 instance sizes into the AWS specific component here and making that a flag that you consume.

[00:19:21.550] – Kris
I think where you run into trouble is where you try to over standardize, and you try to say, let’s give Tshirt sizes to all the clouds, and let’s have that a small translate to an EC2 large, and then you’re getting into these weird anti patterns. But why is a small large and you’re trying to make all these assumptions. And I feel like Kubernetes really has given us a lot of value in abstracting things, but in a weird way is probably over abstracted a lot and put a lot of complexity between what you’re actually trying to do and what the software is actually going to be doing for you.

[00:19:53.960] – Ned
Right, you said something really important to me there, which is we’re asking infrastructure operators right now to also be software engineers, and I know that we’ve heard that from the audience. They felt that pain of I had my way of operating things before, but now you’re telling me I need to know how to write full blown Go code, or I need to learn the entire HCL language for Terraform. What would you think of being the ideal way that infrastructure operations should handle things and what should be the domain of software engineers?

[00:20:28.160] – Kris
I mean, I think in my mind when I look at an infrastructure engineer, they’re like the Navy Seal of the engineering org. Not only do they have to do the same thing applications engineers need to do, but they need to do it underwater with people shooting at them, at midnight with scuba gear on. And I kind of feel like that’s really what we’re saying here is we need to go above and beyond and that’s hard. That’s challenging. And I don’t think I don’t think we’re doing anything too terribly different than what we’re doing today.

[00:21:01.710] – Kris
When you think about it, I wrote this project called NAML, which stands for Not Another markup Language, which takes a one to one comparison of a Go file and puts it up line by line against a YAML file. As it turns out, there’s actually not that the syntax, like trivial syntax is really the only big differentiator between a Kubernetes deployment YAML and a Kubernetes deployment go. And I don’t really think that anyone who is approaching some of this configuration management and is learning how to do HashiCorp config languages or learning Chef or learning puppet or any of these.

[00:21:44.680] – Kris
I don’t think that’s any harder than learning how to write an if statement in go, or Python or JavaScript, right? It’s really not that big of a difference. And I guess my point is, if you start using puppet config language, what you’re doing is you you’re now a puppet engineer, you’re a puppet software engineer, and you’ve learned puppet. And I think in your career it’s wiser to learn a programming language.

[00:22:11.630] – Ethan
Okay, yeah. You’re advocating for learning a programming language, because you can get the same kind of a job done. We have, If we’re not software developers. Maybe a fear of something like Go or C, because as soon as you drop into that, if you’ve not seen that before, it seems like an awful lot to get your head around and rather cryptic what’s going on. Whereas a domain specific language might be a little more friendly to an operator, feel a little more, a little easier to get your head around.

[00:22:34.870] – Ethan
But, Kris, I agree with you under the hood. The fundamental constructs are the same. You’ve got operators, you’ve got conditional statements, you’ve got looping and iteration, and those things are common everywhere, no matter what language you look at. So it feels like that’s the argument you’re making, since fundamentally it’s really the same. And once you get your head around the crypticness of something and the way you express yourself in a given language, learn the more generic programming language you can use everywhere. Because why wouldn’t you do that? It’s more applicable to your career.

[00:23:07.350] – Kris
I also think that honestly, if anybody here has ever written a Helm chart before, I think Helm charts, you have to learn more to write a Helm chart than you do to just write a Go program with a client Go implementation. I feel like 90% of writing any Kubernetes YAML, Helm aside, is learning what it means in Kubernetes that now we also have to learn what state functions are and what a secondary syntax, and we can’t really test it. We can’t really compile it. It’s not going to yell at us when it tells us what line number we messed up on.

[00:23:43.600] – Kris
I just think that at some point we need to just look at what we’re doing, and we just need to ask ourselves, are we introducing complexity because we’re afraid of something or introducing complexity because we need it? And I’m just not convinced that anybody who’s writing any of these configuration management languages couldn’t also be equally as effective in just a regular old programming language.

[00:24:05.820] – Ned
I think, and Ethan, you kind of pinpointed us as well. There is a fear there if I’m coming from an operations background. I’m used to bash scripting or PowerShell scripting or whatever. I’m comfortable with that level of things, but I’ve also worked a lot in config files. I’m always on some sort of server opening up a config file, changing some values that feels familiar and comfortable to me. And also, if I’m working in the cloud, I’ve used Arm templates or cloud formation, so that feels kind of normal to me.

[00:24:33.500] – Ned
So it’s a natural progression to go that way, but then you miss out on all the tooling that exists for generalized purpose programming languages.

[00:24:43.420] – Ethan
Well, but, Ned, to follow a point you were saying, it feels familiar to, I would say it’s scoped. It’s more narrow in definition, it’s what’s less intimidating as well, whereas a general purpose programming language. Kris, to get your head stuck into that in the beginning, just that initial learning curve can be a little overwhelming. What do I do with this thing? It’s like being handed paint brushes and a blank canvas make a pretty picture. How do I do that? Exactly?

[00:25:10.420] – Kris
So I completely agree. And what I look at is let’s look at the culture behind application or traditional software engineers and operations engineers. I’m a, I’m a girl, but I’m an Ops guy. This is what I do. I approach problems and I can solve them quickly because I can think of systems. I can be reactive. And honestly, I like the serotonin hit that I get with Operations and every Ops person, every finger quotes Ops guy that I’ve met is kind of the same way. Is this is what they can look at something and they could say, I get it, and that’s the value that we have in a world of where that’s rewarded.

[00:25:55.360] – Kris
That’s why we’re paying people. That’s why they’re here. That’s what makes them effective, their ability to see something and say, I get it and then change it and learn it. I can’t tell you how many operations people I’ve seen in our production system teach themselves a kernel paradigm within five minutes, just because they can just see how it works and watching them piece it all together. And so anyway, in a world where you’re only looking at config, that’s the only place you’re ever going to learn from.

[00:26:21.130] – Kris
So I think that us as a community, as infrastructure management, we probably have a great responsibility to share those patterns of saying like, it’s a Go file or it’s a Ruby file or Python file instead of proprietary configuration management. And I think that we’ll just see that people start to pick it up more and more. In a world where we have this blank canvas and paintbrushes, and we’re surrounded by people who are able to learn things very quickly. Yeah, I would just like to just start seeing more involvement with the operational paradigm of how we function involved with the engineering element of how we build feature driven work.

[00:27:03.800] – Kris
And I think that’s, that’s hard. Like I said, this is a loaded statement.

[00:27:08.150] – Ethan
I think it’s easier if you’re living in the cloud world, Kris, to go the route that you’re talking about. It’s harder if you’re dealing with traditional infrastructure. If you’re still dealing with metal on Prem vendors don’t send you down the direction you’re describing. I think if you’re an AWS or Azure et cetera world, then it is much easier to go down the road you’re saying. Ned, cloud development kits that we recorded about in recent weeks here?

[00:27:31.280] – Ned
Yeah. The fact that if you want to use Go or Python or whatever, there is a development kit specifically for AWS and Azure and the other major clouds, or you can just interact directly with their API. If you prefer to do that, the on Prem stuff, a lot of the times they don’t expose an API to you, and you have to use whatever they’re weird proprietary UI is, or God forbid, you have to load up a Java Applet on this specific version to configure your SAN array, not that that’s ever happened to me.

[00:28:03.060] – Ned
If someone’s thinking about all right, I’m doing some infrastructure as code now, but I’d like to start making this change over to infrastructure as software. Where could they get started? How would they begin bridging that gap?

[00:28:19.160] – Kris
I think this goes back to the operational folks are natural learners and their natural observers. So I think there’s something to be said about having a good example here, and I think that puppet is a great example of writing the software to reconcile configuration. I think there’s a ton of examples in Kubernetes, if you think about it, if you set load balancer equal to type, or service is equal to type load balancer, gosh, I can’t believe I just screw that up. But you expect you expect, especially in the cloud, that there’s some concept of a load balancer to be created.

[00:28:55.480] – Kris
In a bare metal world you have to figure that out on your own, which is probably intrinsically a lot harder to do. But I mean, that’s an example of using software to mutate infrastructure and to represent infrastructure in a one to one way, and that’s bundled in an application abstraction. So I think a lot of it’s going to be understanding that there’s going to be some change. And I think there is a little bit of a there’s a dance here. You have to understand when it makes sense and when it doesn’t.

[00:29:22.360] – Kris
And I think for me, the Turing complete line in the sand is like my first big fire alarm that goes off whenever I start seeing a lot of Turing complete constructs and baked in the ways that are coupled with some sort of proprietary tooling that gets a little worrisome for me. And then other than that, I think it’s just finding those use cases where it’s an easy place to get started. As I found out that building a platform in which you can add use cases and features, too, is actually more valuable than actually solving a use case in the first place.

[00:29:57.760] – Kris
Getting a tool that you roll out to all your systems that you can add a feature to is actually substantially more effective than finding that first feature, because once the tool is there, it’s relatively easy and it’s not a lot of work or investment to lose context and switch what you’re doing to just add a yeah, let’s go configure the switch. Let’s flip out a node, let’s go change the file system. Let’s reformat this hard drive. Whatever. Having a place is more important than having the feature in my mind.

[00:30:26.400] – Kris
So I would say that’s a good starting point as well.

[00:30:29.540] – Ethan
[AD] We pause the episode for a bit of training talk. Training with CBT Nuggets If you’re a day two cloud listener, you are you’re listening to it right now. Then you’re probably the sort of person who likes to keep up your skills as am I. Now, here’s the thing about cloud as I’ve dug into it over the last few years. It’s the same as on Prem, but different. The networking is the same, but different due to all these operational constraints you don’t expect. And just when you have your favorite way to set up your cloud environment, the cloud provider changes things or offers a new service that makes you rethink what you’ve already built.

[00:31:01.520] – Ethan
So how do you keep up with this? Training, and this is an ad training company. So what do you think I was going to say? Obviously, training and not just because sponsor CBT Nuggets want your business, but also because training is how I’ve kept up with emerging technology over the decades. I believe in the power of smart instructors telling me all about the new tech so that I can walk into a conference room as a consultant or a project lead and confidently position a technology to business stakeholders and financial decision makers.

[00:31:28.560] – Ethan
So you want to be smarter about cloud? Cbt Nuggets has a lot of offerings for you, from absolute beginner material to courses covering AWS, Azure and Google Cloud skills. Let’s say you want to go narrow on a specific topic. Okay, well, there’s a two hour course on Azure Security. Maybe you want to go big wide. Alright, there’s a 42 hours AWS certified Sys Ops administrator course and lots more cloud training offerings in the CBT Nuggets catalog. I gave you just a couple of examples to whet your appetite.

[00:31:59.410] – Ethan
In fact, CBT Nuggets is adding 40 hours of new content every week, and they help you master your studies with available virtual labs and accountability coaching. Interested? Of course, you are. So satisfy your curious mind by visiting CBT Nuggets com Cloud and figure out if CBT Nuggets will work for your training with their seven days free trial. Go do it. Cbt Nuggets com cloud for seven days free. That CBT Nuggets com cloud. And now back to the podcast. I so rudely interrupted. [/AD] [00:32:33.830] – Ethan
There’s an element of this that we’ve skipped over Kris. That is, if you’ve never been a programmer before, do you think there’s any kind of a hurdle worth mentioning about things like IDEs, getting your head around libraries and just those fundamentals or is that. You keep saying Ops folks are they just absorb all this knowledge. They’re just so smart, a whole bunch of them. And so we don’t need to even worry about that.

[00:32:57.220] – Kris
I mean, I’m not trying to say Ops folks are able to just learn everything I’m saying. That’s how they learn, at least in my experience. Right? I got into my day job, I didn’t go to College, I didn’t have a degree, I read a few books, and I watched right? I watched people. I paid attention, I learned I listen. And I feel like most Ops folks, this is like a blanket statement so big asterisk here. I feel like most Ops folks at some point in their careers find value in watching, observing and learning, and that’s I think that’s just part of the industry.

[00:33:31.420] – Kris
Right? Like, where as operations folks we’re rewarded for being able to solve problems quickly, so yeah, learning a programming language. It’s a lot of work, right? I put it on par with learning vim or emacs or Linux. It’s a lifetime of work, and programming languages change. And you really have to, like, accept the fact that this is going to be part of what you’re doing. I just keep going back to that. I don’t I think there’s not that much of a difference between some of the configuration management tools that I see and porting that over to a proper programming language.

[00:34:09.100] – Kris
So I would say that if you’re at the point where you’re lightning fast at writing puppet config and a puppet config and an if statement is no big deal for you, and you’ve learned some of these lessons and you’ve been running it, and, you know, the good, the bad and the ugly of a puppet. Chances are it’s not going to be that big of a job for you to get into a programming language. So if anything, this is just more of me trying to push folks into accepting the fact and letting them know that it’s not as big and as scary as it may seem.

[00:34:36.830] – Kris
Programming language documentation is written for programmers. And yeah. Anyway, I’ll show up here and let you guys respond.

[00:34:44.970] – Ned
I think it’s really interesting that the point you made was if you’re using a config management software that already has conditional statements in it, you’re already programming. They be like training wheels programming, but you’re already programming. And actually, I mean, at least in my experience, because of the fact that we don’t think of it as software, the tooling that surrounds it just isn’t there. So if you want to do unit testing on Terraform, you have to go to Go to do that, because that just does not exist in the HCL language.

[00:35:19.650] – Ned
Same thing with Helm. Like, you brought that up. And, of course, I had flashbacks of writing a Helm chart with logic and then trying to debug it. And that was one of the more painful experiences that I’ve ever had because it was something stupid that I missed. But of course, the Linter doesn’t pick it up, and there’s no easy way to debug it. So you’re just stuck going “Ugh why?” So in your ideal world would it be more that you would write software in something like Go or Python that would then produce the config that can be ingested by something else?

[00:35:54.450] – Ned
Is that sort of the workflow that you advocate for?

[00:35:57.460] – Kris
There’s a tool out there. I think it’s called CDK that came out on Amazon. That kind of does this. I’ll reserve my thoughts and opinions and feelings on that tool for another day. I’ve never used it. I’ve just read the read me. I personally would not build a tool that goes from programming language to config to software to production or to the system, whatever that system may be. Again, that’s four hops to get to mutating the server. I think what I’m really advocating for here is, if you’re in a world where you’re writing helm charts, you’re doing conditional things and you’re iterating, and you’re doing these Turing complete constructs, all of these tools are good.

[00:36:40.780] – Kris
I just am challenging that they’re never going to be Go good. They’re never going to be Python good. They’re never going to be Ruby good. And I think that’s because the that’s not what they set out to do. So I’m not trying to say there was anything wrong with them or that there’s anything that they could do better. I’m just saying that they were built for a different reason. Programming languages were built to be generic and have unit tests that give people everything they need to build Turing complete software for whatever their use case may be.

[00:37:10.280] – Kris
And it’s more of a raw resource than it is, like a custom resource. So, yeah. I mean, I just think that’s got to be a natural jump after we get into this configuration management bit, as engineers.

[00:37:25.110] – Ethan
Do You think it matters which language an operator would pick because you’ve thrown around Python? Ruby? I think Java’s come up. Go has definitely come up, but these aren’t interchangeable. Each of these languages are actually rather different when you get into them and how they’re structured, how they express themselves, community support, and so on.

[00:37:46.340] – Kris
So, there’s two answers. There’s a technical answer, and there’s the human answer. The human answer is pick the one you’re most familiar with in a world where we have people who learn through Osmosis and are successful at looking at the state of systems and teaching themselves like common truths about these systems and applying those truths later for different reasons. If you’re in a world where you’re doing, if you’re like, what is Puppet’s written in Ruby? I believe, right.

[00:38:10.680] – Ned
At least originally it was Ruby. I don’t know if that’s still the case, but yeah.

[00:38:14.650] – Kris
Let’s say we have a tool like Puppet that is written in Ruby. I’m sure there’s going to be a lot of constructs of the Ruby programming language that trickle into that tool, which means your chances are you’re already probably pretty familiar with them, and you just need to sit down and realize that these constructs have a name and you’re actually probably pretty opinionated about them. And have a lot of thoughts and feelings about some of them. Some of them you might like, some of them you might not like.

[00:38:40.440] – Kris
And I think as a human part of going through and learning these languages and developing your opinions, it’s going to be a part of it. Don’t completely start in left field. If there’s already something you’re familiar with. I think it makes sense to continue down that path. And if you look at let’s take Helm again, which I think we’re all familiar with, like Helm unit testing or Helm chart testing, which is written in Go, like there’s your hook into go. Right? You’re struggling with finding where your line number is that you’ve messed up, interpolating your YAML, and then you start writing a test to fix it.

[00:39:11.690] – Kris
Well, guess what you’re writing a go unit test, and I hate to say it, but writing a Go unit test for text template and YAML interpolation at runtime probably not the best introduction to Go there is out there. And I’m just trying to tell people that give it a shot, give Go a shot, give Ruby a shot. Give Python a shot.

[00:39:30.940] – Kris
And I think you’ll be surprised how much of it you’ll enjoy and how much of it will make sense to you.

[00:39:35.840] – Ned
Yeah, if I could share how that resonates with me is I do a lot of work in Terraform and Terraform’s written in go. And when I started working on some of the functions or using some of the functions that exist in Terraform, I was like, I really wish this other function existed. And I started doing a little digging into the source code, and I was like, Wait, they’re just reimplementing all of the Go functions, but they’re exposing them to the HCL language, so all those functions exist.

[00:40:03.670] – Ned
It’s just a matter of that one doesn’t exist in Terraform yet just, you know, port it over and expose it through HCL. And you’re good to go. And I was like, oh, I bet I could do this if I just look at an example of how they did it once and then do it for the function that I actually want. And it worked without me knowing a whole lot about Go, because, like you said, I kind of picked it up a little bit just from being steeped in it and picking up the hints that were in HCL.

[00:40:28.440] – Kris
A rose by any other name would smell as sweet, William Shakespeare. An if statement is an if statement, no matter where you happen to see it. Right? A string to lower string to upper split string, a rose is a rose, we can wrap it up into an abstraction and call it configuration management language or you can remove that layer of complexity. And that’s what I’m advocating for here. If you’re splitting strings and you’re looping over things and you’re building complex logical systems. Sweetie, you’re a programmer. Welcome to the club, right?

[00:41:04.280] – Ethan
Kris, you gave the answer I was hoping that you were going to give, because my philosophy is kind of the same thing. I spent most of my time in Python because as it happens, I deal with a lot of networking gear. And there happened to be a lot of libraries written in Python that support the networking world. And so that’s been the path of least resistance for me. The fastest way to get things done doesn’t mean I wouldn’t switch over to something else, because, as you say, once you start looking under the hood, everything gets pretty similar.

[00:41:31.780] – Ethan
Ultimately, with the way it’s structured and the way it works, you just got to get your head around the syntax and some of the, certainly to some languages that are a bit peculiar and how they do things. But once you get your head around, if it’s kind of the same.

[00:41:45.900] – Kris
Yeah. I think there’s a reason that we see a lot of these common functions across programming language libraries. There’s a reason we have the concept of logging and standard out and string manipulation. If your string contains, these are all in the syntax is different. And Python is better than some languages at dealing with sets of strings or sets of ints and things like that. But ultimately, as you start thinking about problems and what you’re dealing with, that with a system that’s in front of you, I think there’s a natural evolution to programming.

[00:42:20.320] – Kris
And I think it’s really easy to miss the boat, so to speak. When you’re as an operations engineer, as an infrastructure engineer, you’re brought up in this config management, onesy twosy scripts to get the job done. It’s really easy to miss the point of where you go. I’m actually a full fledged software engineer, and once you kind of miss that turning point, I found that folks actually kind of double down on configuration management, and that can come back to hurt people are not necessarily hurt people, but it could come back to introduce a lot more complexity than it was intended to, right.

[00:42:53.060] – Ned
It’s the complexity that really kills, it’s adding complexity and abstractions when you don’t need them. And I think that’s what you’re talking about. Where if I write software that then creates a text file, which is then ingested by a different piece of software to create the resource. Right, there’s so many places where that linkage can break or screw up or be misconstrued between the two. Wouldn’t be easier if it was just I write some software that configures a resource, and I’m done.

[00:43:17.050] – Kris
This is like I tweeted this the other day. I think we’ve all seen the never ending USB adapter that it goes from a serial Port to a PS two Port, and then you have these twelve things, and then you have a USB plug at the end, and then it’s Ironically, usually that serial Port is right next to a USB Port. And it’s like at some point you just need to realize that, like, you’ve gotten to this very complicated system by only doing the right thing. Year after year, you went to the server room and added the adapter you needed.

[00:43:46.600] – Kris
And then all of a sudden you’re left with this USB Port. And at some point somebody just needs to walk up to the server and rip the old adapters out and just go straight into the server. And that’s kind of what I’m saying here. At some point, we need to stop and realize that it’s possible to get a lot of complexity by always doing the right thing. And at some point, if you continue to do things the way you’re doing, you’re just going to be going through that same endless chain of USB complexity that you don’t necessarily need.

[00:44:11.720] – Ned
I like that it’s not anyone’s fault. It’s not like people did things wrong along the way. It’s just you’re trying to do the right thing. That was the best option available at the time. Maybe now it’s time to reassess it.

[00:44:25.290] – Kris
Yes, and when you reassess like, if you come up, take a breath, do what you need to do, take a week off work, whatever. A lot of the things you care about. I found out our natural learning points in a programming language. Like what you want to learn how to send an Http request. You want to learn how to create a resource in AWS, you want to learn how to mutate a string, or you want how to do these basic paradigms. These are there at your fingertips, and it’s surprising how natural a lot of this can come with just a little bit of time and patience.

[00:44:57.860] – Ned
I think to a certain degree, at least this has been my personal experience is because I did not start out as a programmer, and I’m not a programmer by trade. I don’t want to call myself a developer. You know, there’s this imposter syndrome. I feel by saying, “Oh you know as a developer” because I keep, there’s this voice in my back of my head that goes, you’re not a developer. You just write PowerShell. It’s kind of like you have to tell that little voice that it’s wrong. Do you think that’s something that’s standing in the way of Ops engineers today, they think of themselves.

[00:45:32.910] – Ned
But I’m not a developer, right?

[00:45:34.700] – Kris
Absolutely. Which is why I’m over here yelling about infrastructure as software, because I do think it comes from within. I feel like the Lorax. I speak for the trees. I’m over here telling people you are a developer. You are writing software. Software means Turing complete systems, and you’re doing that. Your system changes at runtime. If this is set to true, or if this is set to false, that is software. That’s what you’re doing. And I think that I I can’t encourage people enough to just accept the fact that they are, whether they want to be or not.

[00:46:09.540] – Kris
Software engineers, and I want to welcome them with open arms. Welcome to the party. I mean, it sucks here. Don’t get me wrong. It’s not fun, but welcome at the very least. Like, if we’re gonna be mad about software, let’s be mad about software together.

[00:46:27.800] – Ned
Welcome to the party. It’s awful, but at least we can all have an awful experience together. I think it could be a better experience if you’re bringing more people into the conversation, right?

[00:46:37.320] – Kris

[00:46:38.100] – Ned
Yeah. Awesome. Well, this has been. Go ahead. Go ahead. Sorry.

[00:46:42.380] – Kris
No, I was just gonna say it’s always easier to think that there’s this magical point in the future that is gonna make everything better. I was thinking about this the other day. I’m getting ready to move my house and all around me in the house. It’s just the couches behind me, the Heptio pillows, this microphone. This is all a reflection of things that my past self said. If I can only get that, the world’s gonna be better, and it’s got to be better when I get there.

[00:47:06.080] – Kris
And at some point, you just need to stop and enjoy it, and you need to let the past version of yourself kind of enjoy that. These are all things that you worked hard to get. And what I’m trying to say is there’s a big, big hurdle in moving from configuration management to software engineering. And I think that starts with the name, of course, and the fact that it’s a very confusing paradigm. But more so, there’s this inability to think, to take a look around you and actually appreciate what you’ve done.

[00:47:32.770] – Kris
But like, one way of saying it is, I wrote puppet professionally for three years. Another way of saying is I wrote software systems that manage production enterprise servers and was able to iterate quickly and solve problems for business. And I can’t advocate for folks to be on the ladder of those two enough.

[00:47:53.070] – Ned
Right, right. The second sounds so much more impressive, but they’re functionally equivalent. And I don’t know, it’s a different way to think of thinking of things. And I really appreciate the way that you put it. It’s given me a little more, maybe a kick in the butt to actually go and learn go, because I’ve been meaning to forever, and it’s just been well, but I know enough to get by it’s, like, well, maybe it’s worth crossing that hurdle or getting over that hurdle to learn general purpose programming language because my life will be better at the end.

[00:48:26.200] – Kris
I also think that the programming, the engineers also are a little guilty of this as well. Right? If you join any engineering org or team or project or group or whatever for a constant theme being, we’re understaffed, we need people. We don’t have enough good senior engineers. We need help. We need help. We need help for someone who goes around saying that a lot, we sure do put a lot of hurdles between going from Turing complete configuration management to writing software to manage infrastructure. And it’s kind of like I’m over here just like, but these people know how to do what we need, right?

[00:49:05.060] – Kris
They know how to write if statements. They understand what a server is. They probably know more about routing and networking than you do, bro. Let’s let them come and we’re all working on the same team here. And so I would certainly say that if you’re a software engineer listening to this welcome your newly found configuration management brethren to the infrastructure as software society for lack of a better term, but also as an infrastructure management person. You know, it’s really not that big of a gap and believe in yourself and trust yourself because this is what you’re doing.

[00:49:41.950] – Kris
You are an engineer and this is what you’re good at and you just need to call it something else so that the engineers can hear what you’re saying and all you’re really doing is just delivering the same message on their terms.

[00:49:54.950] – Ned
Awesome. Well I gotta say, this has been a fascinating conversation. I’ve really enjoyed it. Are there any key takeaways beyond what you’ve already said that you’d like for the audience to hear?

[00:50:07.400] – Kris
Key takeaways? I don’t listen to me. I’m crazy, right? I have these thoughts for myself because this is just the path that I’ve gone down. So I definitely think that in general the tech industry is subject to this whole like, I feel like this is like Western medicine. Like take a pill and you’re cured, right? Like what’s the solution? That’s Where’s my antibiotic for my team that I could just take once a day for six days and then my problems are magically gone. I don’t think this is that type.

[00:50:41.190] – Kris
We’re not going to see that type of solution here. I think this is more of a cultural problem and I think this is more of a how do we start bringing these two what were traditionally tightly coupled, get very culturally different paradigms into one harmonious working space. I think Kubernetes did a good job at that, we started to tiptoe into infrastructure management from software land. And I think we can go further. We can keep pushing on this and we can keep bringing this together as a holistic paradigm.

[00:51:11.590] – Kris
And so that would be my big key takeaway is really thinking about this as it’s an opportunity to get involved and it’s an opportunity for you to grow. And that to me, that’s more important to how we’re looking at the culture and the problem than it is about, you know, it doesn’t matter if you’re using Terraform or puppet or Chef. What matters is that, like, we’re solving problems as a team and that we’re approaching this in ways of harmony instead of trying to keep these things separated for some unknown reason.

[00:51:39.130] – Ned
Awesome. That was a great key takeaway. Alright, if folks want to follow you, are you active on social media? Do you have a blog you’d like to promote?

[00:51:49.300] – Kris
Yeah, Twitter com slash Kris Nova or my blog is niveny dot com. They all point to each other. So if you find me anywhere, you should be able to find enough links to the other things I stream on Twitch. I do everything from work on the Linux kernel to infrastructure management. I work at Twilio. My day job is managing a enterprise servers at scale with Kubernetes. Yeah, I don’t know. I’m around. It’s not too terribly hard to find me.

[00:52:16.630] – Ned
Awesome. We will include links in the show notes. Well, Kris Nova thank you so much for being a guest today on Day two Cloud.

[00:52:22.860] – Kris
Thank you.

[00:52:23.560] – Ned
And hey, listeners out there virtual high fives to you for tuning in. If you have suggestions for future shows, you know, we’d love to hear them. You can hit either of us up on Twitter at Day two Cloud show, or you can fill out the form on my fancy website. Ned in the Cloud com. Hey, speaking of my fancy website, I recently launched a totally redesigned version of my site. It’s easier to navigate, more visually appealing, because somebody else did it and more performant. So definitely check it out and let me know what you think that’s Ned in the Cloud com until next time.

[00:52:54.430] – Ned
Just remember, Cloud is what happens while IT is making other plans.

More from this show

Day Two Cloud 153: IaC With GPPL Or DSL? IDK

On Day Two Cloud we’ve had a lot of conversations about using infrastructure as code. We’ve looked at solutions like Ansible, Terraform, the AWS CDK, and Pulumi. Which begs the question, which IaC solution should you learn? A Domain Specific Language...

Episode 111