Follow me:
Listen on:

Day Two Cloud 100: Get To Know Crossplane: An Infrastructure Control Plane For K8s

Episode 100

Play episode

Today’s Day Two Cloud is an introduction to Crossplane. What is it? The official definition says Crossplane is “…an open source Kubernetes add-on that enables platform teams to assemble infrastructure from multiple vendors, and expose higher level self-service APIs for application teams to consume, without having to write any code.”

In other words, Crossplane plugs into Kubernetes to serve as a control plane that can run across multiple private and public clouds. It allows infrastructure teams to compose infrastructure with all the required policies, permissions, and guardrails, while also providing APIs for developer self-service.

Crossplane is a sandbox initiative in the Cloud Native Computing Foundation (CNCF), meaning it’s an experiment that, if it attracts enough love, attention, and support, can one day become a full CNCF project.

Our guest and guide to Crossplane is Dan Mangum. Dan is a Senior Software Engineer at Upbound, as well as a Crossplane maintainer and a Kubernetes SIG Release Tech Lead.

We discuss:

  • What Crossplane does and why it might be useful
  • LLVM as a metaphor to understand Crossplane
  • Wait, what is LLVM?
  • How Crossplane compares to something like Terraform
  • The benefits and tradeoffs of another layer of abstraction/complexity
  • Where and how to play with Crossplane
  • More

Sponsor: CBT Nuggets

CBT Nuggets is IT training for IT professionals and anyone looking to build IT skills. If you want to make fully operational your networking, security, cloud, automation, or DevOps battle station visit

Show Links:

The Crossplane Blog – – Daniel Mangum’s blog

@hasheddan – Daniel Mangum on Twitter

Crossplane Videos – YouTube

Is Crossplane the Infrastructure LLVM? –

Crossplane Slack

Daniel Mangum on LinkedIn



[00:00:01.250] – Ethan
Sponsor CBT Nuggets is IT training for IT professionals and anyone looking to build IT skills, if you want to make fully operational your networking cloud security automation or DevOps Battle Station, visit CBT nuggets dotcom cloud. That’s CBT nuggets. Dotcom cloud.

[00:00:21.840] – Ned
Welcome to Day Two Cloud on today’s show, we’ve got Dan Mangum from Upbound talking to us about LLVM and Crossplane. And if you’ve never heard of either of those things, do not despair. Dear listener, we are going to get into both of those and especially into Crossplane and how it automates and creates a platform for different cloud providers. It sounds confusing. It can be a little confusing, but I think it’s also kind of awesome. What do you think, Ethan?

[00:00:51.480] – Ethan
Ned, I have so many things I’m thinking right now. So so this show, this is one of the fastest minute wise I couldn’t believe. Forty five minutes was like come and gone. I looked over, it was like 15 minutes and I looked over at the recording software again. It’s like we’re coming up on an hour. Where did the time go? Crossplane, I mean, you mentioned LLVM and Crossplane gets us into this world of abstraction. It’s another magical abstraction where you don’t want to have to care what cloud your workload is running in because Crossplane does it for you, but you can control it by writing all of your own definitions about how all that’s supposed to work, which makes a Dev’s life easier.

[00:01:30.030] – Ethan
But it’s so much more than that. I think we were talking after the show as a Rube Goldberg machine. If that’s K8s. You got another Rube Goldberg machine here, but that there’s no reason to fear. There’s no reason to fear listener. If this is the kind of a show where if you’re into this stuff, it’s one of those that Daniel goes so deep, you’re going to have to listen to the show twice or maybe three times. I was just here Ned, and I don’t think I got the entire show into my head on the first pass, and I was really paying attention hard.

[00:01:55.470] – Ned
Yeah. I mean, you know, they say fear is the mind killer. You must not fear and do not fear. This episode with Dan Mangum, senior software engineer at Upbound Crossplane maintainer Kubernetes SIG release tech lead. Enjoy the show. Daniel, welcome to Day Two Cloud, let’s dive right into the conversation, getting started with Crossplane now Crossplane is something that I’ve been hearing a lot of buzz about. And somehow it led me to a blog post you wrote called Is Crossplane The Infrastructure LLVM.

[00:02:26.160] – Ned
And after reading the entire post, I’m not going to lie. My brain whimpered and shut down for a little while and took a little nap. But then but then I woke up and I got a chance to try again with you, the human who wrote this post. So let’s start with the acronym that’s in that post, LLVM. What does it stand for and what does it mean?

[00:02:47.040] – Daniel
Yeah, absolutely. Well, first of all, thanks for having me on the show. Always excited to chat with folks in the space. LLVM stands for low level virtual machine, which that name itself is a bit of a misnomer. It can often be confusing for folks who are who are learning about the project. But essentially I describe it as a compiler toolchain framework so that the structure of it is as you have front ends and back ends and then a variety of kind of optimization and mid-level representations.

[00:03:14.370] – Daniel
And in the middle. And the idea right, is that you can build a front end, which would be something like a programing language. So rust as an example of that. C and C++ have compilers that are based on LLVM etc. And then the back ends, right target different hardware architectures, so x86, arm, RISC-V et cetera. And the, the benefit of having a toolchain be able to do this is you don’t have to for each permutation of front end and back end, write a compiler that emits that the correct instructions for that particular architecture.

[00:03:47.730] – Daniel
And there’s a lot of different components. LLVM is an extremely large project, one of the largest open source projects and most successful. So we won’t get too far into that. But the idea right is that you present an interface that folks can interact within and design themselves for a new programing language, if they like, without having to worry about making sure that it emits the right kind of lower level concepts.

[00:04:09.360] – Ned
OK, so it’s sitting between the hardware and the programing language you want to use. It sort of reminds me and correct me if I’m wrong, it sort of reminds me of the Java virtual machine where the idea, albeit didn’t always work great. But the idea was you wrote the Java code once and then the JVM was written for each hardware platform and magic happened and your Java code just ran. Is that is it is that an apropos comparison or is there a little more to it than that?

[00:04:34.890] – Daniel
It’s definitely similar. You can actually I believe that LLVM can actually target the JVM at this point. So so the JVM idea there is that you put a common runtime on all hardware and then you target that runtime. Whereas traditionally and like I said, I believe you can target the JVM at this point with LLVM. But traditionally LLVM is used to emit specific machine code. So it’s kind of just changing the level at which you you present the abstraction, the common architecture there.

[00:05:03.620] – Ethan
So, Daniel, forgive this old computer science major for asking this one, but it sounds to me like what a compiler did back in the day. If I could write C code, I would just feed it to a compiler that would turn out machine code on whatever platform I was writing that code on.

[00:05:17.580] – Daniel
Yep, that’s exactly right. And really what it’s doing is allowing us every time we we have a new language or want to build a new compiler for an existing language not to have to re implement all of the low level optimization parts of emitting machine code from that. So that’s exactly right. If you’ve ever used clang as your C compiler, C++ compiler that’s built on LLVM and you could write your own C compiler on it or build a higher level language if you like.

[00:05:45.420] – Ned
OK, I think I get what LLVM’s purpose is, and it’s to so you don’t have to reinvent the wheel every time, that makes total sense to me. Now you made the comparison between LLVM and Crossplane. So what drew you to making that comparison between the two?

[00:06:03.720] – Daniel
Well, first of all, I’d say the LLVM is wildly successful. So that’s the first comparison, of course, although although it really does follow a similar model. Right. So let’s take kind of those different concepts we just talked about with LLVM. So we have the hardware or the instruction set architecture that we’re targeting as the back end of LLVM with Crossplane. That can be really any API, but typically it targets cloud providers. So being able to create an RDS instance on AWS us or a cloud SQL instance on GCP or any other variety of infrastructure, and it doesn’t have to target cloud providers.

[00:06:41.580] – Daniel
We have providers that point at Slack or Twitter, order you a pizza or something like that. Basically anything that’s exposed via an API is on the table. So that’s kind of the analogous back end for LLVM. The front end. Think of a programmer, right, being the consumer of a compiler. So you present them with an interface and they interact with it. Crossplane the consumer would be developers within an organization and the compiler authors in this case would be platform builders.

[00:07:09.120] – Daniel
Right. So there’s been a big kind of trend towards having platform teams or infrastructure teams within organizations where folks are essentially have a group that manages all of their cloud infrastructure or on prem infrastructure, and then has a system to varying levels of success of offering that infrastructure to different development teams. Right. So the worst end of that spectrum may be you send the platform team an email and say, hey, I need a database or I need a VM or something like that.

[00:07:37.950] – Daniel
And then they hopefully don’t email you back the SSH credentials or something like that. At the kind of optimal end of that, the platform team is building a platform that developers can interact with and self-service on. Right. So the platform team puts restrictions in place and offers abstractions that fit within those restrictions and policies that allow developers to self-service on that infrastructure. So, you know, that might look like taking something like the AWS API and presenting a more Heroku like experience, like a higher level abstraction to developers.

[00:08:12.660] – Daniel
So Crossplane is what sits in the middle and also provides those back ends and says, we give you the ability to, within your organization, build a platform with policy and mapping of higher level concepts to lower level concepts and present that to your development teams and allow them to self-service on infrastructure.

[00:08:32.630] – Ethan
Bwwaaa… OK, but it’s such a moving target, though, all the things that you’re abstracting are changing fairly steadily. So how do you how does Crossplane keep up?

[00:08:41.850] – Daniel
So for some history on the project when it first started, the solution to providing these abstractions was actually trying to define a common standard across all cloud providers. And I can see from your faces that you know how that how that ended.

[00:08:56.960] – Ned
I know how well that goes.

[00:08:58.390] – Daniel
Exactly. Exactly. So that obviously was not a solution. It was useful to demonstrate the ability of kind of having this multi-cloud control plane. But it wasn’t really useful for for end users because with any meaningful usage, they would run into some edge case where the abstraction was leaky. So instead, the Crossplane team and the Crossplane community said, how do you provide value here? You enable people to build the abstractions and define those mappings to the low level concepts.

[00:09:26.340] – Daniel
Right. So we have these folks that are platform team members, which once again are kind of analogous to someone who would build a compiler and they know within their organization. Right. When when a developer creates a database, should that create a RDS instance in us-east-1 on AWS, or should that create a cloudSQL instance in US West on GCP? And what should the parameters be for that? Depending on the environment it’s provisioned from and all of that.

[00:09:53.630] – Daniel
So instead we move towards a model that that we call composition, which basically allows you to take these these granular managed resources, which are those things that represent resources on AWS, GCP, et cetera, and compose them into abstractions and define what those mappings are from, let’s say, a database to an RDS instance and what level of configuration that you’re actually giving to developers. And you can package that up and we can get more into the details on on how that works.

[00:10:22.520] – Ned
OK, so if I can just read back to you or say back to you what I what I’m hearing, it sounds like we don’t want to present to the developers with the raw API or the raw interface to the various clouds that you might want to deploy stuff in. That’s too much. You instead want to pull back, put some controls around what they’re actually able to deploy, and also provide a bit of an abstraction for them so that they don’t shoot themselves in the foot when they’re trying to deploy this thing.

[00:10:48.860] – Ned
So you can have some sane defaults built into that layer. You can determine, no, you can’t launch the gargantuan database that costs ten thousand dollars an hour unless you get special approval or something like that’s not something that you as a developer have access to. Is that about right, what you’re going for with the Crossplane solution?

[00:11:09.650] – Daniel
Yeah, that’s an awesome summary. And one thing I would add to that is you’re not locked into kind of a single level of abstraction. So within a single organization, you may say, you know, this development team really knows how databases work. Right. And they they might want something that’s more akin to actually interacting with the RDS API directly. So we’re going to expose a lot of that configuration to them. And they may have a higher level of permission or trust within the organization.

[00:11:34.790] – Daniel
But this other team, let’s say it’s a marketing team or something that’s not folks that are interacting with infrastructure on a regular basis. We may just say, you know, there’s one field and it’s an enum and it’s small, large, medium, etc. and we and we decide all of those defaults for them. So you can have those differing levels of abstraction within the same kind of Crossplane offering there.

[00:11:57.810] – Ethan
OK, so there’s there’s there’s a subtle distinction here. And so I’m using Crossplane as an organization and I’m a developer that’s writing to Crossplane directly as opposed to I’m making a tool I want some developer to use and I might use Crossplane to help me build that tool that then I hand to the developer.

[00:12:15.740] – Daniel
Yeah. So I’d say both of those things are actually correct and maybe, maybe we should get into how Crossplane is actually architected and its relationship with Kubernetes if you are interested in going down that road.

[00:12:28.490] – Ned
Yeah, I think that’s probably an appropriate time to bring in what’s actually in this thing, because so far it’s a magic black box that abstracts components. So let’s dig into what’s actually in that magic box.

[00:12:40.730] – Daniel
Absolutely. So so I’ll start off with kind of a very brief summary of how Kubernetes works in terms of its extension mechanisms, because I know the audience of this podcast is some sophisticated folks that are already aware of many of these things, but essentially Kubernetes you’re likely familiar with it as a container, orchestration, platform or framework. So it exposes things like pods and deployment’s. Right. That basically make it easier to put workloads across a set of nodes and manage those is kind of a single operating system.

[00:13:10.400] – Daniel
So that that was the initial impetus for creating kubernetes. However, all these different APIs like pods, deployment, services, that abstract things like processes essentially on a machine or some sort of networking primitive or that sort of thing, they realized that those primitives are useful, but there are many other ones that could fit into the same kind of system, so I think in the blog post that you’re referencing earlier, I refer to Kubernetes not as a container orchestrator, but as a distributed systems framework. So it’s kind of evolved to this point where it goes beyond those workloads.

[00:13:44.920] – Ned
Yeah, we actually we had Kelsey Hightower on the podcast oh geez, that was almost a year ago now, I think. And he brought up something very similar where, yes, Kubernetes is ostensibly there to schedule containers. But really there’s so much more to it than that. It can be a platform for you to schedule just about anything. And it’s extensible. So it sounds like that’s what Crossplane is taking advantage of.

[00:14:09.490] – Daniel
That’s exactly right. And definitely take whatever Kelsey Hightower said and use that instead of whatever I provide here. But it’s the same idea. Right. And the way it does this is Kubernetes has an API server where these different abstractions, if you will, are represented as objects. So you may have heard of the Kubernetes resource model, which is basically a way of defining APIs to have a spec and a status. So the spec is the desired state status is the current state.

[00:14:36.490] – Daniel
And then there’s a set of reconciliation loops that are frequently packaged into deployment’s called controllers that are essentially constantly driving that spec or driving that status to meet that spec. Right. So we want the current state and the desired state to look the same. So that works with workloads and services and that sort of thing, but it also works with other things. So Crossplane’s kind of innovation in the beginning, right, was to represent these cloud provider infrastructure resources as objects in the Kubernetes API, which becomes really useful for a number of reasons.

[00:15:09.220] – Daniel
And this is kind of the base layer of how Crossplane works. These different providers, which are the back end for Crossplane, they say, we’ll represent all of AWS’ resource types in the Kubernetes API, which means that we can have the information that’s encompassed about their their status and that can be referenced from things like a workload that’s running right. So if you have your database and your workload represented in the same API, then you can start to have a lot of synergies between referencing information about that database and consuming that from a workload.

[00:15:42.550] – Ned
OK, so you’re representing the AWS RDS instance, using, is it a custom resource definition? Is that essentially you’re creating one of those and then you have a controller that makes sure that RDS matches the desired state? Is that basically what you’re implementing?

[00:15:58.870] – Daniel
That’s exactly correct. So a provider, in Crossplane parlance, is a set of CRDs and controllers to reconcile them that get installed as a bundle.

[00:16:08.890] – Ned
There are a lot of resources in AWS. I mean, like a lot. A lot. So is there an impact to having that many CRDs in a single kubernetes cluster? Would you load the whole provider or can you be more specific than that, say I only want the database related ones or something?

[00:16:25.120] – Daniel
Well, you can be more specific than that. Right now, the way the kind of canonical Crossplane community maintained AWS provider works is it brings all of those resources, which is quite a number of CRDs and controllers. However, there is some nice caching and things like that to make sure that it doesn’t create a lot of overhead, to have a bunch of CRDs you’re not using there. However, you could very easily and anyone can go and write their own provider or they could break pieces off of the AWS provider.

[00:16:54.790] – Daniel
Intall that as more granular pieces. And at some point in the future, Crossplane actually has a package manager that installs all of these things. And at some point in the future, we may add to that package manager the ability to say, you know, I just want this set of CRDs from this provider just to make the interface a little better so you don’t have, you know, API sprawl within your cluster.

[00:17:14.110] – Ned
So we’ve got CRDs and that represents different resources in the cloud. What else does Crossplane implement?

[00:17:20.770] – Daniel
So a lot of folks come to it just for that, right. They say, oh, I just want to be able to represent my infrastructure as kubernetes objects. And like I said, that has a lot of benefits with referencing from workloads and services and that sort of thing. However, that doesn’t really get into any platform building. Right. You’re essentially just bringing kind of infrastructure as code or infrastructure as data, as I’m sure Kelsey said when he was on the show to the Kubernetes control plane, which definitely has its benefits.

[00:17:47.620] – Daniel
Right. It’s different from something like Terraform or Pulumi and that it’s not a one off run right? Where we’re constantly observing that infrastructure and letting you know about its status. So there are some benefits of referencing that directly. However, if you’re only using Crossplane at that level, you’re really not taking advantage of of the real benefits it provides from a platform building perspective. It would kind of be like harkening back to LLVM again. It would kind of be like if you were still writing, you know, that the ISA right.

[00:18:17.170] – Daniel
For the different architectures that you are targeting, you’re not really creating higher level abstractions, even though you are giving a consistent framework to go to target different backends.

[00:18:26.740] – Ned
Something important you mentioned is that in a way it’s similar to Terraform, and we’ve done a couple shows on Terraform, and when I was reading through your post and also some of the Crossplane docs, it reminded me a lot of Terraform at the beginning where, OK, I’ve got a set of providers and those providers allow me to hook into the different cloud APIs and create resources. And the whole thing is managed by terraform. And there’s a state file and I was like this is very similar, but I feel like something critical you mentioned is it’s not just that there’s more that Crossplane has to offer.

[00:18:57.430] – Ned
So can you expand on that?

[00:18:59.230] – Daniel
Yeah, absolutely. So so we already mentioned the kind of constant reconciliation aspect of it. A useful parallel here to explore the further levels of Crossplane and compare and contrast with Terraform is thinking about terraform module’s right. So Terraform also offers a kind of abstraction mechanism to be able to say, I want to take these granular resources from a terraform provider and present them to an end user or a developer with a scoped down set of inputs. Essentially, that’s kind of how Crossplane composition model works.

[00:19:30.490] – Daniel
However, this is all existing in a kubernetes cluster, right? So these objects are things that exist themselves. And Crossplane composition model works in a way where you define actually some some other custom resources that say this is kind of the schema for the abstraction that I want and these are the different mappings that satisfy it. So you can have multiple mappings for a single abstraction. So you could have GCP mapping for a database and an AWS mapping or a dev and a prod one, etc.

[00:19:58.450] – Daniel
and then those actually create new CRDs, which then developers can interact with. Right. So it’s CRDs all the way down. But but essentially what major difference of this is and I would encourage folks to the Crossplane blog, which is blog dor Crossplane dot io actually has a post on there that that goes through a comparison with Terraform that will go into this in greater detail. But an important distinction is that we actually have objects that represent this higher level abstraction that persist.

[00:20:27.190] – Daniel
Right. So with infrastructure as code tools, typically you present a higher level of abstraction and you can imagine the inputs of that kind of like flowing out and the output being these granular resources and the higher level of abstraction kind of disappears when it all gets compiled out essentially. In Crossplane, you’re creating an instance of the higher level of abstraction and that’s all you ever interact with. And that gets mapped to these granular resources and relevant status gets propagated back up to that abstraction.

[00:20:57.640] – Daniel
So you just have kind of one way to interact with with those granular resources as a developer.

[00:21:02.830] – Ethan
Does that include observability then, Daniel? So I would know because I’m interacting with that higher level Crossplane representation of what’s happening underneath status utilization and, you know, these kind of things.

[00:21:14.410] – Daniel
So just like how you define how the inputs to that abstraction get filtered down to the granular resources, you also define what status components of those granular resources come back up and how they get mixed and matched and combined together to present a status that makes sense to users. Right. So you could say reflected in the status of abstraction is the actual state of all the underlying resources. So let’s say I’m a developer and in my team one namespace I create a database.

[00:21:41.890] – Daniel
I could, as the platform builder, say that I’m going to propagate in the status of that database object, RDS instance, ready and healthy, DB subnet group ready and healthy. I could propagate all of those granular resources that compose the higher level of abstraction back up. Or I could just say I just want to give a blanket thing right. This this developer wants a database. They’re really agnostic to what components are under that. I just need to let them know if it’s ready for them to consume or not.

[00:22:07.210] – Daniel
And a big part of that that we haven’t really touched on is providing connection details. Right, because there’s there’s information that’s required for the developers workload to be able to connect to these underlying resources. And sometimes you have to mix and match connection details to give them an interface that they can interact with.

[00:22:24.970] – Ethan
Is there an opportunity then to leverage Crossplane for cost based decisions? Where I put a workload, maybe because and I just interact with Crossplane and Crossplane can make a decision about where to spin up something based on some kind of a dollar cost model?

[00:22:39.970] – Daniel
That is absolutely correct. And once you basically have all these things in a common API, you can do cost modeling, you can do workload bursting, that sort of thing. You can make all these sorts of decisions because everything is represented in a common framework. And you have kind of a a very flexible extension point where you can add new controllers. Right. To interact with these different resources and manipulate how they get provisioned. So we have some of those aspects built into what we call core Crossplane, but we like to make it a very modular system.

[00:23:11.470] – Daniel
Right. So if you’re doing cost based analysis, you might add your own controllers to do that rather than us enforcing a way that you do that, etc..

[00:23:20.020] – Ned
So you could write your own controller that implements your own logic for how you want it to be reconciled going forward. And I think the thing that jumps out to me and the big, big difference between something like Terraform and what Crossplane is doing is that continuous reconciliation loop where it is trying to. Take the spec and make the status, match the spec, and that has such a large impact because the only way that I’ve seen that before is you try to do some sort of drift detection over time.

[00:23:49.250] – Ned
Hey, did something change about the infrastructure since I last deployed it? Maybe you have a pipeline that tries to resolve it. But it’s not this continuous loop. It’s something you have to create yourself and maintain yourself. So for me, that’s the thing that jumps out. The biggest is that reconciliation.

[00:24:04.860] – Daniel
Yeah, that’s definitely really huge and it’s interesting to see how different organizations and different folks receive that sort of operational style, because a lot of folks, their immediate reaction, especially if they’re in kind of a larger and more legacy organization, is that’s really scary for them, right? For it, for something to be constantly operating on their infrastructure. And while some folks view it as a feature that if you go into the AWS console and change something, Crossplane is going to change it back to your source of truth.

[00:24:33.030] – Daniel
And other folks see that as very scary. And so with that right, we need to accommodate different organizations on their journey and operational maturity and give them different options for how often things are reconciled, how they’re reconciled, if there needs to be some checks performed before they actually take remediation action and that sort of thing. So there’s all different components there to to making that work for different folks. But once again, having that flexible system allows for introducing that type of logic.

[00:25:04.800] – Ned
I like the point that you made there that not every organization is ready to go in this full blown, constant deployment technology, that they might want to take a step back and say, OK, I want to put the stuff in, but I want some manual controls in there. And maybe those manual controls are very granular and it only runs the reconciliation once a day or once every couple of days. But then you build confidence in the system and you can start removing those controls as you feel more confident that you can trust the system to do the right thing.

[00:25:36.170] – Daniel

[00:25:37.070] – Ethan
Daniel, back in that context of reconciliation, that kind of we started having that conversation in the context of Kubernetes do I have to run Kubernetes and then Crossplane inside with all the CRDs to get this functionality? Or is Crossplane something I can run outside of a kubernetes environment?

[00:25:53.180] – Daniel
So technically, I would say, yes, you do need to run it with with kubernetes and instead of trying to break away from Kubernetes as our distributed system framework. Right. For for Crossplane, we’re actually trying to embrace some of the kind of upstream kubernetes discussions around how the API server is implemented. So folks that are familiar with with Kubernetes likely are familiar with tools like Kind, which is Kubernetes in Docker, which is used for testing or stripped down distributions like K3s and then something, KubeCon was last week there was a keynote talking about KCP and we had a community meeting for it yesterday, KCP is basically saying we need to take the Kubernetes API and make the components more modular, meaning that you could have a Kubernetes API server that behaved maybe more like something like SQLite, where you can run a very minimal process alongside your application or embed it directly into your application.

[00:26:51.800] – Daniel
So, for instance, you could see a future where we have a very minimal Kubernetes API server actually embedded into the Crossplane binary, which gives you kind of like a single daemon that you could be running on your machine or something like that.

[00:27:05.340] – Ned
OK, I had not heard about that. That’s really interesting to see how that progresses, because it does seem like the secret sauce about Kubernetes is really is the API server. And you don’t necessarily need a giant cluster with a whole bunch of nodes to have that particular functionality. And that’s the thing that Crossplane is using, right? It’s it’s not so much dependent on the other components.

[00:27:24.810] – Daniel
Exactly. And there’s varying levels of of consumption of the API. One of the things and this is very, very nascent information, because I’m actually referencing the first community meeting that happened for KCP, which was yesterday. But one of the things we talked about in that community meeting. So so your listeners are getting, you know, the newest information here. But one of the things that we talked about in that meeting was kind of the varying levels of consumption of the Kubernetes API and then the difference between user facing APIs and controller facing APIs.

[00:27:56.190] – Daniel
So there’s some kubernetes operators or controllers that really just need the API server, right. There’s no workloads. They’re not consuming any part of Kubernetes other than offering new CRDs and reconciling them. Crossplane is a little further to the other side of the spectrum in that we do spin up new workloads. Right. So those providers that you install into into Crossplane, into your Kubernetes cluster, they have to run somewhere. Right. And we do that by going through the Crossplane package manager and spinning up new deployments and pods.

[00:28:26.250] – Daniel
Right. And running those controllers there. So we do still require some components of the Kubernetes API server and we need the kubelet and things like that. But those can be stripped down for our use case because, you know, when users are coming to Crossplane, they’re not creating pods or deployments to interact with Crossplane. They’re telling Crossplane, please install this provider and Crossplane on their behalf, is going to go and manage those pods and deployment. So there’s all different sorts of of consumption models that Kubernetes API server.

[00:28:54.690] – Daniel
And I think efforts are moving towards a way where it’s more modular so so that all folks can kind of benefit from it.

[00:29:01.750] – Ned
What does the footprint look like for a typical Crossplane installation when you have all those providers? How how much space is are all of those deployments and pods taking up within a cluster?

[00:29:12.640] – Daniel
So it really depends. We have lots and lots of providers and they have differing levels of CRDs. And in correspondence with that, differing numbers of controllers and then the number of actual instances of CRDs that you’re creating. So how many RDS instances are represented is actually more indicative of the consumption of these providers. Because if there’s if there’s no instances that exist, those controllers are essentially sitting dormant. Right. So really, when you start to get into large scale, which you could theoretically continue to scale forever.

[00:29:45.460] – Daniel
Right. And Kubernetes, once again, is giving us a nice way to scale the underlying infrastructure for this. But we have a number of consumers of Crossplane that actually implement kind of a layer two cloud provider on top of it. Right. So they’re creating kind of a generic platform and offering that actually as a product. And those folks will have thousands and thousands of resources, you know, on AWS or GCP or some combination of of multiple cloud providers represented there.

[00:30:13.390] – Daniel
And at that point. Right, you have lots of reconciliation loops firing and you may want replicas of your providers in high availability. And once again, Kubernetes provides a lot of those primitives to us so we can kind of take advantage of that. So that’s one of the things when when folks are like, are you are you all thinking about maybe just moving off of Kubernetes? Right. Because you’re not using all of these APIs. That’s why we’re engaging upstream.

[00:30:34.030] – Daniel
We’d rather say we’d like to modify the kubernetes we consume based on the use case rather than trying to build our own distributed systems framework. It’s going to try and accomplish a lot of the same things.

[00:30:45.190] – Ethan
[AD] We paused the episode for a bit of training talk training with CBT nuggets. If you’re a Day Two Cloud listener, you are you’re listening to it right now, then you’re probably the sort of person who likes to keep up your skills, as am I. Now, here’s the thing about Cloud is I’ve dug into it over the last few years. It’s the same as on Prem, but different. The networking is the same, but different due to all these operational constraints you don’t expect.

[00:31:09.250] – Ethan
And just when you have your favorite way to set up your cloud environment, the cloud provider changes things or offers a new service that makes you rethink what you’ve already built. So how do you keep up with this? Training. And this is an ad for a training company? So what do you think was going to say? Obviously training and not just because sponsor CBT nuggets want your business, but also because training is how I’ve kept up with emerging technology over the decades.

[00:31:30.790] – Ethan
I believe in the power of smart instructors telling me all about the new tech, so that I can walk into a conference room as a consultant or project lead and confidently position a technology to business stakeholders and financial decision makers. So you want to be smarter about cloud CBT Nuggets has a lot of offerings for you, from absolute beginner material to courses covering AWS, Azure and Google cloud skills. Let’s say you want to go narrow on a specific topic. OK, well, there’s a two hour course on Azure security.

[00:32:00.640] – Ethan
Maybe you want to go big wide. All righty. There’s a forty two hour AWS certified SysOps administrator course and lots more cloud training offerings in the CBT Nuggets catalog. I gave you just a couple of examples to whet your appetite. In fact, CBT nuggets is adding forty hours of new content every week and they help you master your studies with available virtual labs and accountability coaching. Interested? Of course you are so satisfy your curious mind by visiting CBT nuggets, dotcom cloud and figure out if CBT nuggets will work for your training with their seven days free trial.

[00:32:36.970] – Ethan
Just go do it. CBT nuggets dotcom cloud for seven days free that CBT nuggets dotcom cloud. And now back to the podcast I so rudely interrupted. [/AD] [00:32:49.890] – Ned
I mean, the whole point that kicked off this podcast is don’t build, don’t reinvent the wheel, use what already exists. So you’re using Kubernetes to to manage this? I did have, like, a very specific question because it sparked in my mind when you were talking about the resources that are delivered to the end consumer and then the management of those resources and the the the abstraction of those resources down to the actual resources in the cloud provider. How do you split that up with namespaces so that my developer doesn’t just go in and alter the underlying root resource as opposed to interacting with the abstraction I’ve given them?

[00:33:29.010] – Daniel
I’m glad you brought this up because this is a very important and potentially controversial design decision of Crossplane. So I’m trying to bring the spice to this podcast. But first of all, I’ll say there are some similar or competing projects with Crossplane. So one of them would be Google Config Connector, which is Google’s kind of a closed source controllers for reconciling some of their infrastructure types, ACK, which is Amazon Controllers for Kubernetes, which is a similar kind of thing for for AWS.

[00:33:57.450] – Daniel
It’s open source. And we actually share some code generation with them. And we have good relationship with those projects because we really have different goals. And one of those goals is manifested in the design of Crossplane in that all of the granular managed resources that are brought by providers are actually Cluster scoped, your RDS instance, your your S3 bucket. All of those are going to exist at the cluster scope. And the reasoning for that is we see those as as resources that the platform team or cluster admins should be interacting with.

[00:34:27.810] – Daniel
Right. Those are the granular resources. And you could choose to create an abstraction on those that is one to one representation. But our abstractions can be offered at the cluster scope or the namespace scope. So when you define a new type that maps to these granular resources at the cluster scope, you can expose that at the namespace scope. So you can say folks that have access to the team one namespace can create the database abstraction there and that actually spits out resources at the cluster scope and they’re reconciled there.

[00:34:56.490] – Daniel
And that has a number of important qualities. Number one, with folks that have interacted with any cloud provider, especially if you’ve interacted with multiple cloud providers, you know that the permmissioning model across cloud providers the IAM and all of that differs wildly. And if you’re using multiple of them, it is if you’re using one of them, it’s extremely difficult to manage. Yeah, but if you’re using multiple, it’s it’s nearly impossible. Right. So what we don’t want to do is follow a model like Terraform Right.

[00:35:25.890] – Daniel
Where to run a terraform plan and apply. You need to have the credentials. You yourself, who’s running the terraform plan apply. You need to have the credentials either on your local machine or you need to SSH into a box that has credentials or assume them somehow that can create all of those granular resources that are spread out. Crossplane takes a different model since there are all of these different permmissioning models we want to standardize on kubernetes RBAC is the way that we do permmissioning with Crossplane.

[00:35:54.750] – Daniel
So what we say is that you give folks the ability, developers within a namespace, the ability to create abstraction. They’re never given credentials to create anything on AWS, or GCP or anything like that. You define the abstraction, you define the policy and the permissions required for mapping that to underlying AWS resources. And then you give a controller credentials and you can give it different sets of credentials and scope that based on whose provisioning it and that sort of thing.

[00:36:23.040] – Daniel
But that controller is responsible for executing that on the developers behalf. What that means is that all permmissioning is done at the Kubernetes RBAC level and has namespace isolation and that sort of thing, which is an extremely important quality of Crossplane that enables kind of this whole platform building and self-service model.

[00:36:42.510] – Ned
Right, right. If I could pick that apart a little bit. So I as the developer, have access to a namespace and this abstraction when I go to create whatever that abstraction is, that’s going to get processed by the cluster level stuff and that’s going to look up and see, OK, what AWS account, let’s say we’re using AWS, what AWS account should this get deployed in? And it already has its own credentials to create that resource. And then it’ll return maybe the database access information.

[00:37:10.920] – Ned
And that’s it to me as the developer and that’s all I ever see. So I’m not dealing with AWS credentials. I don’t know what they are. And that that’s better for me because then I can’t get hacked and give those away. And it’s better for the whole design process because somebody else can manage those credentials and rotate them as needed and all that kind of stuff. And that’s really I think I don’t know why Ethan was laughing when you talked about all the different permission models across the clouds.

[00:37:36.890] – Ned
The reason I was laughing is because I just did a demo that deployed resources using Terraform across GCP, AWS and Azure and interacting with all three platforms and all the different information you had to give. And the fact that it was wildly different for each one was incredibly frustrating. So thank you.

[00:38:01.070] – Ethan
It does end up putting a lot of trust, I guess, into Crossplane that you guys are getting all the permissions right with the abstraction layers. But it is a blessing to do if you can get that right, exactly what you’re saying. It’s just the getting it right across multi-cloud is. Well, that’s why I’m laughing anyway.

[00:38:22.460] – Daniel
Yeah, absolutely. And to kind of touch on what you said, yes. There is a lot of onus on Crossplane to make sure that, you know, when you define a mapping that that happens correctly. Right. When you compose the resources that they’re rendered out correctly. However. Right. The platform operator within an organization is still, as Ned was alluding to there, creating those credentials and saying who can assume those credentials right when the controller is going to use them and that sort of thing.

[00:38:46.580] – Daniel
So so you still really locked down how credentials are used in that sort of thing. So definitely from a platform team and infrastructure operator perspective, we recommend that folks have good knowledge of how the underlying IAM structures work.

[00:39:02.850] – Ethan
Daniel, I think I know the answer to this question, but I’m going to going to ask it anyway. So we’ve talked about K8s and it’s going modular and you could take advantage of that, maybe maybe come up with some kind of a simpler setup. Well, could Crossplane interact with a different sort of an orchestrator, say, HashiCorp Nomad? Something like that.

[00:39:21.980] – Daniel
Yeah, it definitely could. Right now, we are pretty tied to the Kubernetes API, right, and that has some benefits because and to be honest, I’m not nearly as familiar with Nomad as I am with Kubernetes, but the extension model is really the thing that that folks are familiar with and what we take advantage of. So that is really the blocker rather than the actual workload scheduling and that sort of thing. The CRD model of Kubernetes is really what Crossplane is leveraging.

[00:39:51.590] – Daniel
You could imagine to your point, right, that you could take advantage of that extension model. But behind the scenes, in terms of just like scheduling the providers to run in their reconciliation loops and that sort of thing, you could have a very stripped down Kubernetes API server, then leverage something like Nomad to actually run your workloads. That being said, most folks don’t want to go down that path, but if they were using Nomad already or a different orchestrator, that could be a potential path for them.

[00:40:17.910] – Ethan
OK, well, let me ask the question in a different way than the way the project, the Crossplane project is structured. If there was community demand, we want to make this work with Nomad. Is that plausible? If the community were to put the effort in and contribute the code, it could actually happen that way.

[00:40:33.800] – Daniel
I would say it’s plausible. I will I won’t say that it would be easy by any means because there does need to be a Kubernetes API server somewhere, even if it was an extremely stripped down, a minimal one.

[00:40:43.280] – Ethan
OK, OK, another Crossplane question. They’re going kind of back to the root value propositions. We’ve been talking through this. Dude, there is a lot of complexity going on here, any time you introduce a heavy duty abstraction like this, there’s so much magic that is presented to you. You get to interact with the abstraction. Now, that makes things easier and more uniform, but then the complexity is kind of hidden. And on the other hand, there’s all this complexity to this kind of being added at the same time. And so that’s a trade off. You need to decide you’re going to take that on if you integrate in this case, Crossplane with your kubernetes infrastructure. Is there, if I’m thinking about this, am I should I think about it like my application is going to benefit from Crossplane or as an organization, the way we’re structured, we’re going to benefit from Crossplane? How do I decide that I should use it or it’s really not the right thing to take on?

[00:41:37.180] – Daniel
Yeah, so. So I think there’s kind of two questions there. Right. How can I reduce that abstraction and kind of get a grip on how I adopt this? And then second, what are the cases where it would be worth it? So allow me to reduce complexity by adding more here.

[00:41:53.500] – Daniel
Crossplane has I’ve alluded to this package manager, right? I mean a lot of times folks say, why does Crossplane have a package manager? Why don’t you just install the provider back ends with Helm or something like that? Crossplane takes an opinionated approach to how packaging works, and it also offers two different levels of packaging. So one of those is being providers and the other one is configuration packages. So we’ve been talking about providers, but the packaging format is actually very similar for both of these types of packages.

[00:42:23.740] – Daniel
Instead of using something like helm, we actually package our providers into OCI images themselves. So a provider is an OCI image with a set of CRDs just in a single YAML file. So it’s basically a single stream of YAML, as well as a manifest file that basically says this is where you can get the image, which is another OCI image of a running these controllers. And this is basically the CRDs that that I need to own as a provider.

[00:42:52.060] – Daniel
So when you say Crossplane, please install this provider, it fetches that basically YAML stream reads it through and says, I’m going to install these providers, I’m going to start up the controllers and I’m going to make this provider the owner of these CRDs. So the benefit that gives you is Crossplane can do things like roll back and roll forward CRD versions and controllers that match them. It also means that you can’t install two providers without explicitly saying that manage the same CRD.

[00:43:19.760] – Daniel
So you don’t want to have one AWS provider managing an RDS instance and another one doing the same and them fighting over it or something like that. So there’s a lot of things we do to make that a happy path. But really that the what I’m getting to to make this an easier experience is the configuration packages. So I’ve talked about these ways that you define a new abstract types and then the mappings. Those happen through two resources an XRD, which is kind of a Crossplane variation of a CRD, is how you define the new schema for an abstract type and offer it to different namespaces.

[00:43:52.240] – Daniel
A composition is how you say for an instance of this XRD that gets created, spit out these granular resources. Right. So that’s where you may have a composition for AWS and GCP and Azure or whatever for the same abstract type. What you can also do is these configuration packages allow you to bundle up these abstractions and also declare dependencies on provider packages. So to to give kind of a concrete example of what this looks like, you can build a single OCI image that includes a manifest that says, this is my platform for the Ned and Ethan org.

[00:44:30.520] – Daniel
And you have a single abstract type in there defined by an XRD that is a database. And you have three compositions for that XRD. One of them says create an AWS RDS instance. One of them says create a GCP cloud SQL instance. And one of them says create an Azure SQL instance. You package that up and you say in your manifest, I depend on provider AWS, provider GCP and provider Azure. And you have a single unit that any time you install Crossplane from now on, you can say, please install this configuration.

[00:45:03.280] – Daniel
And what Crossplane will do is it will find compatible versions of all of the providers, install those, make sure all the CRDs are present, install your abstractions, and then make them available to folks in the appropriate name spaces. So what we’re really doing at that point is bundling a platform definition into an OCI image that can be pushed to Docker Hub or Upbounds Registry or any other OCI conformant registry. And that means that you can also have a marketplace right.

[00:45:31.600] – Daniel
Where a startup could come along and say, I want to consume this, these abstractions that some other larger organization has already defined. So in the future, we envision folks coming along and basically saying, oh, I need a friendly interface to AWS, I’m just going to go ahead and install this friendly AWS package and. It’s going to make that available to me and give me abstractions that are tried and true from some other organization or something like that, and then I can build on those and modify and create my own dependency tree of abstractions, which really gives you the ability to kind of have that immediate Heroku experience or layer two cloud experience without actually even defining your own abstractions on it.

[00:46:14.860] – Daniel
And we hope. Right. The intention there is that number one folks can have reproducible platform environments. So you can do that across dev, staging, and prod. But also the the barrier to entry is drastically reduced. Right. Because to start using Crossplane, you just have to be able to do a one click install basically, and then create this abstract type and you’re off to the races without even having to really understand how AWS or GCP is happening behind the scenes.

[00:46:40.780] – Ethan
You’re making it sound like getting ramped up on Crossplane is easier than maybe digging into the nuances of the various cloud providers.

[00:46:48.320] – Daniel
Absolutely, and, you know, if you are designing a robust platform for a large organization, obviously you’re going to eventually need to understand why you need to give your your service account these credentials to create this abstract type. Eventually you’re going to need to understand that, however we have, and this will be rolling out soon, certification for providers and configuration packages and that sort of thing where we say this is kind of a known good package. And if you install this in your cluster and provide these credentials, we can provide documentation on how this abstraction is happening.

[00:47:23.900] – Daniel
So you don’t have to actually implement it. Right. But you get to benefit from from this kind of platform definition that other folks, whether it be organizations that are offering Crossplane as a service or consuming Crossplane to sharing open source.

[00:47:37.820] – Ethan
This is interesting because you made a point here that’s just just hit me. You say that at some point someone in your org’s got to know what’s going on under under the hood, what Crossplane is manipulating. But the interesting part here, not everyone does. Devs don’t. They just care about Crossplane. Sure. Someone’s got to understand what’s being abstracted so you can troubleshoot things when they go wrong and make sure setting up your definitions and your abstractions correctly the way you want, that’s most suitable.

[00:48:04.610] – Ethan
But you’ve taken a burden off of part of your organization because they’ve only got one interface to work with.

[00:48:12.580] – Daniel
Exactly, and, you know, someone absolutely should understand those abstractions, but maybe not right at the beginning. You know, maybe you’re maybe you’re a startup and you need to leverage cloud infrastructure and you don’t want to be locked into to, I keep harping on Heroku. Let’s say we’ll use another one like render, if you have heard of render. It’s kind of a similar type of thing where they offer, you know, these really abstract resources. You don’t want to be locked into that because, you know, in the future you’re going to need some of the complexity that something like AWS offers.

[00:48:39.670] – Daniel
Right. You’re going to need custom solutions. But so so you don’t want to be locked into using these specific APIs for for a layer two cloud provider. So instead, you install these abstractions in your cluster and maybe for the beginning when you’re doing your PoC or MVP or something like that, you’re using these abstractions. You’re not exactly sure how they’re mapping behind the scenes. You’re not exactly sure how the credentials are being assumed and that sort of thing.

[00:49:03.790] – Daniel
But you’ve put yourself in a position that when you want to understand that you don’t have to change the APIs that you’re using. Right. And you can continue down that path. So, yes, it definitely always will provide that that ease of burden on developers. But I’m personally and this isn’t necessarily as a a mission of the Crossplane Project directly, I’m personally really excited about how it’s going to empower smaller companies to have infrastructure and control planes that look like large cloud providers.

[00:49:30.760] – Daniel
Right. That look like large companies that consume a lot of infrastructure. We’re kind of trying to democratize that and make that available to everyone.

[00:49:37.850] – Ned
Right. This reminds me a lot of what MSPs want to do. They want to be able to offer this kind of interface, but it’s really hard to set up. And now you’ve provided, in a sense, a beginner’s kit to getting it started and then they can customize it however they want. They could throw a Web interface in the front so that their clients can consume it as more of an, you know, a UI, as opposed to just throwing YAML at the problem.

[00:50:02.020] – Ned
But it does give them that startup kit. The last question I had, and this ties neatly into what you’re just talking about, if someone does want to get started, can they just run this thing using Kind on their laptop or do they need to use a cluster up in AWS? What’s the easiest path to get started with Crossplane and try out some of these features?

[00:50:22.090] – Daniel
Absolutely. So definitely in our getting started guide we recommend using Kind so any conformant kubernetes distribution will work. Right? So if you are, if your organization requires that you always run on AWS or maybe you’re using kubeadm and running it on EC2 instances, whatever you want, you can install Crossplane in there. I will also say and and warning this mentions my employer, which is Upbound, so I know you all will edit this out afterwards. So I’ll just go ahead and say it.

[00:50:50.020] – Daniel
Upbound and Crossplane is an open source CNCF project under the Apache two license. Right. So, so Upbound is a potential distributor of Crossplane. There can be others as well. Just like Kubernetes. Upbound, for instance, offers a hosted Crossplane offering where instead of giving you kind of like the whole Kubernetes cluster, we just give you the Crossplane API and give you kind of like one click interface. So if someone wants to do that, you’re allowed to do that.

[00:51:15.250] – Daniel
As as I mentioned in my blog post, I will be very honest about the benefits of using proprietary solutions or not. But if that is of interest, that’s an option.

[00:51:24.400] – Ned
No, no. We appreciate that. And we’ll leave it in. Daniel, that’s that’s fine. You’re allowed to mention your employer.

[00:51:31.090] – Daniel
Very kind, now, I’m going to get a bonus from the marketing folks at our company.

[00:51:33.880] – Ned
So there we go. There you go. Well, if folks want to know more about you or know more about Crossplane, what are some good places they should look on? The interwebs.

[00:51:43.240] – Daniel
Absolutely so. We obviously have the Crossplane website and documentation, which is Crossplace dot io. We we have a YouTube channel where there’s a lot of different talks. We actually recently had a KubeCon EU twenty twenty one. Last week we had a day zero event for for Crossplane called Crossplane Community Day, and we’ve had a number of those in the past as well. So we have a smattering of talks of folks talking about it, whether it be end users. The founders of Kubernetes, we have a roundtable with the founders of Kubernetes where they’re talking about this kind of like control plane future and moving beyond container orchestration.

[00:52:16.210] – Daniel
Kelsey moderates that, of course. And and then we also have I do a a live stream where we have different CNCF projects on and talk about the benefits of standardizing on the Kubernetes API and show demos of using something like Open Policy Agent alongside Crossplane and things like that. So you can go to the Crossplane channel on YouTube. You’re also welcome to join us at Slack dot Crossplane dot io, which is our workspace where we are extremely active. We get feedback all the time.

[00:52:45.310] – Daniel
A number of the Crossplane maintainers are across different time zones, so we have the benefit of being very responsive there. And then the last thing I’d say, if you’d like to reach out to me directly, I am happy to answer any Twitter DMs or any other sorts of emails or anything like that. I’m everywhere at HashDan, and so feel free to reach out on Twitter or something like that or danielmangum dot com of that has all of my information.

[00:53:10.570] – Ned
All right, awesome. We will include all those links in the show notes. So, you know listener don’t worry about scribbling any of that down, but if you just want to go to find him on Twitter, it’s HashedDan you can find him there. And I think all the relevant links can spring from there as well. So that’s an easy one to remember. Well, Daniel, so thank you so much for being a guest today on Day Two Cloud.

[00:53:34.220] – Ned
It’s been it’s been a heck of a conversation. I really enjoyed it.

[00:53:38.030] – Daniel
Absolutely. Well, I appreciate you all having me on and big fan of the show. So definitely an honor to be here. And lots of other guests that you all’ve had in the past that that I don’t hold a candle to. So I’m very honored to be here and I’ve enjoyed the conversation as well.

[00:53:52.220] – Ned
Awesome. Thank you so much. And hey, listener virtual high fives to you for tuning in. If you got suggestions for future shows, you know, we want to hear about those suggestions. You can hit either of us up on Twitter at Day Two Cloud show. Or you can fill out the form on my fancy website, Ned in the cloud dot com. Did you know that packet pusher’s has a weekly newsletter. It’s called Human Infrastructure Magazine, you’re the infrastructure.

[00:54:17.450] – Ned
And it’s filled with the best stuff that we found on the Internet, plus our own feature articles and commentary. It’s free and it doesn’t suck. That’s good. So you can get the next issue, if you’d like, via packet pushers dot net newsletter until next time. Just remember, cloud is what happens while IT is making other plans.

More from this show

Day Two Cloud 180: Understanding AWS EC2 At The Edge

On today's Day Two Cloud podcast, we speak with Jan Hofmeyr, a VP within Amazon Web Services (AWS). This show was recorded at AWS re:Invent 2022 in Las Vegas, and we discuss EC2 at the edge, AWS Outposts and how local zones work, connecting Outposts to...

Episode 100