Follow me:
Listen on:

Day Two Cloud 179: Will CXL Make Composable Infrastructure Real?

Episode 179

Play episode

On today’s Day Two Cloud podcast we talk about Compute Express Link (CXL), a technology for composable infrastructure. The idea is to take all the peripherals in a system—memory, network cards, graphical processing units, and so on—and put them on a bus outside the chassis to share them among multiple hosts. Is this the dream of composable infrastructure coming true?

We get into what CXL is, how it works, and why you might want to use it with Craig Rodgers, Solutions Architect at Camlin Group; and Chris Hayner, Lead Consultant at HMC Technology.

We discuss:

  • What CXL is and where it sits in the hardware stack
  • PCIe and CXL
  • How CXL compares to existing technologies like Optane PMEM and NVMe
  • Where are we in the development of the CXL specification?
  • What’s included in versions 1.1, 2.0, and 3.0?
  • What type of devices would you probably find on a CXL bus (if that’s even the right term)?
  • Security issues
  • More

Sponsor: CDN77

Why should you care about CDN77? To retain those 17 out of 20 people who click away due to buffering. CDN77 is a global Content Delivery Network (CDN) optimized for video and backed by skilled 24-by-7 support. Go to to get your free, unlimited trial.

Show Links:

About CXL –

@CraigRodgersMs – Craig Rodgers on Twitter

Utilizing Tech podcast on CXL.

Chris Hayner’s blog

@hayner80 – Chris Hayner on Twitter

Hear Ned and Chris hosting the Chaos Lever podcast



[00:00:00.170] – Ethan
Why should you care about CDN 77 to retain those 17 out of 20 people who click away due to buffering? CDN 77 is a global content delivery network optimized for video and backed by skilled twenty four seven support. Visit CDN 77 dot slash packet pushers to get your free unlimited trial.

[00:00:27.450] – Ned
Welcome to day two Cloud. And today we’re talking about CXL, this revolutionary new technology that’s going to take the world by storm and completely change the way everything works. Ray?

[00:00:38.510] – Ethan
Ethan I mean, kind of.

[00:00:40.380] – Ethan
Because if we can take all the peripherals that are in a system, the memory and network cards and graphical processing units, and stick them on a bus that’s outside the chassis and share them amongst multiple hosts, wouldn’t that change everything? I think it kind of would.

[00:00:54.370] – Ned
I think it might. It’s the dream of composable infrastructure that I’ve heard for so many years. It could actually finally become a thing. And we’ve got two people who are deeply steeped in the technology. Craig Rogers. He’s a solutions architect at the Kamlin Group. And Chris Hayner he’s a lead consultant at HMC Technology. They’ve been doing the research and taking the briefings, so they’re going to tell us all about CXL. Well, Craig and Chris, welcome to the show. Glad you both could make it. Today we’re going to be talking about CXL, and so probably we should start at the ground level here. Craig, can you briefly describe to me what CXL is and where it sits in the overall hardware stack?

[00:01:37.610] – Craig
Okay. CXL is evolution of something everybody’s already very familiar with, the PC Express boss. So basically it’s PC Express level connectivity, but it’s going to allow that to EndToEnd out beyond the me and Chelsea of a computer, much the same way Thunderbolt has. Thunderbolt has been that level of external connectivity. And the reason it got performance was it was an extension outside that extended that PC Express bus. So CXL is just the new way of letting us plug in peripherals and components outside of the chassis.

[00:02:19.380] – Ned
Interesting. So what’s the differentiation between PCIe itself and CXL? Because it seems like they’re almost one and the same.

[00:02:27.870] – Craig
They are. If you imagine PCIe was probably the foundation for CXL, and that it provided a form factor slots, it provided power specifications, it provided bus connectivity up to processors, it’s provided that foundational layer. CXL is driving more, what would be the term, more capabilities from that PCIe bus that’s already there.

[00:02:59.610] – Ned
Okay, so got you. It’s taking advantage of the hardware that PCIe provides, but it’s bringing its own operations and capabilities with it exactly outside the chassis.

[00:03:11.940] – Ethan
Craig, you said that a couple of times. What do you mean by that?

[00:03:18.670] – Craig
I’m sure we’ve all architected serverless solutions here at some point. So whenever you’re architecting a solution, you take a step back and look, how much compute do I need, how much storage, how much Ram, how much SSD, how much spending rust if you’re still using that, how much network throughput, and then you arrive at a conclusion of numbers. And then you would work out the total number of hosts that you need that are sensible for me. In fact, maybe even they are as well. And then you would have to take a step forward again and carve the requirements into a highly available solution. So you’d be putting this amount of memory and hosts that guidelines. You’d be putting these number of CPU cores, you’d be putting in these number of GPUs per server if you were doing VDI or something. What CXL is hoping to achieve is to allow you to stay back and architect with Compute notes and architect with trees of Ram, trails of SSDs, trays of AI modules, GPUs, et cetera. So what’s in an individual server shouldn’t matter as much because you’re looking at that whole pool of resources away from a server level one, two, four, U server.

[00:04:39.370] – Craig
They want you to look at a rack scale and potentially multirac.

[00:04:45.130] – Ethan
So CXL outside the chassis really means we’re moving towards that world of composable architecture.

[00:04:51.070] – Craig
Exactly, yeah, moving towards now, that’s going to take time until we get there. Our initial benefits around CXL are mostly around being able to compose the image of memory available to a server.

[00:05:06.110] – Ned
So if we’re talking about memory and storage, how does CXL compare to existing technologies that are already out there? Something like Intel’s optane or NVMe connectivity. Chris, what are the differences there.

[00:05:26.150] – Craig

[00:05:26.620] – Chris
Certain ways in terms of memory utilization and cost benefit analysis, optane was an amazing first try. I think it was a product that was a bit ahead of its time and just never got enough utilization in the market for it to take off. What CXL does by using the existing platform of PCIe, you’re using commodity hardware and in fact they’re trying to do commodity hardware across as many manufacturers as they possibly can. Whereas Optane was kind of intel’s project. Intel incidentally, was the founding member of the CXL team all the way back in 2019. But it has since become basically every company you’ve ever heard of.

[00:06:06.080] – Ethan
Yeah, there’s a lot of people in the CXL consortium now. I forget how many companies, but I think we’re into dozens that have signed on and or other groups that had similar technologies, gen Z, cops to mind. But they’ve donated that intellectual property into the CXL consortium.

[00:06:22.660] – Craig

[00:06:23.100] – Ned
So speaking of that consortium, I guess they’ve rallied around a specification. Where are we in the development of that CXL specification?

[00:06:36.030] – Craig
What is it?

[00:06:36.750] – Chris
CXL has a couple of major versions that are important to people of the ones that matter really are 1.01.12 .0 and then the recently released and deeply theoretical at this .3.0. But they have a roadmap for CXL that goes out well further than that. I believe it goes out to either five or 06:00 A.m.

[00:06:58.970] – Craig
I. Right, Craig? Yeah, six maybe seven, you’ll be doing silicone, photonics, all sorts there.

[00:07:07.530] – Ned
So really pushing capabilities there, or at least they’re trying to extrapolate out what the potential capabilities of the spec might be.

[00:07:16.010] – Ethan
Yeah, that aligns with what I’ve been reading. We got 1.01.12 .0 and 3.0 and I am amused at your choice of words that 3.0 is deeply theoretical, but the PowerPoint slides we’ve seen so far are phenomenal. They are some gorgeous illustrations of what we’re going to be able to do with memory pooling and PCIe switches and the such like. But I think they’re more than deeply theoretical. I mean, as we said, it is a published specification that is going to enable some really cool things that we’re on track for. And there’s been some testing done with some of this deeply theoretical specification and so on. Isn’t that right, Craig?

[00:07:54.710] – Craig
For sure, yeah. And the companies that have been longer standing members of the CXL Consortium have obviously all been working together to agree on what that specification is actually going to be. And these days with Flipchip PGA technology, it’s allowing them to really rapidly develop new products. There azure CXL switches out there at the minute, which will be arriving on the market soon, where the azAC doesn’t even exist right now, but it’s running quick enough in FPGA to perform the tasks that are needed and that’s letting them have a real good, fast development cycle and why they’re working on stuff now two generations ahead of the one that isn’t even out yet.

[00:08:40.790] – Ethan
Can we delineate what the differences are between 1120 and 30 as those are the major specs we keep seeing.

[00:08:48.810] – Craig
One. One is mostly inside the Chelsea. 2.0 is going to give you external switching capability and 3.0 is going to let you plug in everything, all sorts, GPUs, Ram, disks, whatever you want.

[00:09:04.850] – Ned
What type of devices Azure we talking about that we would typically find on a CXL bus? I think you’ve sort of indicated that a bit already, but obviously memory. But would you put other device types on there and sort of how are they interacting with the CPU?

[00:09:18.990] – Chris
TXL defines in their terminology, there are three types of device. Type One, type Two, type Three. Type One is a standard device and it’s referred to as an accelerator. So that’s your GPUs or a Nic or even external processing units as they come available. Type Three, all the way on the other side is a memory buffer. So that’s a massive memory expansion and it’s really focused just on that. And type two is in between. That kind of has a little bit of both. So when Craig was talking about what CXL is kind of like, I really like the fact that he used the concept of Thunderbolt. So CXL 1.1 allows types of expansion like that that are unique to the system itself. Meaning if you have a Thunderbolt connection, you can connect an external GPU and use that for workloads. Right now that’s something that you can buy a Best Buy tomorrow, CXL is going to do the same thing and they’re going to do it over PCIe CXL 1.1. So that’s for a type one device in 1.1, they can also do type three devices, which is just inbox memory expansion, plugging directly into the PCIe Five bus with an expander card or a memory accelerator that will then communicate over PCIe to a CPU that is compatible with this CXL technology.

[00:10:34.160] – Ned
Okay, so that means that the CPU itself needs to support accessing the memory that’s over on the PCIe bus as opposed to the local attached memory that’s on its numa note or whatever the correct terminology would be. I’m sure you’ll correct me, Chris.

[00:10:51.790] – Chris
No, you got it. And actually it’s an interesting distinction because from what I understand and we’re definitely going to have to promote Craig’s CXL podcast as much as possible. The most recent episode, they talked about the difference between the memory from an expander being shown as a simple device or with memory software management overlaying it as a virtual device that has more functionality and capability. But in either case, the way that I understand it is that in 1.1 the memory shows up as a new Pneuma node that just doesn’t have any CPUs in it. So you’re still inside the frame, you’re still getting motherboard level speed connectivity to DDR Five memory. The only cost in terms of latency is a numa hop.

[00:11:39.230] – Ned
Okay, so it will introduce latency, but we’re still working with the same physical type of memory. So the same chips I would pop directly in the system board and in my regular dim slots, those same chips are going to go in this expander card, but there’s just going to be additional latency and maybe a little less throughput because of where it sits.

[00:12:01.590] – Chris
That is my understanding. And with this stuff that’s inside of the chassis, what’s exciting is the amount of latency we’re talking about is something that we’ve already done system design around. If you’ve ever installed something on VMware, you understand the Newman boundary and how to design a system to take advantage of memory inside of the Newman node in an appropriate way. And that’s where I think some of the more advanced memory management software is going to be super interesting because the memory can then be aggregated and pooled and the memory management software can then say, well, this system is running at a really high rate. It’s going to get the local memory first. This one is a development system. It’s going to get primarily the CXL connected on its own Newman node first.

[00:12:48.870] – Ethan
This gets more interesting when we get into the architectures that involve a PCIe switch. PCIe switch that would be sitting in between maybe the memory and the host that’s consuming that memory adds yet another latency hop, meaning certain workloads are going to be fine with that additional latency, but others maybe not. And so there’s more design considerations for what your CXL bus looks like and whether or not you include a PCIe switch, depending on what the workload demands are and the speed required to access the memory being consumed.

[00:13:19.630] – Chris
Yeah, that’s exactly right. And that’s where you get into CXL 2.0. And this one I haven’t looked as much into. Craig do you have, like, use cases or examples around 2.0 that makes sense?

[00:13:29.760] – Craig
Yeah, for sure. A couple of things that I want to clarify on there with CXL, it’s not exactly like NEMA because you’re not using the QT link, say, between processors. There’s obviously going to be additional latency in a hop, but we’re talking two orders, we’re talking more, we’re talking multiple orders of magnitude faster than, say, a network. We’re not in milliseconds, we’re in nanoseconds. And some manufacturers are saying they can swap memory between devices in 100 nanoseconds. Something that quick. It’s really quick. Jumping to 2.0 and taking it outside the chassis. The external connectivity brings about a whole new wave of challenges, many of which I don’t think we’ve even thought of yet. One big one is how are we going to secure the memory in that? How do we know that that memory is allocated to that, say, host or whatever? At some point they may abstract it and present it straight into a virtual machine. We don’t know we’ll have that capability. It’s a virtual Linux device. As you were touching on there, Chris, there’s a couple of different ways it can be presented to a machine, so it’s going to open up a new wave of challenges control and orchestration.

[00:14:55.380] – Craig
How do we control what host gets access to what Rom?

[00:14:59.490] – Ethan
Yeah, I had a briefing from Broadcom on this, and they got into this specific topic that these are challenges that have not been nailed down yet. Exactly how all of that is going to be dealt with. If you have multiple hosts on a shared PCIe bus or the CXL bus through a switch that can hit the same memory, in theory that’s a problem. And so you have to allocate the memory to one host or the other. How do you do that? That’s a resource management challenge. That isn’t known how that’s going to be solved yet, as I understand it. And then you use the word security. Craig were you getting at if there’s yet another actor that’s plugged in there that is like, hey, hi, I’d like to get access to this memory location, and then they just walk in and access and see what they can pull. Is it that kind of a thing?

[00:15:46.080] – Chris
For sure.

[00:15:46.450] – Craig
What are the biggest tax services we have at the minute? VMware. Say if you get access to VMware, you’ve got access to all the workloads underneath. There’s going to be something that composes CXL moving forward. The company that wins might not even exist right now, but something is going to have to orchestrate that and compose it, and if it’s composable, it’s exploitable. If you have the ability to allocate that and allocate who has access to that, that’s an attack surface that’s going to have to be controlled.

[00:16:19.230] – Chris
It’s one we have experience with. It’s not unlike if you accidentally gave administrator level control to your blade chassis, fabric that person in a similar way as VMware, they could do whatever they wanted with whatever is connected. So CXL is going to be connected and going to have to be secured in the same exact way.

[00:16:37.730] – Craig
So instead of Chelsea manager, it will be a rack manager.

[00:16:40.360] – Ned
Yeah, probably. This is not entirely unlike when we introduced storage arrays and you had a shared pool of disks and then you had an operating system dedicated to the storage array that was going to hand out permissions and access to those various LUNs that were carved out of the storage array. And you did have the capability in some cases to give read only access to some machines, while one machine would get read write access to that lun. That definitely raises some security concerns, because if the wrong machine gets read access, it could have access to potentially sensitive data. And I know, like, if you think there’s sensitive data on disk, oh, wait until you get to memory, because that is assumed that it’s only dedicated to a single thread usually. And so processes do tend to put really sensitive information in memory, assuming that it is secure. What if it’s not? I’m not asking for a solution. I’m just like picking holes in potential problems we’re going to have to deal with down the line.

[00:17:44.950] – Craig
We have to assume that they already have something in mind. Companies like Candy and intel are beck and then even software level security into the hardware of the chip with the people who are on board in the consortium, security will have been considered. Our first market is most likely to be hyperscalers. Given the CXL 1.1 feature of Ram composability, they stand the most commercially in optimizing the use of Ram. Over 50% of the server cost is memory. So it makes sense to take the biggest thing and try and get operational efficiencies there. But what I have not seen is huge amounts. I don’t know why they’re going to secure properly yet. Hopefully find out soon.

[00:18:34.370] – Ethan
We’ve been talking a lot about memory here because that does seem to be most of the noise as you watch CXL presentations. They talk a lot about memory and memory use cases that are going to be enabled by CXL. What about other peripherals? I think, Chris, you might have mentioned Nix network interface cards being on a CXL bus of some sort. So Azure, there other use cases for peripherals like Nix that are interesting that CXL is going to enable.

[00:18:56.630] – Chris
They vary based on things that actually exist and the hypothetical. But things like Nix and GPUs are on the roadmap as obvious low hanging fruit. Any component you can think of that was part of a single standalone server could conceivably be put on the PCIe Five or six bus using CXL. Some of the things that are in between the type two devices that I was talking about are particularly interesting and are a little bit more in the future. Like one example would be a GPU that has additional memory attached to it that could be shared intermittently between systems as it’s needed. So let’s fast forward ten years. You’re in, I don’t know, a pod of five or six systems that are connected via CXL. Not everybody is going to be running a full GPU workload at the same time. So you could have a situation where you kind of claim the GPU man, I’m thinking now in mainframe terms you claim it for a period of time, it’s dedicated to your workload and then when you’re done it’s released back into the pool for someone else to claim. So that’s kind of an example of a product that already exists.

[00:20:01.080] – Chris
Everybody understands the GPU.

[00:20:02.630] – Craig

[00:20:03.380] – Chris
To Craig’s point, memory is really expensive. Being able to reuse it means you save a lot of money and TCO is really massively benefited.

[00:20:11.770] – Ethan
It’s not sitting there in a slot taking up server real estate when you don’t need it for most of the time you have it on a bus, that’s a community property. Anybody can get in there and use it and it’s time sharing kind of thing.

[00:20:23.850] – Chris
Yeah, and that’s full CXL 3.0 because we’re talking in that point of full fabric connections and multilayer switching. Also 6.0 doubles the bandwidth of PCI Five, which is going to make a lot of this technically and bandwidth wise practical.

[00:20:41.070] – Craig
There’s another aspect there regarding memory. This is going to give us we’ve had tiers of storage for decades. We have tiers of storage. We’ve never had tears of Ram. If you had a DVR Three system, DVR Three Ram, four five CXL is going to let us use tiers of memory because now there’s going to be direct attached, fast connected plugged into a demo in the motherboard Ram. There’s going to be potentially a huge pool four times size behind that on CXL that may not have the same throughput, has not dissimilar latencies, but it won’t have the same throughput, but it doesn’t have to be DDR Five. It could be DDR Four. It could be DDR three. Do all of your instances in AWS, do all your ECT instances need DDR Five Ram or do they maybe only need three? They might not need the fastest Ram available, which is what DDR Five is right now.

[00:21:40.350] – Ned
Right when you said AWS, that triggered a thought in terms of how the hyperscalers are going to enjoy this because the way that they decide to buy hardware and the way that they size their instances is because of the limitation per box. So I can put X amount of memory in a box, I can have X number of CPUs, and so I will carve it up thus and such that I get the maximum amount of utility out of each metal box as I sell it off as instances. And if I want to have GPU enabled virtual machines, then each of the hosts needs to have that GPU. That can then be time shared out to the individual virtual machines in that box. This completely changes that model for the Hyperscalers. They can now buy individual components and then compose them together as needed to service whatever instance, type and size you want.

[00:22:34.110] – Craig
And there’s a lot of workloads that go hugely vertical. SAP Hana HPC Workloads there’s a lot of workloads that will benefit from huge amounts of Ram virtualization. It’s a gray area, I don’t know how they’re going to implement that yet. It’s obviously going to come, but it’s at least going to give the Hyperscalers the option to provide these huge amounts of Ram in machines. And they might only have a couple of those in Iraq with other workloads. It’s going to give them a lot of options to try and process as a type of scaler. Scale yourself is difficult. It’s difficult, but I’m sure they’ll have teams of very smart people on it.

[00:23:28.640] – Ethan
Let’s PaaS the podcast for a bit. Research suggests that 17 out of 20 people will click away to the buffering or stalling, and I am definitely one of those 17. There’s lots of stuff to watch out there and there’s no reason to wait around. If your company delivers online media, consider CDN 77. They are a globally distributed content delivery network and they’re optimized for video on demand as well as live video. CDN 77 is not some newcomer to the scene. They are used today by many popular sites and apps, including Udemy, ESL, gaming, live sports and various social media platforms. And that makes sense to me. CDN 77 has scale. They have a massive network with distribution points all over the globe and plenty of redundancy. While that means you shouldn’t have problems, what happens when you do need tech support? CDN 77 offers twenty four seven support staffed by a team of engineers. No chatbots, no tickets getting routed around queues while no one actually does anything. Just no nonsense dedication to your issue. To get your online media back to 100%. To prove that CDN 77 will work for your content delivery, visit CDN 77 com packet pushers to get a free trial with no duration or traffic limits, that’s CDN 77 com packet pushers.

[00:24:49.290] – Ethan
For a free trial, you can push hard. For serious proof of concept testing, CDN 77 com packet Pushers and now back to this week’s episode.

[00:25:00.050] – Ethan
Does the CXL Bus enable communications without the CPU being in the middle? Can can devices on the CXL Bus talk to each other directly?

[00:25:08.770] – Craig
Not yet. Three, I think.

[00:25:13.560] – Chris
Yeah, three and three. Plus they actually have things on the on the roadmap where you have micro CPUs or chiplets or things like that at the edge of whatever your Compute platform is that will communicate only back to the main server when necessary, which should speed things up with much smaller, much more efficient, single purpose CPUs that are connected over CXL.

[00:25:34.890] – Craig
But if you have smarts, even existing CXL switches that are in play already have 256 lanes of PC Express that they can process. The fourth generation Intel Scalable processor is only giving you 64, but the switch can process 256. So if you have devices behind that switch that can talk to each other, technically it’s four times quicker than, you know, four times three would have gone through a single CPU.

[00:26:00.850] – Ned
That’s interesting. It builds a different sort of client server architecture and they’re not even servers. It’s just like little devices talking to each other. This GPU needs a little more memory. It’s just going to grab a slice of memory from the memory shelf. It doesn’t have to talk to the CPU about that, just grab some more memory for its process and away it goes for a while. That’s very different. I’m curious what this does to network traffic because in my mind, if I have all these different servers and they’re all connected to the PCI switching fabric, then I don’t need to have Nick’s riding a traditional Ethernet necessarily. Or I could put the Ethernet over the PCI Express bus. I’m just like I’m spinning here. But it does change the way that individual servers and virtual machines might talk to each other using the networking stack.

[00:26:53.490] – Craig
I think IP use infrastructure processing units are already offloading a lot of that. That doesn’t go back over the PCIe Bus to want to try and free up. I think the PCIe Bus deliver as much throughput as possible for necessary the best type of traffic. You don’t want to saturate the boss of network traffic if you don’t have to. IP Use will move that out to behind the switch. So if they can go straight out to the network and not interfere and not take a bond with them, the PCIe bust, that could be good. But you’re right, it’s going to transform networking.

[00:27:29.710] – Ethan
That scenario you described, it was almost like host to host traffic, though. Were you suggesting that you could have host to host comms like, I don’t know, RPC calls between microservices running across the PCIe Bus?

[00:27:40.770] – Ned
Potentially. I mean, why run it out a network card and hit the Ethernet and then whatever layer one you have there for your Ethernet traffic when you can just ride the PCI Express Bus with your DPUs or whatever? Now we’re using a lot of acronyms.

[00:27:58.230] – Chris
Yeah, that’s interesting. I think a lot of the separation will maintain itself if for no other reason than it’s going to be team based and security based. Not unlike how we have out of band networking in a completely separate Ethernet port than production networking. There’s no reason to think that that kind of separation wouldn’t still exist even if there was an opportunity to communicate much more quickly through the PCI bus. That would be for private traffic.

[00:28:24.500] – Craig

[00:28:24.690] – Chris
So once again you would have your Oracle rack nodes talking to each other through PCI, but talking to the Internet through Ethernet is my suspicion as to.

[00:28:31.120] – Craig
How it would end up.

[00:28:31.890] – Ned

[00:28:32.450] – Craig
I think a lot of these new products are probably they’re not going to hit the market at the cheapest option, new products are, and they’re going to have to get that money back. So I imagine certainly short and near term networking equipment will cost a lot less, but when the options are there, somebody will use it.

[00:28:53.770] – Ethan
Another question kind of related would be does memory access over CXL in probably a 20 or 30 iteration? Does that obviate the need for RDMA or InfiniBand technologies? Maybe.

[00:29:09.550] – Craig
InfiniBand has always been five years ahead of Ethernet in terms of latency. At what cost? It’s going to come down to money again. Infinity has always been the lowest latency choice, but Ethernet is prevalent.

[00:29:30.370] – Chris
InfiniBand is like the opposite of colo fusion. In five years, everyone’s totally going to be using infiniban.

[00:29:40.410] – Ned
So we’ve mentioned the fact that some of this hardware doesn’t exist yet. In fact, a lot of it doesn’t. What actual hardware can I buy today that implements CXL? Can I buy anything?

[00:29:52.110] – Craig
You can buy a PC Express expander card that like to plug in for Dim’s and provide half a tower beta from through a VPC Express slot.

[00:30:03.650] – Ethan
That would just cost money.

[00:30:05.410] – Craig

[00:30:05.810] – Chris
You will of course need a CPU and a system that supports CXL.

[00:30:09.350] – Craig

[00:30:10.530] – Ned
Okay, so that raises the larger question, is there a CPU and a system board that currently supports CXL?

[00:30:18.230] – Craig
If you are friendly with intel or AMD, yes, and they will lend you an engineering sample. But until we get fourth generation Scalable, Safari, Rapids, or AMD, there’s nothing on the market right now.

[00:30:34.570] – Ethan
This is a hardware issue. This isn’t just I can’t take my 2012 era HP Workstation Z 80 in my basement that’s running ESXi Seven right now and upgrade that bad boy to CXL. No, ain’t going to happen. No, got to have board, got to have chips. Boo. But also yeah, I saw speed.

[00:30:54.190] – Chris
It looks like AMD is going to win the race. They claim that variants of Genella are going to be releasing in actually just about a week on November 10, but shoot, what’s it called? Sapphire Rapids and Ampere computing has been working on an arm chip. Both of them look like they’re going to release in, quote, early 2021, three, early 2023. Did I mention that CXL is a time traveler?

[00:31:23.030] – Ned
We’re just going to throw those motherboards in the DeLorean and send them back to you the preview. Okay, so that’s good to know. That the. Hardware is it’s not way off in the future. It’s coming out very soon. And they developed the system boards with the CXL specification in mind.

[00:31:42.350] – Craig
Kudos to them. They held back, they held back the architecture until CXL’s back was locked in. They delayed it to make sure they could deliver what everybody was doing with CXL until I’ve more hit some roadblocks around fixing bugs, really. On SaaS Rapids. They spent a fortune. I think every time they spent up a new Wafer design, it cost them something like 10 million just to come up with the new disk. And they’ve had to do that twelve times to fix Software Rapids. So 120,000,000 just spin up new Wafers until they were like, yeah, we’re happy to put that out. At least they did it. They could have just fired it, right? I’m happy. There was Kia, nine figure Kia, but.

[00:32:29.470] – Ned
Yeah, I didn’t want to be that QA engineer who’s like, no, go spend another 10 million.

[00:32:38.130] – Craig
In the next three months. You know, if there’s three months behind, six months behind, you know, in the grand scheme of things, at least. Another interesting point, though. PCI Three come out when, 2013 or 14 1314, and we’ve been using it since then. And PCI Express Four only just really come out with Ice Lac. At the start of the year, we’re already moving to PCI Five. The cadence of frequency with which we upgrade PCI Express and enhance CXL buses is going to have to increase to get this out to the market properly and get people the functionality they want, which is if they did it every two years instead of every four to eight, and that that could get us season CXL three within five years.

[00:33:31.060] – Ethan
I thought that was the cadence. I thought that we were trying to get to see XL Three by 25 timeframe. Yeah, we’re we’re 2020 late 2022 as.

[00:33:38.250] – Craig
We record this, which coincidentally is when all of the fabs, they’re building more this way to come online.

[00:33:45.830] – Ethan

[00:33:47.110] – Craig
What are the chances?

[00:33:48.950] – Ethan
All right, so we have had a good bit of pie in the sky in this discussion because we’ve all been watching these CXL presentations where it’s the future and that’s what we’re marching towards, and this is coming and so on. But what can we do today? Do we have use cases either today or coming very soon? The early use cases that people in the industry are excited about with CXL.

[00:34:11.490] – Chris
All the early use cases, and by that I mean realistically getting in the hands of companies to use in production AMD realistically said three to five years before this stuff becomes worldwide. I mean, there’s the supply chain to consider, there’s the amount of time and proof of concept that has to go into it before you see this stuff at Best Buy. But to Craig’s point, there are products that actually exist. We saw a couple of them at the OCP Global Summit where you could see memory acceleration in a system happening in real life. That was from a company called Astera, but they’re not the only one. So they definitely are trickling out to the market right now. We’re in a very cautious place, I think, with these vendors, because the ones that I looked at all basically said, contact us for more information.

[00:35:00.430] – Craig
Yeah, I know. Walks optin has been turbulent this year, we’ll call it. But it’s given us a lot of the technology solutions that we needed to be able to do memory purely because it gives us access to a different tier of Ram. Granted, it was over the same bus, but it got presented as a device. So a lot of the engineering challenges around tiers of Ram have already been sorted. You have companies out there that are already, like some members there are already providing optimizations around Ram. And if you’re running, say, a spot instance in AC Two and you’re worried that instance is going to get reclaimed, but it’s got a huge workload on it. You can take snapshots of Ram, and if that spot instance gets reclaimed, restore that snapshot in active machine state to a new spot instance on identical hardware. So a lot of the engineering required to do it already exists. It just needs to be tweaked adopted.

[00:36:05.650] – Ned
Yeah, okay, that makes sense. But I guess the other big thing about optane is it was persistent. So it wasn’t just that it was another tier of memory that was also persistent memory. It could be either okay, we’re not going to be either with the CXL memory expanders we’ve been talking about, because those are just regular DDR four or five dims, right?

[00:36:26.650] – Craig
Yeah, but I don’t see optin going away, because if you look at the latency chart, you go from spending desk to SSD increase in order, you reduce an order of magnitude. If you go from state to MVM, you reduce an order of magnitude. If you want from NVM to opt in persistent, you reduce an order of magnitude. And then Ram was another. At the minute, it slotted really nicely into an options chart there. And in terms of latency, I don’t see that performance of persistent memory going away. It’s massively backwards. If we go back to Mvme, we’re ten times slower than what we were with opt in. Granted, opt in didn’t get anywhere near as much adoption, but it may 3 or six help us and intel have a huge stock of it. In the stock of it, you know, there’s intel product, there’s opt in memory products coming out that aren’t even on the market yet. You know, the 300 cities will come out. It’s not even night yet, but they’ve already announced the trilogy opt in, but there’s still a whole lot of servers that you can plug it into.

[00:37:29.870] – Ned
I’d feel a little odd buying a server with optane knowing that it’s an end of life product yeah, but they.

[00:37:37.970] – Craig
Built up so much stock, microns stopped building it down piece ago. They have so much stock there, I think they wrote off half a billion. But there’s a number of diet there. I wouldn’t want to bat either, though.

[00:37:50.390] – Ethan
From a use case perspective, early use case perspective. Is there a play for CXL with AI and ML workloads?

[00:37:57.150] – Chris
I think that’s exactly where the type two devices come into play as well. We talked a little bit about it with GPUs, which is kind of shorthand for AI ML processing anyway, right? So whatever type of chiplet you want to talk about can be put into an accelerator and attached or assigned temporarily or otherwise to a system in CXL. So I think a couple of things like that that will come out sooner rather than later. They make an obvious use case. Not every system needs 16 neural link type of processors in them. And if you’ve bought a system and all of a sudden you want that functionality, this is a great way to connect it directly without having to buy a brand new system.

[00:38:33.490] – Craig
Chiplets are exciting to you, even the high bond with memory, have you seen that HPM? It’s ludicrously fast Ram, way faster than system Ram. And say you don’t need a huge amount of cores, but your workload will benefit from very fast Ram you can have. So if your chiplets imagine your four chiplets on a CPU silicone, you might say, I’m happy enough for 64 gig of Ram to be enough for my workload. If it’s really fast and the rest is two chiplets of CPU cores and one chiplet of GPU cores, you’re going to have that mix and match functionality. So the amount of SKUs coming out of intel and AMD in future is going to be phenomenal. Just all the sheer combination, the choices that you’ll have a chiplet level, it’s massively increasing their stock complexity there.

[00:39:25.820] – Chris
Yeah, and that’s really the dream, is to have an entire rack or an entire row of CXL hardware that you can configure at will, update one device or one module or one note at a time, modify on the fly and have everything running over a PCIe six bus, except for Ethernet, of course.

[00:39:45.050] – Ethan
It makes upgrading servers, it makes upgrading all the components interesting. If this Ram is too slow, we’re going to upgrade the shelf of Ram today, pull it out of the pool, and re add it to the pool as a faster thing. Same thing with any of the components that would be in a system attached to the bus like CPU, the main board. We’re pulling out this stuff today and putting plugging it back in to the PCIe bus. And now at the CXL now it’s easier to do those replacements. A lot of assumptions about abstractions and how you’re doing your workloads and all the rest, but that becomes feasible where you don’t have to pull out an entire Ru or Ru’s worth of servers. As a working together compute node of everything in the box, you can upgrade individual components to meet your needs.

[00:40:32.170] – Craig
Another nice point about CXL is now I’ve been clarified enough, you can make all your hardware backward compatible, so you could release a CXL Three device now that meets the known CXL Three standard, use it at 1.1, and as long as the vendor will provide hardware support in that device, it might do for years to come. In terms of expansion, obviously the hardware would run at PCI Express five speeds now, but functionality wise, it could be capable in the future. I can’t remember Steven Glasser from Nvidia. Maybe there’s an overhead in providing that backwards compatibility. Not all vendors may do it, and it might not make sense for all components that we plug in, but it’s an option.

[00:41:29.450] – Ned
The thing that I keep thinking about is the operating systems that are going to be going to be necessary to handle this composable architecture. Kind of like how we had to invent the hypervisor to deal with virtual machines. And Chris, you had mentioned earlier the concept of wow, this sounds an awful lot like we’re building a mainframe. So are we going to see a resurgence of a giant mainframe style operating system that oversees multiple racks of CXL gear and just composes things for workloads? Just it’s a thought that occurred to me. What does an operating system look like when the hardware capabilities have changed so drastically?

[00:42:13.050] – Chris
Right. You also hit on an important point for the current, which is the operating systems have to support the CXL protocol. There are three major subsets. We don’t have to get into the details of it, but it’s CXL IO, cache and memory. Are these protocols that exist under the CXL banner? Current versions of Linux already support it. Like I said, we’ve seen demos. All the demos are done with Linux. I can’t imagine why. I’m not positive where Microsoft is, to be honest with you, whether or not they support it right now. But they’re going to need to support those three protocols in the short term to make any of these devices functional, or at least maximally functional. When we talk about CXL Three, CXL Four, and full composability, I completely agree with you. I think there’s going to be a revolution in the way that operating system or perhaps I should restate that as the way that hypervisors are written. Because it’s not Linux that needs to change, necessarily, it’s ESXi that needs to change.

[00:43:14.900] – Ethan
I was just thinking about that, because if you guys have followed the story of Project Monterey and allowing direct hardware access from ESXi, from a VM through ESXi to some unusual piece of hardware like a DPU, let’s say that’s been tough, hard work for the hardware vendors and VMware to make all that go. This just sounds like another significant challenge to me.

[00:43:39.190] – Craig
You mentioned a vendor that is strangely absent from the CXL consortium page. You’re not going to see VMware there anywhere at the moment.

[00:43:54.010] – Ned
That’s interesting because they are going to need to grapple with it at some point.

[00:43:59.070] – Craig
But they’ve been very supportive of MVM over GCP, NVMe over fabric. Are they going their own composable rate? I don’t know. I don’t know if anybody does at the minute there, but they’re not signed up yet.

[00:44:22.130] – Ethan
So interesting prognostication speculation we can do here. Broadcom has acquired VMware. Broadcom is all over this CXL thing, has all the chips in the pipeline that are going to be a part of the chain. So maybe that eases the transition and helps grease the skids for VMware to get their hypervisor on board with that and enabling VMs to leverage CXL attached hardware. Maybe I just made that up, everybody. So if you think nothing official I heard from anybody, I’m literally just thinking out loud.

[00:44:55.790] – Craig
The current architects out there working on data center platforms outside of, let’s say, public cloud, anybody looking at multicloud HyperCloud where VMware specifically are going to look at VMware logo and go, all right, VMware on board. I’ll know VMware are adopting it, and obviously in a year or two they’re going to be under the Broadcom umbrella. So I don’t know the interest and see what happens. All right.

[00:45:19.850] – Ned
Well, Jen, this has been fascinating and certainly got my mind churning on CXL and what the future possibilities look like. If folks are hungry for more CXL, where would you point them? Is there a canon that’s just shooting off CXL information for them to subscribe to Craig?

[00:45:37.440] – Craig
Coincidentally, yes, there is. So I am currently co host on podcast. You’ll find it at utilizing Tech and it’s utilizing CXL. And all we are going to do is talk about CXL once a week this year for the year ahead. We’ve recorded a number of episodes already. Some of the guests that Steven has lined up are fantastic, and I hope you’re excited to listen to them as I’m excited to speak to these people and try and find out more.

[00:46:18.700] – Ned
Yeah, so I’ve listened to two episodes already and that got me excited for this conversation.

[00:46:23.680] – Craig

[00:46:24.290] – Ned
Yeah. Folks want to keep listening for more. We’ll include the link in the show notes. Chris, is there somewhere you’d point people for your own insane ramblings?

[00:46:33.550] – Chris
My insane ramblings are and I also co host a podcast with one Ned Bellavance called Chaos Lever, which I am sure is going to touch on CXL from time to time from here on forever.

[00:46:48.050] – Ned
We definitely don’t have that in the content pipeline at all. Cool. Craig, what about you? Where can folks find more just from you?

[00:46:56.640] – Craig
Just me at Craig Rogers Ms on Twitter and also on Craig Rogers Dokoro tk. And I’m pretty frequent on LinkedIn. It’s Craig Rogers.

[00:47:06.540] – Ned
Awesome. Well, Craig Rogers and Chris Heiner, thank you for joining us today on day two. Cloud. And hey, virtual high fives to you for tuning in. Dear listener. If you have suggestions for future shows, we’d love to hear about those suggestions. You can hit either of us up on Twitter at Day Two Cloud Show or go to the website day Two Cloud and we have a whole link just for you to click on and fill out a little form. And we’ll scurry off and do some research and make an episode happen. So, you know, click the link, do the thing. Hey, if you’re a vendor out there and you’ve got a way cool Cloud product, you might want to share that product with our audience of It professionals. And you can do that by becoming a Day Two Cloud sponsor. You’ll reach several thousand listeners, all of whom have problems to solve. Maybe your product fixes their problems. We’ll never know unless you tell them about your amazing solution. You can find out more at packet. Pushers net sponsorship. Till next time, just remember, cloud is what happens while it is making other plans.

More from this show

Day Two Cloud 180: Understanding AWS EC2 At The Edge

On today's Day Two Cloud podcast, we speak with Jan Hofmeyr, a VP within Amazon Web Services (AWS). This show was recorded at AWS re:Invent 2022 in Las Vegas, and we discuss EC2 at the edge, AWS Outposts and how local zones work, connecting Outposts to...

Episode 179