Follow me:
Listen on:

Day Two Cloud 188: Out-Of-Band Management And Infrastructure Automation With ZPE Systems (Sponsored)

Episode 188

Play episode

Welcome to Day Two Cloud! In today’s episode, out of band management for your infrastructure. Only, not simply OOB. No, no. Think instead of the device supplying you the out of band connectivity also being the device that’s the box you run your automation tooling on.

Our sponsor is ZPE Systems, and we talk through out-of-band management network design. If your idea of out-of-band management is a jump box and some terminal servers, there’s a lot more to the story when you bring automation tooling into the picture.

Our guests are Rene Neumann, Director of Solution Engineering at ZPE Systems; and Frank Basso, EVP of Operations at Vapor IO. Rene knows how all the ZPE gear works, and Frank is a ZPE customer who will chat about Vapor IO’s use of ZPE gear in their edge compute sites.

We discuss:

  • ZPE Systems’ approach to out-of-band management
  • What ZPE Systems means by automation infrastructure
  • What a real-world deployment looks like
  • How Vapor IO is using ZPE Systems
  • More


  1. Automation should be implemented through a dedicated (control) network
  2. Automation needs (short explanation of infrastructure parts needed)
  3. The same platform should be used for other projects, including security

Show Links:


Rene Neumann on LinkedIn

Frank Basso on LinkedIn



[00:00:04.050] – Ethan
Welcome to Day Two Cloud and in today’s episode, out of band management for your infrastructure only. Not simply out of band. No. Think of the device supplying you the out of band connectivity as also being the device that’s the box you run your automation tooling on. Our sponsor today is ZPE System. So we’re going to talk through out of band management network design. If your idea of an out of band management is like a jump box and some terminal service, there is a lot more to the story when you bring automation tooling into the picture. Our guests are Rene Newman, director of Solution Engineering at ZPE Systems, and Frank Basso, EVP of Operations at Vapor IO Aka. That Operations guy? Rene knows all about how the ZPE gear works. And Frank is a ZPE customer who’s going to chat about Vapor Io’s use of ZPE gear in their Edge Compute site. So let’s kick off the conversation with Rene. Rene, I think ZPE Systems is new to the day two cloud audience here. So you got to start from the beginning here. Who is ZPE Systems and what do you folks do?
[00:01:03.670] – Rene
So, ZPE Systems, so we are a US company around about ten years old and dealing since then in the out of band business. So that is really our core. So think about traditional out of band console iPMI, Kvm, all that operations, security, management. That is exactly where we came from, or that is still what we do. It’s our core business really. But we do a good bit more because essentially we use out of band as a starting point and then we go from there to package the whole thing up using DevOps netops concepts to really provide an automation infrastructure platform for our customers.
[00:01:44.850] – Ethan
Okay, we definitely going to dig into the DevOps component. But you guys have been around for ten years. Rene, how come I haven’t heard of you? I’ve done lots of out of band and Kvm and so on over the years, but ZP hasn’t come across my radar. Am I just not paying attention?
[00:01:58.850] – Rene
Maybe? No. So we really grew in the US with really large customers. That is where we’re coming from. That is our core business. So we have multi thousand nodes deployment and over the last couple of years we really are starting to break into the Enterprise space. So I would say in the Enterprise space we are still relatively new and that is probably the reason why you haven’t heard of us. And there hasn’t been really that much of a bus either.
[00:02:23.140] – Ethan
No, that’s exactly fair. If you’re just breaking into the Enterprise space now and start to do market penetration there, that’s where I’ve spent most of my time as a network engineer and data center ops, kind of a human. I haven’t worked at a hyperscaler or for a cloud provider, that kind of thing, where you guys seem to have got your start now ZPE talks a lot about automation infrastructure. That the platform you make is for automation infrastructure. What does that mean? Because you’re not like competing with ansible, that’s not what you do. So how do you guys mean? Automation infrastructure?
[00:02:56.230] – Rene
Correct. So the whole concept of automation infrastructure is really coming from customers like Frank and others which we have met over the last couple of years. And if you look at automation or orchestration, most people just implement the tool, whatever that is. Could be ansible, could be glueware, you name them. And then typically you just automate your infrastructure. So your routers, your switches, whatever you might have in your environment. But you don’t really think about the infrastructure which needs to sit in between to get the automation running. So there are not many customers who think about what happens if my automation goes wrong? Let’s put it that way. And that is what the automation infrastructure is all about. It’s about the tools which you need to have in place to ensure your automation works, can reach your end devices and you can recover.
[00:03:45.120] – Ethan
So if I’ve got a server that is the platform I tend to run my terraform from. You’re saying that’s what, people aren’t putting.
[00:03:53.210] – Rene
Enough thought into it’s more between that server and your final infrastructure? If you just look about cloud, then the cloud providers would typically provide that infrastructure. If you look at more hypercloud environment where you have infrastructure yourself, which you want to maybe just end up like a cloud environment, then that environment sits somewhere in a data center, in an edge environment, in a closet, in somewhere, wherever that might be, right? So that is exactly where you need all those different tools as well.
[00:04:26.340] – Ethan
Got it? Okay, so you’re stepping in there to fill that void to make sure I’ve got a more sure way to reach my gear and in fact giving me a place to run my tools. Would I run Terraform or ansible on my ZPE box?
[00:04:46.330] – Rene
Correct. That is exactly all what we are doing. So let’s be honest, that automation infrastructure concept isn’t really new. People are doing it already today, right? But what most people are doing is they have a small little nook or some form of a jump box somewhere. They might have smart hands sending out if something goes wrong. They might have a small Linux box with their file storage, with your TCP dump, with your local automation tool, whatever that might be already running on each site. Then tweak an ate connection somewhere you might have some other form of van connectivity like Mpls or anything like it. Most of that is already there, but in most cases it’s combined together from whatever you have lying around or you buy it from scratch.
[00:05:37.730] – Ethan
Got it. Because I got excited about the automation infrastructure, maybe I missed some of the most key elements of what your platform actually does, which is the connectivity. And as I was digging around trying to understand the ways in which your box can provide me with out of band connectivity. There’s a lot there. There’s, of course, wired and wireless and so on. But I’ll let you tell us that, Rene.
[00:05:58.220] – Rene
If we look at that automation infrastructure, we are looking at three main core building blocks. One is that out of band piece, which is not just consequence, but it’s IP connectivity through Ssh, tenet Rdpvnc, Web UI, APIs, Ipmis. So that entire management network layer essentially going down. And there we can provide full connectivity. We have consequences ports, we have normal Ethernet ports, Sfp interfaces, whatever you might have in your infrastructure on the VXLAN side. Then we offer a full range of connectivity as well, starting from Mpls over normal Ethernet fiber connectivity, ate connectivity with multiple ate modems if you wanted to fail over between modems 5G, WiFi, you name it. Essentially, we have the connectivity built into our boxes.
[00:06:55.270] – Ethan
So I’ve got multiple ways that I can connect into the box into the ZPE platform. Meaning that if my main network goes down because some kind of a change went bad, I’ve got one or more backdoors that get me onto the ZPE box so that I can fix what’s broken.
[00:07:12.490] – Rene
Exactly. Essentially, what we can provide to our customers is really a dedicated out of band van. My boss actually came up with the term obvan for out of band infrastructure van connectivity.
[00:07:27.630] – Ethan
Sure. Because if it’s all like five Gke or LTE connected, I’ve got this out of band wide area network of ZPE boxes that I can use to manage the entirety of my infrastructure. So, in theory, things that go really bad on my production network because of reasons. And I’ve still got full connectivity into the ZPE out of band SDWAN, if you want to use that term.
[00:07:48.810] – Rene
That is exactly what we’re providing.
[00:07:51.230] – Ethan
Now, you said just one little detail I wanted to pick out. You mentioned, oh, I could be Mpls. Does that mean your box actually speaks Mpls, or that I could just wire it into, like, an Mpls network?
[00:08:01.090] – Rene
No, we actually talk Mpls as well.
[00:08:04.810] – Ethan
[00:08:06.650] – Rene
We have even a full routing stack in our boxes. So the best one, and I think, Frank, you came up with that analogy, think of us more like a server dedicated to management and operations than an appliance, which just does out of bend. So a full blown server, essentially, to our customers with all its capabilities.
[00:08:27.150] – Ethan
Got it. Okay. So this is a big fancy box, then, it feels like. I also picked up Sfps so that I can pop in whatever fiber connectivity I require. Is there some speed limitation there? Well, I’m not going to be putting a 400 gig optical into my ZP box, but I could go one or ten gig, I assume.
[00:08:50.950] – Rene
Yes. So we fully support Sfp, plus interfaces up to ten gigs. So that is what we see in most of our customer environments, we have a couple like Frank who are asking for higher speed. So we are currently evaluating that as well. But for one connectivity you insert a long distance Sfp, then we can go long distance. Or for your normal land connectivity, you just put your one or ten gig interface in.
[00:09:18.290] – Ethan
All right, so there’s one thing that some people are going to be a little bit hung up on, and it’s this. If I’m used to running my tools on my laptop, and all I really care about is just being able to connect over the out of band Wan, let’s say, to my ZPE box. What does running the tools on the ZPE box actually give me?
[00:09:36.070] – Rene
The main point then is really around automation. So if you look at automation, what you Azure trying to do is to automate your day to day jobs, right? That means you might need to have a file server. You might need to have some other small little tools. Let’s say you want to do a firmware update on a router, or you need to recover a router. You might need to have a local Tftp server in that environment. Those are all the tools which we can directly run locally on the box. You can’t really run a Tftp server locally on your laptop and then through VPN go down, right? Those are all the ones where we are starting off, and then customers really take it wherever they want to. The big benefit is that you have everything in one single place. So from corporate perspective, you have more control over it than just running it off your laptop.
[00:10:29.350] – Ethan
You really sensed it for me as soon as you made the point. There are certain things you can’t do on your laptop or would be hard to do from your laptop just because of distance latency. The practicality of it, maybe the hodgepodge of spaghetti tunnels that you might be using to get from your laptop into the infrastructure. Talk to the ZPE box and then be able to manage it as soon as you got that string, wire and spaghetti up. Maybe it’s okay, maybe it’s a bit fragile. Maybe the only thing you’ve got going is an Ssh connection and there’s a lot of hoops you’d have to run through to make all of your tooling work. But if all I need to do is connect to the ZPE box and all my tools are there and I can run them locally, I have a performance benefit, I have a reliability benefit that’s immediately obvious at that point. And so then there might just be an operational mindset change for people who are used to doing things off their laptop or in some other way to get the full benefit of going with the ZPE solution. Not that I think that would be that hard because it’s an easy sell, but still yeah, I’m not even sure.
[00:11:36.050] – Rene
If it’s that much of a change. If you think about jump boxes, they’re doing exactly the same thing for the last ten years. Right? What we are providing is the same capabilities as a jump box. We have even a couple of customers who deploy extra jump boxes as VMs on top of our appliances.
[00:11:53.330] – Ethan
[00:11:53.860] – Rene
Just because they feel more comfortable using a Windows jump box than a Linux system.
[00:11:59.000] – Ethan
So then my ZPE would actually be an interesting win for zero touch provisioning then too, I could stand those boxes up, have them point to the ZPE server, and then off they go and begin the provisioning process.
[00:12:12.440] – Rene
Yeah, that is actually one of the core features which we use for what most of our larger environment customers are using us for. They’re using us as a first device in a remote location that initial seed, which essentially then bootstraps the rest in the environment up. So they typically just buy a unit from us, ship it on site, they provision it remotely, and then from there, all the rest of the equipment and infrastructure gets bootstrapped from our device.
[00:12:47.450] – Ethan
Got it. Okay, it’s all coming together for me now here. I think a good way to cement this part of the conversation, Rene, would be to walk us through what a typical ZPE deployment in the real world looks like. That’ll bring all the components together and I think bring together this part of the conversation we’ve had so far.
[00:13:05.870] – Rene
So, typical deployments, they can start really small because I don’t want to downplay, really, our out of band offering. So we have traditional out of band customers who just buy our console switches or deploy our management software somewhere in their data center. They might start off with one or two devices, just do out of band, traditional breakfast, and then from there you can grow to customers like Frank or other customers are doing where they’re really using all the capabilities on the same platform. Important thing here is you don’t need to buy any other hardware. It’s all built into our OS. So whatever hardware you buy or whatever appliance you buy, they all have the same capabilities. It’s up to the customer to use them or not to use them. So, typical deployment then, for large environments, looks like you buy one or two appliances for each site, depending how much redundancy you want. You build your obvan connectivity. We have our own solution with a sauce based ZP cloud offering, but customers can just deploy their own obvan and then from there you connect up your enterprises, you manage them, you deploy the tools you need and you’re ready to go.
[00:14:18.950] – Ethan
Now, the ZB platform I build, you mentioned it comes with all the capabilities, whether you use them all or not. So it sounds like there’s no license levels or anything I’m concerned about is that right?
[00:14:28.740] – Rene
On the licensing side, we are relatively lightweight. I would say there’s one for a large amount of managed devices. That is more if you go into a virtual environment and you want to manage thousands of iPMI devices or anything like it, for most customers, that doesn’t really apply. And then for more advanced features like the virtualization, we have one per box license, but that’s it.
[00:14:54.800] – Ethan
Okay, are there different models then, where I pick and choose which ZPE box I want based on number of connections I need or what radios I might want in it, that sort of thing?
[00:15:04.530] – Rene
Yes. This we start off from really small, from around about the size of your mobile phone to one use size, modular, which is in SR, that is actually the one which Vapor is using. And you typically decide based on connectivity, what you need. Most of the devices are upgradable in terms of CPU memory and the amount of ad modems you want to have in them. But the physical amount of serial ports or network interfaces, that typically defines which.
[00:15:36.970] – Ethan
Appliance you go with, that makes sense. Now, you mentioned as small as a mobile phone. What is that device? I’m curious.
[00:15:44.570] – Rene
So we call it the Mini SR and has a full stack, has Ate, has WiFi, has network interfaces. You just plug it in and you’re ready to go.
[00:15:53.780] – Ethan
Okay, hold that over to the camera. Now, you guys listening, can’t see this thing, but Rene is actually showing me. And yeah, it’s roughly the size of a mobile phone. It would fit in the palm of your hand. It’s got antennas bristling out of either side, it’s got multiple interfaces on this thing and it’s got a big heat sink on top. So, yeah, you can position that just about anywhere.
[00:16:13.690] – Frank
And Rene, you need to tell everyone that it’s hardened. Right? So that environmentally, that thing, you can put it in a cabinet out in the middle of the desert and it’s good to go.
[00:16:22.640] – Rene
Yeah. So security for us is quite important. So all of our devices provide security from the hardware level up down to the OS, and most of them are hardened in terms of temperature range and everything else as well. The Mini SR, you can run off a battery. I use that for shows, I just plug it into the battery and it runs for two days straight off the battery.
[00:16:45.820] – Ethan
Now, the big ones that I can fit in Iraq, you said, are modular. So what does that mean depending? Is that like expansion for is that the point of the modules or is that because I want this sort of a radio and not that sort of a radio and so I picked that module?
[00:17:03.350] – Rene
No, the net SR is fully modular. It’s one U box, has five slots and you pick and choose. So from the chassis itself comes with two Sfp plus interfaces, two ethernet interfaces, couple of USB, and then the CPU memory and everything else in it. So it’s ready to go. We have a couple of customers, you use it as a management appliance because they don’t want to stand up a VM, they just use Barebox AWS, a management appliance. But the real power comes from we have a wide range of different modules, starting from serial over, USB, standard Ethernet, Sfp, one gig, ten gig storage, ate 5G. There’s probably more in compute. So literally, you just pick and choose from the range of the cards. You slot them in and you’re ready to go.
[00:17:58.990] – Ethan
Compute? Why would I care that much about Compute one way or the other? Wouldn’t it just be CPU? Why do I care? You guys are going to put a big enough CPU in, I would imagine so. Why would I be making choices there?
[00:18:14.110] – Rene
So choices comes then really from so what most customers are starting to do is or where they’re starting off is they typically come with an out of bend question to us, and that’s where we are starting. And then over time and Frank is still mad at me that we didn’t mention a couple of the features in the beginning. They’re starting to develop new features or new use cases for the appliances in their own environment. And then they’re starting to say, hang on a second, I can replace that box, and I can replace that box to an environment where space is really of a challenge. You might want to have just white box server. We can put your own OS on it and run whatever you want on it. And that is the reason why we have the Compute card.
[00:18:58.280] – Ethan
If I’m thinking outside the box here, what else can I do with this one? U of rack space. If I put more Compute into this thing, I can run anything I want on it that maybe is management related, but doesn’t even have to be necessarily.
[00:19:13.010] – Frank
[00:19:13.540] – Ethan
So I can load more compute power into there for those purposes. Can I run containers on it? Yes, of course I can. Okay.
[00:19:26.790] – Rene
So you can run containers and full blown VMs. You pick and choose.
[00:19:31.270] – Ethan
Okay, containers or full blown VMs. All right. So, Frank, I want to bring you into the conversation here. We’ve talked a lot about the platform. You are a consumer of ZPE. You guys are using it for vapor IO at your kinetic grid edge. Compute. I don’t know if I got all your branding right. Frank, just tell us how you’re using the box.
[00:19:53.470] – Frank
That’s okay. Our CMO, Matthew Farrell, will probably flog me after this because I didn’t get it right either. That’s okay. Yeah. Vapor. I o. Imagine we’re a regional data center, colocation, and network operator. We call it the kinetic grid, but that’s how it kind of translates. But if you think about it, we take a large centralized data center. You break it apart into multiple buildings across a metro landscape. Like, for instance, we have seven locations in Chicago, and you tie it all together with fiber which we manage, not own, but we light our own dark fiber between the facilities so we can get a predictable performance result for our clients. And all of those facilities act as one. So if you’ve had four cabinets in a central data center, you pick them up, spread them across the landscape, they would think that they’re in the same room together because the network is transparent to them, the experience is transparent to them. And we kind of contextualize or operationalize all of our locations and provide that via our Sense platform so customers can take actions. I know that was a big mouthful, but our platform basically exposes all of the network telemetry all the way down to, hey, what are the optical power levels on this link?
[00:21:15.300] – Frank
If you want to know those things or the temperature of the batteries that are charging in the Ups, we kind of open the Kimono and share all that data with our clients so they can make an informed decision on whether they should have workloads running there. So they should be moving their workloads if the building is going into distress. Because these facilities range everything from smart poles to street furniture to micro modular data centers, to telecom shelters, to small 20 and 30 rack data centers. That’s kind of our operational size. So some of these facilities, like the street furniture, there’s no generation there, but you may be running a critical smart city function there that you may need to move if something goes bad. And to do that, our platform just constantly has APIs that you can pull and gather that data and make that informed orchestration decision.
[00:22:13.010] – Ethan
So with all the kinetic grid sites that you’ve got, you mentioned that the network feels like one big network to the boxes that are being hosted by your customers there. What is the interconnection between the data centers?
[00:22:25.170] – Frank
We build and maintain and run our own network, so it’s predictable. Each link between each site is planned to be one millisecond or less, and we have to have distinct control and functions at every single point. We call it Edge native. So each facility can run independently of each other, but we cluster them together in a way, clustering is a bad word, but they act as one through our platform. But those decisions, which is a good segue for this discussion, is you need to run all of those decisions somewhere and run that software somewhere, collect all that data from somewhere, and we need it to be at the site itself. So we actually run it on all the ZP boxes, right? Rene said I was pissed at him. I was just like, what? Wait, I don’t have to have a bunch of one U servers white boxes out there? I can run things on this box too.
[00:23:21.090] – Ethan
Well, yeah. So talk about you guys are going to bring a new kinetic grid Edge compute site up. What are the issues you face what’s that deployment process look like?
[00:23:28.980] – Frank
Most of our systems and things that we deploy are pre built in a factory. They’re micro modular data centers. They range anywhere from kind of two cabinets to ten cabinets. They’re not that large, but they do have high density or two cabinets, 20 compute available space, meaning they’re just kind of like think of it the size of an F 150. It’s got two slots in it to put in their server cabinets. Or customers can bring their own cabinets. They can use one of our third party like cloud providers that may be installed in there already. And that thing comes on a truck and it has a pad, and the crane is sitting there waiting for it. And so when that unit shows up on a truck, it gets picked up by the crane, it gets set on the pad, it gets bolted down, we connect the power swing in the fiber and we turn the unit on. Now what happens when you turn it on? Well, it used to be I’d have network engineers and facilities engineers standing around in a very expensive way, say in the middle of the Las Vegas desert and waiting to provision it in the middle of a dust storm.
[00:24:35.810] – Frank
It’s like, oh well, that’s no fun for anybody. So with the ZPE being our out of band box, and there as soon AWS, we apply power that comes online and so the cellular link comes up because in our configuration we choose cellular is one of the cards that can slot in AWS soon as it comes online. Since it’s pre built and pre configured at the factory, the engineers can instantly remote and see the status of the systems.
[00:24:59.290] – Ethan
Okay, so there’s important detail here. So you crane drop this thing onto the pad, you plug in the fiber to the right spot, and the Zve box has got enough there to do what? So that you can connect to it. It’s got a phone home capability been pre provisioned somehow.
[00:25:14.490] – Frank
Yeah, so I guess I’ll back up a little bit. The way we configure our ZPE is we have multiple cards within the no grid system. So we have a serial console, we have Ethernet for out of band management, we have cellular, and we also have storage module there, so we can collect data and place it on there. We also get the virtualization docker license on it, so we can run containers there. So when the unit spins up, the most fundamental thing is it comes online, it foams home, it says, hey, I’m online. And if something happens, say the phone provider, say never. A phone provider could mess something up, but say the IP address shifted on the unit or something between the last time it was powered on, and then it phones home and tells us its new IP address and we jump onto the console. And there’s our out of band experience. We now have serial consoles to all of the network equipment at the site. And we also have out of band Ethernet management to all those devices. So if one of them lost its config, we could fix that. Or we can just do a power on test and watch the power on tests get to the PDUs to cycle power and things like that and just do it like a static test of the site as a first step when we bring it online.
[00:26:36.540] – Ethan
Now, some boxes like this where you can like some SDWAN boxes, the provider will work with the customer so that the SDWAN node will ship from the factory pre provisioned to come up and be adopted into their SDWAN Cloud. Is it like that with the ZPE box? Or is it you get it at headquarters somewhere, someone builds the thing, out, stages it, and then it gets put into the two or ten racks that are going to get dropped onto the pad?
[00:27:10.410] – Frank
That’s a great question. So we do in a way we don’t currently use ZPE’s Cloud service. We kind of have our own automation. We’re thankful that we have more than 25 software engineers on our team and so we have our own software shop. And so we were able to write part of our glue and automation that when the box comes on, it kind of phones home and tells us what it’s doing. The pre configuration of it used to be when we had a cellular modem with a card, and then we had an out of bound router and out of band switch and out of band serial console. And then we had a compute node and sometimes a laptop because it needed USB connections to things at the site. We had all that giant stack and it took a long time, meaning a day or two, to get all that configured and working right. And it took a different class of engineer to do that. And now with the ZP, we have a lab in Chandler, Arizona, where we stage everything before it ships out to sites. Now it’s entry level engineers and technicians that pull the unit out of the box, put it on the bench, turn it on well, put the SIM card in it, turn it on, register the SIM, and that’s it.
[00:28:30.870] – Frank
The box is then online, available for the engineers to jump in from remote and provision the box. Because now we have a network pathway to the box before it even turns up an interface to the lab network to participate there. The cellular interfaces up and that gives us basically a console to this box, which is great.
[00:28:52.170] – Ethan
Yeah, okay, but you just mentioned something else that was interesting there. There is a ZPE cloud service. So Rene, you want to mention what that is? Is that where you’d be pre provisioning the box for the customer or what is that?
[00:29:07.470] – Rene
So ZP Cloud is for our customers, like it’s an. Out of Bend SaaS service really fulfills two functions. One is zero touch provisioning. So all of our service, all of our appliances get pre enrolled to our cloud. The service routers by default call home as soon as they turn up. So you can really take a unit, ship it anywhere in the world and the unit will phone home. And then from there you can claim it and you can then push a configuration stump, you can fully configure it from the cloud. And after that then you can actually use it as an out of band interface as well. You don’t need to stand up any VPNs or anything like it, it’s fully encrypted through TLS tunnels. So you just jump onto our cloud, you open up a web session or console session and you’re directly on the Enterprise.
[00:29:58.680] – Ethan
That does feel like some of the SDWAN equipment deployment models where you’ll ship it, it’ll come from the factory pre ship to talk to you as a tenant and then you see it show up in your list when it phones home. And then you can adopt it, push policy to it and begin making it be useful, useful participant in the SDWAN cloud. Not that we need to get too far off on that, but Frank, for you guys it was just easier to, as you said, kind of roll your own.
[00:30:28.750] – Frank
We did because that service was new when we started using the appliances and now we’re taking a second look at it. But we had asked ZPE to go in a slightly different direction for us. And this is what’s cool about ZPE is they’re flexible and they take feature requests seriously. We use Salt for network automation on the back end and we asked for two things. One, we wanted to run Salt proxy as a microservice, as a docker container on the box, so we could reach the local devices to gather and execute and do things like that. But we also wanted to manage the ZP box itself through our network automation. And so we’re talking to Rene, we said, hey, it’d be great to manage via Salt natively. And so they came out with a Salt execution module for the node grid and we gathered it up and got the first version. We were the first guinea pigs for it. But the great part was, and the one thing I want to say about ZPE is they say it’s going to do something. It does that. They’re very solid engineering group. And so the first version did everything they said it was going to do and it really didn’t have a problem with it.
[00:31:47.280] – Frank
And now we can run those configurations. You could use the cloud, or if you have an automation framework that you’re already using, or if using Salt, you could configure the node grid boxes with your native automation platform as well, which is what we do now if you’re.
[00:32:03.840] – Ethan
Listening to this and you’re going, wait a minute. So Frank’s applauding ZPE Engineering because the box does or the feature does what they say it’s going to do. I’ve been in this industry for a long time that is not to be taken for granted. They can say it does something and then there’s all kinds of asterisks and caveats and limitations and well, we’ll get to that other part in the next version, and it’s months and months of lead time before the thing actually does what you say. So Frank is actually giving high praise.
[00:32:33.230] – Frank
Here for ZP Engineering and for those out there who are listening that know me, they know that is exactly that. I’m usually the person who beats up vendors the hardest. As most operations folks do, vendors are constantly disappointing. Our partnership with ZP has been really excellent from day one. And when we were faced with this decision, we had this huge other stack just to touch on this. We were looking at, we have to replace it. It was just way too costly. It had to be come online without a technician or an engineer in the field, just a technician in the field. And we were having all kinds of problems with our existing solution and scaling it. Right. Because you want these things to be kind of lights up, power up and operate lights out. And so we’re looking at other vendors and they were trying, they’re trying their best and they were writing custom code and their professional services folks were scrambling to try to make it work for us. And then one of our engineers, well, actually Director of Network Engineering, he said, hey, there’s this company I worked with before, we should give them a try.
[00:33:48.810] – Frank
And I said, okay, great. And AWS, soon as we turned the box on, we’re like, oh yeah, we’re starting to check all the boxes in our list. And the easy button came out a couple of times and we went, okay. And we went with ZP. And we actually deployed. We weren’t planning on it, but for consistency and reproducibility of operations, we went system wide. So we replaced our out of band over a six month period and all our entire data center footprint nationwide. And we’ve even deployed them internationally as well. And it’s just our go to. We use it in everything. And the smaller units we’re going to start using in smart poles we’re putting out there around different metros.
[00:34:30.710] – Ethan
Frank, you mentioned the complexity of the old management stack that you had, which I read the white paper that you folks published on this, that was five units, if I remember, each unit being some physical piece of hardware that did something specific within your stack, is that right?
[00:34:48.240] – Frank
Yeah, it was a modem out of band router, a switch, a serial console, and then a compute node. So we’re talking five, almost six ru of management net. Just to give an example, we make these Ven units our vapor edge modules in our Vem 20, the smallest unit, 20 kw, is two slots. We have twelve ru of rack space for our own use. That’s it. There’s an end cap where the network racks fold down. You can see a picture of it in the white paper, but it’s from going from using a whole rack on the end to using one ru to do the same thing and more, it was a huge win for us. Because on those, especially when you’re doing smaller spaces, or even if you’re talking about that corporate It office, remote office or something like that, and you’re an enterprise, you’re heat constrained, you’re power constrained, and also you’ve got to put a small box somewhere. You can’t afford to put big, all that stuff, and send a cable diagram out for someone who doesn’t understand it to plug it in. Or in this case, it may be we’re deploying and the guys are deploying it are electricians and rigors from crane companies and the different class of tech in the field.
[00:36:20.440] – Frank
And they’re not network engineers. They don’t understand what even optics are. They’re like, oh, where do I plug the cord in? They’re just different class of technician in the field. And as you grow and expand, you can’t send a network engineer to every location. Just no company can afford to do that. So it just made it a lot easier to deploy. Have one box that has a huge amount of functionality to it just made a lot of sense for us.
[00:36:46.840] – Ethan
My application was different, but I used to bring remote offices on site for an enterprise, and I’d have to color code everything because I was shipping them like a Wan optimization box and some kind of a firewall and an edge router and whatever else. And that stack had to go together in a particular way, and colored tape got me pretty far. Sure, the two pink ones go together, that kind of thing.
[00:37:08.670] – Frank
In a diagram, we label both ends of every cable and the device, and still they get plugged in the wrong place.
[00:37:17.450] – Ethan
Yeah, for sure.
[00:37:18.840] – Frank
Can’t blame the folks who are trying their best out there.
[00:37:22.510] – Ethan
One other weird point I wanted to bring up, Frank, as I’ve run into this myself, is you guys have some SCADA requirements out there, right?
[00:37:28.900] – Frank
Oh, yeah. So that’s cornerstone to our platform, right? So we actually operate a full SCADA system locally in each unit. So temperature, humidity, airflow differential, pressures, door locks, all the different points of data fan operation, compressor temperatures, all that kind of stuff that you imagine would run from HVAC cooling system and everything around a facility. And we need to collect all that data in real time. And then it needs to be running not only locally to make decisions locally, but then it also has to go from that kind of data puddle it’s not even a pond, to the region, which is kind of the puddle and then the central cloud, which is a lake, but we needed a better way to collect that. So currently we have a system from a company, they’ve end of Life, the product that we were leveraging. And so now we’re taking the new platform, not only our Sense platform, which does probably 90% of this, but the last bits they’ll be running on the ZPE box at some of our sites testing. And so they run as a container and we’re collecting everything locally on the container and those things need to come online immediately.
[00:38:55.440] – Frank
We need to know whether when you power on this box, one of the other things is the cooling plant should just come online or if it doesn’t, you’re like, oh, something went wrong. Okay, it’s not even talking well, we have a serial connection to it. Like, what is the Plc saying to us? It’s saying, oh, you need to press the reset button or reboot me because it didn’t come up. But we can do that through the out of band management, which is really very handy to run that software locally and also have the physical connectivity in the same box. So we’re going to one place to do all of those things.
[00:39:30.470] – Ethan
Yeah, it’s not the scenario where the ZPE box is talking SCADA natively, but it sort of is it’s because you’ve got the container option or the compute option that you can run whatever software you want. You can make the thing talk SCADA effectively and talk over to the other box that is providing you that information and you’ve still got the one point to go to, the one out of band place to go to, and you can still control everything.
[00:39:58.350] – Frank
Yeah, and I have a couple of points there too, is one is it’s not only out of band, but it’s also in band. Right. So the way we cable this up is all the management ethernet interfaces and console ports are all plumbed into the ZPE unit and then that’s connected to our production network and we do use the full routing stack so the ZPE talks to the rest of our network and devices. And also I’ll mention for all of those out there, we run our native. Our management plan is 100% iPV Six native and the ZPE box supports everything we need in the iPV Six world. So when we’re starting up and we don’t have any fiber spans up yet or enabled, we can come in from out of band and we’re on the ZPE unit and we have a certain experience. Well, what’s great is as soon as we bring the fiber spans up and we now have in band available, we still connect the ZPE and we have the same user experience. So whether we’re out of band or coming in over the wire in band, through the same box, we have the same look and feel.
[00:41:06.610] – Frank
We don’t have to get retrained. We don’t have two things that we’re working on. So that was a really big one too. Is that one user experience one thing to train the technicians with and consistency so it lowers our total administrative overhead. That was a big deal for us.
[00:41:23.450] – Ethan
There’s one other note that I picked up, I think from the White Paper, which is you’re using your ZPE box in your CI CD pipeline. It’s integrated in there somehow. How are you using that in your pipeline?
[00:41:37.710] – Frank
Yes, the ZPE box is since it can run containers, we’re running our network automation. Kind of think of it as agents and or salt proxies there. So we can kick off and trigger network events that then execute locally, or collect data locally, or manipulate a configuration or whatever it may be remotely onto the ZP unit. So it runs edge native. Those decisions and those things, instead of coming from Central Cloud and relying on everything in between east Winachi all the way to Las Vegas or Atlanta or Dallas or Chicago or Pittsburgh, wherever locations are, barcelona, wherever it is to go, rely on everything in between to work perfectly to make one thing, one decision, make one change. We connect directly into the ZPE unit and we run that orchestration automation locally. And so whatever it is, customer says, hey, I want to build an EVP and VX land connection between Barcelona and Las Vegas for some reason, and they want to do that. It runs locally on each ZPE, connects to the network infrastructure, the Juniper boxes or whatever it may be talking to. And then they all report back through that connectivity path, back to the main vapor network automation service and report back that everything’s good to go.
[00:43:09.790] – Frank
Or if something happened, something happened, there’s an exception. But those proxies and those services run natively there and we’re going system wide with that type of automation is super cool.
[00:43:25.410] – Ethan
That’s very cool. So Frank, as you’ve been describing all this, one point to make here is that there’s a lot I can do with this box. It feels like I can do almost anything with this box. Frank. As a network engineer, I can use its native functionality. I can augment it with the functionality I want if it doesn’t already have it. Is that a fair assessment?
[00:43:47.170] – Frank
I think it’s a really good way to describe it. It’s the old analogy of it’s a Swiss Army knife for a network engineer. So it is a multi tool that it’s not one size, fits most like a bad fitting hat or something. You have this platform that is expandable. So we configure the physical hardware, we select the expansion cards we want to put in it that fit our environment. Whether you want RG 45 console or you want USB console or you could do all kinds of things with it. So you can select the hardware that fits your environment and then the software is extensible. So whether you use their native configuration tool or we use Salt and it comes in from the outside. Using our automation platform, we can do those things and then we can run services on top of it. That augment the box, that leverage the box being physically connected to everything. And it is a platform. So this is one of those things that you can bend it to your own ways and every, I don’t know about you, but every company I’ve been in does everything their own way. We all do the air quotes, we do operational best practices, industry best practices.
[00:45:03.470] – Frank
Well, those best practices are unique to every operation. So the good news is this box is that Swiss Army knife that allows us to do all these functions, actually.
[00:45:13.640] – Ethan
Whine about everybody doing everything their own way because it’s the bane of automation and standardization across the industry because we’re all kind of doing it ourselves.
[00:45:22.390] – Frank
It’s goodness.
[00:45:23.180] – Ethan
[00:45:24.050] – Frank
[00:45:24.380] – Ethan
Yeah. Rene, back to you. This has been a great conversation about ZPE systems. If you would leave people with some, takeaways some highlights from the conversation that you think are important things that should stand out to everybody.
[00:45:39.370] – Rene
Yeah, I guess for us from ZPE Systems, we really want to offer our customers an automation infrastructure platform and that it means for some customers it’s just out of band, it’s just your traditional out of band concept server in your data center or in your branch office or anything like it. But really what I want customers to take away with is that should be really your starting point across your automation infrastructure. Wherever that takes you on your automation journey. Who knows where it leads us? Could be Salt, could be ansible, It, could be who knows, something else in five years which I can’t really think about yet. So that is probably the biggest takeaway is start where you are feeling comfortable with and then grow from there.
[00:46:26.330] – Frank
Very good.
[00:46:27.200] – Ethan
And if people want to know more about ZPE systems, Rene, where would you.
[00:46:30.450] – Rene
Send them so you can find more information about us? And thanks to you on PacketPushers you will find the white paper from Frank. On it, you will find a couple of more information about the Blueprint. The blueprint, which we wrote, is really open so it’s not dedicated to ZP Systems. Have a read through it, maybe you get some ideas from it, maybe you agree, maybe you disagree. And then from there just reach out to me and say you were wrong.
[00:47:03.350] – Ethan
So Rene, if people want to reach out to you and tell you that you’re wrong, are you on social media? How can people get a hold of you?
[00:47:09.530] – Rene
Probably the best way is via LinkedIn. I’m not great on social media on a couple of selection, but not really that active.
[00:47:16.500] – Ethan
Okay. And Frank, over to you as far as social or are you out there active anywhere?
[00:47:22.270] – Frank
Oh, yeah, I mean, Vapor has quite the platform myself. I’m on LinkedIn. I think I’m just linkedinfrankbassa. So I’m always out there willing to connect with new folks, members of lots of groups. I’m at the typical places, like if you’re a nanog person, I’m usually there.
[00:47:42.370] – Ethan
Oh, I might have missed you in Atlanta. Did you make atlanta?
[00:47:45.150] – Frank
I didn’t make that one. It was too much of a hike for me on this one, but I’ll be at the next for sure.
[00:47:51.290] – Ethan
Okay, great. Well, thanks to both you for appearing on Day Two Cloud and if you’re still listening, virtual high fives to you for tuning in. You are an awesome human. If you have suggestions for future shows vendors you want us to try to get on a sponsor so you can hear from them, let us know. Ned Belavance, who is our normal co host, couldn’t make the show today because of a schedule conflict. But he and I monitor at day two cloud show on Twitter. Or if you’re not a Twitter person, go up to daytoolcloud IO and fill out the topic request form. Now maybe you’re a vendor and you’ve got a WayCool cloud product you want to share with our audience of It professionals. You too can become a Day Two Cloud sponsor. Just like ZPE Systems, you’re going to reach several thousand savvy It professionals, all of whom have problems to solve and maybe your product fixes their problem. We’ll never know unless you tell them about your amazing solution. Find out sponsorship. And until then, just remember, cloud is what happens while It is making other plans.

More from this show

Episode 188