Search
Follow me:
Listen on:

Day Two Cloud 134: Simplifying Infrastructure Access With StrongDM (Sponsored)

Episode 134

Play episode

Today’s Day Two Cloud is a sponsored episode with StrongDM. StrongDM focuses on infrastructure access; that is, helping engineers and IT professionals get access to databases, servers, Kubernetes clusters, switches, Web apps, and more from a desktop or laptop.

StrongDM takes a proxy approach to the challenge of access and authentication. It uses a local client that can run on a Mac, Windows, or Linux device; a gateway to mediate access; and an administration layer for setting policies and permissions and auditing access.

Our guest is Justin McCarthy, StrongDM’s co-founder and CTO.

We discuss:

  • The problems StrongDM aims to solve
  • How it differs from other access and authentication systems
  • Integrations with directories and identity stores
  • The ability to audit and play back events
  • Pros and cons of the proxy model
  • Use cases
  • More

Show Links:

StrongDM.com/packetpushers

StrongDM on YouTube

StrongDM Blog

@strongdm – StrongDM on Twitter

StrongDM on LinkedIn

Transcript:

[00:00:04.250] – Ethan
Welcome to Day Two Cloud. We got a sponsor here today with a brand new sponsor. A brand new sponsor. StrongDM. StrongDM is in the world of authentication, but not just authentication to anything, because there’s lots of solutions like that. This fits a really interesting niche, doesn’t it, Ned?

[00:00:19.200] – Ned
It really does. They are really focused on empowering people to access their infrastructure. So engineers, IT professionals, folks like us that need access to their Kubernetes cluster or their database or SSH access into a machine, that’s what they’re focused on. And they want to make it convenient to grant that access and use that access on your desktop or laptop.

[00:00:41.790] – Ethan
But this is not just like a simple little proxy that like a bastion host jump box kind of a thing. It is so far beyond that. You’ve got to listen to this show. Our guest is Justin McCarthy. He’s the co founder and CTO, the one who got all passionate. And it’s like this is a problem I have to solve. And you’re going to hear it in his voice as he gets right down in the weeds with us. Please enjoy this conversation with our sponsor, StrongDM and Justin McCarthy. Justin, welcome to Day Two Cloud. Man, it’s the first time you’ve been on the show. First time StrongDM has been a sponsor. We are delighted to have you. In a sentence, would you tell us who you are and what you do?

[00:01:20.100] – Justin
Sure. My name is Justin. I’m the co founder and CTO here at StrongDM. So as you can imagine, what I do is think all about building a product that hopefully people get to use and love every day.

[00:01:32.000] – Ethan
Then tell us about the product. Give us the elevator pitch. Describe StrongDM as a company and what you guys are doing.

[00:01:39.580] – Justin
Okay, sure thing. So StrongDM the product is all about access. But really the audience that’s doing that access is always somehow a technical audience. So we are an access product for that technical audience, getting access primarily to infrastructure. So this is your data engineers, your software engineers, your DevOps folks that need access to that cluster, that server, that database every day. That’s exactly the audience. And that’s what we do. And that’s what our product has done for a whole bunch of years now.

[00:02:10.730] – Ethan
So their applications, but also hardware, like a Cisco switch?

[00:02:16.080] – Justin
Really, whatever. If you go and you have a conversation with a member of your tech staff and you say, what kind of systems do you need access to get your job done? It’s those systems. So if the answer is a Cisco Switch, then yes.

[00:02:29.140] – Ned
Okay, so this is not a solution to get people access to Office 365. This is to get into the switch, to get into the cluster, to get into the infrastructure that runs applications.

[00:02:39.000] – Justin
Yeah, and I’ll say if all you need to do your job on a day to day basis is maybe access to your web based CRM, maybe access to your email address. Your company is probably a StrongDM customer, but you as an individual might not be a licenses user.

[00:02:51.810] – Ned
So what made you go out and found this company and build this product? Because that’s not a small thing to go out and do. So what was the main drive behind that?

[00:03:01.540] – Justin
There are a lot of sort of classic responses to a question like that. And for me, it was definitely the classic scratching the itch. So I was that person throughout my career in startups that was often faced with a pretty challenging ask of, hey, can I get access to that cluster? Can I get access to that server? Can I get production access? I need to fix a bug, I need to inspect something, I need to do a release and all that’s. True, you have those reasons, you have those needs. But every time you say yes to one of those questions, you take on a little bit of risk. And every time you say no, you frustrate a person and a department and an initiative. And so being right at the nexus of that, saying yes and saying no, that was it for me. That was the reason why I could see that a product like this needed to exist.

[00:03:52.830] – Ned
So you mentioned risk. What do you think of as the risk behind granting someone access?

[00:04:00.030] – Justin
The security surface area and the sort of likely possible avenues for data leak or availability problem? We spend a lot of time as architects and as designers thinking about, for example, the surface area of the application, but we spend much less time generally thinking about the surface area of essentially your staff. Right. So the sum of all technical staff in your company actually creates this quite large surface where consistent level of training, consistent level of practices all need to contribute to a really safe operation of any production system, any production data. Okay, so that’s the element of risk. Ned, you seem like a trustworthy person. I’ve seen you on the command prompt. You’re very diligent. And yet it seems like if I don’t give Ned access, maybe I will reduce my chances of unexpected error of some type.

[00:04:56.670] – Ethan
Which isn’t to say, Justin, that we didn’t have authentication for infrastructure management before. I mean, there’s a lot of solutions. There’s not a solution set that solves all of this. Is that how you differentiate StrongDM? Now I have one solution as opposed to the twelve ways I was accessing infrastructure before.

[00:05:15.590] – Justin
Yeah, I think we’ve seen this a number of times. As new generations of products come online, there was always a way to get into the data center. It just used to be called. Well, at one point it was called a physical key. Right.

[00:05:29.170] – Justin
And then it was maybe a key card, and then it was maybe biometric. Right. So there’s always been an answer to how do we grant access. How do we secure that access? What I’ll say is that in the era that we’re in now, a couple of things have changed that. They really changed the game and actually require, in our view of product that specifically addresses the convenience of that access. If you don’t have a convenient, consistent way to say yes and to say no to can I have access to that, then you’re going to get folks working around the system inevitably. Okay.

[00:06:08.390] – Justin
You’re going to get whether you want to call it shadow IT or the server instance that we don’t know about in the other adjacent cloud account. One way or another, you’re going to get folks that they need to accomplish their task. Right. So actually, what we found is that by focusing on the convenience of that experience of requesting access, receiving access, and using access to the infrastructure, you end up just getting a lot clearer vision of your total surface area of access grants that are in place, and you get a lot clearer vision of how that access is happening every day.

[00:06:52.270] – Ned
Okay. So would you term it as an access broker, where you’ve got the things that need to be accessed on one side, the people that need to access it on the other side? And then I’m assuming you hook into some sort of identity system or multiple identity systems to give people a way to authenticate and verify that they are who they say they are.

[00:07:10.390] – Justin
So for sure, you could use the word broker. And in fact, in the technical implementation of the product, it is a proxy. Okay.

[00:07:16.610] – Ned
Okay.

[00:07:17.430] – Justin
The other thing that’s happening is just as you pointed out, absolutely, we are coordinating with the upstream what we think of as the upstream identity providers. So because you already have Active Directory that you’re happy with, you already have Okta that you’re happy with, you have an existing SSO in place that’s a great working source of identity. And so we’re going to continue to use that. What’s happening within the proxy, though, is those identities, which are really easy to understand when you’re thinking in Active Directory terms or something like that. Well, how those translate into a specific Credential or username on a Redis cluster, and how that’s different from a legacy Sybase database and how that’s different from a Kubernetes cluster because those downstream resources you’re trying to access because they share so little in common in their technical implementation, we serve as a bridge to make it feel like one thing. Okay. So in one gesture, you’re granting access to Sybase, Redshift, Snowflake, and Kubernetes, which isn’t really possible unless you have that unifying broker layer.

[00:08:19.810] – Ned
Interesting. I see where you’re going with this. If I live in a pure Windows Active Directory world, it’s great because all my machines are domain joined and they all trust that identity. So if I want to give someone access, I can just do it through AD, I’m good to go. But then you point out something like, oh, but you also want SSH into that Linux box. How do I Wire those things together? Am I going to install LDAP authentication on every Linux box in my organization? Probably not.

[00:08:46.210] – Justin
Yeah. And that’s all been possible since the beginning of time. You could stitch all of this together. That’s always been possible. I think what’s inevitable once you have a unified way of adding things, it feels inevitable that you’re going to and then you get to see a unified list of exactly all those servers that I need access to every day. And I know that I am receiving access, and that access is being logged in a very uniform way, even across a very diverse set of types of systems.

[00:09:18.850] – Ethan
Based on what you said, Justin, I think I know the answer to this one, but I want to ask it anyway. How does StrongDM relate to a CASB solution? Would it be complimentary or competitive?

[00:09:27.320] – Justin
Sure. So I would probably generally say complementary, unless you are really under utilizing your CASB. So if you’re just using it for one or two use cases, then surely you can find a way to run those one or two use cases through StrongDM as well. But generally complimentary. A lot of CASBs seem to be focused substantially on full fledged SaaS applications. Right. And so again, many times those are web applications. Web applications. We are absolutely a web proxy, among other protocol types. But when you look inside the use cases that our customers are coming to us day in, day out, it’s much closer to those use cases where you need direct access to some server or cluster or database.

[00:10:05.370] – Ned
Okay, that makes a lot of sense. Now, one thing that occurs to me is because you’re kind of sitting in the middle of all these interactions, you’re also keeping a log of all these interactions. And I got to imagine if someone wants to audit access and what’s going on, you might be a good point for that.

[00:10:22.910] – Justin
That was also, of course, a key part of the product once the architecture was clear that we needed to understand these protocols essentially at the wire level in order to deliver that convenience of being able to connect and authenticate consistently. Once we’re in that position on the network, once we’re in that position in terms of understanding the protocol, then recording a log of what’s happening became quite possible. In fact, also necessary to give you that confidence that you’re saying yes, with a sense of safety. And I’ll say for the recording of the activities, that’s something that we actually dog food regularly internally. So there will be, for example, I think of session recordings in SSH where we might show one member of the team how to do something from a recording that was made by another member of the team. Even just traceability on. When did that release precisely hit the wire, being able to go back and correlate something in case you had some sort of a snag with your build system or something, having that second layer of evidence that said here’s exactly when this step or this change took place, it’s just really available at your fingertips in the observability side of our product.

[00:11:33.980] – Ethan
Oh, yeah. We’re going to have to get into the architecture more and talk about the proxy functionality, because you just highlighted something there that wasn’t actually obvious when you say, oh, StrongDM sits in the middle and we can log. And I’m thinking my gut reaction was right. Logging authentication. So and so just authenticated to whatever the resource is done. That’s the log entry. Oh, no, we’re talking about logging the full activity stream who did what and when. So full accounting of the session that’s going on is what you’re talking about, Justin, right?

[00:12:02.060] – Justin
Yeah, that’s exactly right. So Ethan SSH into Server Foo, then ran this get on the Redis cluster, then ran this query over in Snowflake, and then used Windows Remote Desktop protocol to jump into and run SQL Server Management Studio on that server over there. So all of those events semantically enriched with all the time, all the to and from. But then also for those session oriented protocols like RDP, you’re watching a Pixel perfect playback of that session.

[00:12:33.520] – Ned
Oh, really? Wow.

[00:12:36.120] – Justin
Yeah, of course, 2022. What else did you expect?

[00:12:40.510] – Ned
Not that, I think when I think about that, being able to capture all that, you’re generally talking about something that’s using TLS to connect between the client and whatever that destination is. So somehow you’re sitting in the middle and decrypting that traffic and logging it without triggering any alarms on the end point that the person is connecting to. That sounds like a complicated feat. How did you do that?

[00:13:06.240] – Justin
We’re not violating any laws of physics here. We are triggering alarms. So you do have to have an approach for trusting the CAs that are involved. Right. So you do have to have an approach for distributing these out to the workstations so that the workstations are going to accept the TLS certificates that are involved. But yeah, we are a man in the middle, just like many of the other sort of men in the middle technologies that are maybe focused on web traffic, we’re just focused on infrastructure traffic. But yeah, the connection is absolutely TLS from the point of origin from your tableau client into the StrongDM proxy and then out the other end.

[00:13:39.310] – Ethan
Okay, so considering how deeply you can touch the data stream going in between the person managing and whatever’s being managed, that sounds like that might help me with regulatory stuff like PCI, HIPAA, SOXs, all depending on the use case. Is that fair to say?

[00:13:53.080] – Justin
Yeah, I would say there’s a substantial bias in our customer base toward folks that by one regime or another, have compliance obligations. So in other words, the more important, your data is to protect, and the more obligated you are to display evidence of that, sort of, the more valuable a product like ours becomes. And so we have really just pick any of the acronyms and we’ve got customers that are helping to meet those acronym obligations through our product.

[00:14:22.730] – Ned
Is that something you’re natively flagging in the logs, or is it more they’re then taking the logs, exporting them to some other analysis tool to find those compliance violations?

[00:14:32.760] – Justin
It’s somewhat both, so I would say it’s very common. And actually our first duty is to emit logs that are semantically enriched that capture and explain instead of just explaining a Byte stream. They say this is Ned interacting with interacting with a MongoDB running this type of query. Right.

[00:14:50.540] – Justin
So we’re turning something that would otherwise just be your IP address to this other IP address and how many bytes flowed through. We’re turning that into something that you can actually have a conversation about with an auditor. Okay. A lot of the phrasing of conducting an audit, though, is in terms of a request in response. So it’s show me what Ned did do on this day. Right. It’s less Proactive in terms of catching those violations of some policy you might write. That said, that stuff typically is happening more in the security team outside of the context of a compliance audit, but just responding in real time to the fact that our proxy is omitting this event. That said, Ned is doing this. If you know that for whatever reason, Ned should have access to this machine, but shouldn’t be doing that. That’s a case where many of our customers do react to that in real time, and then through our APIs, will pause or shut down that session.

[00:15:45.530] – Ned
Okay, so, yeah, they have some other automation that is watching that and flags an event to shut down that session. Kill Ned’s connection. He should not be querying all the Social Security numbers of everyone at the company. That just seems wrong.

[00:15:58.090] – Justin
Yeah, you can go into a penalty box for a while. You can discuss it and get your access reapproved. But, yeah, that’s exactly the kind of scenario that our customers rely on day to day.

[00:16:10.750] – Ned
You’ve mentioned a bunch of different possible technologies to connect to, Redis, Mongo, Kubernetes, Windows RDP. Is there something that StrongDM won’t talk to, something that I might need to authenticate to that it just isn’t compatible with?

[00:16:25.800] – Justin
We haven’t found one yet. So really, anything that’s on a network that you can remotely access. We have a lot of primitives in our system for deeply discovering and understanding what wire protocol is in use. Okay. And then we’ve just done that for so many types of systems at this point that it’s fairly second nature to add. So we’re adding new protocols every month.

[00:16:51.060] – Ethan
Whether or not it’s baked in today, it’s pretty easy to add new ones if you got a customer that says I got the weird one. Come on, help us out. StrongDM.

[00:16:57.870] – Justin
You can’t. We love weird ones because at this point it’s just a new challenge. And actually there are very few that we encounter. So I would love to hear about a weird one.

[00:17:05.850] – Ned
Just thinking about like OT stuff like SCADA networks and the fact those are not designed with any kind of security or access control in mind. I guess you can talk to those.

[00:17:17.150] – Justin
Yeah, we want to hear about all the weird ones. So anyone that’s got an idea for a wire protocol that we may not have seen yet, please let me know.

[00:17:27.290] – Ethan
Well, we got to talk about the architecture now, Justin. So I was doing some reading on getting ready for the show. There’s a client on the one side, there’s a proxy. I don’t know what all there is. Walk us through the main components of the StrongDM architecture.

[00:17:41.640] – Justin
Sure. So your first intuition about the architecture, because this is, I would say, a way that there is an architecture that solves for some of this stuff for generalizing access that relies heavily on installing agents in the sort of target systems. So an agent in your Linux box, an agent on the machine that’s running your database. And so I just want to say that’s not our architecture. Instead, from the point of view of the target systems, all of the traffic originates from the proxy. So if you were doing a who or you were looking at the connections into a MySQL database, what you would notice is that the IP address that’s listed is the IP address of whatever proxy is connecting to that resource. Okay, so there’s nothing installed on the MySQL box. There’s nothing installed in the Kubernetes cluster or on the Linux or Windows host. Okay.

[00:18:26.720] – Ethan
Thank you. Thank you very much for that.

[00:18:28.870] – Justin
Yes. And I will say that means that actually trying out the product is as simple as running a single instance of the proxy in your VPC or in your environment right on the user side. Because this is a people first user oriented product, it is important not just for the technical ingress, but also just for the user experience. It is important that we have a full fledged graphical client.

[00:18:59.210] – Ethan
As opposed to like a web browser and jump box.

[00:19:01.930] – Justin
As opposed to a web browser. Yeah. And by acknowledging the reality that very few of us are tweaking our Cassandra cluster configuration from our phone. Okay, so as much as I love to design mobile first, a lot of these use cases are just full desktop use cases. Okay. So because we’ve just embraced that reality, we get all the things that come along with being a full desktop client. So we get to push notifications to the desktop in the native desktop way. We get to rely on OS and Workstation locking. We get to rely on OS and Workstation secret storage. Okay, so there are a lot of really cool things that we get by just embracing the reality of being a client.

[00:19:45.470] – Justin
It also provides the ingress that we need. So that the tableau client as far as it’s concerned, it’s just talking to a database, apparently on the loopback interface. So it’s apparently just running on local host.

[00:19:57.470] – Ethan
So the client I fired that up, and it talks to some kind of centralized, StrongDM brain.

[00:20:05.590] – Justin
Yeah, exactly. So there is a control plane that coordinates the network of proxies. Okay. And it coordinates all the clients that are running out there. And that control plane is super important, for example, the distribution and maintenance of trust. Right. So if you want to run your session through the mesh of proxies who you are, the authenticated session that you’re beginning, all of the key material that’s involved in that needs to be distributed so that by the time you dial and hit the proxy, it knows what certificates you’re going to be dialing with. And it knows not just in general, but in the specific, like this is. Exactly. Ethan. Now, from this workstation. And so all of that is coordinated through the control plane.

[00:20:48.910] – Ethan
The client is then talking to the proxy via what? Am I tunneling? Is it a TLS session. And all the management sessions underneath are tunneled through that.

[00:21:00.250] – Justin
Yeah. So if you look at it in Wireshark, you will see a single TLS armored session. Okay. So you’ll see one TCP connection, and you’ll look on one side of your screen and you’ll notice that you have a ton of queries going and some Kubernetes commands running, and you’re running some Ansible, and all of that is going through, but on the Wireshark side, you’ll just see a single TCP connection. Okay, so all of those individual sessions and TCP connections are being multiplexed inside of one sort of thick link over to the proxy.

[00:21:30.110] – Ethan
Now, you did say network of proxies, and my heart fluttered a little bit. What did you mean by that?

[00:21:36.110] – Justin
Okay, just like our product has to be simple to use for end users, it has to be simple to deploy. And part of that is being able to really accomplish, really any network topology that you can think of with kind of as few moving parts as possible. So what that necessitates in our product is that our proxies form a mesh network.

[00:21:58.910] – Ned
What do you mean by they form a mesh network is that they would form a mesh of all the proxies that are in one data center or all of the proxies everywhere are in some kind of shared mesh.

[00:22:09.710] – Justin
All of the proxies within your organization as a customer. Right. And so what you would do is really, this is simple. You just do it on a piece of paper. You sketch out the geographic regions you’re in. You sketch out the virtual networks that you have. However, you segregated and segmented those subnets. And then as long as you can draw a line, and they’re bi directional lines because they’re a mesh. As long as you can draw a line from the workstation to the target resource, then you’re good. And if you can’t draw a line, then you put a proxy in there. Okay.

[00:22:38.120] – Ned
So just to give an example, see if I have this clear. If I have a VPC that has no peering connections and I want to connect to a resource in there, I have to put a proxy in there, and then I need a proxy that I can get to that can bounce to that proxy, or I need to be able to reach that proxy directly.

[00:22:55.550] – Justin
Correct.

[00:22:55.930] – Ethan
You’re saying you don’t have to get to that proxy. You have to get to a proxy. I’m thinking of this as an infrastructure management overlay as long as I can hit one of the boxes. I can then jump from it to wherever in the mesh in the management overlay, and eventually I’m going to get sent out the other side to the resource I’m trying to manage.

[00:23:14.680] – Justin
Yeah, exactly. So I think of when we had physical machines racked, and there was the management network and management NIC, right. It is. That virtually right, for sure.

[00:23:26.020] – Ned
Okay, so I think I have an idea of what the proxy architecture. And I think we’ll have to dive deeper into this in a future episode because there’s a lot just riding on that. But I’m curious, what do these proxies look like? I would assume there’s a virtual instance. Do you also sell a physical box that can be the proxy, or is it always deployed as some sort of virtual machine or something like that?

[00:23:47.520] – Justin
It is, 100% of the time only available in a 48 U titanium case.

[00:23:53.490] – Ned
Excellent.

[00:23:54.830] – Ethan
Perfect.

[00:23:55.330] – Ned
I really want to roll that into an AWS data center and be like.

[00:23:58.060] – Ethan
Yeah, I’m getting two for the home.

[00:24:00.710] – Justin
That’s the only way we sell our software. No, of course it’s the opposite of that. Our team is a team of Gophers. So everything’s written in Go. And so it means that we are able to compile for whatever CPU architecture or whatever operating system, we’re able to compile a single binary with no other dependencies. And so that means that really any way you would run a single user space binary is how you run the StrongDM proxies. So throw it in system D, throw it in a container, make a virtual machine dedicated to that. It doesn’t really matter. It’s just a single software process.

[00:24:34.750] – Ethan
You’re not shipping me a physical box, then at all?

[00:24:37.910] – Justin
No.

[00:24:39.410] – Ned
Right. But if I want to run this on my Raspberry Pi, I very well could, because you already have a binary compiled for the Arm processor and whatever architecture I’m running it on.

[00:24:51.310] – Justin
Of course. Yes. Again, it’s 2022.

[00:24:54.530] – Ned
You say that, but you’d be shocked at how often that is not the case.

[00:24:59.570] – Justin
Well, there’s a pretty Arm heavy desktop operating system that is also out there these days. So, yeah, Arm everywhere is an important property for sure.

[00:25:09.620] – Ned
Okay. So I want to talk about the client portion of things a little bit more because let’s say I’m working on a Kubernetes cluster. I’m going through StrongDM to do that. What does that look like at the command line when I’m running kubectl commands? How am I addressing that remote cluster and making sure it uses the proxy?

[00:25:29.030] – Justin
Our job is never done unless we have an answer to that question that feels idiomatic for that particular use case. Okay, so how you feel idiomatic for a Microsoft SQL Server user is different from how you feel idiomatic for a kubectl user. And I’m a cuddle. I say cuddle. All right, so when I’m on the command line and I type kubectl, the answer is that our client has interacted with the kube config and it has augmented it to include the proxy versions of all the systems you need to access.

[00:26:02.210] – Ned
So it’s transparent.

[00:26:03.380] – Justin
It’s absolutely transparent, yes. Wow.

[00:26:05.640] – Ned
And so if I were a SQL DBA similar sort of situation, if I have connections to a bunch of different SQL servers.

[00:26:12.680] – Justin
Yeah, exactly. That sugar of that last mile is an important part of the job. It differs. We have different levels of sugaring for each protocol. We’ve spent a lot of time sugaring SSH and kubectl and maybe less time on some other protocols. But yeah, being able to say this feels natural. That’s a key drive that we have always in our design.

[00:26:35.270] – Ethan
Because effectively, as the operator using the client, I’m sending commands and I’m not talking to Kubernetes, I’m talking to the proxy, and then the sugaring happens to massage that so that it looks good when it actually hits the Kubernetes control node.

[00:26:55.610] – Justin
Yeah, exactly right.

[00:26:57.380] – Ned
What does that look like from a DNS perspective? Are you taking control of my local DNS the way that’s resolving to make sure that I’m hitting the proxy and not the server that I want to talk to?

[00:27:11.690] – Justin
That’s a great question. Actually, for today, the answer is no. We don’t take over DNS yet. It actually ends up just being a Port numbering that you essentially coordinate with your team. So you essentially resolve names to numbers on the loopback interface, but they stay consistent across your team.

[00:27:36.410] – Ethan
Okay, so I’m using my native tooling and SSH client. Let’s say that I know and love. I hit that loopback address or that custom Port that hits the client sitting on my box, which then takes it, shoots it through the tunnel to the nearest proxy and across the mesh if necessary, and then it finally hits the destination. Interesting.

[00:27:59.280] – Justin
You got it. Okay.

[00:28:02.880] – Ned
And I assume I have to fire up the client first. Or maybe that’s something that just launches when I log in every day.

[00:28:08.720] – Justin
Yeah, it depends on, again, how much sugar we’ve got. So, for example, if you’re using a web application. It’s very easy to know what the URL is, and all operating systems respond to basically open this URL with the right default browser behavior. So that’s an example of a client that’s very a web application is very easy to open. So you click on it in the StrongDM client and it would just open.

[00:28:33.630] – Ethan
So if I’m the StrongDM administrator, I’m actually configuring this thing and I assume I’m like provisioning policies to send down to the client. Am I the one that controls what the local host and the Port number mapping is?

[00:28:46.960] – Justin
Yes, you do. And typically you’re going to think that through and you’re going to sort of lay all that out and then maybe affect that through your Terraform or essentially whatever other automation you have in place. So it typically ends up being a little bit of design and then it’s manifest through some automation.

[00:29:07.290] – Ethan
I think you might have answered my next question. So if I’m that StrongDM administrator trying to bring the system up for the first time, I got a bunch of users to bring on board. What is that onboarding process like?

[00:29:16.950] – Justin
Sure. It all depends on how happy you are with your upstream identity systems. If you’re super happy and that system supports the SCIM provisioning protocol and you’re happy with how your groups and roles are organized, essentially all of that is going to be synchronized down to the product. Okay.

[00:29:31.500] – Justin
And then the product, you then take those groups and you say the data science team can access any of these resources that are tagged with data science or something like that.

[00:29:40.680] – Ned
I have not heard anyone say the SCIM protocol in a long time. Can you expand a little bit on what that is for the listeners because they might not be familiar with that.

[00:29:48.800] – Justin
Sure. So there’s a set of protocols related to authentication and authorisation that are out there. And Interestingly, as an industry, I will say there’s no one that says we’re done in terms of those standards yet, but there’s at least one that does a pretty good job, at least enumerating what the users are in your organization and providing a way to have a system of synchronizing what users should exist are expected to exist and what roles and what groups. And that’s the SCIM protocol. So it is a way for identity providers to push down the existence of Ethan. And the fact that Ethan is a member of the data science team into any provider, really, it’s not universally adopted, but it’s a pretty good structured way to get a population of users essentially populated into a system, right?

[00:30:38.740] – Ned
Yeah, I encountered it when I was working on Azure AD single sign on to get it integrated with some of the third party applications they were trying to support. And if it had that SCIM support, then it was really easy. And if it didn’t, it was harder. Not great.

[00:30:53.550] – Justin
Yeah. So we’re the former we’re the one where it’s real easy.

[00:30:57.210] – Ethan
Oh, well, Justin, let’s paint an ugly scenario, though. I’ve got a hodgepodge of users scattered across different I don’t know, it could be as bad as spreadsheets, Justin. I mean, what are my options then to get on board?

[00:31:12.010] – Justin
That’s not uncommon. I think that also the reality is that there are very few cases where everything is clean at the beginning. So I’ll say in that case, all the automation you would expect from the sort of spectrum is if you just have a CSV. Well, you’re going to use our command line tools, which are totally happy to ingest CSV or JSON. Okay. So that’s part of the automation spectrum. The next hop in the automation spectrum is you’re going to pick your language of choice and you’re going to import to using our official SDKs. You’re going to import into that will essentially run those imports against our API. And then the sort of final form is one of those really high level orchestrators of the SDKs, which the canonical example would be Terraform. So the native Terraform provider, you would enumerate all your users just in there and you would terraform apply and you’d be off to the races.

[00:32:07.750] – Ethan
There is a native Terraform provider for StrongDM. That’s what you just said. Yes. Also that okay.

[00:32:15.970] – Ned
Got my Terraform Spidey sense tingling there. I think everybody knows that I’m a fan. Because StrongDM is super important to having access to my systems. What happens if a proxy crashes? Am I just dead in the water to getting to those systems until it comes back up?

[00:32:36.620] – Justin
This is a question that thankfully it’s one of the other benefits of being able to control the client. So because we control the desktop client, we actually control essentially how that mesh network is switched. Right. So you can have and achieve high availability in your circuits through to the target resources without, for example, an explicit load balancer because the load balancing is happening between the client and whatever proxy nodes it’s connected to.

[00:33:05.420] – Ned
So I can just deploy two of the proxies in my VPC or whatever it is. I don’t have to put an ELB in front of them or anything. And they’ll handle the load balancing connectivity.

[00:33:16.770] – Justin
Yes, correct. And I wouldn’t say can, I would say must you must deploy more than one. We’re going to definitely encourage and nudge you to deploy quite a few so that you’re happy with even things like Availability Zone, you would want to be able to know that you were well distributed across those sort of partitions. The other thing I will say, or the one caveat I’ll offer is that there are protocols that can’t tolerate a TCP connection reset. Right. So for a lot of database protocols, for example, if your query is flowing through path A and path A needs to be rebooted or is lost, well, your next query will flow through path B. A millisecond later. But that won’t be the same query, that won’t be the same TCP connection. So our mesh network is awesome resilient. It just isn’t able to fully Port in real time with no drop that TCP connection.

[00:34:11.950] – Ethan
Right. I see what you’re saying. Don’t rely on it as a load balancer in the same way with some of the same features and stateful mirroring that you might get in certain very fancy high availability clusters and such.

[00:34:23.340] – Justin
Yeah, in those very rare, very fancy ones. Of course, if you’re doing a stateless protocol like Http, you would never notice when a node was restarted or crashed for whatever reason.

[00:34:33.540] – Ethan
Justin, I can see people, SysOps people and SecOps people looking at the StrongDM box and going, I want to own that. So how is the separation of duties typically? What do you typically see with folks that have adopted StrongDM?

[00:34:48.100] – Justin
Sure. And I’m going to verbally describe the architecture diagram for a second here. I’m going to tell a story, hopefully draws a picture. We talk in terms of the horizontal direction of flow and the vertical direction of flow. So vertically we’re communicating with the plane, the control plane that is recording events and broadcasting policies. Okay, so that’s the part that our team is responsible for maintaining. The customer’s team is responsible for everything in that horizontal direction. So it’s the client itself on the workstation. It’s the direction of flow into that proxy, and then it’s the distribution of all of those proxies to map that network topology. Okay. There’s also a part of that, which is all of those logs that are being generated that actually have every Pixel of the Windows remote desktop session and have every keystroke of the kubectl exec into the cluster. So all of those things that are being emitted, all of that evidence, all those logs. As you can imagine, there’s sensitive stuff in there, right? And so your team is responsible for figuring out how you’re going to route those sensitive captures into your SIEM, into your log aggregator, and into your long term storage systems so that you can collect those logs and have them for future review, have them for forensics, et cetera.

[00:36:10.950] – Ned
Is there something built into the StrongDM system that will scrub some of the information out before it even gets to the point where I would emit it as a log?

[00:36:18.540] – Justin
In our ecosystem, we have a sidecar container called the log export container, and actually that’s just essentially a custom rolled version of configuration of Fluentd. And then within Fluentd, there’s Sanitization option that we support customers in using. So if you can precisely identify something you never want to appear in the log, then that’s the path you would take.

[00:36:41.270] – Ned
I never want my lunch order to be in clear text.

[00:36:44.450] – Justin
Yeah, it’s embarrassing.

[00:36:46.950] – Ned
No one needs to know about my horrible Taco addiction if they didn’t already know about that.

[00:36:54.350] – Justin
You said Taco, but I see there you’ve got a bag of cotton candy. You’ve got empty bags of cotton candy all around.

[00:36:59.420] – Ned
Shh, we don’t do video on this podcast. So the other concern I might have is when you put a proxy in front of things and you got a lot of traffic flowing through that proxy, it could become a potential bottleneck for getting access to those resources. And you said it’s like it’s like one big link that you’ve got. And if it’s all going through one thing, that’s a problem. So how are you dealing with potential bandwidth issues?

[00:37:27.410] – Justin
It’s absolutely true that it is a bottleneck. And we already talked about the availability side of it. So how do I have high availability access to those resources? So let’s just talk for a second about more on the performance side. So, of course, all performance. You talk about it in throughput and in latency. Okay, so the answer on latency, I’ll address first, and then we’ll talk about throughput. Okay, so regarding latency, the way you should think of any proxy is roughly add up the legs of the path at the speed of light and then add a little overhead. Okay.

[00:37:55.310] – Justin
And that’s true for us as well, because we’re a mesh. If you’re flowing from Austin over to Mumbai and then back to Virginia, that’s a lot of speed of light. Okay. So if your route is suboptimal, then there’s nothing it’s physics. So keeping your route and your path tight is an important part in terms of latency. The other thing that I think is important to note is that this is a people first product, and so high frequency trading algorithm would definitely notice a fraction of a millisecond. Ned won’t. Okay, so that’s the other thing to consider regarding latency. In terms of throughput. That’s another case where the implicit load balancing is just happening among the available deployed proxies. Okay, so because the load is spread out, if you get a hotspot, well, that load is just going to flow elsewhere. Okay, so the answer is, theoretically, you could hit throughput limits. You have to push really, really hard. And then because the proxy has been designed to scale with CPU, the answer, the remedy is always just add more cores. Okay.

[00:39:02.160] – Justin
So you don’t have to think about memory. You don’t have to think about disk. Just add more cores.

[00:39:06.480] – Ned
The answer is either add more cores or add another proxy that has some cores in it.

[00:39:10.550] – Justin
You got it exactly right.

[00:39:12.450] – Ethan
Justin, I love it when founders who have their fingers deeply in the code base and product design come on the show to talk about this. This has been super cool, and we’ve left so many things on the table that we want to get into. And other good news is StrongDM is going to be back later in the year with more shows to chat more, and so we can get into it. Like, I really want to get into the infrastructure as code conversation and the API and the SDK, all that kind of stuff that we can work with on the StrongDM side. But for now, this has been a fantastic conversation introduction to StrongDM. If people are listening to this and they want to know more, where would you recommend they go?

[00:39:50.550] – Justin
I encourage everybody to just check out our website and sign up for a free trial. So if you want to do it, it’s just of course, StrongDM dot com slash packet pushers, you can sign up for a free trial. You can ask for a demo, all that stuff. StrongDM dot com slash packet pushers.

[00:40:07.110] – Ethan
Thank you, Justin. Now, Justin, are you a social guy personally like you on LinkedIn or Twitter or any of those things that people can harass you? I mean, come up to you and politely ask you questions.

[00:40:17.970] – Justin
I know this may be unconventional, but I will encourage people to email me if that’s okay.

[00:40:23.130] – Ethan
If you wish to share your email address, by all means.

[00:40:25.510] – Justin
Yeah, Justin at StrongDM, so that’s the first way to get me and then you can also find me at the Twitters @builtbyjustin lovely.

[00:40:35.610] – Ethan
Thank you very much, Justin McCarthy, CTO and co founder of StrongDM, for joining us today. And our thanks to StrongDM for sponsoring today’s episode. Ned and I have families to feed up here in the cloud and our sponsors help us to do exactly that. Virtual high fives to you for tuning in and listening all the way to the end. If you talk to the folks at StrongDM again, strongdm.com slash PacketPushers, would you be sure to let them know that you heard about them on day two cloud, part of the packet pushers podcast network. We would appreciate that. And if you have suggestions for future shows, vendors you’d like to have come on and sponsor the show, etc for Ned and I would love to hear about that from you. So hit either of us up on Twitter. We are monitoring at day two cloud show. Or if you’re not a Twitter person, go up to Ned’s fancy website, Nedinthecloud.com and hit his contact form there and let us know. Now if you like it engineering shows like this and you’d like even more, visit PacketPushers. dot Net slash subscribe, links to all of our podcasts newsletters and our websites are listed there.

[00:41:28.660] – Ethan
It’s all nerdy content designed for your professional career development. And until then, just remember, cloud is what happens while IT is making other plans.

More from this show

Day Two Cloud 153: IaC With GPPL Or DSL? IDK

On Day Two Cloud we’ve had a lot of conversations about using infrastructure as code. We’ve looked at solutions like Ansible, Terraform, the AWS CDK, and Pulumi. Which begs the question, which IaC solution should you learn? A Domain Specific Language...

Episode 134