EP 6 — Allan Swanepoel: How Automation Can Help Developers Think of Security as an Actuator
This modern SDLC has really exacerbated the fractured relationship between developers and security. Often security is frustrated that developers cannot deliver on their laundry list of asks, and in turn, developers are sick of the legacy application security ways that slow down progress.
To scale at the speed of DevOps, organizations have to eliminate this friction and improve the relationship between developers and security.
Our guest today is Allan Swanepoel and during this episode, he’ll teach us exactly how we can do that by bringing the power of automation to your application security program. Allan has a deep understanding of both sides of this issue — for many years he was on the development side before moving over to security after observing the lack of automation that existed in security workflows and processes.
Topics discussed in this episode:
- Why organizations need to embrace a policy-driven prioritization approach to managing security.
- Why eliminating the friction between developers and security begins with culture.
- How security teams can get developers to adopt and use security tools.
- Why organizations hiring security engineers only to have them handling things like Jira tickets is a tremendous waste of talent and resources.
- How to build an automation mindset within your security team.
- How security teams can balance automating key workflows with the normal day to day fires.
- Security lessons from Allan’s time focused on infrastructure-as-code and infrastructure automation.
Lessons from integrating third party library scanning in DevOps workflow – AppSecUSA 2018 (Keynote that Harshil referenced in the episode).
Allan: Hey Harshil, how's it going today?
Harshil: It's going fantastic, and I am super excited to have you here. If you're a regular listener to this podcast, you already know that I like bringing people who come from non-security backgrounds but are now doing a lot of amazing work within the space of security. And Allan, I would love to hear a little bit about your background. Where did you start your career and just really briefly, how did you get into security?
Allan: Yeah, sure. So I'm one of those people that fit into your non-traditional mold. I hail more from the infrastructure side of things, where I was an infrastructure engineer for many years, and fell into security by starting to build security products. And that just kind of hit the spotlight for me on where there's a big need for work to be continued. And also as DevOps then started forming, how we can take DevOps and implement security in DevOps, effectively building DevSecOps.
Harshil: Right. And you spent a lot of time migrating legacy infrastructure, re-architecting them to be Kubernetes and cloud-native. So is that where the intersection with security came in?
Allan: Yeah, definitely. Especially with the advent of cloud. A lot of the traditional methods, traditional firewalls, the traditional controls, couldn't necessarily be adjusted to cloud on day one. And you needed to be more aware of how the cloud was built and how the cloud was working to be able to translate those models more effectively.
Harshil: Right. So the one thing that I'm always fascinated about is you were doing large scale infrastructure communities, environment architecting implementation, what got you interested in security?
Allan: The automation side of it was really lacking in my personal view. You always find that when you're working with other security teams that you've got all of these policies, all of these procedures. And sometimes if you look at an org and you interpret it, it might feel like a lot of red tape that you need to go through to get something from your laptop into production. And this was becoming, in that sense, a very big blocker. So one of the things that really interested me has always been how do we take these fast-moving agile worlds and apply them to security, that security can move forward at the same speed as your application development, without becoming an inhibitor to the organization?
Harshil: Okay, so let me get this right. So you are seeing that okay, all of these other sides of the business from infrastructure and development, they were already automated, they were moving faster, and you pointed out to those bunch of security people and said, “Hey guys, you need to do better”. Is that why you joined security to help them out?
Allan: Basically, yeah.
Harshil: That's amazing. I love it. So tell me a little bit more about that. I know you're a big fan of automation, I know you've got a ton of experience doing that on the other sides of the world. What about security? Like, what are the things that you've seen that can be automated but most people are not doing that automation in security?
Allan: You know, the one thing that really pains me, and I see this almost on a day to day basis, is you go out as an org and you hire a bunch of security engineers. And these are intelligent people, these are well-trained people, CISSP trained, OSCP trained. And what do you have them do? You have them sitting and creating Jira tickets. And in my mind, this is such a waste of intelligent and useful resources, that we really need to enable these highly skilled engineers to be able to use their time more effectively. And the only way to do that is through automation. Now, yes, maybe these security engineers are skilled enough to do more work, but they might not necessarily have the infrastructure skills to automate the work that they're doing. And that's kind of where I come in and I take a look and I look at how do I make you take this job that takes you half an hour every day to go through these bunch of Jira tickets and automate that so that's a five-minute process.
Harshil: Right. So the challenge that most people run into is not debating whether automation is the right choice or not. I'm pretty sure everyone would agree that we should automate as much as possible. The challenge is, in my opinion, twofold, which is a lot of the security teams don't have the right skill set or the bandwidth to invest in automation. And the second piece is there's just so much to automate, like, how do you prioritize that, right? So let me ask you the first question first, which is if you look at a typical security team that doesn't have a lot of people, and a lot of resources, how do you get that automation mindset? Do you suggest hiring four people dedicated to this, or do you suggest training folks existing in the team to learn some of those automation things, or what's your suggestion there?
Allan: Yeah, I love the teaching method because teaching someone how to automate gives you essentially two things: Number one, they start to think more independently, which is basically the crux of the problem. Like, if they can think about how they can automate their own workflow, and in the DevOps world, there's this mentality of you're essentially working yourself out of your own job through automation. And if we start looking at applying that to security engineers and in the security skillset, it's not that you're actually going to run out of a job, it's that you can just accomplish so much more in the 8 hours of a day that you spend in front working with the tools that you have. So where you maybe spend one day a week just triaging incidents, it's something that's automated. And just the highlights of those incidents is something you can go through in 15-20 minutes.
Harshil: Right. And I think it's also getting much more easier with the very easy access to cloud environments. Like, it's so much easier now to just write a Lambda function or spin up a container and just get it deployed in AWS super quickly, right? Now the access to build that automation, along with open-source frameworks that are available, combined with the ease of access to cloud platforms, it's much more easier to build those automation pieces in place. So the next question comes into the picture, which is like, okay great, now we have some people who can do some sort of automation within the security team. How do you find time and resources to do that? Because as we know, most security teams are not fully staffed. They always are looking for more people. So how do you think about prioritizing automation versus running after fires that are just burning every day?
Allan: With the smaller teams, and yes, this doesn't necessarily scale very well outside of the org, but with a smaller team, it's very easy for me to go to a person and say, “Let's do a shadow session,I want to see what you're doing, what your tasks are on a given day. What's your highest barrier of issues that you feel is taking up the most amount of time that you wish you could get off your plate so that you could get freed up to do something else?”. And that's basically where I stop. For someone that's a project manager, it might be automating your 15 day SLA responses. For someone that an incident response agent, it might be automating the aggregation of different tools so that you can have a single pane of glass view on what the different tools are accomplishing, your software composition analysis, your vulnerability assessment, all of these are different tools. So now you've got 10-15 dashboards to log into and you really just want an overview to see what's actually going on.
Harshil: Right. Yeah, that's a good point. I really like that tactical approach of looking into what exactly you do on a day to day basis. Let's do an analysis of what can be automated out of that. I think that might be a much better approach as compared to looking at the future ideal state and saying “Hey, let's automate this other thing that you may or may not be spending a lot of time on”. So looking at what the team does on a day to day basis and picking those tasks for automation makes a lot of sense. I think one of the challenges that we ran into in our previous lives was that in a smaller team, I personally had one engineer who was working on automation, but after a point when that person left, that knowledge went with that person. So a lot of times I've seen that happen again and again at several different companies, especially in smaller teams where one or two individuals are focused on writing automation. And as soon as they leave, the whole thing just falls apart because it's not maintained, and it's not documented correctly, and there's not enough resiliency in the knowledge within the team as well.
Allan: Yeah. And again, this is where the hand holding at least for a direct one on one approach really helps because you're teaching this person how they can think about automation and work to enable that engineer to consider their own automation stories as well. Now, I'm not the only one who wants to build this utopia. This utopia has to be something that everybody wants to build. And that's why it's different for every organization. And it's different for every engineer that you're going to work with, because each of these engineers have a different thing that they want to have automated, that they don’t want to deal with in their lives.
Harshil: Right. So let's switch gears a little bit and carrying the theme of automation, you came from the infrastructure as code and infrastructure automation background. Are there any lessons that you may want to share? Or maybe not lessons, but things around security implications, because if you're managing your infrastructure in a legacy environment versus managing everything in a cloud-native infrastructure as code world, what are some of the security implications that are interesting from your perspective, especially since you're coming from an infrastructure background?
Allan: Yeah. The thing that strikes me the most is how people seem to either forget or ignore the basics. And by this, I mean just because you can now run your API on a Docker container in a Kubernetes cluster, that does not imply that it's more secure than when it was running on a piece of thin sitting in a rack in a hardware. There's still thin that's managing, that's underneath the hood that's basically holding the container, the Kubernetes environment. So are you making sure that your hardware is still hardened OS? Are you making sure that if you're using a hypervisor that your hypervisor access is hardened, that your undercloud or overcloud networks are secure and isolated? Some of these topics that are really almost so commonplace today sometimes almost seem to be forgotten and left behind.
Harshil: Right. Yeah. And one of the common themes that I've seen is how security teams are still stuck sort of in the legacy mindset. I've seen so many people, they're operating in a Kubernetes native environment, but they still run network scans every day within their environment, and before they know it they have hundreds of thousands of issues that they need to resolve. But really, if you really think about it in a cloud-native environment that's managed with infrastructure as code, it's mostly immutable infrastructure where you don't really need thousands and thousands of items to go after. You need to go chase the underlying base images or the golden images or the artifacts that are stored in the registry to really fix the problem. But for some reason, a lot of the security teams still function in the way of “I'm going to detect things in the production environment, and I'm going to give you a list of thousands of IP addresses”. So I don't know if you've seen that as well. Or is it just me?
Allan: Yeah, no, I've definitely seen that. And unfortunately, I think that's probably going to be the hardest barrier to break. The tooling that exists doesn't necessarily support immutable infrastructure. When I'm scanning a service, it's still scanning an IP address. There's no real tool that says, “Oh, this is actually a pod”. And looking at your Kubernetes infrastructure, this is the container or this is the host, or that it's running on or goes down a more cloud-native route of identifying what you're actually scanning. And I think a lot of these challenges are just with the tools that are available. And one of the areas where security is still today lagging is not necessarily having something that's cloud-native to scan. And also personally, I don't sign up to “something should be scanned in a production environment”. If you've got insecurities in your production environment, it's way too late.
Harshil: Yeah. So then let me ask you this, now if you are the person or the team that's managing the cloud environment, the production infrastructure, and I as a security person, my job is to find out what my bugs are, what my vulnerabilities are, non-compliance things within the environment, various different things, right? So how should I work with you? Is there a particular time, or is there a particular system where I should communicate security information to you that makes your job much easier as an infrastructure architect?
Allan: Yes. I think the shift-left mentality definitely needs to apply, and this needs to be as close to the time of creation as humanly possible. You know security should, in all form and in practice, be part of doing the code review as you're building up your infrastructure as code templates, whether you're using Terraform, or Ansible, or CloudFormation, or whatever tooling you're using to build up your infrastructure. I strongly think that if you can emulate your environment ahead of the actual creation of it, it leads to a higher success rate in the security automation, because then you can also track your changes through your code review cycle. But again, this also means that there's a change required from the security environment that you need a lot more developer trained security engineers than you actually need security trained engineers. So you almost need to shift how your engineers are looking at the problem, and they need to be able to communicate to the developers and cloud engineers on those levels for the environment enablement.
Harshil: Right. So tell me a little bit more detail about it. So let's say I'm the security team, right? And I have the shiny new tool that does TerraForm scans and Kubernetes scans, and I can integrate that with GitHub, and it will look at my code and it will flag all the findings. But I am the security team, I'm not the one writing all the code. So that's an inherent challenge, right.? Because the security team who owns these tools and assessment systems, they are not writing the actual code. But the people who are writing this code, who are building out this infrastructure, they might not even know that these systems of security assessments exist. And if you are writing your own infrastructure in a relatively larger organization, there's no central chokehold, there's no central point of funneling everything through this one single piece. So how do you bridge those two things together, where security owns the security tools and the other teams across the organization are writing their own infrastructure? How do you connect those two things together?
Allan: So most of these tools that exist allow you to implement some form of tagging in it. So once you've identified the area or the team that you're working with, you could either go and tag these repos through your CI/CD pipeline. So certain tags from your repo gets published into your cloud environment so you know who the owner is or who the owning team is - and I'm saying this from a security point of view. So the security always knows, “Hey, this is the team that's building this. This is the owner or the team manager potentially having an email alias in there”. And that way when you're looking at it from a cloud environment, you immediately know who to contact, how to get hold of them, and potentially even what repo, and where this repo is located, where this infrastructure gets pulled from.
Harshil: Yeah, I love that idea of having the organization tag assets and artifacts so there's traceability and there's attribution to who built what. I guess the other question is, if you or the security team is running all these tools and consuming and seeing all these alerts, the developers that are writing this infrastructure, they probably don't even know that these things exist, right? So how do you bring the right set of information in front of the developers who are actually building out this infrastructure, and writing the code?
Allan: Again, from a security point of view, you have to work with your developers and the development teams constantly. The product managers of these teams know who the security engineers are. Within the teams, you may have some security champions, again, working closely with the rest of the security team, or even developers who have shown an interest in leaning towards the more secure side of adopting a more secure style in the actual team. All of these projects, all of these programs need to be existent and enabled within your org.
Harshil: Right. So what I'm hearing is the fundamental piece is to establish those relationships with the different Dev teams, whether it's with the engineering leadership, or building out a security champions program, or identifying people, engineers, developers who are passionate about security. So finding out who your points of contacts are, establishing those relationships so there is a mutually beneficial relationship between security and engineering. I think that's what I'm getting out of it.
Allan: Yes, definitely. There's two ways you can do this. Cultural adoption is always the easiest way. So if you drive the right culture, this is definitely the preferred way for adoption. You're going to get a lot higher resistance if you try and shove it down someone's throat and make it a thou shalt governance rule. “Thou shalt have vine container scanned” is the 11th law of said org does not really scale very well with engineers, but with the right culture and the right adoption, and having these programs in place, working with the teams, enables the teams to think of security as an enabler and not an inhibitor.
Harshil: Yeah. One story I can share is in one of my previous roles, I was talking to one of the developers, Dev leads, and I was asking him like, “Hey, why does the team not look at all these security things that we spend so much time and effort in finding out?”. And he said, “My team doesn't even know that these things exist”. They have a certain way of looking at their test results, the bugs from a quality performance reliability perspective. They have a certain set of tooling that they use on a day to day basis, but if security alerts live in a completely siloed different system, they're not going to go and look for it. So what we did as an experiment was we integrated security tools, and just showing them the results early on in their own system. So running security tests as a part of Jenkins pipelines, running it as a part of your GitHub pull request checks, and just showing that visibility brought about quite a bit of change within the developers, because now they could see that, “Hey, these are the things that maybe we should fix”. Now, granted that there's no world where they fix every single thing, there were still discussions and debates about it, but at least it got security front and center into what they look at on a day to day basis, and what they think about, what they care about. So that was one very simple trick for us to make awareness, drive awareness about security.
Allan: Yeah, that definitely works. And that definitely scales very well. I think the one big challenge from that is there's not just one tool to rule them all anymore. You've got SaaS, you've got DaaS, you've got IST, you've got dependency license scanning. This can be one tool, it can be three tools, it can be five tools. So it's becoming more and more complex to say, “Okay, well these are the five or six dashboards that you need to go to look at how your tools are performing”. And again, this is where you're losing the interest. So the real benefit is in being able to have this integration, like you said, and present all of these findings in the single pane of glass.
Harshil: Right. And prioritized based on what's relevant to the business, right? So policy driven prioritization.
Harshil: Yeah, that's exactly what we are building at Tromzo, by the way.
So I know you're also very interested and passionate about this open source project called Security Scorecard by OpenSSF. Tell me a little bit more about that project.
Allan: Sure. So again, if we just take a step back to what I just said a minute ago, now you've got all of these tools and all of these tools are raising different risks. Now, in each of these tools, any one of these risks can potentially be a circuit breaker to your build. So let's take software composition analysis. Let's say there's a new finding of a high vulnerability in one of your dependencies. On its own, it could now break your bolt as a circuit breaker. And whether it should or should not break your bold is a part of your business policy that you want to exercise. But how do we quantify this risk, along with the SaaS scan or the desk scan, or the infrastructure-testing or the end-to-end testing? How do we build a full card that says for this specific project, these are all of my risks, and give me an aggregated score. And I'd rather then say,
“Okay, let's make the circuit breaker decision on the aggregated score than on just a single item being flagged as a potential high risk”. And again, this is where I think something like what is more commonly known as the Google Scorecard project comes in, is that it advocates for these types of decisions where you have more of an aggregate scoring solution as an overall instead of just a singular item having a make or break, or circuit breaker decision within your CI/CD pipeline.
Harshil: Right. And I love the idea that when you define those different policies that combine the results into an aggregated score, it's also very transparent, right? So it runs as a part of your CI pipeline and developers can look at it like, “Hey, my score is, I don't know, nine out of ten or four out of ten”. And the threshold is seven, anything below a seven, you're not going to be able to build, you're not going to be able to release to production. So what is it that needs to be done to get my score above a seven or above an eight? And this goes back to our earlier point of security folks focusing on very high value, high leverage work of defining what policies are appropriate for the organization, and leave the rest of the manual work to automation where the automation system aggregates the results of those policies, combines them into a single actionable score. And then you build a pipeline or the delivery pipeline just makes a decision based on that piece.
Allan: Yeah. And what's nice is because you're using your CI/CD pipeline and you're building your policy as code, this code can reside within your repo. It's maybe one or two additional files, it's not going to kill the world in storage. But the second thing is now this policy becomes auditable by anyone, you can see if this policy has been changed, if those changes are accepted or not. And you could even now have automated alerting based on this policy being changed. So we have the example, like you said, if your score is below a seven. So maybe I just want to alert if someone goes and changes my average score, that should not be an alert. Why is the new average score being lowered down to four? Who made this change? Whereas something like the repo owner or the project owner might not necessarily need an alert, because during a projects lifecycle, it may have different owners, people may definitely still leave and join the org. So you’ve got this governance that you can also go and say articulately what you want to have alerting on.
Harshil: Right. Yeah, and I'm a big fan of making things simple. A few years ago, I had given a talk at one of the security conferences about how Apple Watch is so amazing. So a lot of people who are into fitness, they use Apple Watch or Fitbit or some sort of tracker, right? And being able to see those three rings on the Apple iWatch that tells you, have you met your daily goal or not? Are you under your goal? Are you above your goal? That is such an easy, simple way to set a goal for yourself, make sure you're meeting that goal or exceeding that goal, and be aware of the risks that you might be introducing. I think that's exactly what security scorecard does, which is it removes a lot of this complexity about security. Because let's be real, developers don't have the time to learn and understand nine or ten different security tools. They want security to be much more simple, much more actionable. So something like this Google scorecard makes it very easy to understand, very easy and deterministic that it's not nebulous anymore. If this, then that, and maybe this, maybe that. It's none of those discussions. It's very obvious, here's a binary decision, you either do it or you don't do it. And either you're allowed to deploy or you're not allowed to deploy.
Allan: Yeah. And another one of the big benefits that comes out of that, each of these nine or ten different tools each produce a ton of alerts. So if your team is getting these alerts, you're quickly going to run into alert fatigue. You're constantly going to be chasing alerts. And again, that’s not a scalable solution. By introducing this policy driven system, you've got one system, all of your data is aggregated into one system, and you're globalizing your alerts to only be generated through one system whereby there's one source, there's one pane of glass, there's one set of alerts to look at. And that's really the biggest benefit that I can see coming out of a solution like this where you really want to have just the highlights. This repo didn’t cut it, here’s things you can do to make your repo make this project. You didn't have a SaaS scan or your SaaS scan had these findings and this brought your score down to a six. So these very quick fixes then enable the team to very quickly turn around and say, “Hey, these are the two, three things that we need to fix, and that will move our score from a six to an eight”.
Harshil: Yeah, I can almost see sort of a remediation campaign-style operation, right? So typically a lot of security teams focus on certain types of bugs or certain classes of bugs and work with the engineering organization on, “Hey, let's resolve these types of things in the next quarter, in the next six months”, or whatever. So you can modify your scores and policies to reflect what the organizational priorities are, and that just drives action across the engineering organization.
Allan: Yeah, definitely. And another one thing that I've continuously seen is you might - again for your smaller teams and your younger, let's call it startup, hypergrowth level companies, they might not have a mature security posture today. They're only doing maybe the scanning. So as they introduce a new tool, of course there's going to be findings. Of course, there's going to be a lot of risk added to your risk card. This is expected. But again, from a security point of view, how do we turn this around and turn it into an enabler? We don't want to grind engineering to a halt by, “Hey, here’s these new 5000 risks that you never knew about, but we just added this tool that now tells you about them”. This is not the intent. The intent is let's scan for a new baseline, let's reset the baseline, and because this is now a policy-driven architecture, we can manage the policy effectively and still work on enabling the engineering teams to do what they do best. One of the examples that I like to compare this with is if you've ever heard of Amazon interview, they have a Bar Raiser program and the Bar Raiser is supposed to in that interview, see how you as an engineer meets or raises the bar, or meets or raises the existing status quo of the engineering team, or of the team that we're interviewing for. And the same should now be applied for software. How is this review process that we're going through raising the bar for the repo, for the project, for the org.
Harshil: I love it. Raising the bar in small increments. I love that approach.
Allan, unfortunately, this is all the time we have for our recording today. I really appreciate all the insights you shared. Thank you so much for being a guest here.
Allan: It's only been my pleasure, and I look forward to speaking to you real soon, Harshil.
Harshil: Fantastic. Thank you.
Thanks for listening to the Future of Application Security. If you've enjoyed this episode or you are new to the show, I'd love to have you subscribe wherever you get your podcasts, so you don't miss any episode. And if you like the podcast, I'd be grateful if you can leave us a review on Apple Podcasts. Thank you for listening.
The past two weeks have been amazing for Tromzo. First we were named as an Application Security Posture Management (ASPM) Sample Vendor in Gartner's Hype Cycle for...Read more