In the information security labor market, perhaps no other job title has the allure of the threat hunter. While ordinary infosec job seekers want to be chosen for the red team or the blue team, threat hunter roles seem to operate on a different plane where challenges can be fast and furious.
Devon Kerr, principal threat researcher at Endgame Inc., a cybersecurity company headquartered in Arlington, Va., started his career in the ’90s in IT operations working primarily as a systems administrator before transitioning to network engineering. In the early 2000s, when he was working for a company that suffered a security issue, Kerr discovered he had a knack for investigation, and that knack led him to his current role.
As principal threat researcher at Endgame, Kerr has not just done the job, but he also co-wrote a book on it with Paul Ewing, senior threat researcher at Endgame, The Endgame Guide to Threat Hunting, in which they lay out the threat hunting process.
And while having technical skills can help the hunter, successful threat hunters don’t always start out as techies, according to Kerr. In this Q&A, Kerr speaks about what it takes to become a threat hunter and what kinds of skills and tools are needed and how they can be applied.
Editor’s note: This interview was edited for length and clarity.
Who can be a threat hunter? Can anyone do it, or is threat hunting more of a specialist role?
Devon Kerr: Threat hunting as a discipline doesn’t really require any specialized skills. Most of the knowledge that you need to be an effective threat hunter — which I know is a loaded term — is something that you can achieve on the job. To get started, you need data, some sort of tooling and a critical thinking approach to identifying things that are unusual given an environment.
The first thing is, what is threat hunting? Threat hunting is often mystified by the industry, and I think that’s intentional. I think that there are business reasons why the industry would want threat hunting to seem like a prestige class of incident response or analysts — and it’s really not.
Threat hunting has always existed. We’ve called it different names; it used to just be called monitoring and detection. That was threat hunting.
Devon Kerrprincipal threat researcher, Endgame
Before we had technologies that had rules and could give us a reactive approach, maybe a more passive or lazy approach to detection, we had people who were looking for these unusual things. We have simply rebranded monitoring and detection with this new term, which I think is a reflection that most technology now is a reaction to some stimulus, and then it’s not facilitating folks to go out and just ask questions about the environment.
From a threat hunter perspective, there are folks all over in different roles expected to support threat hunting. Some of those folks will be incident responders or they might be malware analysts or they might inhabit both an IT and a security role like I once did. Then you’ve got folks in the leadership chain who manage those organizations and who also may step in to do threat hunting from time to time.
I don’t believe that any one group of people is responsible for owning the knowledge domain. I have supported investigations over many years where database administrators, office assistants, electricians, heating and air conditioning engineers have stepped in to support this role. By and large, those individuals are just as successful as anyone else if they’ve got the right visibility into the environment.
How do those people go about the threat hunting process? And what should they be looking for?
Kerr: Asking questions typically begins with simple anomalies: Is this even normal? And then working from there to a notion of frequency: How common is it? What’s the prevalence of this thing that I’m questioning in the environment? Is it only on one system, which makes it seem more suspicious, or is it on every system, which may preclude this being marked as suspicious?
It might mean we want to whitelist that as benign, even if we don’t know what it’s doing. Something widespread could be considered benign.
When adversaries are active in an environment, they leave behind evidence — that’s typically the type of thing we’re looking for. Though threat hunting has a number of other objectives that we don’t generally talk about, like developing awareness of the environment, just knowing what’s normal and what’s not being used to identify things related to security, auditing and compliance. For example, how many systems do we have that have not yet been patched to this particular CVE or, in the case of Windows, this particular patch release?
And, to a degree, we find that hunting also addresses questions of internal user misbehavior, which is not exactly the same thing as an adversary doing something malicious. A good example of internal user misbehavior might be if you have someone in accounting who was granted administrative rights on a system, and maybe they run some commands in an experimental way that is not related to their role and would be considered unexpected for their account. We might discover that through a type of targeted threat hunting.
We put out a practitioner’s guide to threat hunting meant to be used as a desk reference. [We wanted to give] the practitioner some very specific things that they can do.
We’re not redefining threat hunting; we’re not attempting to prescribe actions or functions, but really trying to help folks understand that it’s not black magic and it’s entirely within their reach, and really hammer home a point that you don’t find what you don’t look for. Organizations are very reliant on technology to give them all the answers; sometimes, we have to leverage technology to find the answers ourselves, and that’s one of the calls to action I think that we attempt to hit in this particular volume.
Some of the things that we go over in this book include what a hunt team looks like, how you can comprise hunt teams from maybe unexpected groups of personnel, how you develop some awareness of what’s in your environment, and where do you begin and where do you start if you have no capability whatsoever?
And, from my experience, those are some of the biggest hurdles that organizations are trying to overcome: Where do I start? What’s square one? And once I have become familiar with square one, where do I go from there?
What kind of data do you start with? And where is the best place to start looking for data, as well as the tooling to get started?
Kerr: We can break down each one of these into their specialized areas. When it comes to data, a lot of organizations look at MITRE’s ATT&CK [Adversarial Tactics, Techniques and Common Knowledge], and one of the things ATT&CK provides is a list of data sources for every single technique.
MITRE, to a degree, was a little biased when they created ATT&CK. Most of their data sources are considered process-based, so when you launch an application on a given operating system, all of the data and metadata associated with that would fall into process-type behavior.
It’s not necessarily limited to just knowing this user wants this application. It’s also things like: Does that application use the network? Does it have specific modules, libraries or plug-ins that support different functionality? But, ultimately, all these things allow us to ask questions about a given environment.
In our BSidesCharm presentation [at the regional Security BSides conference held in Baltimore] in April, Roberto [Rodriguez, adversary detection researcher at SpecterOps] and I discovered that if you look across all of the ATT&CK matrix and you just break down techniques based on common data sources, the majority of those are going to be process-based, and we do have a top 10 list that we released with that presentation.
But if you cross-reference it by threat groups, and MITRE actually correlates each of these techniques to one or more threat groups, and once you start looking at the intersection of common techniques and bad actors — at least those which we know about — you start to see a reinforcement of that top 10 list. That top 10 list is a great place to start, and the majority of technologies that people deploy for this purpose cover all of those use cases.
Some of the common ones that are not process-related would be things like modification of the registry, general use of the network, or whether a given binary is packed or not, and each one of these things can be extracted through tooling.
I am a threat-agnostic practitioner, so I tend to ignore the fact that a Russian group or a Chinese group or some other group has used a technique. But for those who might be in a specific industry that is targeted by those groups or are just looking for a way to get into threat hunting without losing political capital with leadership, that is one place that we see organizations start, is cross-referencing these threat actors.
From a technology perspective, though, you can get most of this data with free solutions.
On Windows, Sysmon exposes almost all of the top 10 sources of data; that’s part of [Windows] Sysinternals, which is a Microsoft acquisition. If you’re looking at other open source solutions, there’s osquery, which was a Facebook project that was open source.
But, ultimately, what these things do is they gather the evidence you need and they forward it to some centralized location. That could be a commercial solution or it could be a noncommercial solution. For a lot of organizations, they’re going to be looking for an enterprise commercial solution that does some of the heavy lifting for them, and that’s one of the ways that Endgame has positioned itself in the marketplace.
If we’re talking about just raw capability, most of these open source solutions solve that problem; they give you the ability to aggregate. Free solutions primarily lack usability. They also tend to be a little light when it comes to analytics, requiring the practitioner to do a little bit more of that. And that is the way we see folks using what is commonly referred to as the hypothesis-driven model, where you look at ATT&CK [and] you pick a technique — a good example might be accessibility features.
Accessibility features include things like the sticky keys exploit where you hit shift five times on a Windows computer and it pops up an accessibility feature asking if you would like to enable some of these usability features for people who may be vision-impaired, [or] if you alter a specific registry key that gives you a sort of poor man’s backdoor into the system.
Well, if we look at that one technique, we can decide that’s a registry modification, that’s a process modification. We can pull that data into a simple query. We’re going to assume that no registry keys in the environment show evidence of this particular exploit and we just prove ourselves wrong. That’s the entire approach.
We make sure that we have full coverage of our Windows fleet, we document the percentage of systems that we’ve covered and we look backward in time, not just what’s present today. But if we have data over, say, the last 30-60-90 days to be convinced that this has not happened in the recent past, and then if we’re really comfortable with the way that we’ve analyzed this data — and in this case, it’s a very black and white type of analytic — we might even automate that collection and most of the analysis so that, ultimately, what the user winds up with is just a yes or no question. Is this expected or not?
If it’s not, you can proceed to incident response. And if it is, then you move onto whatever is next on your list. Most organizations have a daily/weekly/monthly/quarterly approach to threat hunting where different techniques are covered at different points in time over this annual cycle.
You covered data and tooling, but how does the critical thinking approach fit in?
Kerr: It begins with data and tooling, and the third thing is a critical thinking approach. You have to be a critical thinker.
How people problem-solve, I think, lends itself directly to threat hunting because it’s anomaly detection. To a degree, the environment informs you.
If you ask a question related to one of these techniques and you find it’s present on every system, that frequency is much too high to be malicious. And if you go through and you find it on only one or two systems, triaging those systems and doing a little bit of a deeper dive generally helps you to understand whether it’s malicious or not.
I would say that the majority of techniques that organizations look for in the shorter timeframes — daily or weekly — those tend to be more black and white and involve less analysis or time. And those are the type of tasks that are well-suited for people in an IT type of role, whether they’re splitting time between IT and security or they’re, strictly speaking, systems administrators or engineers. I’ve met great success getting those folks up to speed doing these shorter-term daily, weekly and even to a degree biweekly or monthly types of hunts.