How to Execute During Incident Response: OODA for DFIR 2020
We’re at the final post in our OODA and endpoint triage series where we’ve been talking about using the OODA loop during the investigative process. It’s time to execute (or “Act”) on the Decision that we made so that we can answer our investigative questions about the endpoint or server during incident response.st
We’ll recap what has been covered thus far and drive into steps associated with Acting.
Let’s quickly review what has happened so far:
- In the Fall, we did a 9-part “Intro to IR” series focused on breaking up big digital investigation questions (IQs) into smaller investigative questions. It focused on topics like answering “Is there evidence malware persistence mechanisms?” or “Are there suspicious user accounts?”
- This OODA series is about making the numerous decisions that need to occur in that process
- OODA is a decision-making model that is natural to many people but also formalized to help optimize processes
- It consists of four phases that continue to repeat and provide feedback. The idea is to quickly iterate through them faster than your adversary and therefore make decisions faster than them and answer your IQs quickly.
What is OODA again?
OODA is acronym with four repeating phases:
- Observe: Observe the situation using a variety of sources that are available. You will never be able to collect all data before making your first decision. In a previous post, we talked about endpoint and network visibility
- Orient: Use your experience and training to make sense of what you are seeing so that you can answer your IQs. Make hypotheses about the answers. This was the focus of an earlier post, and it talked about data overload, confidence values, and alternative hypotheses
- Decide: Based on your goals, orientation, and timing needs, decide on what action to take and when. Actions could include testing a hypothesis about the current state (i.e. get more data) or to try to change the predictions about what will happen next (i.e. thwart a bad guy). We previously looked at how to make that decision, you’ll need to consider your confidence levels and risk of the action
- Act: Finally, the next step is to act based on the previous decision.
The impact of the action (and at all stages) should be observed and fed back so that a new orientation can be made and another follow on decision and action.
Here is graphic focused on endpoint triage with details that we’ve covered so far:
How to Execute During Incident Response
The mechanics of acting depend a lot on what you have for infrastructure and tools. So, this discussion is going to stay at a fairly high level.
There are several categories of possible actions, including:
- Get More Host or Network Data: Perhaps no conclusions can be drawn yet with high confidence and the action is to get more data via more collections or searches. The techniques and requirements are the same that we outlined in the “Orient” article and require you to have sufficient endpoint visibility
- Research: Sometimes it’s not bits and bytes that you need, but rather it’s more information about a threat, what a given computer is used for, or what role a user has in an organization. This process depends largely on your organization, how device management is performed, and what access you have to company roles
- Isolate: If it is believed that the system poses a threat to the enterprise, you may choose to isolate it from the network. This can happen by either disabling network adaptors and unplugging or blocking network traffic to and from it. Your ability to do this depends on your remote access to the endpoint and types of network gear you use. Of course, once you isolate it, you may not be able to get more data from it (but you should have already considered that during the Decision phase)
- Remediate: At various points in the process, you may want to take action to take capabilities away from the attacker. These can include changing passwords on accounts that have been compromised or removing malware from systems. Some organizations will try to surgically remove malware and hope they’ve gotten it all. Other places will simply rebuild a new computer from scratch. Of course, remediation only works when all vectors are blocked. If malware is removed, but they still have credentials, they can re-install. If they lose an account but still have malware, they may be able to get new password hashes.
There honestly isn’t a lot to talk about here with respect to Act’ing because you should have already evaluated the feasibility and impacts of the action during the previous “Decide” phase.
Now, you just need to do it.
How to Execute During Incident Response: Automation and Planning
Given that the original goal of OODA was speed and for fighter pilots to be able to engage faster than their adversary, it’s hard to not keep thinking about automation and planning for the various stages.
- Observe: Fast observations come from:
- Technology: Having the visibility tools that quickly collect or search endpoint and network data, access to threat intelligence, and past data
- Training: Knowing what data is important to observe based on previous attacks
- Documentation: Knowing what data exists in your environment that you are able to observe
- Orient: Fast orientation comes from:
- Technology: Having automated tools to score data and integrate with threat intelligence
- Training: Knowing what suspicious and bad data look like and what often happens during attacks
- Documentation: Knowing about is normal in your environment so that you can consider alternative hypotheses in your organization
- Technology: Having tools that give you the largest number of possible actions, including getting more data or remediation
- Training: Knowing what actions are possible and their risks
- Documentation: Having documented procedures (i.e. playbooks) that outline responders are expected to do in common situations. This allows them to quickly act on that playbook without needing to consider alternatives
- Technology: Same as above for “Observe” and “Decide.” You need to have infrastructure that gives you visibility and remediation capabilities
- Training: Knowing how to perform the actions
- Documentation: Knowing what approvals are needed for different kinds of actions (such as isolating a computer).
Incident response relies on lots of technology and tools, but training and documentation are also critical to making sure you can execute quickly and efficiently.
We’ve completed the series on using the OODA process during an endpoint investigation and we hope that it has helped as a way to frame and organize your process. Training becomes just as critical as having the right visibility and analysis software to ensure the process is fast and thorough.
If you need better endpoint visibility and more automation in endpoint analysis, try out the free Cyber Triage evaluation. It’s an agentless solution that automates the collection and analysis of data.
If you liked this content and want to make sure you don’t miss our next article, sign up for the Cyber RespondIR email!