Like security and IT professionals worldwide, Alert Logic was kept busy dealing with the fallout from the discovery of CVE-2021-44228, dubbed Log4Shell, caused by flaws in the Apache Log4j logging library. Now that the dust is beginning to settle, we can start to reflect on the work done.
- Our emerging threat Knowledge Base Article (KBA) on our website outlines our response on December 10
- A breakdown of the three attack phases involved in successful exploitation
- The valuable work our threat intelligence teams conducted to create new (and identify existing) coverage for the Log4Shell exploit and derivatives
This blog post outlines what we did with the intelligence highlighted above and expands on the actions we took to help our vulnerable and targeted customers respond to Log4Shell, as well as the intra-team synergy that facilitated our response.
Threat Research and Hunting — Casting a Wide Net
We assign analysts, security researchers and data scientists (collectively referred to as threat hunters) to threat hunting tasks daily. When an emerging threat is discovered, these teams are the first line of defense, as they can move quicker than our content or analytical engines due to human flexibility and agility. They immediately shifted their attention to Log4j once it was announced by LunaSec.
Before our researchers and content creators worked on targeted, automated detection, they deployed new wideband log and IDS telemetry signatures and pointed to suitable existing signatures which could catch the Log4j exploits.
This step is essential in enabling our analysts and data scientists to effectively hunt for compromise in our massive threat data lake. Think of the data lake as a giant haystack — the researchers organize the hay into bales and then point out which bales are likely to contain the needle. It is still a difficult and time-consuming task, but when a threat first emerges, you must cast your net wide in order to maximize the probability of catching the exploits and the inevitable derivatives.
Our Threat Hunters reviewed the telemetry manually, identifying failed and successful attempts. Each was fed back to the Researchers and Security Content Creators to improve on our telemetry signatures, reducing the noise and creating more targeted detections.
Hunting dashboards were created to track all the relevant data and shared across the teams so work could continue 24/7.
All IoCs, IoAs, and other useful identifiers were uploaded to our threat tracking database, where we upload all indicators from all campaigns and emerging threats. This allows us to track threat actors/groups via activity clusters, so that when one of their TTPs change, we can still identify them from the additional indicators or TTPs observed.
Good examples would be the attacker infrastructure or favored TTPS used in other stages of the kill chain. In this case, the initial exploit was new, but subsequent steps were not. Tracking this information means we know exactly where to look next, and speeds up analysis times, increasing our mean time to detect and our customers’ mean time to respond.
Our unique insight into 4000+ customer environments means we have a rich database of unique threat data we can combine with public domain intelligence and proprietary threat intelligence created and curated by our Global Threat Intelligence teams.
Responding to Successful Exploitation
Our analysts escalated incidents, with attached evidence and detailed mitigation and remediation steps, via customers’ preferred notification channels (email, Microsoft Teams, Pager Duty, ServiceNow, etc.).
As compromise was observed, each incident was accompanied by a phone call from the analyst who identified the threat. In addition to the escalated incident report, responders were verbally alerted and walked through the escalation, ensuring response actions were understood and able to be executed immediately.
In some scenarios, attackers had gained access via the Log4j exploit and began their next actions in the installation and C2 phases. Our analysts hunt down the kill chain to identify the full scope of the compromise and include this in the report and remediation steps. The data tracked in the database means we often know the favored next steps for specific groups, and this information helps us identify if any further compromise was observed — quickly.
Our analyst used FIM data which, as standard, monitors security sensitive directories where attackers often embed persistence mechanisms in scheduled tasks. This allows us to direct customers to the exact files that have been modified by the attacker, which could have re-installed the attacker onto the machine, even after the initial exploit was patched.
Our unified console means our analysts, and our customers, can see both threat data, exposure data, and asset meta-data. This unified view allows us to easily correlate across data types, as well as data sources, facilitating comprehensive and speedy analysis and response.
Responding when Exploitation Hasn’t Been Successful …Yet
Beyond looking for compromise, we also looked in log and IDS data to see if our customers had any products that included the Log4j module and met the other precursors for vulnerability.
Despite not yet being exploited, we also reached out to these customers via an incident escalation and analyst phone call. This still held the same urgency for us, due to the severity of this threat, so they were also escalated at high/critical severity. Analysts informed customers of the nature of the vulnerability and that although they had not been exploited, explained the need and urgency to patch or mitigate against the vulnerability.
Some customers were in change freezes, or not able to patch immediately for other reasons. These customers were told to take the mitigation step of changing ‘formatMsgNoLookups’ from the default flag to true.
This would buy them some time in patching, but our analysts were able to explain that this should not be seen as an alternative to patching, and they should patch at the next possible opportunity.
Creating Content for Scanning and Automated Detection
The actions taken by our Threat Hunters are essential in providing quick response and detections; however, anything that can be effectively automated should be. Machines are faster than humans, and humans are better used on higher value, adaptive tasks.
So, while the Threat Hunters were manually alerting exploited and vulnerable customers, their output was fed to the content team to create automated detections. Having the established threat hunting process means our security content creators can pass the telemetry signatures to the hunters while they create content with a high enough fidelity to be able to alert automatically with minimal false positives.
It takes time to responsibly create automated detections that limit false positives and, more importantly, minimize false negatives. Having a manual Threat Hunting process serves as a backstop to automated detection and allows us to test and review proposed analytics while our customers are still covered.
We continue to assess the fidelity of deployed analytics and will create automated detections under the 15-minute SLA once we have enough data to be confident of the false positive and negative rate. At that point, it will be fully enabled.
In order to better detect vulnerable systems in our customer base, numerous vulnerability assessment methods were created and deployed to our scanners. On the 12th of December we rolled out scanning content for our vulnerability scanners.
Continuing the Hunt
Our researchers and hunters will continue to look for derivatives of the known exploits in anticipation of attackers developing new methods of exploiting the vulnerability to evade existing mitigations or detections.
Our researchers have included Log4j in our existing Java based project, where we deploy hunting telemetry in anticipation of the next emerging threat. We have seen success in this project previously when we dealt with Confluence vulnerability CVE-2019-3396 in 2019, developed and deployed many general-purpose signatures designed to detect OGNL attacks, and caught the 2021 CVE-2021-26084 in the older signatures.
It is initiatives like this that meant we had existing hunting telemetry we could rely on as soon as Log4j hit our radars.
For Customers with Vulnerable Systems Outside their Alert Logic Protection Scope
Due to the ubiquity of this vulnerability, the severity (10/10), and that many of our customer base may be unaware of java-based products that bundle Log4j, we sent out customer communications to our entire customer and partner base.
This communication included background information, instructions on how to identify if you’re vulnerable, how to patch or how to mitigate, if patching immediately is not possible. We also explained that this mitigation step should only be temporary, and the true remediation action is to patch.