Monday, 18 April 2022

Auditing Protected Lsass (RunAsPPL) Access using Sysmon

 Auditing Lsass access using Sysmon is one of the key settings that blueteam are using to detect suspicious instances in an attempt to detect behaviour like Mimikatz. It's also known that a lot of legit programs (including MS native services) are requesting process access handle (including VM_READ) which get very noisy in large scale deployment. 

One exception that comes to mind is what if some of the machines in the prod have RunAsPPL enabled ? RunAsPPL normally is designed to prevent OpenProcess from a process with lower signature level to access Lsass.  Thus the volume of Sysmon 10 events with TargetImage set to lsass.exe shouldn't be a problem as its limited to what Lsa PPL allow:

If the SourceProcess is not a protected process then the only allowed access rights (from a high/system integrity level) are : 


Excluding QUERY_LIMITED_INFORMATION (suspend/terminate is a bit abnormal for lsass anyway) and should be good enough to limit noise to SourceImage set to a protected processes (with signature level >=  PsProtectedSignerLsa) or a usermode process leveraging a kernel mode driver such as ProcessExplorer via PROCEXP512.sys or ProcessHacker via kph.sys and those cases should be limited (no major impact on noise, can excluded with minimal efforts).

To confirm this behavior, created a generic sysmon config to audit all process access to PPL Lsass : 

ProcExp leveraging PROCEXP.SYS

As suspected, almost 90% of the volume is related to query_limited_info/set_limited_info, rest is malicious or related to a usermode process that leverage a kernel mode component :

As can be seen above, even protected process with signature level > LsaLight (services.exe) they usually don't request GA > query_limited_info (even if technically they can get full access). This lead us to the idea that if  we receive a sysmon event 10 with target image set to lsass and granted access allow to read memory and RunAsPPL is enabled from that machine then there are high chances it's A) PPL Bypass or B) legit usermode process (non protected) leveraging a kernel driver (risky too if not controlled and can be abused via stealing lsass handles), C) Protected Process with signature_protect level >= Lsa_LIGHT (should be limited).

The question that remain is how an analyst can differentiate between a Sysmon event 10 (Process Access) coming from an Lsa protected process or from a normal one ?  

An ideal scenario is if Sysmon added an enrichment to Sysmon 10 events (at least if targe image is lsass but can also be useful for auditing PP tampering in general), something like : 

A dirty way of  doing it, is to create a scheduled task (via GPO) that triggers on a sysmon 10 event with lsass access (custom XML filter), then run a command to query RunAsPPL registry value (I know it can be deleted as a bypass with no effect on PPL but unlikely scenario and can be audited via sysmon 12) and write the results to the event logs (everytime lsass is accessed), then ingest and correlate that event with any process access:

Correlating Lsa RunAsPPL status with Sysmon Process Access enabled us to differentiate between what can be considered as a regular false positive (e.g. svchost.exe accessing Lsass with full access rights) and what could be a serious alert like PPL bypass or potential bypass vectors such as a third party unprotected process leveraging a kernel driver to obtain access to protected processes.


Sunday, 20 March 2022

Structured Approach to Triage New Detection Ideas


Triaging new detection ideas is an important aspect of detection engineering, as it allow us to focus on the most important tasks and to optimize the utilization of the existing limited resources (both human and technology).

It doesn't have to be perfect, but it needs to minimize the effect of personal preferences and the tendency or desire to catch the most advanced or recent stuff. To do so, we need a way to assign qualitative scores for certain critical questions, the threshold can be adjusted per your need and context.

The following diagram try to summarize the most relevant questions to consider while deciding on whether or not to implement a new detection idea. Note that if a detection does not fit the agreed threshold, it can move as a hunt or a scheduled report (the goal here is to scope rules running in near real-time and that tend to consume more computing processing power).

Detecting all known common LOLBINS connecting to the internet, at first glance seems to be a good idea, but if we take it through the above process : 

  • Coverage width is high since a variety of malware droppers tend to involve some kind of lolbins -> DS = 5 (having access to a malware sandbox helps with this point)
  • Performance impact is medium : although the number of lolbin binaries is considerably high, at least 25 processes, we are still using only one type of event (network), no correlation, logic is simple (if network event detected and process is in lolbins_list alert) -> DS = 10
  • It is a critical technique (matches Initial Access & Execution and partially defense evasion too) -> DS=15
  • Triage experience and noise ratio: a considerable number of LOLBINS tend to connect to the internet for legit activities, this makes quick assessment a little bit harder, the same impact FP rate (False Positives) -> DS=15 (no changes to the score)
  • resilience to bypass:  low effort via renaming a LOLBIN process name to something else -> DS=12 (15-3)
So our idea total score is 12, as it seems to be very close to our defined threshold, we can target the weakest points of our initial idea and adjust it, either by working on the evasion aspect, triage experience or both, to do so we can slightly change our initial idea to a correlation of a lolbins process execution followed by network connection, this will allow us to improve resilience by using process original file name (instead of process name) and also make triage and FPs tunning processes more flexible by having access to both the process arguments and the destination address.

Note that sometimes adjusting the initial logic may require to redo the whole assessment, for instance in our example the correlation may increase performance impact on the rules processing engine, but this is an acceptable impact (medium) since we are limiting the correlation to a subset of known processes (lolbins).