PowerShell Script Block Logging with Event ID 4104

Most PowerShell-based attacks rely on the same trick: pass a Base64-encoded command, a string concatenation, or a script downloaded at runtime, and hope nothing reads what actually executed. PowerShell Script Block Logging 4104 defeats that assumption by logging the script source after the parser has resolved encoding, concatenation, and variable substitution. This post is the enable-tune-hunt walkthrough for getting 4104 useful in a real environment.

Key Takeaways

  • Event ID 4104 in the Microsoft-Windows-PowerShell/Operational channel logs every script block PowerShell compiles — including the deobfuscated content of Base64 and concatenated strings.
  • 4104 is off by default. Enable it via Group Policy or a one-line registry write; the corresponding registry key is the canonical reference.
  • The default channel size overflows quickly on busy hosts. Raise Microsoft-Windows-PowerShell/Operational to 1 GB at minimum and forward to a central collector.
  • Hunt for the few high-signal patterns first: -EncodedCommand, FromBase64String, DownloadString, and reflective Assembly.Load calls. They produce one or two hits per fleet per day and are almost always worth reading.
  • 4104 pairs with Constrained Language Mode and AMSI — the event sees the code; the other two stop or scan it. Run all three.

Environment

  • Windows 10/11 and Windows Server 2019/2022 endpoints with PowerShell 5.1 or later (5.1 is the minimum that supports script block logging).
  • Group Policy or Intune deployment channel to push the registry key fleet-wide.
  • Windows Event Forwarding or a SIEM agent to ship the PowerShell Operational channel off-host. See our WEF setup post.
  • PowerShell 5.1 or 7.4 on the collector for hunting queries.

The Problem

PowerShell is the default execution environment for both legitimate administration and a large share of post-exploitation tooling on Windows. Empire, PowerSploit, Covenant, and most commodity loaders all reach for PowerShell because it is signed, present on every host, and trusted by application control. The mitigations Microsoft added in PowerShell 5.0 — script block logging, AMSI, Constrained Language Mode, transcription — are what make PowerShell defensible. Of those, 4104 is the one that produces the analyst-readable artefact: the actual code that ran, after the parser unrolled everything.

The wrinkle is that 4104 is off out of the box, and even when it is on, the channel size is too small to be useful on a busy host. A single web shell that loops every five seconds will fill the default 15 MB log in an afternoon. Detection engineering against 4104 starts with the operational work of enabling it correctly, forwarding it, and writing queries that pick out the few patterns worth alerting on.

The Solution

Step 1 — Enable script block logging

The supported path is Group Policy: Computer Configuration → Administrative Templates → Windows Components → Windows PowerShell → Turn on PowerShell Script Block Logging → Enabled. The "Log script block invocation start / stop events" checkbox underneath generates additional 4105/4106 events for every block — useful for execution chaining, noisy for everything else.

The registry equivalent is one key, scriptable for non-domain endpoints:

# Enable on a single host
$path = 'HKLM:\Software\Policies\Microsoft\Windows\PowerShell\ScriptBlockLogging'
New-Item -Path $path -Force | Out-Null
Set-ItemProperty -Path $path -Name 'EnableScriptBlockLogging' -Value 1 -Type DWord

Microsoft's authoritative reference for the logging settings is in the about_Logging_Windows documentation. The setting takes effect on the next PowerShell session — running sessions are not retroactively instrumented.

Step 2 — Size the channel and add transcription

Before traffic builds up, raise the channel size on every endpoint:

wevtutil sl 'Microsoft-Windows-PowerShell/Operational' /ms:1073741824   # 1 GB

For high-value hosts (jump boxes, admin workstations, build servers), enable transcription as well — script block logging captures what was compiled, transcription captures the full input and output as a text file:

$tx = 'HKLM:\Software\Policies\Microsoft\Windows\PowerShell\Transcription'
New-Item -Path $tx -Force | Out-Null
Set-ItemProperty -Path $tx -Name 'EnableTranscripting'   -Value 1 -Type DWord
Set-ItemProperty -Path $tx -Name 'EnableInvocationHeader' -Value 1 -Type DWord
Set-ItemProperty -Path $tx -Name 'OutputDirectory'       -Value '\\fileserver\pstx$\%COMPUTERNAME%' -Type String

Point transcripts to a write-only network share that the host can append to but not read back. That stops a compromised endpoint from tampering with its own transcript history while keeping the data centralised.

Step 3 — Where the deobfuscated text actually lives

Large script blocks split across multiple 4104 events, with two fields used to stitch them back together: MessageNumber and MessageTotal. The deobfuscated source is in the event message body, not in a separate property. The crucial point is that PowerShell logs the script after its parser has processed the input — passing -EncodedCommand <Base64> produces a 4104 containing the decoded UTF-16 source, not the Base64 string the operator typed.

String concatenation that survives the parser (assembling a command from variables, building method names character by character) is still visible — the 4104 logs the resulting compiled block, which contains the concatenated literals as written. Obfuscation that defeats 4104 has to defeat compilation itself, which leaves less room than most operators realise.

Step 4 — Hunt encoded payloads and download cradles

The single highest-signal query against 4104 is a regex over the message body for known loader strings:

# Encoded-or-downloaded payloads in the last 24h
Get-WinEvent -FilterHashtable @{
    LogName   = 'ForwardedEvents'
    ProviderName = 'Microsoft-Windows-PowerShell'
    Id        = 4104
    StartTime = (Get-Date).AddDays(-1)
} -ErrorAction SilentlyContinue |
    Where-Object {
        $_.Message -match '(?i)(\-enc(odedcommand)?|frombase64string|downloadstring|downloadfile|invoke-webrequest\s+[^|]+\.(ps1|exe|dll))'
    } |
    Select-Object TimeCreated, MachineName,
                  @{Name='User';    Expression={ $_.UserId }},
                  @{Name='Snippet'; Expression={ ($_.Message -split "`n")[0..3] -join ' / ' }}

This will catch the obvious cases. A typical environment produces a handful of hits per day — almost all of them either legitimate admin automation (treat as a tuning opportunity, allowlist by signed cert or known path) or genuinely worth investigating.

Step 5 — Hunt reflective .NET loading

A more advanced loader pattern: load a .NET assembly from a byte array in memory, never touching disk. Cobalt Strike's PowerShell stager, plenty of public C2 frameworks, and a fair number of red-team tools use it:

Get-WinEvent -FilterHashtable @{
    LogName   = 'ForwardedEvents'
    ProviderName = 'Microsoft-Windows-PowerShell'
    Id        = 4104
    StartTime = (Get-Date).AddDays(-1)
} -ErrorAction SilentlyContinue |
    Where-Object {
        $_.Message -match '\[Reflection\.Assembly\]::Load\(' -or
        $_.Message -match 'System\.Reflection\.AssemblyName' -or
        $_.Message -match 'GetDelegateForFunctionPointer'
    } |
    Select-Object TimeCreated, MachineName, Id,
                  @{Name='Snippet'; Expression={ ($_.Message -split "`n")[0..5] -join ' / ' }}

Reflective loading has almost no legitimate use case in modern administration — virtually every hit is worth a closer look at the host. Pair with Sysmon event 10 (process access) against lsass.exe from the same host to chain credential-access tooling.

Step 6 — Watch for AMSI tampering

AMSI (Antimalware Scan Interface) is the kernel-mode bridge that lets antivirus inspect PowerShell script content at parse time. Operators routinely try to break AMSI by patching amsi.dll in memory or setting amsiInitFailed via reflection. The bypass code itself is a 4104 event:

Get-WinEvent -FilterHashtable @{
    LogName   = 'ForwardedEvents'
    ProviderName = 'Microsoft-Windows-PowerShell'
    Id        = 4104
    StartTime = (Get-Date).AddDays(-7)
} -ErrorAction SilentlyContinue |
    Where-Object {
        $_.Message -match '(?i)amsiInitFailed|amsi\.dll|AmsiUtils|AmsiScanBuffer'
    } |
    Select-Object TimeCreated, MachineName,
                  @{Name='Snippet'; Expression={ ($_.Message -split "`n")[0..5] -join ' / ' }}

Even when the bypass succeeds against AMSI, it cannot prevent the 4104 that logged the bypass code itself — the event is written by PowerShell's logging subsystem before AMSI is invoked. The implication is that 4104 catches the bypass attempt even on the operations where AMSI is the thing being bypassed.

Frequently Asked Questions

Does script block logging slow PowerShell down?

In benchmarks on modern hardware, the overhead is in the low single digits of percent for typical workloads. Heavy automation that emits very large script blocks (multi-megabyte modules) sees more measurable impact because the channel write is synchronous. For everything else — interactive admin, scheduled tasks, normal scripting — the cost is not user-visible.

Will obfuscation defeat 4104?

Not the common kinds. Base64 encoding, character substitution, and string concatenation are resolved by the parser before logging, so the 4104 contains the recovered source. Obfuscation that survives is the rarer kind that builds the eventual code through layered Invoke-Expression chains or runtime AST manipulation, and even then each layer produces its own 4104. The forensic question shifts from "what did this run" to "which 4104 has the final payload."

What is the relationship between 4103 and 4104?

4103 is module logging — parameters and module member calls. 4104 is script block logging — the actual code text. 4103 tells you that Invoke-WebRequest was called with a particular URL parameter; 4104 tells you the full surrounding script. Both are useful; 4104 is the higher-signal of the two for hunting.

Why are some 4104 events flagged as Warning level?

PowerShell promotes a 4104 to Warning when the parser detects content matching its built-in suspicious-strings list (encoded commands, known reflection patterns, AMSI bypass strings). Filtering on LevelDisplayName = 'Warning' is a cheap pre-filter that surfaces the highest-value events without any custom regex.

Should I also enable transcription on every endpoint?

Workstation-wide transcription generates a significant volume of small text files and a corresponding storage and retention cost. The pragmatic split is to enable transcription on jump boxes, admin workstations, and high-value servers — places where every interactive session is worth keeping — and rely on 4104 alone for the broader fleet. Transcription pays off most when an analyst needs to read the exact output an operator saw, which is rare outside incident response.

Conclusion

PowerShell Script Block Logging is one of the cheapest, most analyst-friendly Windows detection events available — and one of the most commonly left disabled. Enable 4104 via Group Policy, raise the channel size, forward it off-host, and run the four or five regex queries above against the centralised stream. The signal-to-noise ratio is high enough that the alerts produced are mostly worth reading.

Combined with audit policy, Windows Event Forwarding, and Sysmon, 4104 closes the last big visibility gap in the standard Windows detection stack. Attackers who rely on PowerShell are loud against it; the work is making sure the channel is on, sized, and shipped.

Related Posts

Detecting Kerberoasting with Windows Event ID 4769

Kerberoasting (MITRE ATT&CK T1558.003) is one of the few credential-access techniques that produces a clean, on-prem audit signal — provided the right event is enabled and the right field is read. Detecting Kerberoasting with Event ID 4769 comes down to two things: alerting on RC4-HMAC service ticket requests in an environment that should be running AES, and watching for bursts of ticket requests against many SPNs from a single account. This post is the detection and hardening pair we use on the domain controllers we monitor.

Key Takeaways

  • Event ID 4769 on domain controllers records every Kerberos service ticket request. The Ticket Encryption Type field is the primary detection signal — 0x17 means RC4-HMAC, which is what offline cracking tools require.
  • Any modern Active Directory environment should rarely see RC4 service tickets. Treat 0x17 against domain user SPNs as anomalous until proven legitimate.
  • A burst of 4769s — one user requesting tickets for many distinct SPNs in a short window — is the classic Kerberoasting pattern, with or without RC4.
  • Hardening beats detection: set msDS-SupportedEncryptionTypes to AES-only on service accounts, migrate to Group Managed Service Accounts (gMSAs), and deploy a honey SPN for high-fidelity alerting.
  • 4769 is logged on the issuing domain controller, not the client. Centralise the Security log from every DC; one DC's events are not enough.

Environment

  • Active Directory domain at Windows Server 2016 functional level or higher.
  • Windows Server 2019/2022 domain controllers with Advanced Audit Policy applied via Group Policy.
  • 4769 events forwarded from every DC via Windows Event Forwarding or a SIEM agent.
  • PowerShell 5.1 or 7.4 for ad-hoc analysis on the collector.
  • RSAT Active Directory tooling for service-account hardening tasks.

The Problem

Kerberoasting works because Kerberos is doing exactly what it was designed to do. Any authenticated domain user can request a service ticket (TGS) for any account that has a Service Principal Name (SPN). The TGS is encrypted with a key derived from the target account's password. Older or misconfigured accounts produce RC4-HMAC tickets, which can be cracked offline at billions of guesses per second on a modern GPU. AES-encrypted tickets are computationally infeasible to crack at the same speed, which is why attackers explicitly request RC4 even on AES-capable accounts when they can.

The detection challenge is volume. 4769 fires for every service ticket request in the domain — Outlook to Exchange, SCCM to its database, end-user RDP, every internal web app. A single DC issues thousands of 4769s per minute. The trick is filtering down to the small number of requests that have the shape of an attack.

The Solution

Step 1 — Enable Kerberos service ticket auditing

Under Advanced Audit Policy → Account Logon, enable both Success and Failure for Audit Kerberos Service Ticket Operations. Apply via the Default Domain Controllers Policy so every DC inherits the same configuration:

# Verify on a DC
auditpol /get /subcategory:"Kerberos Service Ticket Operations"

Without this subcategory enabled, 4769 will never fire and the rest of this post is moot. Confirm at least one DC is logging the events before scaling out the detection.

Step 2 — Anatomy of a 4769 event

The fields that matter for detection:

  • Account Name — the user requesting the ticket. Will appear as USERNAME@DOMAIN.LOCAL.
  • Service Name — the SPN being requested. For Kerberoasting, this will be a domain user account name (not a computer or krbtgt).
  • Ticket Options — a flags field. 0x40810000 is normal; 0x40810010 often indicates ticket re-use.
  • Ticket Encryption Type — the heart of the detection. Common values: 0x12 (AES256-CTS-HMAC-SHA1-96), 0x11 (AES128), 0x17 (RC4-HMAC), 0x18 (RC4-HMAC-EXP).
  • Client Address — the source IP of the requester. Useful for narrowing the actor.
  • Failure Code0x0 for successful issuance; non-zero for errors.

Step 3 — Alert on RC4 service tickets

Modern domain members negotiate AES by default when the target account supports it. RC4 service tickets in an AES-capable environment fall into a few legitimate buckets — pre-Windows-Server-2008 trusts, accounts with msDS-SupportedEncryptionTypes unset or explicitly RC4 — and one illegitimate bucket: Kerberoasting tools forcing the encryption type down to make the ticket crackable.

# Surface RC4 service tickets issued in the last 24h, excluding machine accounts
Get-WinEvent -FilterHashtable @{
    LogName   = 'ForwardedEvents'
    Id        = 4769
    StartTime = (Get-Date).AddDays(-1)
} -ErrorAction SilentlyContinue |
    Where-Object {
        ($_.Properties[5].Value -eq '0x17' -or $_.Properties[5].Value -eq '0x18') -and
        ($_.Properties[2].Value -notmatch '\$$')
    } |
    Select-Object TimeCreated, MachineName,
                  @{Name='User';     Expression={ $_.Properties[0].Value }},
                  @{Name='Service';  Expression={ $_.Properties[2].Value }},
                  @{Name='ClientIP'; Expression={ $_.Properties[6].Value }},
                  @{Name='EncType';  Expression={ $_.Properties[5].Value }}

The -notmatch '\$$' filter drops machine accounts (Kerberoasting targets user accounts with SPNs, not computer accounts). Whatever survives this query should be a short list — investigate every entry.

Step 4 — Alert on ticket-request bursts

Attackers that cannot force RC4 will still leave a behavioural fingerprint: one principal requesting tickets for an unusually large number of distinct SPNs in a short window. The query is shape-based and works regardless of encryption type:

# Same user requesting tickets for many SPNs in an hour
Get-WinEvent -FilterHashtable @{
    LogName   = 'ForwardedEvents'
    Id        = 4769
    StartTime = (Get-Date).AddHours(-1)
} -ErrorAction SilentlyContinue |
    Where-Object { $_.Properties[2].Value -notmatch '\$$' -and
                   $_.Properties[2].Value -notmatch 'krbtgt' } |
    Group-Object { $_.Properties[0].Value } |
    ForEach-Object {
        [pscustomobject]@{
            User           = $_.Name
            DistinctSPNs   = ($_.Group | Select-Object -ExpandProperty Properties |
                              ForEach-Object { $_[2].Value } | Sort-Object -Unique).Count
            Total          = $_.Count
        }
    } |
    Where-Object DistinctSPNs -gt 20 |
    Sort-Object DistinctSPNs -Descending

Tune the threshold to the environment — 20 distinct SPNs per hour from one user is loud in most domains and quiet in a few. Legitimate hits are typically service accounts running discovery tooling (vulnerability scanners, asset inventories, monitoring agents). Allowlist by account name once those are identified.

Step 5 — Deploy a honey SPN

The highest-fidelity Kerberoasting detection is a decoy. Create a domain user account that no legitimate service ever talks to, register a plausible SPN against it, and alert on every 4769 issued for that SPN. Two false-positive sources to plan around: AD discovery scans by red-team tooling and the occasional curious admin running Get-ADUser -Filter * -Properties servicePrincipalName.

# Create the decoy
$pw = -join ((33..126) | Get-Random -Count 64 | ForEach-Object { [char]$_ })
New-ADUser -Name 'svc-backup-sql' `
           -SamAccountName 'svc-backup-sql' `
           -AccountPassword (ConvertTo-SecureString $pw -AsPlainText -Force) `
           -Enabled $true `
           -Description 'Service account — do not modify'

setspn -S MSSQLSvc/backup-sql.example.local:1433 svc-backup-sql

# Disable interactive logon and pre-set a long, random password
Set-ADUser -Identity svc-backup-sql -CannotChangePassword $true -PasswordNeverExpires $true

Then create a SIEM rule that alerts on Service Name = MSSQLSvc/backup-sql.example.local:1433 in any 4769 event. The account is never used by anything legitimate, so every match is a true positive.

Step 6 — Harden service accounts

Detection is reactive; the hardening below removes the technique outright for any account it covers:

  • Force AES on service accounts. Set msDS-SupportedEncryptionTypes to 0x18 (AES128 + AES256). Tickets issued to those accounts will no longer be RC4 regardless of what the client requests.
  • Use Group Managed Service Accounts. A gMSA has a 240-character password that AD rotates automatically every 30 days. The password is never typed, never stored, and cannot be cracked at any practical speed. Migrate any service that supports gMSAs.
  • Long passwords on remaining accounts. For services that do not support gMSAs, set a 25+ character random password. RC4 cracking against a 25-character password is computationally infeasible regardless of GPU budget.
  • Remove unused SPNs. SPNs on accounts that no longer host the service are pure attack surface. Audit servicePrincipalName against actual running services annually.
# Set AES-only on a service account
Set-ADUser -Identity svc-sql-prod -Replace @{ 'msDS-SupportedEncryptionTypes' = 24 }

# List all SPNs in the domain for review
Get-ADUser -Filter * -Properties servicePrincipalName |
    Where-Object servicePrincipalName |
    Select-Object SamAccountName, @{N='SPNs'; E={ $_.servicePrincipalName -join '; ' }}

Frequently Asked Questions

Why does Event ID 4769 fire so often even in a quiet domain?

4769 is the standard Kerberos service ticket flow — every domain client requests one for every service it talks to, then caches it for the ticket lifetime (10 hours by default). Outlook, file shares, SQL connections, RDP, and internal web apps all generate 4769s constantly. Volume is normal; the goal is filtering on encryption type and request shape, not on overall count.

Can I detect Kerberoasting without forwarding logs from every domain controller?

Not reliably. A client can request a service ticket from any DC the domain resolves; observing only one DC misses tickets issued by the others. For consistent coverage, forward the Security log from every DC to a single collector or SIEM and run detections against the merged stream.

What encryption type values should I expect in a healthy domain?

0x12 (AES256) is the modern default for tickets issued to AES-capable accounts. 0x11 (AES128) appears for older or differently configured accounts. 0x17 (RC4-HMAC) should be rare and should map to a known list of legacy accounts. 0x18 (RC4-HMAC-EXP) is exceptional and worth investigating wherever it shows up.

Do gMSAs eliminate Kerberoasting entirely?

For the accounts they cover, effectively yes. The 240-character random password is rotated automatically and cannot be brute-forced at any realistic speed. gMSAs do not retroactively protect accounts that still hold cracking-feasible passwords, so the migration is the work — the protection is automatic once it lands.

Will Kerberoasting still be detectable if the attacker uses AES?

The encryption-type signal goes away, but the behavioural signal does not. Tools that request tickets for every SPN they enumerate still produce the burst pattern in Step 4 — one principal asking for many distinct SPNs in a short window. The honey SPN in Step 5 also fires regardless of encryption type. Defense in depth matters here precisely because the easiest detection can be bypassed.

Conclusion

Kerberoasting is the rare offensive technique where the protocol-level signal is unambiguous if the right audit is on. Enable Kerberos Service Ticket Operations auditing, forward 4769s from every DC, alert on RC4 issuance to non-machine accounts, and add a honey SPN for high-fidelity coverage. Then do the unglamorous half of the work: AES-only encryption types, gMSAs where possible, and long random passwords on whatever remains.

The detections in this post are not novel — they are the same patterns the public detection-engineering community has published since 2016. What makes them effective is having them on, having them centralised, and having the hardening done so the alerts that fire actually mean something.

Related Posts

Sysmon Configuration for Windows Security Monitoring

Native Windows auditing covers a surprising amount of ground, but it has known gaps: no file hashes on process creation, no outbound network connections, no LSASS access telemetry, and no built-in DNS query log. Sysmon configuration for security monitoring closes most of those gaps and is the single highest-value addition to a Windows endpoint detection stack after audit policy itself. This post is the deployment we use internally — install, baseline config, scaled rollout, and the event IDs that actually carry signal.

Key Takeaways

  • Sysmon is a free Sysinternals tool that augments the Windows event log with process, network, registry, file, image-load, and DNS telemetry — none of which native auditing covers well out of the box.
  • The configuration file is what matters. An empty Sysmon install logs almost nothing useful; a tuned config (SwiftOnSecurity or Olaf Hartong's modular project) is the sane starting point.
  • Deploy via a Group Policy scheduled task or Intune script so updates flow through the same channel as the rest of the fleet's tooling.
  • Event IDs 1 (process), 3 (network), 10 (process access), 11 (file create), and 22 (DNS query) cover the majority of high-value detection use cases.
  • Forward the Microsoft-Windows-Sysmon/Operational channel to a Windows Event Collector or SIEM. Sysmon on its own is per-host; centralised logs are what make it useful for a fleet.

Environment

  • Windows 10/11 and Windows Server 2019/2022 endpoints.
  • Sysmon v15 for Windows (the current major version as of writing — Sysmon for Linux is a separate project with a different event schema).
  • An existing Group Policy or Intune deployment channel to push binaries and config updates.
  • A Windows Event Collector (WEC) or SIEM to receive forwarded Sysmon events. See our Windows Event Forwarding setup guide for the collector side.
  • PowerShell 5.1 or 7.4 for hunting queries on the collector.

The Problem

Built-in Windows auditing is good at telling you that a process started (4688), that an account logged on (4624), or that an AD object changed (5136). It is less good at the things modern detection actually depends on: the SHA-256 of the binary that ran, the IP it then connected to, the DLL it loaded that did not match a Microsoft signature, the handle it requested on the LSASS process. Some of that is available with extra audit subcategories, command-line inclusion settings, or Object Access auditing, but the coverage is uneven and the events are designed for compliance more than detection.

Sysmon was written by Sysinternals (Mark Russinovich and Thomas Garnier) to fill those gaps. It installs as a kernel driver plus a user-mode service, hooks the events of interest at the source, and writes them to a dedicated event channel. Crucially, it is free, signed by Microsoft, supported on every modern Windows version, and produces telemetry that maps cleanly to MITRE ATT&CK techniques. The trade-off is that it ships with no useful configuration — running sysmon.exe -i with no config file logs almost nothing of value. The configuration file is the entire product.

The Solution

Step 1 — Install Sysmon on a test endpoint

Download Sysmon from the Sysinternals page on Microsoft Learn. Verify the signature, then install with a configuration file in one step:

# Run elevated
sysmon.exe -accepteula -i sysmonconfig.xml

# Confirm install
Get-Service Sysmon* | Format-Table Name, Status, StartType

The installer registers the driver, starts the service, and begins writing to the Microsoft-Windows-Sysmon/Operational channel. To update the config later, run sysmon.exe -c sysmonconfig.xml — no reinstall, no reboot. The channel is small by default; bump it before traffic builds up:

wevtutil sl Microsoft-Windows-Sysmon/Operational /ms:1073741824   # 1 GB

Step 2 — Pick a configuration baseline

Two community baselines cover virtually every production Sysmon deployment in the wild:

  • SwiftOnSecurity/sysmon-config — a single curated XML, heavily commented, conservative on volume. The right starting point for most environments.
  • olafhartong/sysmon-modular — a modular framework where rules live in separate files per ATT&CK technique. More work to assemble, but cleaner to maintain and easier to map to detection coverage.

Both projects are open source and well-maintained. We use SwiftOnSecurity's baseline as the floor and pull additional modules from sysmon-modular for areas the baseline omits — most notably credential-access (LSASS handle requests) and lateral-movement (named pipe creation) coverage. The rule of thumb is to enable everything except event ID 7 (image load) initially, then turn 7 on for a defined set of high-value processes once log volume is understood.

Step 3 — Deploy and update across the fleet

At scale, copy the Sysmon binary and config to a known location and trigger install or reconfigure on every endpoint. A Group Policy scheduled task is the simplest mechanism that does not require additional tooling:

# Idempotent deployment script
$bin    = '\\dfs\sysmon\Sysmon64.exe'
$config = '\\dfs\sysmon\sysmonconfig.xml'

if (Get-Service Sysmon* -ErrorAction SilentlyContinue) {
    & $bin -c $config
} else {
    & $bin -accepteula -i $config
}

Wrap that in a scheduled task that runs daily as SYSTEM, target it via GPO at every domain-joined endpoint, and config changes propagate within 24 hours of being copied to the share. Intune Win32 app deployment works the same way for cloud-managed endpoints. Whichever channel is used, the binary and configuration both need version pinning — Sysmon's schema occasionally adds new fields, and a config authored against schema 4.90 will fail to load on a host running an older binary.

Step 4 — The Sysmon event IDs that carry signal

Sysmon emits 27 distinct event IDs at the time of writing. A much smaller subset is where most detections live:

  • 1 — Process creation. Includes the SHA-256 hash, parent process, full command line, and integrity level. The single most valuable Sysmon event.
  • 3 — Network connection. Outbound TCP/UDP including destination IP, port, and the process that initiated it.
  • 7 — Image loaded (DLL load). High volume; enable selectively for processes like lsass.exe, winlogon.exe, and Office binaries.
  • 8 — CreateRemoteThread. Classic process injection primitive.
  • 10 — Process access. The GrantedAccess field is where LSASS credential dumping shows up (values around 0x1010 or 0x1410).
  • 11 — File create. Useful for staging detections (executables written to %TEMP%, %APPDATA%, or web-shell paths).
  • 12 / 13 / 14 — Registry events. Pair with autorun and persistence keys.
  • 22 — DNS query. Every resolution made by every process — the single best telemetry source for C2 callback detection.
  • 25 — Process tampering. Process hollowing and image replacement.

Step 5 — Forward Sysmon to the collector

Add Microsoft-Windows-Sysmon/Operational to the subscription XML on the Windows Event Collector. The simplest path is a second Select path inside an existing baseline subscription:

<Select Path="Microsoft-Windows-Sysmon/Operational">
  *[System[(EventID=1 or EventID=3 or EventID=7 or EventID=8 or
            EventID=10 or EventID=11 or EventID=22 or EventID=25)]]
</Select>

Sysmon volume is significant — a single workstation can push 500 to 2,000 events per minute depending on which IDs are enabled. Plan for a separate forwarded log if the collector is serving more than a handful of endpoints; the WEF setup post covers custom event channels for exactly this case.

Step 6 — Hunting queries

A few queries that pay off the first time they run against forwarded Sysmon data:

# Event 10 — handle requests on lsass.exe with credential-dumping access masks
Get-WinEvent -FilterHashtable @{
    LogName = 'ForwardedEvents'
    ProviderName = 'Microsoft-Windows-Sysmon'
    Id = 10
    StartTime = (Get-Date).AddDays(-1)
} -ErrorAction SilentlyContinue |
    Where-Object { $_.Message -match 'TargetImage:.*lsass\.exe' -and
                   $_.Message -match 'GrantedAccess:\s*0x1(0|4)10' } |
    Select-Object TimeCreated, MachineName,
                  @{Name='SourceImage'; Expression={
                      ($_.Message -split "`n" | Where-Object { $_ -match 'SourceImage' }) -replace '.*: '
                  }}

# Event 22 — DNS queries from processes that should not be resolving names
Get-WinEvent -FilterHashtable @{
    LogName = 'ForwardedEvents'
    ProviderName = 'Microsoft-Windows-Sysmon'
    Id = 22
    StartTime = (Get-Date).AddHours(-1)
} -ErrorAction SilentlyContinue |
    Where-Object { $_.Message -match 'Image:.*\\(certutil|bitsadmin|powershell|regsvr32)\.exe' } |
    Select-Object TimeCreated, MachineName,
                  @{Name='Query'; Expression={
                      ($_.Message -split "`n" | Where-Object { $_ -match 'QueryName' }) -replace '.*: '
                  }}

These are the kinds of queries that produce one or two hits per fleet per day and almost always warrant attention when they fire.

Frequently Asked Questions

Do I still need native Windows auditing if I have Sysmon?

Yes. Sysmon does not replace the Security log — it sits next to it. Authentication events (4624, 4625), account management (4720, 4728), AD object changes (5136), and audit policy changes (1102, 4719) all live in the Security channel and have no Sysmon equivalent. Sysmon adds process, network, image, and DNS telemetry; the Security log keeps everything else. Run both, forward both.

Will Sysmon affect endpoint performance?

On modern hardware, the user-visible impact is minimal — Sysmon hooks the events in kernel mode and writes to a dedicated channel, so most cost is in disk I/O on the event log itself. The two configuration choices that drive measurable CPU and disk use are event ID 7 (image load — every DLL load on every process) and overly broad rules on event ID 13 (registry value set). Both can be tuned with exclude rules in the config.

How is Sysmon different from Microsoft Defender for Endpoint?

Defender for Endpoint is a commercial EDR with its own telemetry pipeline, behavioural detections, and response capability. Sysmon is a free telemetry source that writes to the Windows event log; the analytics happen wherever you ship the log. Most environments running Defender for Endpoint do not also need Sysmon, since Defender collects similar telemetry through its agent. Sysmon is the right answer when the SIEM is something other than Defender XDR — Sentinel without P2, Splunk, Elastic, or a homegrown collector.

How should I update the Sysmon configuration across the fleet?

Treat the config like any other production artefact — version-control the XML, copy it to a known share or container, and run sysmon.exe -c on every endpoint on a schedule. The command is idempotent: applying the same config twice is a no-op. The Group Policy scheduled task pattern in Step 3 handles this without additional infrastructure.

Is Sysmon event ID 7 (image load) worth the volume?

Selectively, yes — but not for every process. Enabling event 7 globally adds hundreds of events per second per endpoint. The useful pattern is to include event 7 only for the small set of processes that matter for credential-access and persistence: lsass.exe, winlogon.exe, services.exe, and any Office or browser binary that hosts macros or extensions. Both SwiftOnSecurity and sysmon-modular ship include-by-default rules along these lines.

Conclusion

Sysmon is rare among free security tools in that the deployment effort is small and the detection lift is large. The hard parts are picking a configuration baseline and getting the events centralised — both are solved problems with mature open-source projects. Once those two pieces are in place, the event IDs above expose roughly the same telemetry an expensive EDR collects, in a format any SIEM can ingest.

Combined with native audit policy and Windows Event Forwarding, Sysmon gives a Windows environment detection coverage that is genuinely useful — not the box-checking kind, but the kind that catches things.

Related Posts

Windows Event Forwarding Setup for Centralised Logs

Knowing which event IDs to watch is half the job; getting them off every endpoint before the local Security log wraps is the other half. Windows Event Forwarding setup is the native, no-agent, no-cost way to do it — and it is what feeds nearly every SIEM-native Windows pipeline, including Microsoft Sentinel. This post is the build we use internally: source-initiated subscriptions, a Windows Server 2022 collector, and the Group Policy that wires the endpoints in.

Key Takeaways

  • Source-initiated subscriptions scale better than collector-initiated for anything past a small lab — endpoints push to the collector instead of the collector pulling from each one.
  • The Group Policy SubscriptionManager URL is what tells endpoints where to send events.
  • The collector reads remote logs as NETWORK SERVICE, which means the collector's computer account has to be a member of Event Log Readers on every source.
  • Custom event channels prevent the default ForwardedEvents log from overflowing on a busy collector — split by source type once you pass a few hundred endpoints.
  • Forwarded events keep the original computer name in Event/System/Computer; write queries against that field, not the collector hostname.

Environment

  • Windows Server 2022 acting as the Windows Event Collector (WEC).
  • Windows 10/11 and Windows Server 2019/2022 source machines, joined to the same Active Directory domain.
  • WinRM available on TCP 5985 (HTTP) inside the management network. TCP 5986 (HTTPS) for sources crossing a less-trusted boundary.
  • Group Policy management rights to push the subscription URL, the WinRM service state, and the restricted-groups membership to source machines.
  • PowerShell 5.1 or 7.4 on the collector for verification and ad-hoc queries.

The Problem

A busy domain controller will overwrite its Security log in hours, not days, regardless of the size you give it. 4624 and 4634 alone can generate thousands of events per minute. Trying to retain a week of DC security history on the DC itself is wishful thinking. Third-party log shippers solve this, but they bring agents, licensing, and a software supply chain that has to be vetted before it touches every server in the estate.

Windows Event Forwarding is built in, free, and rides on WinRM — which is already present on every supported Windows release. The trade-off is that the setup is fiddly, the documentation is scattered across several Microsoft Learn pages, and the failure modes are quiet. There are two modes: collector-initiated, where the collector pulls from a hand-listed set of sources, and source-initiated, where sources push to the collector based on Group Policy. Collector-initiated is fine for a handful of servers. Source-initiated is what scales, because new machines start forwarding the moment they apply the GPO and the collector does not have to know they exist beforehand.

The Solution

Step 1 — Prepare the Windows Event Collector

On the collector host, enable the Windows Event Collector service and configure WinRM. Two elevated commands handle both:

# Run elevated on the collector
wecutil qc /quiet
winrm quickconfig -quiet

wecutil qc sets the Wecsvc service to delayed auto-start, opens the firewall for WinRM, and registers the default WS-Management endpoint. After it completes, the Event Viewer → Subscriptions node becomes usable, and the ForwardedEvents log is ready to receive data.

Resize ForwardedEvents before any traffic arrives. The default 20 MB overflows within minutes once a few dozen domain controllers start pushing 4624s:

wevtutil sl ForwardedEvents /ms:8589934592   # 8 GB

Step 2 — Grant the collector permission to read remote logs

Source-initiated subscriptions run as NETWORK SERVICE on the collector. NETWORK SERVICE on a remote host authenticates as the collector's domain computer account, so adding that computer object to Event Log Readers on every source is what unlocks read access. The cleanest way is a restricted-groups Group Policy targeting source machines:

Computer Configuration
 → Preferences
   → Control Panel Settings
     → Local Users and Groups
       → New → Local Group
         Group: Event Log Readers (built-in)
         Action: Update
         Members: DOMAIN\WEC01$

Replace WEC01$ with your collector's computer account. The trailing dollar sign is required — without it, Group Policy will silently match nothing and you will spend an afternoon wondering why subscriptions are reporting "access denied".

Step 3 — Push the subscription URL via Group Policy

The single setting that activates source-initiated forwarding is the subscription manager URL. Under Computer Configuration → Administrative Templates → Windows Components → Event Forwarding → Configure target Subscription Manager, enable the policy and add the entry:

Server=http://wec01.example.local:5985/wsman/SubscriptionManager/WEC,Refresh=60

A few details that bite first-time deployments:

  • The FQDN must resolve from every source. IP addresses technically work but break the Kerberos authentication WinRM expects by default.
  • Refresh=60 tells the source to re-pull the subscription configuration every 60 seconds. Useful during rollout; raise it to 600 or higher once the fleet is stable.
  • WinRM also has to be running on the source. Set the Windows Remote Management (WS-Management) service startup to Automatic via a service GPO if it is not already.

After a gpupdate /force and a minute or two of patience, each source registers with the collector and appears under Event Viewer → Subscriptions → Source Computers. Microsoft's authoritative reference is on Microsoft Learn: Setting up a Source Initiated Subscription.

Step 4 — Author the Windows Event Forwarding subscription

Subscriptions can be created in the GUI, but XML is the only sane way to manage them at scale. Save the following as baseline.xml on the collector. It pulls the high-signal IDs from our Windows Event IDs reference and drops everything else:

<Subscription xmlns="http://schemas.microsoft.com/2006/03/windows/events/subscription">
  <SubscriptionId>Baseline-Security</SubscriptionId>
  <SubscriptionType>SourceInitiated</SubscriptionType>
  <Description>Baseline forwarding: auth, account mgmt, process, PowerShell</Description>
  <Enabled>true</Enabled>
  <Uri>http://schemas.microsoft.com/wbem/wsman/1/windows/EventLog</Uri>
  <ConfigurationMode>Custom</ConfigurationMode>
  <Delivery Mode="Push">
    <Batching>
      <MaxItems>20</MaxItems>
      <MaxLatencyTime>30000</MaxLatencyTime>
    </Batching>
    <PushSettings>
      <Heartbeat Interval="60000"/>
    </PushSettings>
  </Delivery>
  <Query>
    <![CDATA[
      <QueryList>
        <Query Id="0">
          <Select Path="Security">
            *[System[(EventID=1102 or EventID=4624 or EventID=4625 or EventID=4648 or
                      EventID=4672 or EventID=4688 or EventID=4720 or EventID=4724 or
                      EventID=4728 or EventID=4732 or EventID=4756 or EventID=4738 or
                      EventID=4698 or EventID=4699 or EventID=4702 or EventID=4719)]]
          </Select>
          <Select Path="Microsoft-Windows-PowerShell/Operational">
            *[System[(EventID=4104)]]
          </Select>
        </Query>
      </QueryList>
    ]]>
  </Query>
  <ReadExistingEvents>false</ReadExistingEvents>
  <TransportName>http</TransportName>
  <ContentFormat>RenderedText</ContentFormat>
  <Locale Language="en-US"/>
  <LogFile>ForwardedEvents</LogFile>
  <AllowedSourceNonDomainComputers></AllowedSourceNonDomainComputers>
  <AllowedSourceDomainComputers>O:NSG:NSD:(A;;GA;;;DC)(A;;GA;;;NS)</AllowedSourceDomainComputers>
</Subscription>

Register it with wecutil cs baseline.xml and confirm with wecutil es. The AllowedSourceDomainComputers SDDL above permits any domain controller (DC) plus NETWORK SERVICE (NS) — narrow it to a specific computer group SID for production. The XPath query is the same shape you would write in Event Viewer's filter dialog; copying a working filter and pasting it inside the Select element is a reliable starting point.

Step 5 — Verify and tune

Events should start landing in ForwardedEvents within a minute. The source computer is in Event/System/Computer, not the collector's name — write queries against that field:

# Top forwarders in the last hour
Get-WinEvent -FilterHashtable @{
    LogName   = 'ForwardedEvents'
    StartTime = (Get-Date).AddHours(-1)
} |
    Group-Object MachineName |
    Sort-Object Count -Descending |
    Select-Object -First 20 Name, Count

On the source, check the runtime status of the forwarding plugin:

# Inspect the subscription state from a source machine
wevtutil gl Microsoft-Windows-Forwarding/Operational
Get-WinEvent -LogName 'Microsoft-Windows-Forwarding/Operational' -MaxEvents 20

The two most common failures: WinRM not running on the source (event ID 102 in Microsoft-Windows-Forwarding/Operational), and the collector computer object missing from Event Log Readers on the source (event ID 105, access denied). Both are GPO fixes.

Step 6 — Split high-volume sources into custom event channels

Once you cross a few hundred endpoints, the default ForwardedEvents log churns fast enough that even an 8 GB log only holds a few hours of history. The fix is custom event channels: write a provider manifest, compile it with ecmangen, register it on the collector, and target separate subscriptions at separate logs — for example WEC-Auth, WEC-Process, and WEC-PowerShell. Each log gets its own size budget, retention policy, and query surface. Palantir's windows-event-forwarding repository on GitHub has production-grade manifests and subscription XMLs worth borrowing as a starting point.

Frequently Asked Questions

Should I use HTTP or HTTPS for Windows Event Forwarding?

Inside an Active Directory domain, HTTP on TCP 5985 is already encrypted end-to-end by Kerberos and authenticated by the machine accounts on both sides. HTTPS on 5986 only adds value when sources are non-domain or cross an untrusted boundary, and that path also requires client certificates provisioned on every source. For a domain-joined fleet, HTTP is the documented default and the simpler operational choice.

Do I need a SIEM if I have Windows Event Forwarding?

Not strictly. A WEC collector with adequate log sizing and a few PowerShell queries handles ad-hoc incident response and small environments fine. A SIEM adds correlation across sources, longer retention, alerting, and search beyond just Windows logs. WEF is what feeds the SIEM in most native deployments — Microsoft Sentinel's Azure Monitor Agent, for example, can ingest forwarded events directly from a WEC.

Why are some events missing from forwarded logs even though they appear locally?

Three usual suspects: the event ID is not in the subscription's XPath query; the source has not yet refreshed the subscription configuration (wait for the Refresh interval to elapse); or the source denies read access to Event Log Readers. Run wecutil gr SUBSCRIPTION_NAME on the collector for the per-source heartbeat and last-error code, which usually points straight at the cause.

How big should the ForwardedEvents log be?

Budget at least 100 MB per active source per day at the baseline event set above. For a fleet of 500 endpoints, that is roughly a 50 GB log to retain a single day — which is the point at which custom event channels start paying for themselves. Domain controllers push significantly more volume than member servers, so split them out first.

Can I forward events from workgroup (non-domain) machines?

Yes, but with significantly more work. WinRM has to be configured with client certificates on the source, a matching certificate authority has to be trusted by the collector, and the source thumbprint has to be added to AllowedSourceNonDomainComputers in the subscription. For more than a handful of non-domain sources, an agent-based shipper is usually cheaper than maintaining the certificate plumbing.

Conclusion

Windows Event Forwarding is one of the more useful native Windows features that almost nobody enables until they have to. The setup is tedious — multiple GPOs, an XML subscription, an SDDL string — but everything is built in, costs nothing, and survives both the collector and the SIEM being swapped out for something else. Once it is running, the rest of the Windows-side detection stack quietly assumes it is there.

If we had to pick the single highest-leverage thing to do after configuring audit policy, this would be it. Centralised logs make every other monitoring decision — log retention, correlation rules, hunting queries — significantly easier.

Related Posts

Useful Advanced Hunting KQL Queries in Microsoft Defender

Microsoft Defender's Advanced Hunting is genuinely useful — possibly the most useful thing in the portal that does not have a flashy dashboard. We are currently building our own on-premises log management stack, but before going all-in we wanted to see how far we could get with Advanced Hunting KQL queries against the Defender data we already have, without committing to a full Microsoft Sentinel workspace. The short answer: surprisingly far.

Key Takeaways

  • Advanced Hunting in Microsoft Defender XDR lets us run KQL queries against device, email, and identity telemetry without a Sentinel workspace.
  • A short summarize over DeviceLogonEvents is enough to surface noisy or misbehaving service accounts in seconds.
  • An EmailEvents query grouped by SenderFromDomain turns inbound threat telemetry into a concrete blocklist candidate list.
  • A let-and-join pattern between DeviceLogonEvents and DeviceNetworkEvents can hint at which human is behind a shared service-account logon.
  • None of this replaces a SIEM, but for spot checks, validation, and small investigations it is faster than firing up Sentinel.

Environment

  • Microsoft 365 tenant with Microsoft Defender XDR (formerly Microsoft 365 Defender) — Advanced Hunting is part of the unified Defender portal.
  • All users on Microsoft 365 E5, which includes Defender for Endpoint Plan 2 and Defender for Office 365 Plan 2.
  • No Microsoft Sentinel workspace in use for this work — queries run inside Defender XDR directly.
  • Tables referenced: DeviceLogonEvents, DeviceNetworkEvents, EmailEvents.

The Problem

Like many teams, we are slowly moving toward a setup where the bulk of our log management runs on premises, on our own terms. The plan does not change — but the plan also takes a while to roll out. In the meantime, we still need to answer questions like "which user is hammering the directory with logons today?" or "which inbound sender domains are responsible for most of the spam getting filtered?" without spinning up a parallel Sentinel deployment.

Defender's Advanced Hunting fills that gap. It is a KQL query interface over the same data Microsoft uses to power its alerts, sitting right inside the Defender portal. Microsoft maintains a large library of pre-defined queries — useful as a starting point — and the full Advanced Hunting schema reference is on Microsoft Learn. But the queries we end up running over and over are short, narrow, and embarrassingly simple. The three below are the ones we have actually pinned and re-used.

The Solution

Step 1 — Find the noisiest accounts with DeviceLogonEvents

Occasionally we just want to know who is authenticating a lot. The usual reason is a service account that has decided to misbehave — a stuck loop, a misconfigured scheduled task, or an integration that lost its mind after a credential rotation. A short summarize on DeviceLogonEvents answers that in about two seconds:

DeviceLogonEvents
| summarize LogonCount = count() by AccountName
| top 10 by LogonCount
| render piechart

The output is a pie chart of the top 10 most active accounts in whatever time range is selected at the top of the page. If one slice eats half the chart, you have your answer. We tend to scope the time range to the last 24 hours by default — short enough to catch a runaway, long enough to spot a slow drip. Once an account stands out, the same query without the top and render lines and with a where AccountName == "..." filter gives a per-day breakdown to confirm the trend is real.

A couple of small caveats. DeviceLogonEvents only sees logon activity on devices onboarded to Defender for Endpoint. Sign-ins to cloud apps go to SigninLogs or AADSignInEventsBeta instead — different tables, different schema, same KQL. And render piechart is a presentation hint: it has no effect when the query runs from the API or as a scheduled detection, so do not rely on it for anything except the in-portal view. If you are wondering which underlying logon types these rows correspond to, our post on essential Windows event IDs for security monitoring covers the mapping in more detail.

Step 2 — Surface the worst inbound sender domains

This one started life in the Azure-Sentinel GitHub repository and we kept it because it tells us something useful we cannot easily see in the standard Defender for Office 365 reports. The point is to find sender domains whose inbound mail is mostly junk — malware, phish, or spam — so we can decide whether to block them at the connector instead of relying on the filter to catch every message.

EmailEvents
| where EmailDirection == "Inbound"
| where Timestamp > ago(30d)
| summarize TotalEmailCount = count(),
            SpamEmailCount  = countif(ThreatTypes has_any ("Malware", "Phish", "Spam")) by SenderFromDomain
| extend Bad_Traffic_Percentage_Inbound = todouble(round(SpamEmailCount / todouble(TotalEmailCount) * 100, 2))
| where SpamEmailCount != 0
| sort by SpamEmailCount desc
| project SenderFromDomain, SpamEmailCount, TotalEmailCount, Bad_Traffic_Percentage_Inbound
| top 100 by SpamEmailCount

Each row is a sender domain, the number of messages it sent us in the last 30 days, the number that tripped at least one threat type, and the resulting "bad traffic" percentage. Defender blocked most of those, of course — but a domain with a 98.5% block rate is still a domain we have no business accepting mail from. Once you have the list, it is straightforward to push the worst offenders into a tenant-wide block entry in Exchange Online Protection or your secure email gateway.

Two things to watch out for. First, countif with has_any counts a message once even if it carries more than one threat type — so the count is the number of threat-bearing messages, not the number of threats. Second, low-volume domains with one or two malicious messages will rocket to 100% bad traffic and crowd out the real signal. Sort by SpamEmailCount first and only consider Bad_Traffic_Percentage_Inbound meaningful on domains with a non-trivial sample.

Step 3 — Tie a service-account logon to a probable human user

This one is more situational, and we will say upfront: if service accounts are being used by humans for interactive sign-ins, that is the problem you should be fixing, not a query you should be writing. That said, when you inherit an environment and need to figure out who is using a shared account to log into a particular box, this pattern helps you correlate it back to a likely person. We are using it strictly for authorized internal attribution in environments we own.

The idea is to pull the most recent logons by a service-account-looking name on a device of interest, capture the remote IP they came from, and then join that back to DeviceNetworkEvents to see which actual user on the remote endpoint had network sessions from that IP:

let Logons =
    DeviceLogonEvents
    | where AccountName contains "..."
    | where DeviceName contains "..."
    | summarize LatestTimestamp = max(Timestamp)
        by AccountName, DeviceName, RemoteIP;
Logons
| join kind=leftouter (
    DeviceNetworkEvents
    | summarize LatestSeen = max(Timestamp) by RemoteIP, DeviceName, InitiatingProcessAccountName
) on RemoteIP

Replace the two "..." placeholders with the service-account name fragment and the target device fragment. The leftouter join keeps the logon row even when no matching network event exists, which is useful — it tells us "we saw this account log in from IP X, but the remote side is not onboarded, so we cannot attribute it further." When a match does come back, InitiatingProcessAccountName on the network-event side is usually the breadcrumb you want: it is the user context under which the connecting process was running on the remote machine.

The right long-term fix is per-person service principals or managed identities, not better forensics on the symptom. Once you have the data to make that case to whoever owns the workflow, this query has done its job. Our write-up on SIEM correlation rules for real attacks covers the next step — turning patterns like this into scheduled detections rather than ad-hoc lookups.

Frequently Asked Questions

Do I need Microsoft Sentinel to run Advanced Hunting KQL queries?

No. Advanced Hunting runs inside the Microsoft Defender XDR portal against the data Defender already collects from endpoints, mailboxes, and identities. Sentinel adds a workspace, longer retention, and a broader correlation surface across non-Microsoft sources, but for the queries above you only need Defender XDR.

Which licenses do I need for DeviceLogonEvents and EmailEvents?

DeviceLogonEvents and DeviceNetworkEvents require Defender for Endpoint Plan 2. EmailEvents requires Defender for Office 365 Plan 2. Both ship with Microsoft 365 E5 and are available as standalone add-ons for lower SKUs.

How far back can I query in Advanced Hunting?

Advanced Hunting retains 30 days of data by default. Anything older needs to be exported to a Sentinel workspace, a Log Analytics workspace, or a downstream log store before it ages out. If you regularly need 90- or 180-day lookback, that is the cue to start streaming to Sentinel or to your own SIEM.

Can I save these queries and re-use them?

Yes. The Advanced Hunting page supports personal and shared saved queries. For team use, version-controlling them in a small internal repository alongside the rest of your detection content is preferable to relying solely on the portal — it makes them reviewable, diffable, and easier to migrate later.

Will render piechart work outside the Defender portal?

No. render directives are interpreted by the Advanced Hunting UI and ignored everywhere else. If you take a query out of the portal and run it via the API, Logic Apps, or a custom detection rule, drop the render line — it does not error, but it does nothing useful either.

Conclusion

Advanced Hunting is not going to replace a real SIEM, and it is not meant to. What it does well is give a small team a fast way to ask sharp questions about data we already pay for. The three queries above are nothing clever — a count, a group-by, and a join — but they answer questions that come up every week and that would otherwise sit in someone's "I should look into that" pile.

We will eventually move most of this to our on-premises stack, where the retention, cost model, and correlation are on our own terms. Until then, KQL inside Defender XDR is good enough for the day-to-day, and it is the rare Microsoft feature that does what it claims without a chain of licensing surprises.

Related Posts

Automating Phishing Simulations for New Employees in M365 — Sort Of

A few weeks ago we tried to improve our phishing simulation game for new employees in Microsoft 365. The goal was simple: new hires should automatically receive a phishing simulation email after their first one to two months on the job — just to make sure they know that phishing exists and that the IT security team might occasionally test them. Easy, right? Spoiler: not quite.

Key Takeaways

  • Microsoft's Attack Simulation Training can target new hires automatically, but only by way of a dynamic group — there is no native "new employee" trigger.
  • Driving the dynamic group off employeeHireDate is more accurate than using createdDateTime or whenCreated, which often predate the actual start date.
  • If you sync from on-premises Active Directory, employeeHireDate has to be fed from a custom AD attribute such as msDS-cloudExtensionAttribute1 via Entra Connect.
  • Simulation automations cannot be scheduled in advance, so the workflow is still partly manual — typically one prep session per month.
  • Attack Simulation Training requires Defender for Office 365 Plan 2, which is included in Microsoft 365 E5.

Environment

Before we dive in, here is what we are working with:

  • Microsoft 365 tenant with Defender for Office 365 Plan 2 (included in M365 E5).
  • All users have an M365 E5 license.
  • On-premises Active Directory synced to Entra ID via Entra Connect.

The Problem

Microsoft's Attack Simulation Training in the Security portal does offer an automation feature. The idea is that you define a condition, and when a user matches it, they automatically get added to a simulation. You would think "new user joins tenant" is one of those conditions. It is — but only if you define a dynamic group and point the automation at it. Fine, we can do that.

Here is where it gets interesting. You need to figure out which attribute to base the dynamic group query on. The obvious candidate is createdDateTime (or whenCreated in on-premises AD), which reflects when the user object was created. But that can be misleading — user objects are sometimes created weeks or even months before the employee actually starts. Not ideal for targeting someone fresh on their first day.

Fortunately, Entra ID has an attribute for exactly this: employeeHireDate. By default it is not populated, but we can work with that.

The Solution

Step 1 — Populate employeeHireDate via Entra Connect

Since we are running an on-premises Active Directory synced with Entra Connect, we can map a custom AD attribute to employeeHireDate in Entra ID. We used msDS-cloudExtensionAttribute1 for this. It is one of several cloud extension attributes available in AD and, unlike most standard attributes, it is free to repurpose.

For existing users, we wrote a short PowerShell script to set the attribute. Since we do not always have the exact hiring date on hand, we used whenCreated plus one month as a reasonable approximation:

Import-Module ActiveDirectory

Get-ADUser -Filter * -Properties whenCreated, msDS-cloudExtensionAttribute1 | ForEach-Object {
    $hiringDate = $_.whenCreated.AddMonths(1).ToString("yyyy-MM-ddTHH:mm:ss.0000000Z")
    Set-ADUser -Identity $_.SamAccountName -Replace @{
        'msDS-cloudExtensionAttribute1' = $hiringDate
    }
}

For new users, the same logic was added to the existing interactive PowerShell user creation script. Ideally, someone manually fills in the real hire date — but let's be honest, that does not always happen. The approximation is good enough for most cases, and is better than nothing.

If you want to know how to map msDS-cloudExtensionAttribute1 to employeeHireDate in Entra Connect, check out this post: Configure EmployeeHireDate and EmployeeLeaveDateTime in Active Directory to be used with Microsoft Entra ID Governance. Microsoft's own reference for the attribute lives in the Microsoft Graph user resource documentation. After a successful sync, you should be able to see employeeHireDate populated on the user objects in Entra ID.

Step 2 — Create a Dynamic M365 Group

With employeeHireDate now populated, we can create a dynamic membership group in Entra ID. One important note: it must be an M365 group or a mail-enabled security group. Regular security groups do not work with Attack Simulation Training. Yes, that is just how it is.

The dynamic membership query filters users whose hire date falls within a specific window. Rather than hardcoding dates that need monthly updates, we let Entra do the math using system.now with ISO 8601 duration offsets:

(user.employeeHireDate -le system.now -plus -p30d) -and (user.employeeHireDate -ge system.now -plus -p60d)

Breaking that down:

  • -le system.now -plus -p30d — the hire date is at least 30 days in the past, meaning the employee has been around long enough to receive a simulation.
  • -ge system.now -plus -p60d — the hire date is no older than 60 days, so we are not sending a "welcome" phishing mail to someone who joined two years ago.

The result is a window that moves forward automatically every day — no manual date changes needed. It is worth noting that employeeHireDate is still in preview as of writing, so treat it accordingly. Once the rule is in place, use the Validate Rules feature in the Entra portal to test against a few users with known hire dates before going live. It takes maybe two minutes and saves you from a potentially awkward conversation later.

Step 3 — Configure Attack Simulation Training

Now we can set up the simulation in the Microsoft Security portal under Attack Simulation Training → Simulations. Use your newly created dynamic group as the target audience.

Here is the catch we ran into: simulation automations cannot be scheduled. Microsoft's automation feature in this context does not let you set a future start date. This means we cannot set up a fully automated "fire and forget" workflow for new hires. Instead, we prepare one simulation per month in a single session and let them run on schedule. It is a bit more manual than we hoped, but it is manageable.

A few things to keep in mind when setting up each monthly simulation:

  • Only use payloads in your company's primary language, unless you intentionally want to test multilingual awareness.
  • Set start and end dates to align cleanly with the calendar month so the reporting is easier to read.
  • Choose payloads that are realistic for your industry — a fake invoice or HR notification works better than something obviously suspicious.
  • Review the results from previous months before selecting new payloads — some patterns get stale and click rates drop for the wrong reasons.
  • Make sure the landing page and training assignment match the payload theme so the learning moment lands.

Frequently Asked Questions

Do I need Microsoft 365 E5 to run Attack Simulation Training?

Attack Simulation Training is part of Defender for Office 365 Plan 2, which ships with E5 and is also available as an add-on for lower SKUs. You do not strictly need every user on E5, but the users you want to enroll do need the Defender for Office 365 Plan 2 entitlement.

Can I use a regular security group instead of an M365 group?

No. Attack Simulation Training only accepts M365 groups and mail-enabled security groups as targets. Plain security groups will not appear in the picker. This is the single most common reason a dynamic group built for this purpose fails to work as expected.

Why use employeeHireDate instead of whenCreated or createdDateTime?

Object creation timestamps reflect when IT created the account, not when the employee actually started. In environments where accounts are provisioned weeks or months ahead of the start date, whenCreated targets the wrong window. employeeHireDate is the attribute Microsoft specifically intends for this purpose.

Is employeeHireDate generally available yet?

At the time of writing, employeeHireDate in Entra ID is in preview. It works reliably in our environment, but its behavior, supported scenarios, and Graph API surface can still change. Validate the dynamic rule against known test users before scaling up.

Can I fully automate the monthly simulation rollout?

Not today. The simulation creation step itself does not support scheduling a future start date through the automation feature. The dynamic group keeps itself current, but creating and launching the next simulation is still a manual click-through. A monthly batch session is the practical workaround.

Conclusion

Is this the cleanest solution? No. Should a platform like Microsoft 365, with its automation features and extensive user attribute support, handle this natively without requiring a chain of attribute mappings, dynamic groups, and monthly manual setup? Probably. But here we are.

The good news is that it works. New employees get a phishing simulation shortly after joining, employeeHireDate gives us a more meaningful trigger than raw object creation time, and the monthly prep sessions take maybe thirty minutes once you have a template to work from. Not elegant, but effective — which, in IT, is often the best you can hope for.

Related Posts