MITRE ATT&CK + D3FEND: Mapping Defense to Attack

Part four of the MITRE ATT&CK series. ATT&CK catalogues what attackers do; MITRE D3FEND catalogues what defenders do about it. The two are linked — D3FEND exposes a mapping from each defensive technique to the offensive techniques it counters — but the link lives behind a separate API and requires some careful plumbing to use at scale. This post wires D3FEND into the mapped dataset built in Part 2 and addresses the practical problems (API quirks, payload size, deduplication) that come up along the way.

Key Takeaways

  • D3FEND exposes its mappings via a JSON API at d3fend.mitre.org/api/. Two endpoints carry most of the value: offensive-to-defensive mapping per technique, and full defensive-technique details.
  • The API returns SPARQL-style "bindings" — slightly awkward to walk, but stable.
  • Cache responses locally. The API is friendly but rate-limited and the data does not change minute to minute.
  • The combined output bloats fast (~80 MB raw) because references and authors repeat across techniques. Deduplicate with a lookup table to cut size by 75%.
  • The point of the mapping is decision support: gaps in D3FEND coverage for techniques attackers actually use are the highest-leverage defensive investments.

Environment

  • Project structure from Part 1 and mapped data from Part 2.
  • Python 3.10+ with requests.
  • Internet access for the initial D3FEND fetch (cache locally afterwards).

The Problem

ATT&CK answers the question "what are attackers doing?" D3FEND answers "what could a defender do about it?" — but only if you can resolve the link between them. The D3FEND API surfaces that link, with some quirks: payloads are SPARQL-shaped, descriptions are deeply nested, and the same references repeat in dozens of techniques. The recipe below tames the data into something usable for downstream analysis and reporting.

The Solution

Step 1 — Fetch the D3FEND mapping for an ATT&CK technique

One endpoint per technique. Cache by technique ID:

import json, logging, requests
from pathlib import Path

logger = logging.getLogger(__name__)
D3F_CACHE = Path('cache/d3fend')
D3F_CACHE.mkdir(parents=True, exist_ok=True)

def fetch_d3fend_mapping(attack_id: str) -> dict | None:
    cached = D3F_CACHE / f'{attack_id}.json'
    if cached.exists():
        return json.loads(cached.read_text())
    try:
        url  = f'https://d3fend.mitre.org/api/offensive-technique/attack/{attack_id}.json'
        resp = requests.get(url, timeout=30)
        resp.raise_for_status()
        data = resp.json()
        cached.write_text(json.dumps(data))
        return data
    except requests.RequestException as exc:
        logger.warning('D3FEND fetch failed for %s: %s', attack_id, exc)
        return None

Step 2 — Fetch defensive-technique details

Each defensive technique has its own endpoint with the long-form description, references, and synonyms:

def fetch_d3fend_detail(def_id: str) -> dict | None:
    cached = D3F_CACHE / f'{def_id}_detail.json'
    if cached.exists():
        return json.loads(cached.read_text())
    try:
        url  = f'https://d3fend.mitre.org/api/technique/d3f:{def_id}.json'
        resp = requests.get(url, timeout=30)
        resp.raise_for_status()
        data = resp.json()
        cached.write_text(json.dumps(data))
        return data
    except requests.RequestException as exc:
        logger.warning('D3FEND detail failed for %s: %s', def_id, exc)
        return None

Step 3 — Resolve mapping into a flat list of defensive techniques

The mapping payload nests the actual links under off_to_def.results.bindings. Each binding has a defensive-technique URI; the ID is the fragment after #:

def defensive_techniques_for(attack_id: str) -> list[dict]:
    raw = fetch_d3fend_mapping(attack_id)
    if not raw or 'off_to_def' not in raw:
        return []

    out = []
    for binding in raw['off_to_def']['results']['bindings']:
        label = binding.get('def_tech_label', {}).get('value')
        uri   = binding.get('def_tech',       {}).get('value')
        if not (label and uri):
            continue
        def_id = uri.rsplit('#', 1)[-1]

        detail = fetch_d3fend_detail(def_id) or {}
        description = (
            detail.get('description', {})
                  .get('@graph', [{}])[0]
                  .get('d3f:definition')
        )

        out.append({
            'id':          def_id,
            'label':       label,
            'description': description,
            'url':         f'https://d3fend.mitre.org/technique/d3f:{def_id}',
        })
    return out

Step 4 — Decorate the mapped ATT&CK data

Iterate the mapped techniques from Part 2 and add a d3fend field. The API is fast but caching is essential — the first run takes minutes, subsequent runs are seconds:

def add_d3fend(mapped: list[dict]) -> list[dict]:
    for i, tech in enumerate(mapped, start=1):
        tech['d3fend'] = defensive_techniques_for(tech['technique_id'])
        if i % 50 == 0:
            logger.info('D3FEND mapped %d/%d', i, len(mapped))
    return mapped

Step 5 — Deduplicate references to control file size

Naively decorated, the combined output bloats to ~80 MB on the current dataset. Most of the bulk is references and authors that repeat across techniques. A lookup table cuts it sharply:

def deduplicate_references(mapped: list[dict]) -> dict:
    ref_table:    dict[str, int] = {}
    author_table: dict[str, int] = {}

    for tech in mapped:
        for d3f in tech.get('d3fend', []):
            new_refs = []
            for ref in d3f.get('references', []):
                key = ref['url'] if isinstance(ref, dict) else ref
                if key not in ref_table:
                    ref_table[key] = len(ref_table) + 1
                new_refs.append(ref_table[key])
            d3f['references'] = new_refs

            new_authors = []
            for author in d3f.get('authors', []):
                if author not in author_table:
                    author_table[author] = len(author_table) + 1
                new_authors.append(author_table[author])
            d3f['authors'] = new_authors

    return {
        'techniques': mapped,
        'metadata': {
            'references': {v: k for k, v in ref_table.items()},
            'authors':    {v: k for k, v in author_table.items()},
        },
    }

The replacement integers are small; the canonical reference data lives once in metadata.references. Output drops from ~80 MB to ~20 MB.

Step 6 — Strip empty values

A small recursive cleanup removes null, empty strings, empty lists, and empty dicts before serialisation. Saves another couple of megabytes and makes the output noticeably easier to read:

def strip_empty(value):
    if isinstance(value, dict):
        return {k: strip_empty(v) for k, v in value.items()
                if v not in (None, '', [], {})}
    if isinstance(value, list):
        return [strip_empty(v) for v in value if v not in (None, '', [], {})]
    return value

Step 7 — Act on the gap analysis

Once the dataset is enriched, the most useful query is the inverse mapping: which ATT&CK techniques have no D3FEND coverage, weighted by group usage. Those are the high-leverage defensive investments:

uncovered = [
    t for t in mapped
    if not t.get('d3fend') and len(t.get('groups', [])) >= 5
]
uncovered.sort(key=lambda t: len(t['groups']), reverse=True)
for t in uncovered[:20]:
    print(t['technique_id'], t['name'], 'groups:', len(t['groups']))

The top of that list is where security investment maps directly onto reduction of real-world attacker capability. Take it to your next strategy review.

Frequently Asked Questions

Is D3FEND a replacement for ATT&CK's mitigations?

No — they overlap but are not interchangeable. ATT&CK mitigations are broad, often process-level recommendations. D3FEND is a structured ontology of defensive techniques with more granularity. Use ATT&CK mitigations for high-level mapping; use D3FEND for detailed control-design work.

Is the D3FEND API rate-limited?

Not in a documented way at the time of writing, but it is a free service hosted by MITRE. Cache locally, respect the service, and back off if you start seeing failures. The mapping changes slowly; weekly refresh is plenty.

Why are the API payloads SPARQL-shaped?

D3FEND is built on top of an OWL ontology, which is queried in SPARQL natively. The JSON wrapper is a thin serialisation of SPARQL result bindings. Once you know the shape, the data is straightforward; the shape itself is just legacy from the underlying technology.

Can I query D3FEND directly with SPARQL?

Yes — D3FEND publishes an OWL/RDF dataset you can load into a local triplestore (Apache Jena, Blazegraph, GraphDB). Useful for advanced analysts; not necessary for typical defender-side mapping. The JSON API covers the common cases.

How big is the final combined dataset?

After mapping, deduplicating references, and stripping empty values, the combined ATT&CK + D3FEND JSON sits at about 20 MB for the current enterprise dataset. Compress on disk if you check it into git.

Conclusion

ATT&CK without D3FEND is half the picture; pairing them gives you both halves and lets you do useful gap analysis on the defensive side. The API has rough edges — SPARQL-shaped responses, repeated references, deeply nested descriptions — but they are the kind of thing a small loader + deduplicator pipeline handles once and never bothers you about again. The output is something you can hand to a security architect and have a useful conversation about which controls to build next, which is what these frameworks are for.

Related Posts

Authoritative references: MITRE D3FEND and D3FEND API documentation.

PowerShell Quick Guide: Process Investigation

PowerShell process investigation sits between two extremes: built-in Get-Process tells you almost nothing useful for triage, while a full EDR product does the job but is not always available on the box in front of you. The middle ground — a handful of Get-CimInstance and Get-NetTCPConnection queries — is what we reach for first when something looks off on a Windows endpoint and we need to decide whether to escalate.

Key Takeaways

  • Get-Process alone is not enough for defensive work; the command line, parent PID, and signature live on Win32_Process via Get-CimInstance.
  • Parent-child relationships are the fastest way to spot living-off-the-land abuse, like Office or Outlook spawning powershell.exe.
  • Command-line inspection catches encoded PowerShell, suspicious arguments, and binaries running out of %TEMP% or %APPDATA%.
  • Authenticode signature checks separate signed Microsoft binaries from third-party or unsigned files in unusual locations.
  • Use these techniques for authorised triage on systems you administer. For production monitoring, integrate the same signals into Microsoft Defender for Endpoint or another EDR rather than relying on ad-hoc scripts.

Environment

  • Windows 10 22H2 or Windows 11 23H2 endpoint, Windows Server 2019 or later.
  • Windows PowerShell 5.1 or PowerShell 7.4 — every cmdlet used below works on both.
  • Local administrator rights, since command lines and signature paths require elevation for non-current-user processes.
  • Optional but useful: Sysmon for richer process telemetry, Microsoft Defender for Endpoint for production-grade hunting.

The Problem

Most blog posts about "PowerShell threat hunting" stop at Get-Process | Sort-Object CPU -Descending, which is fine for finding the browser tab eating your battery but tells you nothing about an attacker. The information a defender actually wants — who started this, with what arguments, signed by whom, talking to which IP — is split across at least four different sources: the WMI/CIM Win32_Process class, Authenticode signature data, the network stack via Get-NetTCPConnection, and registry-based startup locations.

The goal of this guide is to wire those sources together into a few short scripts you can run during an authorised incident triage, in a lab, or on a system you own to baseline normal behaviour.

The Solution

Step 1 — Get the data that Get-Process hides

Get-Process returns a .NET System.Diagnostics.Process object that does not include the command line or parent PID. Get-CimInstance Win32_Process does, and it returns properties as plain values that play well with Select-Object:

Get-CimInstance Win32_Process |
    Select-Object ProcessId,
                  ParentProcessId,
                  Name,
                  CommandLine,
                  CreationDate,
                  @{Name='Owner';Expression={ ($_ | Invoke-CimMethod -MethodName GetOwner).User }} |
    Sort-Object CreationDate -Descending

Note that CommandLine is $null for processes you do not have rights to read — usually services running as another user when you are not elevated. If you see a lot of empty command-line columns, re-run the console as administrator.

Step 2 — Map the parent-child tree

Attackers very often live off the land. winword.exe spawning cmd.exe spawning powershell.exe is the textbook macro-execution chain, and it is invisible until you draw the tree. Two small joins do the work:

$procs = Get-CimInstance Win32_Process
$procs |
    Select-Object ProcessId,
                  Name,
                  CommandLine,
                  @{Name='Parent';Expression={
                      ($procs | Where-Object ProcessId -eq $_.ParentProcessId).Name
                  }} |
    Where-Object Parent -in 'winword.exe','excel.exe','outlook.exe','powerpnt.exe' |
    Format-Table -AutoSize

An Office product is not supposed to be the parent of cmd.exe, powershell.exe, wscript.exe, mshta.exe, or regsvr32.exe. If the table is not empty, that is the lead you investigate first.

Step 3 — Inspect command lines for the obvious red flags

PowerShell encoded commands, -WindowStyle Hidden, -ExecutionPolicy Bypass, and outbound Invoke-WebRequest calls show up in command-line strings even when the script body never touches disk:

$flags = @(
    '-enc',           # short for -EncodedCommand
    '-encodedcommand',
    'frombase64string',
    'iex',
    'downloadstring',
    'downloadfile',
    '-w hidden',
    '-windowstyle hidden',
    '-nop',
    '-noprofile',
    '-executionpolicy bypass'
)

Get-CimInstance Win32_Process |
    Where-Object CommandLine |
    ForEach-Object {
        $cl = $_.CommandLine.ToLower()
        foreach ($f in $flags) {
            if ($cl -like "*$f*") {
                [PSCustomObject]@{
                    Pid     = $_.ProcessId
                    Parent  = $_.ParentProcessId
                    Name    = $_.Name
                    Trigger = $f
                    Command = $_.CommandLine
                }
                break
            }
        }
    }

False positives exist — your own deployment tooling may legitimately call -NoProfile -ExecutionPolicy Bypass. Baseline a clean system, list the expected hits, and treat anything outside that list as worth a second look.

Step 4 — Verify Authenticode signatures

Unsigned binaries living in %TEMP%, %APPDATA%, or %PROGRAMDATA% are statistically far more interesting than signed Microsoft binaries in C:\Windows\System32:

Get-Process |
    Where-Object Path |
    Select-Object Name,
                  Id,
                  Path,
                  @{Name='Signer';Expression={
                      (Get-AuthenticodeSignature $_.Path).SignerCertificate.Subject
                  }},
                  @{Name='Status';Expression={
                      (Get-AuthenticodeSignature $_.Path).Status
                  }} |
    Where-Object {
        $_.Path -like "$env:TEMP*" -or
        $_.Path -like "$env:APPDATA*" -or
        $_.Status -ne 'Valid'
    }

Combine the signature check with the parent-child tree from Step 2 and you have a short list of the processes worth investigating first.

Step 5 — Tie processes to network connections

A process running with no network activity is rarely interesting. A process you have never heard of with an established connection to a hosting-provider IP is worth a closer look:

Get-NetTCPConnection -State Established |
    Where-Object { $_.RemoteAddress -notmatch '^10\.|^172\.(1[6-9]|2[0-9]|3[01])\.|^192\.168\.|^127\.|^::1' } |
    Select-Object LocalPort,
                  RemoteAddress,
                  RemotePort,
                  @{Name='Process';Expression={
                      (Get-Process -Id $_.OwningProcess -ErrorAction SilentlyContinue).Name
                  }},
                  @{Name='Path';Expression={
                      (Get-Process -Id $_.OwningProcess -ErrorAction SilentlyContinue).Path
                  }} |
    Sort-Object RemoteAddress

The regex filters RFC 1918 ranges plus IPv4 loopback and IPv6 loopback, leaving only externally routable destinations. Be aware that legitimate update services and telemetry endpoints will show up here — baseline before you alert.

Step 6 — Check the persistence surface

If a suspicious process keeps coming back after a reboot, it has a persistence anchor somewhere. Win32_StartupCommand only covers the legacy startup folders and the basic Run keys, but it is a one-line first pass:

Get-CimInstance Win32_StartupCommand |
    Select-Object Name, Command, Location, User

Scheduled tasks and WMI event subscriptions are the next places to look — covered in our scheduled-task detection post linked below.

Frequently Asked Questions

Why does CommandLine come back empty for some processes?

Reading another user's process command line requires either local administrator rights or the same token as the process owner. Empty values almost always mean an unelevated console. Re-run as administrator.

Is Get-WmiObject still safe to use?

It works on 5.1 and produces the same data, but it is deprecated and absent from PowerShell 7. Use Get-CimInstance for new code — it talks WinRM rather than DCOM, plays better with firewalls, and is the supported path going forward.

Is this a replacement for an EDR?

No. These queries are point-in-time snapshots; an EDR captures the full process tree over time, correlates with file and registry activity, and stores the result for retrospective hunting. Use these scripts for triage on machines you administer or to learn what the signals look like. For continuous monitoring, deploy Microsoft Defender for Endpoint, Sysmon plus a SIEM, or an equivalent commercial agent.

How do I run this remotely against a fleet?

Wrap any of the snippets above in Invoke-Command -ComputerName $list -ScriptBlock { ... }. PowerShell remoting carries the objects back deserialised, so most of the downstream pipeline still works. See the remote management guide linked below.

Are these queries safe to run on a live production server?

Yes — they are read-only and add negligible CPU. The risk is operational: if you are connected over a slow link, Get-CimInstance Win32_Process over the wire can take longer than expected when the host has thousands of processes. Limit the query with -Filter for tight scopes.

Conclusion

Most useful endpoint triage comes from four signals: who started the process, what arguments it has, who signed the binary, and what it is talking to on the network. PowerShell exposes all four in a few cmdlets, and the patterns above are the ones we keep recycling. They are not a substitute for proper endpoint telemetry, but on a system you administer they are enough to decide whether something deserves a deeper look or a clean bill of health.

Save the snippets as a profile module, baseline a known-clean machine, and the next time something feels off you will get to a useful answer in minutes rather than hours.

Related Posts

Authoritative reference for the WMI class used above: Win32_Process on Microsoft Learn.

PowerShell Quick Guide: Remote Management Basics

PowerShell remoting is the difference between fixing one machine and fixing two hundred in the same amount of time. The cmdlets themselves are short — Invoke-Command, Enter-PSSession, New-PSSession — but the surrounding plumbing (WinRM listeners, TrustedHosts, double-hop auth, HTTPS certificates) is where most setups quietly go sideways. This guide walks through the working baseline we deploy on domain-joined endpoints, including the security knobs that are worth tightening before this becomes someone else's lateral-movement vector.

Key Takeaways

  • PowerShell remoting is built on WinRM, which speaks WS-Management over HTTP/5985 or HTTPS/5986.
  • Invoke-Command is one-shot, Enter-PSSession is interactive, and New-PSSession is the right choice when you want to reuse a session.
  • Outside a domain, TrustedHosts must be configured explicitly; inside one, Kerberos handles auth automatically.
  • HTTP/5985 is acceptable inside a tunnel or VPN; for anything traversing untrusted networks, configure HTTPS/5986 with a real certificate.
  • Remoting is also an attacker tool. Restrict the WinRM firewall rule to management subnets, log session usage, and avoid leaving Enable-PSRemoting on standalone machines that do not need it.

Environment

  • Windows 10/11 and Windows Server 2019/2022 endpoints, all domain-joined to an Active Directory forest.
  • Windows PowerShell 5.1 and PowerShell 7.4 — both speak the same WinRM protocol, but PowerShell 7 needs its own listener configuration on the target.
  • Microsoft Entra ID joined devices supported via the SSH-based PowerShell remoting path; classic WinRM remoting requires line-of-sight to a domain controller or a configured Kerberos trust.

The Problem

The cmdlet documentation makes remoting look like a one-liner: Enable-PSRemoting on the target, Enter-PSSession from the client, done. In practice the failures show up later — workgroup machines refusing the connection, second-hop authentication errors when the remote script tries to touch a file share, HTTPS certificates not trusted by the client, double-encrypted sessions through a VPN that thinks WinRM traffic is suspicious. The pattern below avoids most of those.

The Solution

Step 1 — Confirm WinRM is reachable

Before anything else, prove the target is listening. Test-WSMan tells you whether the remote WinRM service answers; Test-NetConnection tells you whether the port is even open:

$target = 'fileserver01.corp.example.com'

Test-WSMan -ComputerName $target

Test-NetConnection -ComputerName $target -Port 5985  # HTTP
Test-NetConnection -ComputerName $target -Port 5986  # HTTPS

If Test-NetConnection reports TcpTestSucceeded : False, the problem is firewall or routing, not PowerShell. If the port is open but Test-WSMan fails, the WinRM service is either not running or has no listener bound to the network interface you reached it on.

Step 2 — Run a one-shot command with Invoke-Command

Invoke-Command is the right tool when you want a result back and do not need to stay connected. It happily accepts an array of computer names and fans the call out in parallel:

$servers = 'web01','web02','web03'

Invoke-Command -ComputerName $servers -ScriptBlock {
    Get-Service W3SVC |
        Select-Object @{Name='Host';Expression={$env:COMPUTERNAME}}, Status, StartType
}

Output is deserialised on the way back, which means downstream Where-Object and Sort-Object work but methods on the returned objects do not. If you need to call a method, do it inside the script block.

Step 3 — Start a persistent session with New-PSSession

When you need to issue several commands and avoid the per-call connection overhead, hold the session in a variable and reuse it. Always remove the session when finished, even if a script errors:

$session = New-PSSession -ComputerName 'web01'

try {
    Invoke-Command -Session $session -ScriptBlock { Get-Service }
    Invoke-Command -Session $session -ScriptBlock { Get-EventLog System -Newest 50 }
}
finally {
    Remove-PSSession -Session $session
}

An idle session times out by default after a few minutes of inactivity. For long-running orchestration, configure -IdleTimeout when creating the session.

Step 4 — Handle credentials cleanly

Inside the same domain Kerberos handles authentication automatically — your current ticket is reused. When you need a different identity (another forest, an emergency local account, a delegated admin), prompt once and reuse:

$cred = Get-Credential -Message 'Credentials for remote management'

Invoke-Command -ComputerName 'server01' -Credential $cred -ScriptBlock {
    Get-WinEvent -LogName Security -MaxEvents 10
}

Never store the password in a plain string. Get-Credential returns a PSCredential with a SecureString backing it. For unattended scripts, use a Group Managed Service Account or an Azure Key Vault-backed secret rather than serialising credentials to disk.

Step 5 — Configure TrustedHosts for non-domain clients

Outside a domain, the client cannot use Kerberos and falls back to NTLM, which requires the target to be on the local TrustedHosts list:

# Add a single host
Set-Item WSMan:\localhost\Client\TrustedHosts -Value 'lab-server' -Concatenate -Force

# Or an entire subnet by wildcard (use sparingly)
Set-Item WSMan:\localhost\Client\TrustedHosts -Value 'lab-*.example.local' -Concatenate -Force

# Always review the resulting list
(Get-Item WSMan:\localhost\Client\TrustedHosts).Value

TrustedHosts is per-client, not per-target. Treat it as an opt-in list for systems you actually plan to talk to, not as a permanent wildcard.

Step 6 — Switch to HTTPS for anything off-LAN

HTTP/5985 traffic is authenticated and message-encrypted by default, but the metadata still leaks. For management traffic that crosses untrusted segments, configure a real HTTPS listener with a certificate issued by your internal CA:

# On the target, with an existing cert thumbprint in LocalMachine\My
$thumb = (Get-ChildItem Cert:\LocalMachine\My |
    Where-Object Subject -like '*CN=server01*').Thumbprint

winrm create winrm/config/Listener?Address=*+Transport=HTTPS `@{Hostname=`"server01.corp.example.com`";CertificateThumbprint=`"$thumb`"}

# Open the firewall
New-NetFirewallRule -DisplayName 'WinRM HTTPS' -Name 'WinRM-HTTPS-In-TCP' `
    -Direction Inbound -Protocol TCP -LocalPort 5986 -Action Allow `
    -RemoteAddress 10.0.0.0/8

The firewall rule restricts inbound 5986 to a management subnet. Leave 5986 open to the world only if you have a very specific reason and a corresponding network ACL upstream.

Step 7 — Copy files over an existing session

Once you have a session, Copy-Item can move files in either direction without setting up SMB or a separate transfer agent:

$session = New-PSSession -ComputerName 'web01'

Copy-Item -Path 'C:\Build\release.zip' `
          -Destination 'C:\Deploy\' `
          -ToSession $session

Copy-Item -Path 'C:\Logs\app.log' `
          -Destination 'C:\Triage\web01-app.log' `
          -FromSession $session

Remove-PSSession $session

This rides the existing WinRM channel, so any firewall rule that permits remoting also permits file transfer. It is slower than SMB but works in environments where SMB is locked down.

Frequently Asked Questions

What is the difference between Enter-PSSession and Invoke-Command?

Enter-PSSession opens an interactive remote prompt — you type, the remote runs, you see the output as if you were on the box. Invoke-Command sends a script block, runs it remotely, and returns the result to your local pipeline. Use the first for hands-on troubleshooting and the second for automation.

Why does my remote script work locally but fail when it touches a file share?

This is the classic double-hop problem. The first hop authenticates you with Kerberos, but the credential does not delegate to a second remote server by default. The fixes are CredSSP (broad, less safe) or resource-based Kerberos constrained delegation (narrow, recommended). Microsoft documents both in detail.

Do I need PowerShell 7 to use remoting?

No — Windows PowerShell 5.1 ships with remoting enabled by default on Windows Server. PowerShell 7 adds SSH-based remoting as an alternative transport, which is what you want when targeting Linux machines or working across firewalled boundaries where WinRM is blocked.

Is HTTP/5985 actually insecure?

The payload is authenticated and message-level encrypted with the user's session key, so on a trusted LAN it is fine. The case for HTTPS/5986 is defence in depth and protection against downgrade attacks if Kerberos is unavailable. Treat 5985 as acceptable inside a controlled network, 5986 as required for anything else.

Can attackers abuse PowerShell remoting?

Yes — it is a common lateral-movement primitive. The mitigations are the same as for any privileged remote access: restrict the WinRM firewall rule to management subnets, require admin group membership for connection, log session usage via PowerShell module logging and script block logging, and audit Windows Event ID 4104 in the PowerShell operational log.

Conclusion

The point of remoting is not the cmdlets — it is the operational reach they give you when something needs fixing across more than one machine. Get the listener story right, prefer Kerberos inside a domain, scope TrustedHosts tightly outside one, and treat HTTPS as the default for anything that touches an untrusted network. Done that way, remoting is one of the most valuable tools in a Windows admin's kit. Done badly, it is a quiet path for lateral movement.

Related Posts

Microsoft's reference architecture for remoting and the security trade-offs lives at Running Remote Commands on Microsoft Learn.

PowerShell Quick Guide: Managing Event Log Sizes and Retention

Windows event log size and retention is one of those settings nobody thinks about until an incident hits and the events that would have closed the investigation rolled off the channel three days ago. The defaults — 20 MB for most channels, circular overwrite — are sized for a desktop running Word, not for a Domain Controller hosting fifty Kerberos events per second. This post is the configuration we apply to every Windows server we build before it sees production traffic.

Key Takeaways

  • The default 20 MB Security log on a busy Domain Controller can roll over in hours. Plan for 1–4 GB on DCs and at least 256 MB on workstations.
  • Three retention modes matter: Circular (default, overwrites oldest), AutoBackup (rolls into an archive file), and Retain (refuses new events when full).
  • Configure size and retention with wevtutil sl for any channel, or Limit-EventLog for the classic logs only. Manage at scale with Group Policy under Computer Configuration → Administrative Templates → Windows Components → Event Log Service.
  • Disk space is cheap relative to forensic value. Size the log to hold at least the audit window your incident response process requires — 30 days is a sensible floor.
  • Forward critical logs to a central collector with Windows Event Forwarding or a SIEM. Local sizing is a fallback, not the strategy.

Environment

  • Windows Server 2019/2022 (Domain Controllers, member servers) and Windows 10/11 workstations.
  • Windows PowerShell 5.1 or PowerShell 7.4.
  • Local administrator rights — every command below modifies a system-scope setting.
  • Group Policy editing rights if rolling out across the estate.

The Problem

The Windows event log defaults were last sensibly sized when servers had 4 GB of RAM and a 36 GB SCSI disk. A modern DC writes thousands of 4624 and 4634 events per minute. A 20 MB log holds a few hours of that volume before circular overwrite kicks in and the events you needed last Tuesday are gone. The audit trail Microsoft and your compliance framework assume exists, simply does not.

The fix is two settings, applied per channel: maximum size and retention mode. Get both right once, deploy via GPO, and stop worrying.

The Solution

Step 1 — Inspect the current state

Before changing anything, see what each channel is sized to and how full it is. Get-WinEvent -ListLog returns the same metadata Event Viewer shows under "Properties":

Get-WinEvent -ListLog * |
    Where-Object RecordCount -gt 0 |
    Select-Object LogName,
                  @{Name='SizeMB';    Expression={ [math]::Round($_.FileSize/1MB,1) }},
                  @{Name='MaxMB';     Expression={ [math]::Round($_.MaximumSizeInBytes/1MB,1) }},
                  @{Name='PercentFull';Expression={ [math]::Round(($_.FileSize/$_.MaximumSizeInBytes)*100,1) }},
                  LogMode,
                  IsEnabled |
    Sort-Object PercentFull -Descending |
    Format-Table -AutoSize

Any channel above 80% on a workload you care about is a candidate for resizing. Anything above 95% is already losing events.

Step 2 — Decide the right size per channel

Sizes are a function of event volume and how long you want to retain. As a starting point we use:

  • Domain Controller — Security: 4 GB (auth events dominate, very high volume).
  • Domain Controller — System / Application: 256 MB each.
  • Domain Controller — Directory Service: 1 GB.
  • Member server — Security: 1 GB.
  • Workstation — Security: 256 MB.
  • PowerShell/Operational (everywhere, if script block logging is on): 1 GB.
  • Sysmon/Operational (if Sysmon is deployed): 4 GB.

These are starting points, not gospel. Measure rollover frequency on a representative box for one week and adjust.

Step 3 — Resize and change retention with wevtutil

wevtutil works against any channel, classic or modern, and is the only option for the Microsoft-Windows-* channels:

# Set Security log to 4 GB, circular overwrite (the default mode)
wevtutil sl Security /ms:4294967296

# Switch retention to AutoBackup (rolls full log into archive, starts a fresh one)
wevtutil sl Security /ms:4294967296 /rt:false /ab:true

# Set Sysmon to 4 GB
wevtutil sl 'Microsoft-Windows-Sysmon/Operational' /ms:4294967296

# Inspect a single channel
wevtutil gl Security

The /rt:false flag means "do not retain when full" — combined with /ab:true, that produces the AutoBackup mode in Event Viewer. /rt:true on its own produces the Retain mode and is rarely what you want, because it makes Windows refuse new events when the log fills.

Step 4 — Roll out via Group Policy

For more than a handful of machines, configure the settings under Computer Configuration → Administrative Templates → Windows Components → Event Log Service. Each log type (Application, Security, Setup, System) has its own subkey with two settings:

  • Specify the maximum log file size (KB) — value in kilobytes; multiply your target MB by 1024.
  • Retain old events — set to Disabled for circular, Enabled with Back up log automatically when full for AutoBackup.

Microsoft-Windows-* channels are not directly covered by Administrative Templates. For those, use Group Policy Preferences to push wevtutil commands at boot, or run them from a Desired State Configuration / Ansible / Intune script.

Step 5 — Watch for the channel filling up anyway

Even with generous sizing, runaway logging (chatty third-party drivers, broken auditing policy) can fill a log in minutes. Run a scheduled check that alerts when any channel passes a threshold:

$threshold = 80

Get-WinEvent -ListLog * |
    Where-Object { $_.RecordCount -gt 0 -and $_.MaximumSizeInBytes -gt 0 } |
    ForEach-Object {
        $percent = ($_.FileSize / $_.MaximumSizeInBytes) * 100
        if ($percent -ge $threshold) {
            [PSCustomObject]@{
                Host        = $env:COMPUTERNAME
                LogName     = $_.LogName
                PercentFull = [math]::Round($percent,1)
                SizeMB      = [math]::Round($_.FileSize/1MB,1)
                MaxMB       = [math]::Round($_.MaximumSizeInBytes/1MB,1)
            }
        }
    }

Schedule the script every 15 minutes and pipe the output into your existing alerting channel. If a log is over the threshold consistently, either bump the size or fix the upstream noise source.

Step 6 — Export important channels before destructive changes

Any size change keeps existing events, but log clearing does not. Always export first if there is any chance you will need the historical data:

$dir = 'C:\LogArchive'
New-Item -ItemType Directory -Path $dir -Force | Out-Null

wevtutil epl Security "$dir\Security_$(Get-Date -Format 'yyyyMMdd_HHmm').evtx"
wevtutil epl 'Microsoft-Windows-PowerShell/Operational' "$dir\PSOperational_$(Get-Date -Format 'yyyyMMdd_HHmm').evtx"

The exported .evtx opens directly in Event Viewer and is queryable with Get-WinEvent -Path.

Frequently Asked Questions

What is the default size of the Windows Security log?

20 MB on modern Windows. On a Domain Controller that holds a few hours of authentication events. The setting has not been revisited in many Windows generations.

Should I use Circular, AutoBackup, or Retain mode?

Circular for workstations and member servers — old events disappear, no human intervention required. AutoBackup on Domain Controllers and audited servers — full logs roll into archive files for later analysis. Retain only when a compliance requirement explicitly forbids overwrite; expect operational pain when logs fill.

Will increasing the log size affect performance?

No measurable impact during normal operation; the log file is memory-mapped and writes are append-only. The cost is disk space and a slightly slower startup for the Event Log service on very large files.

Where does Windows store the actual log files?

%SystemRoot%\System32\winevt\Logs\. Each channel is a .evtx file named after the channel. Both space planning and backup design should account for that path.

Is forwarding logs better than local retention?

For incident response and long-term audit, yes. Windows Event Forwarding (free, built-in) or a SIEM agent ships events to a central collector you can query and back up independently. Local retention remains useful for the gap between event generation and forwarding, and for offline analysis.

Conclusion

Sizing event logs is a one-time setup task that pays off the first time you actually need the data. The defaults are wrong for any server doing real work, the GUI is fine for a single host, and Group Policy plus wevtutil covers the rest of the estate. Pair this with central forwarding, schedule a fullness check, and move on. The events will be there when you need them.

Related Posts

Reference for the underlying utility: wevtutil on Microsoft Learn.

PowerShell Quick Guide: Working with Event Logs Like a Pro

PowerShell event log queries are how we actually live in Windows logs day to day. The Event Viewer GUI is fine for clicking through a single host; Get-WinEvent with a properly built FilterHashtable is what makes it possible to ask "which user failed 50 logins in the last hour, on which DC, from which IP" without manually exporting anything. This post is the working baseline we hand to new admins on the team.

Key Takeaways

  • Get-WinEvent -FilterHashtable is dramatically faster than Get-WinEvent | Where-Object because it filters server-side on the EventLog channel rather than after retrieval.
  • The Properties array on each event is the structured payload. Positional access ($_.Properties[5].Value) is the standard pattern; positions differ per event ID and are documented per event.
  • Domain Controllers carry the highest-signal Windows security events; query them remotely with -ComputerName or fan out with Invoke-Command.
  • "No events found" raises a terminating error by default. Wrap with -ErrorAction SilentlyContinue or try/catch to keep batches running.
  • Pair this with sensible log sizing and retention — otherwise the events you need are already overwritten by the time you ask.

Environment

  • Windows 10/11 endpoints and Windows Server 2019/2022, including Active Directory Domain Controllers.
  • Windows PowerShell 5.1 or PowerShell 7.4 — both ship the same Get-WinEvent cmdlet.
  • Administrative rights on each target. The Security log is unreadable without them.
  • Advanced audit policy enabled (logon, process creation, account management, policy change) so the interesting event IDs are actually generated.

The Problem

The naive pattern Get-WinEvent -LogName Security | Where-Object Id -eq 4625 pulls every event off the channel and filters in PowerShell. On a busy DC with a few million security events that takes minutes and chews memory. FilterHashtable hands the filter to the Event Log service, which evaluates it natively and returns only the rows you asked for in seconds.

The other recurring problem is the Properties array. Event 4625 has 21 properties; the username sits in position 5, the source IP in position 19, the failure reason in position 9. There is no schema in the object — you have to look up the positions for each event ID. Once you have them, calculated properties give the output sensible column names.

The Solution

Step 1 — Discover the log you actually want

Most useful events live in three logs: Security, System, and Microsoft-Windows-PowerShell/Operational. List everything that has data to confirm which channels are actually being written:

Get-WinEvent -ListLog * |
    Where-Object RecordCount -gt 0 |
    Sort-Object RecordCount -Descending |
    Select-Object -First 25 LogName, RecordCount, IsEnabled, LogMode

Channels reporting RecordCount = 0 are usually either disabled, scoped to a feature you do not have installed, or freshly cleared.

Step 2 — Build the filter on the server side

FilterHashtable accepts LogName, ID, Level, StartTime, EndTime, ProviderName, and a few more keys. Multiple values in any key become an OR; multiple keys combine with AND:

# Failed logons in the last hour
Get-WinEvent -FilterHashtable @{
    LogName   = 'Security'
    Id        = 4625
    StartTime = (Get-Date).AddHours(-1)
} -ErrorAction SilentlyContinue

# Successful AND failed logons together
Get-WinEvent -FilterHashtable @{
    LogName   = 'Security'
    Id        = 4624, 4625
    StartTime = (Get-Date).AddMinutes(-15)
} -ErrorAction SilentlyContinue

# Errors and warnings from System log, last 24 hours
Get-WinEvent -FilterHashtable @{
    LogName   = 'System'
    Level     = 2, 3   # Error, Warning
    StartTime = (Get-Date).AddDays(-1)
} -ErrorAction SilentlyContinue

Always use -ErrorAction SilentlyContinue when the filter may return nothing — otherwise the cmdlet throws a terminating error that you then have to catch.

Step 3 — Project the properties you care about

Each event has a Properties array indexed by position. The mapping for 4625 (failed logon) is documented by Microsoft; the practical positions are:

Get-WinEvent -FilterHashtable @{
    LogName = 'Security'; Id = 4625; StartTime = (Get-Date).AddHours(-1)
} -ErrorAction SilentlyContinue |
    Select-Object TimeCreated,
                  @{Name='User';      Expression={ $_.Properties[5].Value }},
                  @{Name='Domain';    Expression={ $_.Properties[6].Value }},
                  @{Name='Reason';    Expression={ $_.Properties[8].Value }},
                  @{Name='SourceIP';  Expression={ $_.Properties[19].Value }},
                  @{Name='WorkstationName'; Expression={ $_.Properties[13].Value }}

For other event IDs, the simplest way to find the right positions is to grab one sample event and inspect $event.Properties | ForEach-Object { $_.Value } alongside $event.Message.

Step 4 — Detect simple brute-force patterns

Once you can pull failed logons with user and source IP, grouping turns them into a triage table:

Get-WinEvent -FilterHashtable @{
    LogName = 'Security'; Id = 4625; StartTime = (Get-Date).AddHours(-6)
} -ErrorAction SilentlyContinue |
    Select-Object @{Name='User';     Expression={ $_.Properties[5].Value }},
                  @{Name='SourceIP'; Expression={ $_.Properties[19].Value }} |
    Group-Object User, SourceIP |
    Where-Object Count -gt 10 |
    Sort-Object Count -Descending

Real brute-force detection lives in a SIEM, not in an interactive PowerShell session — but on a single DC this query is enough to confirm whether something is actively going wrong.

Step 5 — Query multiple Domain Controllers at once

Domain logons land on whichever DC the client happened to talk to. Aggregate across all of them with Invoke-Command:

$dcs = (Get-ADDomainController -Filter *).HostName

Invoke-Command -ComputerName $dcs -ScriptBlock {
    Get-WinEvent -FilterHashtable @{
        LogName   = 'Security'
        Id        = 4625
        StartTime = (Get-Date).AddHours(-1)
    } -ErrorAction SilentlyContinue |
        Select-Object @{Name='DC';       Expression={ $env:COMPUTERNAME }},
                      TimeCreated,
                      @{Name='User';     Expression={ $_.Properties[5].Value }},
                      @{Name='SourceIP'; Expression={ $_.Properties[19].Value }}
} | Sort-Object TimeCreated -Descending

This fans out the query, then aggregates locally. For domains with more than a handful of DCs, consider scheduling the collection rather than running it ad-hoc.

Step 6 — Export the result for an analyst

Once you have a useful row set, pipe it to Export-Csv for sharing or follow-up analysis. The CSV-export guide linked below covers the encoding flags that keep Excel happy:

... |
    Export-Csv -Path .\failed_logons.csv -NoTypeInformation -Encoding UTF8BOM

Frequently Asked Questions

Why is Get-WinEvent -FilterHashtable so much faster than Where-Object?

FilterHashtable is translated to an XPath query and pushed down into the Event Log service. The service evaluates it on the channel directly and returns only matching records. Where-Object only filters after the cmdlet has already pulled every row across the pipeline.

What is the difference between Get-EventLog and Get-WinEvent?

Get-EventLog targets the legacy classic event logs (System, Security, Application). Get-WinEvent targets both classic logs and the modern EventLog 2.0 channels (Microsoft-Windows-*, PowerShell/Operational, Sysmon). Use Get-WinEvent for new code; Get-EventLog is deprecated and absent from PowerShell 7.

Why does my filter return zero events when Event Viewer shows them?

The two most common causes are case-sensitive provider names and the wrong log channel. Confirm with Get-WinEvent -ListLog * and then with Get-WinEvent -ListProvider *. Time-zone mismatches also bite — StartTime is local, and your script may be evaluating UTC.

Can I query event logs from offline .evtx files?

Yes. Get-WinEvent -Path .\Security.evtx reads an exported log file directly. Combined with FilterHashtable's Path key it scales to hundreds of files for retrospective analysis.

How do I avoid getting throttled when querying a busy DC?

Use tight StartTime and EndTime windows, narrow Id filters, and set -MaxEvents when you only need a sample. For continuous collection, push to Windows Event Forwarding instead of polling.

Conclusion

Event logs only tell you anything useful when you can ask them sharp questions. Get-WinEvent -FilterHashtable is the right primitive, the Properties array is the structured payload underneath the human-readable message, and calculated properties turn the result into a table you can paste into a ticket. Spend ten minutes learning the property positions for the four or five event IDs you actually care about and you will outpace the GUI for anything beyond a single click-through.

Related Posts

Reference for the cmdlet and its filter syntax: Get-WinEvent on Microsoft Learn.

PowerShell Quick Guide: Exporting Data to CSV Files

Exporting data with PowerShell Export-Csv is one of those tasks that looks trivial until you actually hit Excel. Wrong encoding, mangled non-ASCII characters, columns full of System.Object[], calculated properties that disappear — the cmdlet is simple, but the surrounding behaviour is full of small traps. This post collects the patterns we keep coming back to when we need a clean CSV out of a PowerShell pipeline.

Key Takeaways

  • Export-Csv always pairs with -NoTypeInformation on Windows PowerShell 5.1 — the type header it omits is otherwise the first thing that confuses Excel.
  • Use -Encoding UTF8BOM (PowerShell 7) or UTF8 (5.1) when the data contains non-ASCII characters, or Excel will render them as garbage.
  • Filter and project the data before piping into Export-Csv; the cmdlet writes every property it sees, including the noisy ones.
  • Properties that hold arrays or nested objects export as type names. Flatten them with -join or calculated properties first.
  • For long-running collections, prefer -Append over building one huge in-memory array.

Environment

Examples were tested on both supported PowerShell editions. Behaviour for encoding flags and default delimiters differs between them, so it matters which you are running:

  • Windows PowerShell 5.1 on Windows 10/11 — ships in box, default encoding is ASCII unless overridden.
  • PowerShell 7.4 LTS on Windows and Linux — default encoding is UTF-8 without BOM.
  • Excel for Microsoft 365 (current channel) is the consumer for most of these CSVs; Excel cares deeply about BOMs and locale-specific delimiters.

The Problem

The naive call Get-Process | Export-Csv processes.csv works. It produces a file. Opening it in Excel, however, gives you a useful column called #TYPE System.Diagnostics.Process, three columns of meaningless WMI internals, a column called Modules that contains the literal text System.Diagnostics.ProcessModule[], and — if you happen to be in a German, Dutch, or French Windows locale — every column is glued together because Excel expected semicolons and PowerShell wrote commas.

Most of these are one-flag fixes once you know which flag. The rest are pipeline shape issues that have to be solved before the data ever reaches Export-Csv.

The Solution

Step 1 — Strip the type header and project only the columns you want

Always pass -NoTypeInformation on Windows PowerShell 5.1. On PowerShell 7 the flag is the default and the parameter is a no-op kept for compatibility, but writing it explicitly keeps scripts portable.

Get-Process |
    Select-Object Name, Id, CPU, @{Name='WorkingSetMB';Expression={[math]::Round($_.WorkingSet64/1MB, 2)}} |
    Export-Csv -Path .\processes.csv -NoTypeInformation

The calculated property here does two useful things at once: it gives the column a clean name, and it converts the raw byte count into megabytes rounded to two decimals. Doing the same work in Excel afterwards is painful; doing it in the pipeline is one line.

Step 2 — Pick the right encoding for Excel

Excel does not auto-detect UTF-8 when a CSV lacks a BOM. If the data contains any non-ASCII character — an accented user name, a Cyrillic group, a German umlaut, an em dash — and you do not write a BOM, Excel will mojibake the cells:

# PowerShell 7+: write UTF-8 with byte-order mark for Excel
Get-LocalUser |
    Select-Object Name, Enabled, LastLogon |
    Export-Csv -Path .\users.csv -NoTypeInformation -Encoding UTF8BOM

# Windows PowerShell 5.1: UTF8 already means UTF-8 with BOM
Get-LocalUser |
    Select-Object Name, Enabled, LastLogon |
    Export-Csv -Path .\users.csv -NoTypeInformation -Encoding UTF8

In PowerShell 7 the value UTF8 writes without a BOM, which is the opposite of 5.1. If you are sharing scripts between versions, use UTF8BOM on 7 and UTF8 on 5.1 — or wrap the call so the script picks the right one based on $PSVersionTable.PSEdition.

Step 3 — Match the delimiter to the consuming locale

Excel's "Text to Columns" silently picks its delimiter from Windows regional settings. In English-speaking locales that is a comma, which matches CSV. In most of continental Europe it is a semicolon, which does not, and every row lands in column A. Set the delimiter to whatever the target machine expects:

# Force semicolon for German/Dutch/French Excel
Get-Service |
    Where-Object Status -eq 'Running' |
    Select-Object Name, DisplayName, StartType |
    Export-Csv -Path .\services.csv -NoTypeInformation -Delimiter ';' -Encoding UTF8BOM

# Or use the culture-appropriate one automatically
$listSep = (Get-Culture).TextInfo.ListSeparator
Get-Service | Export-Csv -Path .\services.csv -NoTypeInformation -Delimiter $listSep

The Get-Culture approach pulls the same separator Excel will, which is the safest choice when scripts move between machines.

Step 4 — Flatten properties that contain collections

PowerShell objects often hold sub-objects or arrays. Export-Csv resolves those to their type name — Microsoft.ActiveDirectory.Management.ADPropertyValueCollection is not a useful cell value. Convert collections to a delimiter-joined string in a calculated property:

Get-LocalGroup |
    Select-Object Name,
        @{Name='Members';Expression={
            (Get-LocalGroupMember -Group $_.Name -ErrorAction SilentlyContinue |
                Select-Object -ExpandProperty Name) -join '; '
        }} |
    Export-Csv -Path .\local_groups.csv -NoTypeInformation -Encoding UTF8BOM

Joining on '; ' rather than the row delimiter keeps the cell readable in Excel without breaking the CSV structure. If a cell may itself contain the join delimiter, switch to a pipe ('|') or rethink the schema.

Step 5 — Append instead of buffering for long-running collectors

Scripts that watch event logs, poll endpoints, or walk an Active Directory forest can produce more rows than fit comfortably in memory. -Append writes one row at a time, reusing the header from the first call:

foreach ($computer in $allComputers) {
    Get-CimInstance Win32_OperatingSystem -ComputerName $computer -ErrorAction SilentlyContinue |
        Select-Object PSComputerName, Caption, Version, LastBootUpTime |
        Export-Csv -Path .\os_inventory.csv -NoTypeInformation -Append -Encoding UTF8BOM
}

-Append only checks that the file exists; it does not verify that the schema matches. If a later object has different properties, the new columns silently align by position, not by name. Project to a fixed set of columns with Select-Object on every iteration so the schema is stable.

Step 6 — Round-trip with Import-Csv when you need to keep working

A CSV exported with the above flags imports cleanly back into PowerShell with Import-Csv. Everything comes back as a string — there is no type metadata in CSV — so cast as needed:

$processes = Import-Csv .\processes.csv -Encoding UTF8BOM
$processes |
    Where-Object { [int]$_.Id -gt 1000 } |
    Sort-Object { [double]$_.WorkingSetMB } -Descending

For one-off transformations that never touch disk, skip the file entirely with ConvertTo-Csv and ConvertFrom-Csv. Same flags, same gotchas, no I/O.

Frequently Asked Questions

Why does Excel show System.Object[] in my column?

The property contained an array, and Export-Csv rendered it as the type name rather than the contents. Flatten the property with -join inside a calculated property before exporting (see Step 4).

What is the difference between UTF8 and UTF8BOM in PowerShell?

In Windows PowerShell 5.1, UTF8 writes a BOM. In PowerShell 7, UTF8 writes without a BOM and UTF8BOM is the variant Excel needs. The defaults differ, so the same script can produce different files on the two editions.

How do I export to a semicolon-delimited CSV without hardcoding the delimiter?

Use (Get-Culture).TextInfo.ListSeparator as the value for -Delimiter. That gives you the same separator Excel will expect on the same machine, regardless of regional settings.

Can Export-Csv overwrite a file that is open in Excel?

No. Excel opens CSV files with a write lock. Export-Csv throws The process cannot access the file because it is being used by another process. Either close Excel or write to a new file name and swap.

Is there a faster alternative for very large datasets?

For millions of rows, the per-object overhead of Export-Csv becomes the bottleneck. Use StreamWriter directly or pipe through ConvertTo-Csv once and write the resulting strings. For tabular SQL-style work, exporting to Parquet via a dedicated module is usually a better fit than CSV.

Conclusion

Export-Csv is one of the most-used cmdlets in any administrative script, and the boring details — encoding, delimiter, schema stability — are what separate a CSV that opens cleanly in Excel from one that needs a follow-up cleanup pass. None of this is rocket science, but it is the kind of thing that bites once per locale change and once per PowerShell version bump.

Most of the time, the recipe is: project with Select-Object, encode with UTF8BOM, set the delimiter from the culture, flatten arrays in calculated properties, and append for long runs. If that pattern is in your fingers, you will spend approximately zero further minutes fighting Excel.

Related Posts

Microsoft's authoritative reference for the cmdlet lives at Export-Csv on Microsoft Learn.

Visualizing with MITRE ATT&CK Navigator: How to Visualize Mapped Data in MITRE ATT&CK Navigator

Part three of the MITRE ATT&CK series. With the dataset mapped from Part 2, the natural next step is visualisation. MITRE ATT&CK Navigator is the canonical tool for it — a web app that renders the ATT&CK matrix and accepts JSON "layer" files describing which techniques to highlight, in what colour, with what annotation. This post walks through generating Navigator layers from the mapped data, so coverage maps, group footprints, and gap analyses produce themselves.

Key Takeaways

  • MITRE ATT&CK Navigator consumes JSON "layer" files conforming to the published layer.json schema (current version 4.5).
  • A layer is essentially a list of (technique_id, color, comment) entries plus matrix-wide display options.
  • The three highest-value layers to generate from mapped data: groups heatmap, mitigations coverage, and detection coverage (technique IDs you have a SIEM rule for).
  • Heatmap colours should reflect the data distribution, not a fixed scale. Use quantiles over the count series rather than equal-width buckets.
  • Host Navigator yourself (it is a small Angular app) for an air-gapped workflow; otherwise load layers into the public instance at mitre-attack.github.io/attack-navigator.

Environment

The Problem

Numbers are useful; pictures are persuasive. A defender-friendly question like "how much of the ATT&CK matrix do our detections cover?" is impossible to answer convincingly from a table. The same data, rendered as a coloured heatmap over the matrix, makes the gaps obvious to anyone in the room — including stakeholders who do not know what a technique ID is. Generating the layer JSON is straightforward once you understand the schema and pick the right colour scale.

The Solution

Step 1 — Compute per-technique counts

Each layer needs one number per technique. Pull them from the mapped data — groups using the technique, mitigations covering it, internal detections referencing it:

def per_technique_counts(mapped: list[dict], key: str) -> dict[str, int]:
    """key in ('groups','mitigations'); returns {technique_id: count}"""
    return {t['technique_id']: len(t.get(key, [])) for t in mapped}

groups_count = per_technique_counts(mapped, 'groups')
mitig_count  = per_technique_counts(mapped, 'mitigations')

Step 2 — Pick colour thresholds from the data distribution

Equal-width bins look terrible when the count distribution is skewed (and it always is — a handful of techniques are used by 70+ groups, the median is 2). Quantile-based bins reflect the data:

from statistics import quantiles

def colour_buckets(counts: dict[str, int]) -> list[tuple[int, str]]:
    values = [v for v in counts.values() if v > 0]
    if not values:
        return [(0, '#ffffff')]
    qs = quantiles(values, n=4)
    palette = ['#ffe5e5', '#ff9999', '#ff4d4d', '#cc0000', '#660000']
    thresholds = sorted({0, round(qs[0]), round(qs[1]), round(qs[2]), max(values)})
    return list(zip(thresholds, palette[:len(thresholds)]))

def colour_for(count: int, buckets: list[tuple[int, str]]) -> str:
    chosen = buckets[0][1]
    for thr, col in buckets:
        if count >= thr:
            chosen = col
    return chosen

Step 3 — Emit a layer file

The layer schema is documented in the Navigator layer format reference. The boilerplate is minimal:

def build_layer(name: str, counts: dict[str, int]) -> dict:
    buckets = colour_buckets(counts)
    return {
        'name':        name,
        'description': f'Heatmap of {name} per technique',
        'domain':      'enterprise-attack',
        'versions':    { 'attack': '15', 'navigator': '5.0.0', 'layer': '4.5' },
        'gradient':    { 'colors': [c for _, c in buckets], 'minValue': 0, 'maxValue': max(counts.values()) or 1 },
        'legendItems': [ { 'label': str(thr), 'color': col } for thr, col in buckets ],
        'techniques':  [
            {
                'techniqueID': tid,
                'color':       colour_for(count, buckets),
                'comment':     f'{count} {name.lower()}',
                'enabled':     True,
            }
            for tid, count in counts.items()
        ],
        'layout':      { 'layout': 'flat', 'showName': True, 'showID': False, 'expandedSubtechniques': True },
        'showTacticRowBackground': True,
        'tacticRowBackground':     '#dddddd',
        'hideDisabled':            True,
    }

Step 4 — Generate the standard three layers

Three layers cover most stakeholder conversations: groups heatmap (volume of attacker usage), mitigations coverage (defensive options on paper), and your own detection coverage (what your SIEM actually catches):

import json
from pathlib import Path

def write_layer(layer: dict, path: Path) -> None:
    path.write_text(json.dumps(layer, ensure_ascii=False, indent=2), encoding='utf-8')

out = Path('navigator_layers')
out.mkdir(exist_ok=True)

write_layer(build_layer('Groups',      groups_count), out / 'groups.json')
write_layer(build_layer('Mitigations', mitig_count),  out / 'mitigations.json')

# Detection coverage layer: load your own technique IDs from a CSV / SIEM export
detection_count = {tid: 1 for tid in detected_technique_ids}
write_layer(build_layer('Detection Coverage', detection_count), out / 'detection.json')

Step 5 — Load the layer in Navigator

Open mitre-attack.github.io/attack-navigator, click Create New Layer → Open Existing Layer, and upload the JSON file. The matrix renders with the colours and counts you generated. For a hosted internal instance, clone mitre-attack/attack-navigator, build with npm, and serve the static output behind your usual reverse proxy.

Step 6 — Read the picture and act on it

The three layers answer three distinct questions:

  • Groups heatmap — where attackers concentrate effort. T1059.001 (PowerShell) and T1566.001 (Spearphishing Attachment) are perpetual heat sinks; the long tail is interesting too.
  • Mitigations coverage — where MITRE documents defensive options. Light cells with high group usage are gaps in the framework itself, often interesting research targets.
  • Detection coverage — where you actually have a SIEM rule, a YARA signature, or an EDR alert. The overlap with the groups heatmap is your real coverage posture; the gap between them is your detection roadmap.

Frequently Asked Questions

Why not just take a screenshot from the ATT&CK website?

Because the website does not know which techniques you have detections for or which mitigations you have actually implemented. Navigator with custom layers does. The website is for browsing the framework; Navigator is for representing your environment against it.

Can I overlay two layers?

Yes — Navigator supports layer scoring with a formula bar. Open both layers, then use the formula bar at the top to combine them (e.g. a + b). Useful for visualising "techniques used by groups AND covered by my detections" in one pass.

What is the difference between technique colour and tactic colour?

Technique colours come from the layer file's techniques array. Tactic colours (the column headers) are unrelated and set by Navigator itself. The tacticRowBackground option in the layer controls the row separator colour, not the tactic header.

How do I publish a layer to colleagues who do not have Navigator open?

Two options. Either export a PNG/SVG from Navigator's Layer Controls → Export menu, or host the layer JSON on an internal URL and share a Navigator link with ?layerURL=https://... in the query string — anyone with access to both Navigator and the URL gets the same view.

What schema version should I target?

The current schema is 4.5, supported by Navigator 4.x and 5.x. The schema is backward compatible, so older Navigator instances may load newer layers with reduced features. Target the lowest version that includes the features you actually use.

Conclusion

A Navigator layer is just a JSON file with technique IDs, colours, and comments. Once you can generate one from mapped data, the same pipeline produces every coverage view you will ever need: by group, by mitigation, by data source, by detection rule. The hard work is honest accounting of which techniques you actually catch; the picture is just the byproduct.

Related Posts

Authoritative references: MITRE ATT&CK Navigator on GitHub and the layer format specification.