Parsing Windows Event Logs (EVTX) with Python

Q: Can python-evtx read a log from a live, running system?

It reads .evtx files, so you point it at a saved export or a copy from C:\Windows\System32\winevt\Logs\. The active log file can be locked by the Event Log service, so export or copy it first. Reading a saved copy also keeps your analysis off the source host.

Q: Is python-evtx or the Rust evtx parser better?

For small files and maximum portability, python-evtx is pure Python with no compiled dependency. For large files where speed matters, the Rust-backed evtx is considerably faster. Both give you the record as XML, so switching between them changes only a few lines.

Q: How do I extract a specific field like the account name?

Event-specific fields live in the EventData block as Data elements with a Name attribute. Match on the Name attribute, for example EventData/Data[@Name='TargetUserName'], and read the element's text.

Parsing Windows Event Logs (EVTX) with Python — header banner on SecurityScriptographer

If you want to parse Windows Event Logs with Python, the saved .evtx files are a binary XML format that you cannot just open and read line by line. This Python Quick Guide walks through reading an .evtx export with the python-evtx library, pulling out the fields that matter for detection, and filtering by Event ID — the same work I would otherwise do in PowerShell, but in a form that travels to Linux analysis boxes and slots into a larger pipeline.

Key Takeaways

Windows Event Logs are stored as binary XML in .evtx files, so to parse Windows Event Logs with Python you need a dedicated parser rather than plain text handling.
The pure-Python python-evtx library reads .evtx files on any platform and hands you each record as XML you can walk with the standard library.
Filtering by Event ID means parsing the record XML and reading the EventID element under the event's System block, which lives in a specific XML namespace.
For multi-gigabyte logs, the Rust-backed evtx parser is considerably faster than the pure-Python option, at the cost of a compiled dependency.
Reading saved .evtx exports offline keeps your analysis off the source host and avoids touching a system you may need to preserve.

Environment

Python 3.9+ on Windows 11, though the same code runs unchanged on Linux or macOS.
python-evtx installed via pip install python-evtx (the import name is Evtx).
A saved .evtx export — for testing I used a copy of Security.evtx pulled from C:\Windows\System32\winevt\Logs\.
Standard-library xml.etree.ElementTree for reading fields out of each record — no extra dependency.

The Problem

On a live Windows box I would reach for Get-WinEvent and be done. The trouble starts when the logs are not on a live Windows box: an analyst hands me a folder of exported .evtx files from an incident, or the logs land on a Linux host that has no Get-WinEvent at all. The .evtx format is binary XML with chunked records and templated structures, so opening one in a text editor gives you a wall of unreadable bytes. You cannot grep it, and you cannot stream it like a CSV.

Python gives me a portable way to read these files anywhere, and once each record is XML I can do whatever I want with it — filter by Event ID, extract account names, count failed logons, or feed the result into the same kind of analysis I described in my SIEM correlation walkthrough. The one thing to get right first is the parser, because the binary format is not something you want to decode by hand.

The Solution — Parse Windows Event Logs with Python

Step 1 — Read every record out of an .evtx file

The python-evtx library exposes the file as a context manager and yields each record in turn. Calling record.xml() gives you the event as a readable XML string. This is the whole minimal parser:

import Evtx.Evtx as evtx

with evtx.Evtx("Security.evtx") as log:
    for record in log.records():
        print(record.xml())

That prints every event in the file as XML. Note that records() is a method — the parentheses matter. Each record's XML is a complete <Event> element with a System block (metadata such as the Event ID, time, and computer) and an EventData block (the event-specific fields). If you only need a quick human-readable dump and not a script, the library also ships a command-line tool, evtx_dump.py, that does exactly this.

Step 2 — Pull out the Event ID and timestamp

Printing raw XML is not analysis. To filter and report, I parse each record's XML with the standard library and read specific elements. The catch that trips people up: every element sits in the Windows event schema namespace, so a plain find("System/EventID") returns None. You have to register the namespace and prefix your paths with it:

import Evtx.Evtx as evtx
import xml.etree.ElementTree as ET

# The namespace every Windows event XML element lives in
NS = {"e": "http://schemas.microsoft.com/win/2004/08/events/event"}

with evtx.Evtx("Security.evtx") as log:
    for record in log.records():
        root = ET.fromstring(record.xml())
        system = root.find("e:System", NS)
        event_id = system.find("e:EventID", NS).text
        time_created = system.find("e:TimeCreated", NS).get("SystemTime")
        print(f"{time_created}  EventID={event_id}")

The timestamp is an attribute (SystemTime) on the TimeCreated element, not element text, which is why it is read with .get() rather than .text. That asymmetry is easy to miss and produces a confusing AttributeError if you guess wrong.

Step 3 — Filter for the events you actually care about

Most of the time I am after one or two Event IDs — failed logons (4625), successful logons (4624), or whatever the investigation calls for. Filtering is just a comparison, but the EventData fields are stored as named <Data Name="..."> elements, so a small helper to pull a field by name keeps the code readable:

import Evtx.Evtx as evtx
import xml.etree.ElementTree as ET

NS = {"e": "http://schemas.microsoft.com/win/2004/08/events/event"}

def get_data(root, name):
    """Return the text of an EventData/Data element by its Name attribute."""
    node = root.find(f"e:EventData/e:Data[@Name='{name}']", NS)
    return node.text if node is not None else None

with evtx.Evtx("Security.evtx") as log:
    for record in log.records():
        root = ET.fromstring(record.xml())
        event_id = root.find("e:System/e:EventID", NS).text
        if event_id != "4625":          # failed logon only
            continue
        account = get_data(root, "TargetUserName")
        source_ip = get_data(root, "IpAddress")
        print(f"Failed logon: account={account} src={source_ip}")

Event ID 4625 is the failed-logon event, and the TargetUserName and IpAddress fields are where a password-spray or brute-force pattern shows up. This is the offline equivalent of the kind of monitoring I described in essential Windows Event IDs for security monitoring — same events, read from a saved file instead of a live channel. From here, counting failures per source IP or per account is a few more lines with collections.Counter.

Python python-evtx parser output printing failed-logon Event ID 4625 records with account name and source IP from a Security.evtx export

The Step-3 filter on a real Security.evtx export: each 4625 reduced to account and source IP, parsed offline in pure Python — the same events read without touching the source host.

Step 4 — Know when to switch to the faster parser

python-evtx is pure Python, which is exactly what you want for portability and for reading the code to understand it. The trade-off is speed: on a multi-gigabyte Security.evtx it is noticeably slow. When throughput matters, the Rust-backed evtx parser (installed with pip install evtx, imported as PyEvtxParser) parses the same files much faster because the heavy lifting happens in compiled code:

from evtx import PyEvtxParser

parser = PyEvtxParser("Security.evtx")
for record in parser.records():
    # record is a dict: event_record_id, timestamp, and 'data' (XML string)
    print(record["timestamp"], record["data"][:80])

Both libraries hand you XML in the end, so the parsing logic from Steps 2 and 3 carries over with only the iteration changed. I reach for python-evtx when I want a dependency-free script I can drop on any machine, and for the Rust-backed evtx when I am grinding through large collections.

The script from this post is available in my defensive-toolkit repository on GitHub.

Frequently Asked Questions

Why does find() return None when parsing EVTX XML in Python?

Because every element is in the Windows event schema namespace. A path like find("System/EventID") looks for elements with no namespace and finds nothing. Register the namespace (http://schemas.microsoft.com/win/2004/08/events/event) and prefix each path segment, as in Step 2.

Can python-evtx read a log from a live, running system?

It reads .evtx files, so you point it at a saved export or a copy from C:\Windows\System32\winevt\Logs\. The active log file can be locked by the Event Log service, so the reliable approach is to export or copy it first. Reading a saved copy also keeps your analysis off the source host.

Is python-evtx or the Rust evtx parser better?

For small files and maximum portability, python-evtx is pure Python with no compiled dependency. For large files where speed matters, the Rust-backed evtx is considerably faster. Both ultimately give you the record as XML, so switching between them changes only a few lines.

How do I extract a specific field like the account name?

Event-specific fields live in the EventData block as <Data Name="..."> elements. Match on the Name attribute — for example EventData/Data[@Name='TargetUserName'] — and read the element's text, as shown by the get_data helper in Step 3.

Conclusion

Parsing .evtx files in Python is not hard once you accept that the format is binary XML and let a real parser handle it. python-evtx gives you each record as XML, the standard library reads the fields, and the only genuine gotcha is the XML namespace that quietly breaks every path you write until you register it.

The honest limitation is performance: the pure-Python parser is fine for an export from a single host but slow across a large collection, which is where the Rust-backed parser earns its compiled dependency. Either way, the value is portability — the same script reads Windows logs on a Linux analysis box, and the parsed output feeds straight into counting, correlation, or whatever the investigation needs next.

Essential Windows Event IDs for Security Monitoring — which Event IDs to filter for once you can read the logs.
PowerShell Quick Guide: Process Investigation — the live-system counterpart to this offline parsing approach.
From Logs to Threats: SIEM Correlation Rules for Real Attacks — where parsed events turn into detections.

Editorial note: posts on this blog are drafted with AI assistance and then reviewed, edited, and tested against a real environment before publishing. Commands, output, and screenshots come from systems I actually ran the work on.

Through Security Scriptographer, I transform complex security concepts into practical scripts and tutorials. Proficient in PowerShell, Python and various security frameworks, I'm here to help others enhance their security toolkit. Simple code, serious security. 🛡️

Parsing Windows Event Logs (EVTX) with Python

Key Takeaways

Environment

The Problem

The Solution — Parse Windows Event Logs with Python

Step 1 — Read every record out of an .evtx file

Step 2 — Pull out the Event ID and timestamp

Step 3 — Filter for the events you actually care about

Step 4 — Know when to switch to the faster parser

Frequently Asked Questions

Why does find() return None when parsing EVTX XML in Python?

Can python-evtx read a log from a live, running system?

Is python-evtx or the Rust evtx parser better?

How do I extract a specific field like the account name?

Conclusion

Related Posts

0 comments:

Post a Comment

Search

most popular blogs

Detecting Kerberoasting with Windows Event ID 4769

PowerShell Script Block Logging with Event ID 4104

Important References

Categories

Blog Archive

Report Abuse