Pages - Menu

From Logs to Threats: SIEM Correlation Rules for Real Attacks

Hey there, fellow threat hunters! 👋 Today we're talking about something that separates the SIEM wizards from the alert-drowning masses: correlation rules. Because collecting millions of logs is easy, but actually catching the bad guys? That's where the magic happens. If your SIEM is just a very expensive log storage system generating more noise than a construction site, this one's for you!

From Logs to Threats: SIEM Correlation Rules for Real Attacks

The Single Event Trap

Here's the thing about security events - attackers rarely announce themselves with a single, obvious "I'M BEING MALICIOUS" event. Real attacks are like a story unfolding over time, with each event being just one chapter. A failed login attempt? Could be a typo. Fifty failed login attempts followed by a successful one, then immediate lateral movement? That's a different story entirely.

Most security teams start their SIEM journey by creating rules for individual events:

  • Alert on Event ID 4625 (failed logons)
  • Alert on Event ID 4720 (user account creation)
  • Alert on suspicious PowerShell execution

While these individual rules have their place, they're like trying to understand a movie by watching random 30-second clips. You'll miss the plot entirely. This is where correlation comes in - it's about connecting the dots across time, systems, and users to tell the complete attack story.

Understanding Attack Chains

Real attackers follow patterns mapped beautifully in the MITRE ATT&CK framework. They don't just magically appear with domain admin privileges - they follow a progression:

  • Initial Access: Phishing, exploit, credential stuffing
  • Execution: PowerShell, WMI, scheduled tasks
  • Persistence: Registry modifications, service creation
  • Privilege Escalation: Exploit elevation, credential theft
  • Lateral Movement: Pass-the-hash, RDP, network shares

Each step generates multiple log events. The key is correlating these events across time to identify the complete attack chain rather than treating each event in isolation.

Attack Scenario 1: RDP Brute Force to Lateral Movement

Let's start with a classic - RDP brute force attacks that actually succeed. Here's what this looks like in the logs:

The Attack Timeline:

  • T+0 to T+300: Multiple Event ID 4625 (failed logons) from single IP
  • T+305: Event ID 4624 (successful logon) from same IP
  • T+310: Event ID 4648 (explicit credential use) - attacker trying other systems
  • T+320: Event ID 5140 (network share access) - lateral movement begins

SIEM Correlation Rule (Splunk Example):

index=windows EventCode=4625 
| bucket _time span=5m 
| stats count by src_ip, dest_host, _time 
| where count > 10 
| join src_ip 
    [ search index=windows EventCode=4624 
    | eval success_time=_time 
    | fields src_ip, success_time, user ] 
| where success_time > _time AND success_time < (_time + 900) 
| join src_ip 
    [ search index=windows (EventCode=4648 OR EventCode=5140) 
    | eval lateral_time=_time 
    | fields src_ip, lateral_time, dest_host ] 
| where lateral_time > success_time AND lateral_time < (success_time + 600) 
| eval correlation_id=md5(src_ip + tostring(success_time)) 
| table src_ip, dest_host, user, correlation_id, _time

Key Correlation Elements:

  • Time window: Events must occur within specific timeframes
  • Source IP persistence: Same attacker IP across all events
  • Escalation pattern: Failed attempts → success → lateral movement
  • Threshold logic: More than 10 failed attempts (tune based on environment)

Attack Scenario 2: Malicious PowerShell Execution Chain

PowerShell attacks are particularly sneaky because PowerShell is a legitimate administrative tool. Here's how to correlate suspicious PowerShell usage:

The Attack Pattern:

  • Process Creation: Event ID 4688 with PowerShell execution
  • Script Block Logging: Event ID 4104 with suspicious content
  • Network Activity: Outbound connections or file downloads
  • File Creation: New executable files in temp directories

ELK Stack Correlation Query:

GET windows-logs/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "bool": {
                  "must": [
                    {"term": {"event.code": "4688"}},
                    {"wildcard": {"process.command_line": "*powershell*"}}
                  ]
                }
              },
              {
                "bool": {
                  "must": [
                    {"term": {"event.code": "4104"}},
                    {
                      "bool": {
                        "should": [
                          {"wildcard": {"powershell.script_block_text": "*downloadstring*"}},
                          {"wildcard": {"powershell.script_block_text": "*invoke-expression*"}},
                          {"wildcard": {"powershell.script_block_text": "*-encodedcommand*"}}
                        ]
                      }
                    }
                  ]
                }
              }
            ]
          }
        }
      ],
      "filter": [
        {
          "range": {
            "@timestamp": {
              "gte": "now-1h",
              "lte": "now"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "by_user": {
      "terms": {
        "field": "user.name",
        "size": 10
      },
      "aggs": {
        "event_timeline": {
          "date_histogram": {
            "field": "@timestamp",
            "interval": "1m"
          }
        }
      }
    }
  }
}

What Makes This Effective:

  • Behavioral focus: Looks for download + execution patterns
  • User correlation: Groups events by user account
  • Timeline analysis: Shows progression over time
  • Multiple indicators: Combines process creation with script content

Attack Scenario 3: Persistence Through Registry and Services

Attackers love persistence mechanisms because they want their access to survive reboots. Here's how to correlate registry modifications with service creation:

The Persistence Pattern:

  • Registry Modification: Event ID 4657 - Run keys or service parameters
  • Service Creation: Event ID 7045 - New service installation
  • File Creation: New executables in system directories
  • Process Creation: Service execution at startup

QRadar AQL Correlation:

SELECT 
    username,
    sourceip,
    CONCAT(CONCAT(username, '@'), sourceip) as user_ip,
    MIN(starttime) as first_event,
    MAX(starttime) as last_event,
    COUNT(*) as event_count,
    COLLECT(DISTINCT "Log Source Type") as log_types
FROM events 
WHERE 
    (
        (eventid = '4657' AND "Object Name" ILIKE '%\\Run%') OR
        (eventid = '7045') OR
        (eventid = '4688' AND "Process Name" ILIKE '%\\System32\\%')
    )
    AND starttime > LAST 2 HOURS
GROUP BY username, sourceip
HAVING COUNT(DISTINCT eventid) >= 2
ORDER BY first_event DESC

Advanced Correlation Logic:

# Splunk advanced correlation for persistence detection
index=windows (EventCode=4657 OR EventCode=7045 OR EventCode=4688)
| eval correlation_key=coalesce(user, User_Name, Account_Name)
| bucket _time span=30m
| stats 
    values(EventCode) as event_codes,
    values(Object_Name) as objects,
    values(Service_Name) as services,
    values(Process_Name) as processes,
    min(_time) as start_time,
    max(_time) as end_time,
    dc(EventCode) as unique_events
    by correlation_key, _time
| where unique_events >= 2
| eval duration = end_time - start_time
| where duration < 1800  // Events within 30 minutes
| eval 
    has_registry = if(match(event_codes, "4657"), 1, 0),
    has_service = if(match(event_codes, "7045"), 1, 0),
    has_process = if(match(event_codes, "4688"), 1, 0)
| where (has_registry AND has_service) OR (has_registry AND has_process)
| eval threat_score = has_registry + has_service + has_process
| table correlation_key, start_time, duration, threat_score, event_codes, objects, services

Fine-Tuning Your Correlation Rules

Creating correlation rules is an art form. Here are the key principles that separate good rules from alert spam:

Time Windows Matter

  • Too narrow: Miss related events that happen minutes apart
  • Too wide: Correlate unrelated events, creating false positives
  • Sweet spot: Match your environment's typical admin patterns

Baseline Your Environment

Before deploying correlation rules, understand your normal patterns:

# Baseline PowerShell usage patterns
index=windows EventCode=4688 Process_Name="*powershell*"
| bucket _time span=1h
| stats count by user, _time
| eval hour=strftime(_time, "%H")
| stats avg(count) as avg_hourly, max(count) as max_hourly by user, hour
| where max_hourly > (avg_hourly * 3)  // Users with unusual PowerShell spikes

Threshold Tuning

  • Start conservative: Higher thresholds, fewer false positives
  • Monitor performance: Complex correlations can impact SIEM performance
  • Iterate based on feedback: Analysts will tell you what's useful
  • Document exceptions: Known legitimate use cases

Common Correlation Pitfalls

Here's what we've learned from deploying correlation rules in the real world:

The "Everything Correlates" Trap

Don't try to correlate every event type. Focus on high-value correlations that indicate actual threats, not just "interesting" activity.

Time Zone Confusion

Make sure all your log sources are normalized to the same time zone. Nothing breaks correlation like timestamp mismatches.

Field Mapping Inconsistencies

Different log sources might use different field names for the same data. Normalize these during ingestion or account for variations in your rules.

Integration with Threat Intelligence

Take your correlation rules to the next level by incorporating threat intelligence:

# Correlate with known bad IPs
index=windows EventCode=4624
| lookup threat_intel_ips ip as src_ip OUTPUT threat_type, confidence
| where confidence > 70
| join src_ip
    [search index=windows EventCode=4648 OR EventCode=5140]
| stats count by src_ip, user, threat_type
| where count > 1

Measuring Success

How do you know if your correlation rules are working? Track these metrics:

  • True Positive Rate: Percentage of alerts that are actual threats
  • Time to Detection: How quickly rules identify attack patterns
  • False Positive Rate: Keep this under 10% for analyst sanity
  • Coverage: What percentage of MITRE techniques you can detect

Wrapping Up

Correlation rules transform your SIEM from a log vacuum into a threat detection powerhouse. The key is thinking like an attacker - understanding the sequence of actions they need to take and building rules that identify those patterns.

Start simple with basic correlations like failed/successful logons, then gradually build more sophisticated rules as you understand your environment better. Remember, the goal isn't to correlate everything - it's to correlate the things that matter for detecting real attacks.

Your future self (and your security team) will thank you when the SIEM starts catching actual threats instead of just generating pretty dashboards of meaningless metrics.

The best correlation rules are like a good detective story - they connect seemingly unrelated events to reveal the bigger picture. And unlike TV detectives, your SIEM never gets tired, never takes coffee breaks, and never misses the obvious clue because it was distracted by personal drama.

Stay safe, and happy hunting! 🕵️‍♂️

P.S. Remember, correlation rules are living documents. As attackers evolve their techniques, your rules need to evolve too. Regular tuning isn't just recommended - it's essential for staying ahead of the threats.

References

Ready to master SIEM correlation? These resources will take your threat detection game to the next level:

No comments:

Post a Comment