Blue Team · Hard
SIEM Alert Correlation

Master the SOC analyst's core synthesis skill — understanding why no single SIEM alert tells the full story, correlating events across identity, endpoint, network, and email sources to reconstruct a complete kill chain, writing SPL and KQL correlation rules that fire on multi-source attack patterns, mapping every observed technique to MITRE ATT&CK, and producing a professional incident timeline and severity-rated report suitable for management and legal review.

Hard Blue Team Path ⏱ 26 min read
Learning Progress
0%

SIEM and Alert Correlation

A Security Information and Event Management system aggregates logs from across the entire environment — endpoints, network devices, cloud services, identity platforms — and applies correlation rules to surface potential threats. The challenge is that a single attack generates dozens of alerts across different systems. The analyst's job is to correlate these into a coherent attack timeline.

Alert correlation is the core analytical skill of a SOC analyst. It transforms individual data points — a failed login here, a suspicious process there, an unusual outbound connection — into a unified understanding of what the attacker did, in what order, and where they are now. This synthesis skill is what separates an analyst who can investigate from one who can only respond to pre-configured alerts.

💡The full picture: No single alert tells the whole story. A failed login is noise. A failed login + success from a different country + large download + email forwarding rule creation = confirmed account takeover with four separate required response actions. Correlation is what transforms noise into signal.

Why Correlation Requires Multiple Sources

Each SIEM data source has a narrow window of visibility. The email gateway sees delivery and clicks but not what the process did after executing the attachment. The endpoint EDR sees process execution and file creation but not the network traffic to C2. The firewall sees the outbound connection but not which process on which endpoint made it. The identity platform sees authentication events but not what the authenticated session accessed. Only by joining all four sources on the common fields — timestamp, username, hostname, IP — does the complete attack chain emerge.

This is not a limitation to be overcome with better tools. It is the fundamental architecture of modern enterprise IT: different systems log different things, and the attacker's trail runs through all of them. The analyst who can correlate across all sources is the analyst who catches attacks that single-source rules miss entirely.

📌 Non-Technical Analogy

Imagine reconstructing a bank robbery from security footage. Camera 1 (entrance) shows someone entering at 14:22 — just a person walking in, nothing unusual. Camera 2 (vault corridor) shows the same person at 14:31 — still not suspicious on its own, customers walk there. Camera 3 (vault) shows them at 14:33 removing contents from a safe deposit box. Camera 4 (exit) shows them leaving at 14:47 carrying a bag they didn't have when they entered. Any single camera shows nothing definitively wrong. But joined by timestamp and person, across all four cameras, the complete crime is visible. SIEM correlation is this — joining events from different systems on shared identifiers to reconstruct a crime that was invisible from any single perspective.

SIEM Data Sources

Core SIEM Ingestion Sources — What Each Sees
Identity    Azure AD, Active Directory -- logon events, MFA, privilege changes
Endpoint    EDR, Sysmon, Windows Events -- process, file, registry, DLL activity
Network     Firewall, proxy, DNS, NetFlow -- connections, traffic patterns
Email       Mail gateway -- phishing delivery, link clicks, attachment activity
Cloud       AWS CloudTrail, Azure Activity Log -- API calls, config changes
Application Web app logs, database audit, API gateways

SIEM Correlation in Practice

Example 01Correlating a phishing-to-compromise kill chain

Individual alerts from different sources that appear unrelated in isolation are actually one complete attack chain when joined by user and time.

09:14  [EMAIL]    Phishing email delivered to j.smith@corp.com
09:22  [PROXY]    j.smith clicks link: malicious-site.xyz/invoice.html
09:23  [PROXY]    payload.exe downloaded from malicious-site.xyz
09:24  [ENDPOINT] New process: payload.exe spawned by chrome.exe
09:25  [ENDPOINT] Encoded PowerShell executed (C2 staging)
09:26  [FIREWALL] Outbound: CORP-WS-022 to 185.220.101.45:443 ESTABLISHED
09:26  [ENDPOINT] Persistence: HKCU Run key created
# Six separate source-specific alerts. Joined on j.smith + CORP-WS-022 + timestamp:
# One complete phishing-to-C2-to-persistence chain -- 12 minutes start to finish
Example 02Writing a correlation rule (impossible travel)

Correlation rules automatically detect multi-event patterns across sources. This Splunk SPL rule fires whenever the same user authenticates from two different countries within one hour.

# Splunk SPL -- impossible travel detection across Azure AD sign-in logs:
index=auth sourcetype=azure_ad EventType=Signin
| stats values(Country) as countries values(src_ip) as ips
        earliest(_time) as first latest(_time) as last by user
| where mvcount(countries) > 1
| eval time_diff = last - first
| where time_diff < 3600
| table user countries ips time_diff
# Fires when same user logs in from 2+ countries within 1 hour
# Severity: High -- near-certain account compromise
Example 03MITRE ATT&CK mapping

Mapping observed behaviours to ATT&CK provides a common language for reporting and gap analysis — revealing which techniques your detection coverage misses.

Initial Access    T1566.001  Spearphishing Attachment
Execution         T1059.001  PowerShell (encoded command, -enc flag)
Persistence       T1547.001  Registry Run Keys (HKCU\...\Run)
C2                T1071.001  Web Protocols (HTTPS port 443)
Lateral Movement  T1021.002  SMB / Windows Admin Shares (PsExec/lateral)
Example 04Building the incident timeline

The incident timeline is the primary deliverable — a chronological narrative that documents every confirmed event, suitable for management briefing, legal review, and insurance claims.

14 May 2026 -- Incident INC-2026-047

09:14:33  Phishing email delivered to j.smith (mail gateway)
09:22:01  j.smith clicks embedded link (proxy log)
09:23:44  payload.exe downloaded from malicious-site.xyz (proxy + endpoint)
09:24:12  payload.exe executed; parent: chrome.exe (Sysmon Event 1)
09:25:01  Encoded PowerShell: C2 beacon established to 185.220.101.45
09:26:18  Outbound HTTPS to 185.220.101.45:443 ESTABLISHED (firewall)
09:26:55  HKCU Run key: WindowsUpdate persistence (Sysmon Event 13)
10:14:22  RDP lateral movement to CORP-SRV-02 (Event 4624 Type 10)
Example 05Severity scoring and escalation thresholds

Every incident needs a severity rating to determine escalation path, response SLA, and resource allocation. These thresholds should be defined in the IR playbook before incidents occur, not decided under pressure during one.

Critical  Active ransomware / confirmed data exfiltration / domain compromise
          SLA: 15-minute response, CISO notification, crisis team activated

High      Active C2 beacon / lateral movement confirmed / privileged account compromise
          SLA: 1-hour response, security manager notification, IR playbook initiated

Medium    Phishing delivered (no click confirmed) / policy violation / anomalous behaviour
          SLA: 4-hour response, analyst investigation, potential escalation

Low      Blocked attempt / informational alert / single failed login
          SLA: Next business day, log and continue monitoring

What You Need to Know

🔗
Alert Correlation
Connecting related alerts across data sources by time, user, IP, and hostname to reveal an attack chain that no single alert would surface. The core analytical skill that distinguishes SOC investigators from alert processors.
🗺️
MITRE ATT&CK
A globally accessible adversary behaviour knowledge base. Mapping incidents to ATT&CK provides a common language for reporting, enables cross-organisation comparison, and reveals detection coverage gaps by tactic.
⏱️
Incident Timeline
A chronological narrative of every confirmed event with source attribution. The primary deliverable of any SIEM investigation — used for management briefing, legal review, insurance, and post-incident lessons learned.
📊
SIEM Query Languages
Splunk uses SPL, Elastic uses EQL/KQL, Microsoft Sentinel uses KQL. Different syntax but the same goal: filter, aggregate, and correlate event data at scale to surface threats that individual rules miss.

Writing Effective Correlation Rules

Correlation rules are the automation layer that makes a SIEM more than a log storage system. A well-designed rule fires reliably on real attacks, rarely on legitimate activity, and when it fires, provides enough context for an analyst to make an immediate triage decision without additional investigation. A poorly designed rule either floods the queue with false positives (training analysts to ignore it) or misses attacks entirely through over-specificity.

The balance between sensitivity and specificity is the central challenge of rule design. Rules should be specific enough to have low false positive rates — but not so specific that they only catch one historical attack and miss all variants. The best rules are written at the behavioural level (technique) rather than the indicator level (specific IP or hash), because techniques persist while indicators rotate.

Effective Rule Patterns

Multi-source correlation: Rules that require evidence from two or more independent sources raise confidence dramatically. A C2 connection that also has a corresponding process injection on the endpoint is more reliable than either signal alone.

Sequence detection: Rules that require events to occur in a specific order within a time window (phishing delivery → link click → payload execution, all within 15 minutes) filter out coincidental co-occurrence of unrelated events.

Baseline deviation: Rules that compare current to historical baseline (user downloaded 10x their normal weekly volume today) catch behaviour that no signature covers but that stands out against personal history.

Common Rule Failure Modes

Too broad: "Any process that runs PowerShell" fires on every legitimate IT operation. Narrow by: PowerShell with -enc flag, OR from a non-admin user account, OR spawned by Office applications, NOT from known admin hosts during maintenance windows.

Too specific: "Alert on IP 185.220.101.45" catches exactly one historical attack. A better rule: "Any internal host establishing HTTPS connection to IP with no prior history + high entropy certificate + beacon timing regularity."

Missing time bounds: Rules without time windows can correlate events days apart that have no causal relationship. Most attack sequences complete within hours — a 2-hour window eliminates most spurious correlations.

Example 06Multi-source C2 detection rule (KQL/Sentinel)

A correlation rule that requires both endpoint (process creation) and network (firewall outbound) evidence within 5 minutes — dramatically reducing false positives compared to either signal alone.

// Microsoft Sentinel KQL -- multi-source C2 detection
// Requires: encoded PowerShell on endpoint AND outbound HTTPS to new external IP
let endpoint_events = SecurityEvent
| where EventID == 4688
| where CommandLine has "-enc" or CommandLine has "-EncodedCommand"
| project TimeGenerated, Computer, Account, CommandLine;

let firewall_events = AzureFirewallNetworkRule
| where Action == "Allow" and DestinationPort == "443"
| where DestinationIP !startswith "10." and DestinationIP !startswith "172."
| project TimeGenerated, SourceIP, DestinationIP;

endpoint_events
| join kind=inner (firewall_events) on $left.Computer == $right.SourceIP
| where abs(datetime_diff('minute', TimeGenerated, TimeGenerated1)) < 5
| project TimeGenerated, Computer, Account, CommandLine, DestinationIP
// Fires only when: encoded PowerShell runs AND outbound HTTPS follows within 5 minutes
// Both signals required -- dramatically reduces false positives from either alone

MITRE ATT&CK as a Detection Coverage Map

MITRE ATT&CK is most useful not just as a labelling system for incidents, but as a structured map for identifying where your detection coverage has gaps. For each tactic in the kill chain, mapping your current SIEM rules to ATT&CK technique IDs reveals which techniques you can detect and — critically — which techniques an attacker could use in your environment without triggering any alert.

TacticCommon TechniqueATT&CK IDDetection Data Source
Initial AccessSpearphishing AttachmentT1566.001Email gateway: attachment type + sender reputation + SPF/DKIM result
ExecutionPowerShell (encoded)T1059.001Event 4688: CommandLine contains -enc or -EncodedCommand + non-admin parent
PersistenceRegistry Run KeysT1547.001Sysmon Event 13: registry write to \CurrentVersion\Run by non-system process
Defence EvasionProcess InjectionT1055Sysmon Event 8 (CreateRemoteThread) + Event 10 (ProcessAccess lsass)
Credential AccessOS Credential DumpingT1003Sysmon Event 10: lsass.exe accessed by non-system process; Event 4624 + NTLM
Lateral MovementPsExec / SMB Admin SharesT1569.002Event 7045 (PSEXESVC) + Event 5140 (ADMIN$ write) from same source within 2 min
CollectionData from Local SystemT1005File access audit: high volume reads from sensitive directories outside business hours
ExfiltrationExfiltration Over Web ServiceT1567Proxy: large POST to cloud storage domains (Dropbox, MEGA, Google Drive) after hours
Coverage gap analysis: After mapping your current rules to this table, any tactic row with no coverage is a detection gap — a phase of the attack kill chain where an attacker could operate without triggering any SIEM alert. The gap analysis drives the detection development roadmap: prioritise writing rules for the tactics with the highest attacker value (Credential Access, Lateral Movement) that currently have no coverage.

Professional Incident Report Structure

The incident timeline is necessary but not sufficient as a deliverable. A professional incident report contextualises the timeline for different audiences — technical (what happened and how), management (business impact and remediation status), and legal (evidence chain and liability assessment). Writing for multiple audiences from a single investigation is a core analyst skill.

Required Report Sections

Executive summary: 3–5 sentences. What happened, who was affected, what was done. No technical jargon. Suitable for CISO to brief board.

Timeline: Complete chronological event list with timestamps, sources, and analyst confidence level for each entry. The factual backbone of the report.

Technical findings: Attack vector, tools/techniques used, IOCs, ATT&CK mapping, affected systems, data exposure assessment.

Containment and remediation: Every action taken, by whom, at what time. Patch applied, account reset, firewall rule added — all documented.

Recommendations: What would have detected or prevented this incident? New SIEM rule, policy change, security control. Prioritised by risk reduction value.

Quality Standards

Source attribution for every finding: Never write "attacker accessed the database" without citing the log entry that proves it. Every claim in the technical section must reference a specific log source, timestamp, and field.

Distinguish confirmed from suspected: "Confirmed: attacker established C2 (firewall log, 09:26:18)" versus "Suspected: attacker may have accessed source code (access logs not available for this server)." Unsupported claims undermine the entire report.

Confidence levels: High (multiple corroborating sources), Medium (single source, no corroboration), Low (inferred from context). Makes clear to legal counsel which findings are defensible and which are assessments.

Consistent timestamps and timezones: All timestamps in UTC, all sources normalised. A timeline with mixed timezones is a legal liability — courts and regulators will challenge inconsistencies.

Multi-Source Correlation — Full Kill Chain Investigation

SIEM ScenarioPhishing to Lateral Movement — Assembled from Six Data Sources

Starting point: SOC queue shows three separate low-severity alerts within 12 minutes: (1) Mail gateway: suspicious attachment delivered to j.smith, (2) Proxy: executable downloaded by j.smith's workstation, (3) Firewall: CORP-WS-022 establishing outbound HTTPS to an unknown IP. Each alert individually is Medium or below. An analyst correlates all three on username j.smith and hostname CORP-WS-022.

Correlation query (Splunk SPL): Searches all sources for j.smith and CORP-WS-022 in a 30-minute window. Returns: email delivery at 09:14, proxy click and download at 09:22-09:23, Sysmon process creation at 09:24 (chrome.exe → payload.exe → powershell.exe -enc), firewall outbound HTTPS at 09:26, Sysmon registry write at 09:26 (HKCU\Run\WindowsUpdate). Six sources, eight events, one attack chain confirmed in 4 minutes of investigation.

ATT&CK mapping: T1566.001 (Spearphishing Attachment) → T1059.001 (PowerShell encoded) → T1547.001 (Registry Run Key) → T1071.001 (HTTPS C2). Four techniques, four detection data sources, one complete kill chain from initial access to C2 establishment.

Escalation: Severity upgraded from Medium (individual alerts) to High (correlated C2 establishment). IR playbook IR02 (Malware Endpoint) initiated. CORP-WS-022 network quarantined via EDR. j.smith account disabled pending investigation. C2 IP 185.220.101.45 blocked at firewall. Lateral movement check initiated: EDR query for any connections from CORP-WS-022 to other internal hosts after 09:26.

Lateral movement finding: NetFlow shows CORP-WS-022 → CORP-SRV-02:3389 (RDP) at 10:14:22, successful (Event 4624 Type 10 on SRV-02). Scope expanded to two hosts. SRV-02 isolated. Timeline extended. Severity upgraded to Critical.

Incident report: Timeline documented across 6 sources with full source attribution. ATT&CK mapping provided for security operations post-mortem. Three recommendations: (1) Block executable downloads at proxy for users who have not clicked known-phishing links before — SIEM rule to create dynamic block list, (2) Real-time 4688+firewall correlation rule deployed (the one written in Example 06), (3) RDP between workstations and servers blocked at firewall — no legitimate use case identified.

Core Concepts Summary

🔗
Correlation = Synthesis
Join events on username, hostname, and IP across sources. A single event is noise; the same user appearing in email + proxy + endpoint + firewall within minutes is a kill chain. Time-bound correlation windows eliminate spurious matches.
🗺️
ATT&CK as Coverage Map
Map every SIEM rule to a technique ID. Tactics with no coverage = blind spots. Credential Access and Lateral Movement gaps are highest priority — they're the steps between initial compromise and domain takeover.
⚙️
Rule Design Principles
Multi-source rules over single-source. Sequence detection with time windows. Baseline deviation for technique-level coverage. Narrow to behaviour (encoded PowerShell by non-admin) not indicator (specific IP). Test against known-good before deploying.
📊
SPL vs KQL
Splunk SPL: pipe-based, stats/eval/where/table. Microsoft Sentinel/Elastic KQL: SQL-like joins, let statements, summarize. Different syntax, identical goals. Learn the pattern, translate to your platform's syntax.
⏱️
Timeline Quality
Source attribution for every event. UTC normalised. Confidence levels (confirmed/suspected). Separate technical findings from analyst inferences. The timeline is a legal document — every entry must be defensible with a specific log reference.
🎯
Severity Escalation
Individual alert severity + correlation context = final severity. Three Medium alerts correlating to one kill chain = High or Critical. Pre-defined escalation thresholds with SLAs prevent inconsistent severity decisions under pressure.
📝
Report Audiences
Executive summary: 3-5 sentences, no jargon, business impact. Technical findings: source-attributed, ATT&CK-mapped, IOC-listed. Recommendations: prioritised by risk reduction. Same investigation, three different deliverables.
🔄
Findings to Rules
Every investigation should produce at least one new detection rule. The phishing-to-C2 chain you just correlated manually should be automated so the next instance fires immediately. Investigations close the gap once; rules close it permanently.
Ready to put it into practice?
Proceed to the Lab

You've covered the theory. Now apply it hands-on in the simulated environment.

Start Lab — SIEM Dashboard
← Return to all labs