Master cloud-native incident response — reading CloudTrail events to reconstruct attack timelines, detecting IAM privilege escalation and persistence techniques, understanding the shared responsibility model, and applying AWS-native containment and eradication procedures that preserve forensic evidence while stopping active compromise.
Cloud Incident Response
Cloud IR follows the same principles as traditional incident response but requires understanding cloud-specific attack surfaces, log sources, and attacker techniques. Cloud environments present unique challenges: infrastructure is ephemeral (instances can be terminated, losing volatile evidence), logs may not persist by default, and the blast radius of a compromised cloud identity can be enormous — a single IAM key with excessive permissions can compromise an entire cloud estate within minutes.
This lab focuses on AWS, but the principles apply to Azure and GCP. The attacker techniques are similar across platforms; only the log source names and CLI syntax differ. The fundamental principle — that identity is the security boundary in cloud environments — is universal.
How Cloud IR Differs from Traditional IR
Traditional on-premises incident response assumes a relatively stable infrastructure: servers stay running, logs persist on disk, and the network perimeter provides some control over attacker movement. Cloud environments break all three of these assumptions in ways that require fundamentally different response procedures:
- Ephemeral infrastructure: Auto-scaling groups launch and terminate instances constantly. An instance involved in a security event may be terminated by normal scaling operations before you can capture its memory or disk state. Cloud IR requires proactive evidence preservation — snapshotting EBS volumes and disabling instance termination protection — before attempting any investigation or remediation.
- Default log gaps: AWS CloudTrail is enabled by default for management API events, but S3 data events, Lambda invocations, and RDS query logs require explicit enablement. Many cloud breaches cannot be fully reconstructed because the relevant logs were never enabled. The first hardening question for cloud IR readiness is: what are we not logging?
- Identity-based lateral movement: In on-premises environments, lateral movement requires network access. In cloud environments, an attacker with stolen IAM credentials can pivot to any authorised resource from their own laptop — bypassing all network-layer controls. The blast radius is defined by IAM permissions, not network topology.
- Shared responsibility: AWS manages the security of the cloud infrastructure; you manage the security of what you put in the cloud. AWS will not respond to your incident — that is entirely your responsibility. Understanding what AWS logs are available, how to access them under time pressure, and what they do and don't contain is prerequisite knowledge for cloud IR.
Traditional IR is like investigating a burglary in a building you own — the rooms are still there, the CCTV footage is still on the DVR, and you can take your time examining everything. Cloud IR is like investigating a crime in a hotel where rooms are automatically reassigned every few hours, security footage is only recorded for 90 days by default and requires you to specifically request the right footage in advance, the hotel's security team won't help you because their responsibility ends at the building — what happens inside the room is yours — and the suspect used a valid electronic keycard they obtained through a locksmith rather than breaking a window. Different tools, different assumptions, different urgency around evidence preservation.
Cloud-Specific Attack Techniques
Credential theft IAM key from GitHub, code repos, env vars, SSRF → metadata service Privilege escalation PassRole, iam:CreatePolicyVersion, iam:AttachUserPolicy, iam:AddUserToGroup Persistence New IAM user, access key creation, Lambda backdoor, SSO hijack Discovery DescribeInstances, ListBuckets, GetCallerIdentity, ListRoles Lateral movement AssumeRole, switch between accounts, cross-account trust abuse Exfiltration S3 GetObject bulk download, RDS snapshot export, EBS snapshot share Impact EC2 cryptomining, S3 ransomware (object deletion/encryption), route table Primary forensic log source: AWS CloudTrail (all API calls, 90-day default retention) Secondary sources: VPC Flow Logs, S3 Access Logs, CloudWatch Logs, GuardDuty findings
The IAM Privilege Escalation Landscape
IAM privilege escalation is one of the most critical categories for cloud IR analysts to understand. Attackers who obtain limited IAM credentials frequently use specific API call sequences to expand their permissions toward full administrator access. The Pacu framework documents over 30 distinct escalation paths from different starting IAM permission sets. Each path leaves a specific CloudTrail signature that a well-tuned alert rule can detect.
The most commonly observed paths in real incidents are: iam:CreatePolicyVersion (create a new policy version with AdministratorAccess and make it active), iam:AttachUserPolicy (attach an existing AWS-managed admin policy directly to the attacker's user), and iam:PassRole combined with service abuse (pass a highly-privileged role to an EC2 instance or Lambda, then access that service to execute code with the elevated permissions). All three share a common detection pattern: IAM modification events originating from an unexpected source IP or outside business hours.
Cloud IR Investigation in Practice
When an IAM key is exposed in a public repository, automated bots typically use it within minutes. CloudTrail shows API calls from unexpected sources providing the timeline of compromise.
# Developer pushes code with AWS key to public GitHub repo # Within 8 minutes, CloudTrail shows: Time: 14:33:01 Event: GetCallerIdentity UserAgent: aws-cli/2.x Source IP: 185.220.101.45 (Tor exit node -- not developer's IP) Time: 14:33:15 Event: ListBuckets (enumerating S3) Time: 14:33:44 Event: DescribeInstances (enumerating EC2) Time: 14:34:02 Event: GetSecretValue SecretId: prod/database/password Time: 14:34:18 Event: CreateUser UserName: backup-svc (persistence) # Key compromised, recon complete, persistence established in 77 seconds
SSRF vulnerabilities in web apps can reach the EC2 metadata service and steal the instance's IAM role credentials — granting all permissions the role has without any IAM key being exposed.
# Attacker exploits SSRF in web app to reach metadata service: GET http://169.254.169.254/latest/meta-data/iam/security-credentials/ WebAppRole (IAM role name) GET http://169.254.169.254/latest/meta-data/iam/security-credentials/WebAppRole { "AccessKeyId": "ASIA...", "SecretAccessKey": "...", "Token": "...", "Expiration": "2026-05-14T16:00:00Z" } # Temporary credentials stolen -- attacker has WebAppRole permissions # IMDSv2 (requiring a PUT token request first) prevents this attack entirely
Attackers with limited IAM permissions use specific API calls to escalate to full admin access. The sequence in CloudTrail is distinctive and should trigger immediate alerting.
# CloudTrail shows privilege escalation chain: 14:35:01 CreatePolicyVersion (attacker creates new policy with AdministratorAccess) 14:35:12 SetDefaultPolicyVersion (makes new admin version active) 14:35:33 AttachUserPolicy UserName=attacker PolicyArn=arn:aws:iam::aws:policy/AdministratorAccess 14:36:01 CreateAccessKey UserName=attacker (new persistent key created) # 95 seconds from limited to full admin access using iam:CreatePolicyVersion # The Pacu framework automates this from 30+ different starting positions
Attackers with S3 access enumerate buckets and systematically exfiltrate sensitive data. CloudTrail logs every S3 API call when data event logging is explicitly enabled.
14:40:01 ListBuckets (enumerating all accessible buckets) 14:40:22 ListObjectsV2 Bucket: corp-financial-data-prod 14:40:44 GetBucketAcl (checking if bucket is public) 14:41:01 to 14:58:33 GetObject Bucket: corp-financial-data-prod Objects downloaded: 4,821 files Total: 2.3 GB in 17 minutes # Systematic exfiltration of entire S3 bucket # GuardDuty fires: "Unusual data access from anomalous location"
Cloud IR containment uses AWS-native tools to isolate compromised identities and resources without disrupting unaffected services. Order of operations matters — preserve evidence before remediation.
Immediate (IAM key compromise) — preserve then disable aws iam delete-access-key --access-key-id AKIA... aws iam attach-user-policy --user-name attacker --policy-arn DenyAll Review and revoke all access keys created by compromised identity Containment (EC2 instance) — isolate before terminating Isolate: move instance to quarantine security group (no inbound/outbound) Preserve: create EBS snapshot before any remediation Memory: SSM Run Command to acquire memory dump if feasible Eradication Delete attacker-created IAM users, access keys, and policy versions Review CloudTrail for ALL actions taken with compromised credentials Enable AWS Config rules to detect future policy violations
What You Need to Know
CloudTrail Deep Dive — What It Logs and What It Misses
CloudTrail is the cornerstone of AWS forensics, but understanding its coverage gaps is as important as knowing how to query it. An analyst who trusts CloudTrail as a complete audit log of all activity in their AWS account will miss significant attacker activity that occurs outside its scope.
What CloudTrail Covers by Default
CloudTrail management events are enabled by default and capture all control-plane API calls: IAM changes, EC2 instance launch and termination, S3 bucket creation and policy changes, VPC configuration changes, and similar operations. These events are retained for 90 days in the CloudTrail Event History with no additional configuration.
What management events do NOT capture: the contents of S3 object operations (who downloaded which file — only that the API was called), Lambda function invocations and their input data, RDS query execution, or the actual network traffic within VPCs. For a complete investigation, management events must be supplemented by the data-plane logs described below.
What Requires Explicit Enablement
- S3 data events: Captures every GetObject, PutObject, and DeleteObject call on specific buckets. Required for exfiltration investigation. Has cost implications — only enable on sensitive buckets if budget-constrained.
- Lambda data events: Captures every Lambda function invocation. Required for investigating backdoored Lambda functions.
- VPC Flow Logs: Network traffic metadata (source IP, destination IP, port, protocol, bytes transferred). Does not capture payload content. Essential for lateral movement investigation and anomaly detection.
- CloudWatch Logs: Application-level logs from EC2 instances, ECS containers, and other services. Requires CloudWatch agent installation or direct API calls from applications.
- S3 Server Access Logs: More granular than CloudTrail S3 data events. Captures requestor IP, request URI, response code, and bytes transferred for every request.
Structured CloudTrail query patterns for the most common investigation scenarios — usable directly in Athena, CloudWatch Logs Insights, or exported to a SIEM.
--- Find all actions by a specific access key (scope compromise) --- aws cloudtrail lookup-events \ --lookup-attributes AttributeKey=AccessKeyId,AttributeValue=AKIAIOSFODNN7EXAMPLE \ --start-time 2026-05-01 --end-time 2026-05-15 --- Athena query: all IAM modifications in past 24 hours --- SELECT eventTime, userIdentity.arn, eventName, requestParameters FROM cloudtrail_logs WHERE eventSource = 'iam.amazonaws.com' AND eventTime > current_timestamp - interval '24' hour ORDER BY eventTime; --- GuardDuty findings aggregated by severity --- aws guardduty list-findings \ --detector-id [id] \ --finding-criteria '{"Criterion":{"severity":{"Gte":7}}}' # severity >= 7 = High/Critical findings requiring immediate investigation --- Find newly created IAM users in last 30 days --- SELECT eventTime, userIdentity.arn, requestParameters.userName FROM cloudtrail_logs WHERE eventName = 'CreateUser' AND eventTime > current_timestamp - interval '30' day;
Cloud IR Phase Framework
Cloud incident response follows the same NIST SP 800-61 phases as traditional IR — Preparation, Detection & Analysis, Containment, Eradication, Recovery, and Post-Incident Activity — but each phase has cloud-specific procedures and sequencing requirements that differ significantly from on-premises response.
Phase 2 — Evidence Preservation Is Non-Negotiable
The most common cloud IR failure is performing containment actions that destroy evidence before it is preserved. Deleting an IAM key immediately sounds correct but loses the complete activity record associated with it in CloudTrail. Terminating a compromised EC2 instance loses volatile memory state. Cloud IR requires a strict preserve-then-contain sequencing that many practitioners trained on traditional IR get wrong initially.
The evidence preservation checklist before any containment action:
- Export all CloudTrail events associated with the compromised credential's access key ID to persistent storage outside the affected account.
- Create an EBS snapshot of any compromised EC2 instances before isolation or termination. Tag the snapshot with the incident ID and timestamp.
- Export VPC Flow Logs for the relevant time window and relevant VPC.
- Document the current state of all IAM policies, attached policies, and inline policies for the compromised user or role — before revoking anything.
- Record the exact IAM permissions the attacker had at the time of discovery. This is needed for scoping what resources could have been accessed.
T+0:00 — Detection: AWS GuardDuty fires "UnauthorizedAccess:IAMUser/InstanceCredentialExfiltration.InsideAWS" finding. Simultaneously, GitHub's secret scanning sends an email notification that a credential was detected in a public commit pushed 8 minutes ago. The responder's first action is to pull the full CloudTrail record for the affected access key ID — not to delete the key.
T+0:05 — Scoping: CloudTrail shows GetCallerIdentity (reconnaissance), ListBuckets, ListSecrets, GetSecretValue (two secrets accessed), CreateUser, and CreateAccessKey — all within 77 seconds of the key's first use. The attacker has read two production secrets and created a new backdoor IAM user.
T+0:08 — Preserve then contain: Export CloudTrail events for both the original key and the new backdoor user. Document all permissions on both. Then: delete the original leaked key, apply a deny-all inline policy to the backdoor user, and begin rotating the two accessed secrets in parallel.
T+0:15 — Scope confirmation: Review each API call for downstream impact. GetSecretValue on prod/database/password — check RDS CloudWatch metrics and access logs for anomalous query patterns in the 15-minute window. No anomalous database activity found — likely key was collected but not yet used for database access.
T+0:30 — Eradication: Backdoor IAM user deleted. Original key deleted. Both secrets rotated in Secrets Manager with all consumers updated. Root cause: developer committed .env file containing credentials. Post-incident action: implement pre-commit hooks with secret scanning and enforce git history scrubbing policy.
Preventive Controls — Building a Resilient Cloud Posture
IR capability is the response to failure — but the goal is to build controls that prevent, detect, and limit the blast radius of cloud compromises before they escalate. The following controls represent the highest-ROI investments for cloud security posture.
No long-lived IAM access keys for human users. Use AWS SSO/IAM Identity Center with short-lived credentials. Reserve IAM users for service accounts only.
Enforce least privilege: Regularly audit IAM policies with AWS IAM Access Analyzer. Remove unused permissions. Use permission boundaries to cap maximum effective permissions.
MFA on all human accounts: Require MFA for AWS Console access. Enforce with SCPs at the organisation level so individual account administrators cannot bypass it.
IMDSv2 everywhere: Require IMDSv2 (token-based) for all EC2 instances via instance metadata options. Eliminates SSRF-based credential theft.
GuardDuty in all regions and all accounts. Route findings to a centralised security account. Enable S3 malware protection and EKS audit log monitoring.
CloudTrail with Object Lock: Write to S3 with Object Lock (Compliance mode) preventing any deletion or modification. Multi-region trail covering all regions including ones you don't actively use.
AWS Config rules: Automated compliance checks — alert on public S3 buckets, security groups with 0.0.0.0/0 ingress, IAM users without MFA, and root account usage.
Billing alerts: Unusual spending (particularly EC2 GPU instances) is often the first indicator of cryptomining. Set budget alerts for sudden spending increases.
Core Concepts Summary
You've covered the theory. Now apply it hands-on in the simulated environment.
Start Lab — Cloud IR→← Return to all labs