LUCENT GRID
SQL Lab ยท ETL & Data Import โ€” Medium
Loading staging environment...
๐Ÿ–ฅ๏ธ
SQL Terminal
๐Ÿ“ฆ
Data Sources
๐Ÿ”„
Pipeline Status
๐Ÿ“Œ PIPELINE STAGES
โ—‹Create staging tables
โ—‹Load CSV data (LOAD DATA)
โ—‹Merge with INSERT SELECT
โ—‹Handle NULLs with COALESCE
โ—‹Deduplicate with ON DUPLICATE KEY
โ—‹Validate row counts match
0 / 6 complete
SQL TERMINAL โ€” etl_db (staging)
MySQL 8.0 โ€” etl_db (staging environment) Three source systems: CRM, ERP, eCommerce platform Task: Extract, transform and merge 3 sources into unified customers table โš  Sources use different schemas and NULL conventions โ€” normalise before merge โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
mysql>
DATA SOURCE MANIFEST
CRM System (Salesforce export)
File: crm_export.csv (2,841 rows)
Columns: customer_id, name, email, phone, country
Issues: phone numbers as text, some NULL emails
PENDING
ERP System (SAP extract)
File: erp_customers.csv (1,940 rows)
Columns: erp_id, full_name, email_addr, region
Issues: no phone field, uses 'N/A' for missing emails
PENDING
eCommerce Platform
Table: ecom_db.users (3,120 rows)
Columns: uid, display_name, email, signup_date
Issues: duplicates with CRM data (same email)
PENDING
ETL PIPELINE PROGRESS
01Create staging tables (staging_crm, staging_erp, staging_ecom)โ—‹
02LOAD DATA INFILE into staging tablesโ—‹
03INSERT SELECT with COALESCE and NULL handlingโ—‹
04ON DUPLICATE KEY UPDATE to handle cross-source dupesโ—‹
05Row count validation โ€” source total vs unified tableโ—‹
โ˜ฐ BRIEF
๐Ÿ–ฅ Terminal
๐Ÿ“ฆ Sources
๐Ÿ”„ Pipeline
MISSION BRIEF

MISSION BRIEFING

ETL & DATA IMPORT โ€” FUSION RETAIL GROUP

SCENARIO

Fusion Retail Group has just acquired two companies. They now have customer data scattered across three systems: their own CRM, the acquired ERP, and an eCommerce platform. All three use different schemas. Your job is to build an ETL pipeline that stages all three sources, transforms and normalises the data, and merges them into a single unified customers table โ€” handling NULLs, deduplicating by email, and validating final row counts.

PIPELINE STEPS

  • CREATE TABLE staging_crm (id INT, name VARCHAR(100), email VARCHAR(100), phone VARCHAR(20), country CHAR(2));
  • LOAD DATA INFILE '/tmp/crm_export.csv' INTO TABLE staging_crm FIELDS TERMINATED BY ',' IGNORE 1 ROWS;
  • INSERT INTO customers (name, email, phone, country) SELECT name, NULLIF(email,'N/A'), COALESCE(phone,'Unknown'), country FROM staging_crm;
  • INSERT INTO customers ... ON DUPLICATE KEY UPDATE name = VALUES(name);
  • SELECT COUNT(*) FROM customers; โ€” validate total

KEY FUNCTIONS

  • COALESCE(a, b, c) โ€” returns first non-NULL value
  • NULLIF(val, 'N/A') โ€” convert sentinel strings to NULL
  • ON DUPLICATE KEY UPDATE โ€” upsert on unique constraint
  • LOAD DATA INFILE โ€” fast bulk import from CSV
Display Mode
LAB COMPLETE
๐Ÿ†
PIPELINE COMPLETE
ETL & Data Import โ€” Fusion Retail Group
โšก
+100 XP
EXPERIENCE EARNED
โฌ† LEVEL UP โ€” Level
โ†’ Return to Labs