Fusion Retail Group has just acquired two companies. They now have customer data scattered across three systems: their own CRM, the acquired ERP, and an eCommerce platform. All three use different schemas. Your job is to build an ETL pipeline that stages all three sources, transforms and normalises the data, and merges them into a single unified customers table โ handling NULLs, deduplicating by email, and validating final row counts.
CREATE TABLE staging_crm (id INT, name VARCHAR(100), email VARCHAR(100), phone VARCHAR(20), country CHAR(2));LOAD DATA INFILE '/tmp/crm_export.csv' INTO TABLE staging_crm FIELDS TERMINATED BY ',' IGNORE 1 ROWS;INSERT INTO customers (name, email, phone, country) SELECT name, NULLIF(email,'N/A'), COALESCE(phone,'Unknown'), country FROM staging_crm;INSERT INTO customers ... ON DUPLICATE KEY UPDATE name = VALUES(name);SELECT COUNT(*) FROM customers; โ validate totalCOALESCE(a, b, c) โ returns first non-NULL valueNULLIF(val, 'N/A') โ convert sentinel strings to NULLON DUPLICATE KEY UPDATE โ upsert on unique constraintLOAD DATA INFILE โ fast bulk import from CSV