| id | first_name | last_name | country | joined_date | |
|---|---|---|---|---|---|
| 1 | Alice | SMITH | alice@mail.com | uk | 15/01/2023 |
| 2 | Bob | jones | BOB@MAIL.COM | USA | 2023-02-20 |
| 3 | NULL | NULL | NULL | FR | 2023-03-01 |
| 4 | Bob | jones | bob@mail.com | us | 20/02/2023 |
| 5 | Carol | White | carol@mail.com | GB | 01-04-2023 |
| 6 | David | Brown | david@mail.com | GB | 2023-05-10 |
Thornfield Retail migrated their customer database from a legacy CRM three months ago. The export was messy โ inconsistent casing, trailing spaces, NULL records, duplicate rows, and three different date formats. Marketing cannot use the data until it is clean. That is your job.
SELECT * FROM customers WHERE email IS NULL;UPDATE customers SET first_name = TRIM(INITCAP(first_name));UPDATE customers SET email = LOWER(email);DELETE FROM customers WHERE id IN (SELECT MIN(id)...);UPDATE customers SET joined_date = STR_TO_DATE(joined_date,'%d/%m/%Y') WHERE joined_date LIKE '%/%';SELECT COUNT(*) FROM customers WHERE first_name IS NULL;SELECT * FROM customers ORDER BY email;Work through each issue one at a time. Check your results with a SELECT after each UPDATE. The Before/After viewer logs your cleaning actions.