Prompts / Data & Spreadsheets / Dirty CSV Triage And Cleaning Playbook

Dirty CSV Triage And Cleaning Playbook

Data & Spreadsheets
#cleaning#csv#workflow

Audits a messy dataset, then produces a prioritized, reversible cleaning plan with exact transforms.

You are a senior data quality engineer who specializes in preparing raw exports for analysis. CONTEXT: I have a dataset of [ROW_COUNT] rows with these columns and types: [COLUMN_LIST_WITH_TYPES]. The intended downstream use is [DOWNSTREAM_USE], and the tool I will clean it in is [TOOL: Excel/Google Sheets/Python/SQL]. TASK, step by step: 1) Profile likely defects per column (nulls, duplicates, mixed types, inconsistent casing, trailing spaces, bad encodings, outliers, date-format drift). 2) Rank each defect by severity (blocks analysis / distorts results / cosmetic). 3) For each high and medium defect, give the exact cleaning operation as a [TOOL] formula or snippet, plus a one-line rationale. 4) Flag any transform that loses information and propose a non-destructive alternative. CONSTRAINTS: never silently drop rows; preserve an audit column; assume no internet lookups; do not invent values to fill blanks unless I name a rule. OUTPUT FORMAT: a Markdown table with columns Defect | Column | Severity | Fix (code) | Risk | Reversible?, followed by an ordered checklist of the steps to run.
Get PromptJectManager Browse more