The Complete Guide to Cleaning Your CSV Before CRM Import (HubSpot, Zoho, Mailchimp)

CRMApr 2026· 9 min read

A CRM import is only as good as the file you feed it. Dirty data creates duplicate contacts, broken automations, failed field mappings, and hours of cleanup work you shouldn't need to do. Here's a systematic process for cleaning CSV data before it goes anywhere near your CRM.

Why CRM imports fail (and why "clean" is relative)

Every CRM has its own quirks. HubSpot is strict about email format. Salesforce requires specific date formats and has hard limits on field length. Mailchimp will silently drop records with invalid email addresses rather than telling you. Zoho CRM will reject the entire import if a required field is blank in any row.

There are two failure modes you need to care about. The first is the obvious one: the import fails entirely and you get an error message. That's frustrating but fixable — the error usually tells you what went wrong. The second is much worse: the import succeeds, but the data inside the CRM is wrong. Contacts with blank required fields, phone numbers in the email column, dates that imported as text strings, names that got truncated. That's the one that causes real damage — because you might not notice it for weeks.

Cleaning your CSV before import is how you avoid both.

Step 1: Validate the structure first

Before touching the data, check that the file itself is structurally sound. Open it in CSV Splitter Online's Validate tab and look for:

Blank column headers — a CRM has no idea where to map a column with no name
Duplicate column headers — if you have two "Email" columns, the CRM will use one and silently discard the other
Inconsistent column counts — rows with more or fewer fields than the header, caused by unquoted commas in the data
Completely empty rows — they import as blank records

Fix any structural errors before moving on. A file with inconsistent column counts can't be cleaned reliably — you need to fix the structure first.

Step 2: Remove columns you don't need

This sounds obvious but gets skipped constantly. Every column you import into a CRM takes up space in your field mapping step, adds to the chance of a mismap, and stays in the system as clutter you'll need to deal with later.

Common candidates for removal before import:

Internal database IDs from the source system (these mean nothing in the destination CRM)
Audit fields like "Created by", "Last modified date" from the source — these will be overwritten in the CRM anyway
Calculated fields that the CRM will compute itself (e.g. a "Days since last contact" field that the CRM tracks natively)
Columns with 80%+ empty values (check in CSV Splitter Online's stats panel or the Validate tab)
Duplicate information — if you have both "Full Name" and "First Name" + "Last Name" columns, keep the split version

Use CSV Splitter Online's Column Filter: uncheck the columns you want to drop, then run a split or export — only your selected columns come through.

Step 3: Standardize your data formats

This is the most time-consuming but highest-impact step. CRMs expect data in specific formats. Getting this wrong is the main cause of "import succeeded but data is wrong" failures.

Phone numbers

CRMs usually want a consistent format. The problem is that phone numbers come in every possible variation: (555) 867-5309, 5558675309, +1-555-867-5309, 555.867.5309. Most CRMs accept international E.164 format (+15558675309) but also handle plain 10-digit numbers.

Use Find & Replace in CSV Splitter Online to strip common formatting characters from your phone column: remove (, remove ), replace - with nothing, replace . with nothing, replace spaces with nothing. You'll end up with raw digit strings you can format consistently.

Dates

This is the biggest silent killer. Your Excel export shows "January 15, 2024" but Salesforce expects "2024-01-15" or "01/15/2024" depending on your locale settings. Check what format your CRM expects, then use Find & Replace or a quick Python/Excel formula to reformat.

Salesforce uses MM/DD/YYYY for dates and MM/DD/YYYY HH:MM for datetimes by default. HubSpot accepts both YYYY-MM-DD and MM/DD/YYYY. Mailchimp uses MM/DD/YYYY. When in doubt, ISO format (YYYY-MM-DD) is the most universally accepted.

Boolean / Yes-No fields

If you're importing into a checkbox or boolean field, the CRM expects a specific value. HubSpot wants true or false (lowercase). Salesforce accepts True, False, 1, 0. Your export might have Yes, No, Y, N, or ✓.

Use Find & Replace → Column → "Status checked" → Replace "Yes" with "true", "No" with "false".

Text case for names

If your source has names in all-caps (JOHN SMITH) or all-lowercase (john smith), that's going to look bad in your CRM. This is harder to fix without a formula or script, but it's worth doing for contacts that will receive emails.

In Excel: =PROPER(A2) converts to title case. In Python: df['Name'] = df['Name'].str.title()

Step 4: Handle duplicates

Importing duplicates into a CRM creates a cleanup problem that's much harder to fix after the fact. Two contacts with the same email address, or the same person imported twice with slightly different names.

The best approach depends on your CRM's deduplication settings:

HubSpot automatically deduplicates on email address during import. Two records with the same email will be merged. This is usually what you want — but verify it's on before importing.
Salesforce has optional duplicate rules you configure. Without them, it imports everything including duplicates.
Mailchimp deduplicates by email address in each list but will let you have the same contact in multiple lists.

For a clean import: use CSV Splitter Online's Deduplicate tab. Choose your key column (email is usually the right choice for contact data), and remove duplicates before import rather than relying on the CRM to handle them.

CRM deduplication isn't the same as CSV deduplication. If a contact already exists in your CRM and you import them again, that's a CRM-level duplicate — the CRM's dedup rules handle it. If you have two rows with the same email in your CSV, that's a file-level duplicate — handle it in CSV Splitter Online before you import.

Step 5: Check required fields

Every CRM has required fields for import. If any row is missing a value in a required field, the import will either fail entirely or skip that row (silently in some CRMs).

Before importing, check which fields your CRM requires and use the Row Filter in CSV Splitter Online to find rows where those columns are empty:

Filter → Column: "Email" → Condition: "is empty" → Action: "Keep matching rows"
This gives you a list of rows with no email address — either fix them or remove them from the import file

Common required fields by CRM:

Salesforce: Last Name (for contacts), Company (for leads is optional but strongly recommended)
HubSpot: Email (for contacts)
Mailchimp: Email Address
Zoho CRM: Last Name (for contacts), Company (for leads)

Step 6: Validate email addresses

Invalid email addresses are the #1 cause of import rejections in email marketing tools like Mailchimp and Klaviyo. An email is invalid if it doesn't contain @ with at least one character on each side, has spaces, or uses a clearly fake domain.

CSV Splitter Online doesn't validate email format, but you can do a quick pass with the Row Filter:

Filter → Column: "Email" → Condition: "does not contain" → Value: "@" → Keep matching rows
This shows you every email that's missing the @ sign — likely malformed or empty

For thorough email validation, Python's email-validator library or a dedicated service like ZeroBounce is more appropriate for large lists.

Step 7: Handle special characters and encoding

Names with accents (José, François, Müller), Arabic or Chinese characters, or special punctuation can cause encoding issues that corrupt your data in the CRM. The safe format for most CRMs is UTF-8.

If your CSV was created on Windows, it might be in Windows-1252 encoding rather than UTF-8, which handles accented characters differently. The giveaway is garbled characters like Ã© instead of é.

To re-save in UTF-8: open the file in Notepad++ (free), choose Encoding → Convert to UTF-8, save. Or in Python: df.to_csv('output.csv', encoding='utf-8', index=False).

A quick pre-import checklist

Before every CRM import, verify:

✓ No blank or duplicate column headers
✓ Consistent column count across all rows
✓ No completely empty rows
✓ Unnecessary columns removed
✓ Dates in the format your CRM expects
✓ Phone numbers consistently formatted
✓ Boolean fields use the values your CRM expects
✓ No duplicate records on your key field (usually email)
✓ All required fields populated
✓ File encoding is UTF-8

Clean your CSV before it causes problems

Validate, filter, deduplicate, and find-and-replace — all the tools you need in one place. Free, private, runs in your browser.

Try CSV Splitter Online free →

Frequently asked questions

My HubSpot import keeps failing with "invalid email" errors on rows that look fine. What's happening?

A few common causes: (1) there are invisible characters in the email field — leading/trailing spaces, non-breaking spaces, or zero-width characters pasted from another source. Use Find & Replace to trim whitespace from the email column. (2) The column has a mix of valid emails and values that were pasted into the wrong column during data prep. (3) The encoding is wrong and some @ symbols got corrupted. Preview your email column carefully before importing.

Should I import all my fields or just the ones I'll actually use?

Import only what you'll use. Every extra field in your CRM is a field that needs to be mapped, might conflict with an existing field, and could cause confusion for other users. It's much easier to add fields later than to remove or ignore fields that are already in the system. Strip unused columns in CSV Splitter Online's Column Filter before import.

How do I handle contacts that already exist in my CRM?

That's a CRM-level decision, not a CSV cleaning decision. Most CRMs let you choose during import: update existing records, skip existing records, or create duplicates. Check your CRM's import documentation for the exact setting. What you should handle before import is file-level duplicates — two rows in your CSV with the same email — using CSV Splitter Online's Deduplicate tab.

What's the safest date format to use across all CRMs?

ISO 8601 — YYYY-MM-DD (e.g. 2024-01-15) — is the most universally accepted format. All major CRMs accept it. The month/day ambiguity in formats like 01/05/2024 (is that January 5th or May 1st?) causes subtle import errors that are hard to catch. Use ISO format and you'll never have that problem.

Ready to try it?

Use our free, browser-based CSV tools to apply what you've just learned — no upload, no signup.

Open Split CSV →

Why CRM imports fail (and why "clean" is relative)

Step 1: Validate the structure first

Step 2: Remove columns you don't need

Step 3: Standardize your data formats

Phone numbers

Dates

Boolean / Yes-No fields

Text case for names

Step 4: Handle duplicates

Step 5: Check required fields

Step 6: Validate email addresses

Step 7: Handle special characters and encoding

A quick pre-import checklist

Clean your CSV before it causes problems

Frequently asked questions

My HubSpot import keeps failing with "invalid email" errors on rows that look fine. What's happening?

Should I import all my fields or just the ones I'll actually use?

How do I handle contacts that already exist in my CRM?

What's the safest date format to use across all CRMs?

Ready to try it?

Related articles