CSV Formatting Guide for Product Matching
To ensure your product data is processed correctly by the Taxonomy Matcher, please adhere to the following CSV formatting guidelines when preparing your files for upload or when pasting CSV data directly.
1. Delimiter: Semicolon (;)
-
The Taxonomy Matcher expects CSV files to use a semicolon (
;) as the delimiter (field separator). -
This is different from the more common comma (
,) delimiter. Using commas will result in parsing errors.Correct Example (Semicolon Delimited):
id;title;description SKU001;Product A;This is product A. SKU002;Product B;This is product B.Incorrect Example (Comma Delimited):
id,title,description SKU001,Product A,This is product A. SKU002,Product B,This is product B.
2. Header Row (Required)
-
The very first row of your CSV file must be a header row.
-
This header row defines the names of your data columns.
-
The matching engine specifically looks for columns named
id,title, anddescription.Required Headers:
id: A unique identifier for your product (e.g., SKU, product code).title: The name or title of your product.description: A detailed description of your product.
Example Header Row:
id;title;description;brand;color;material(While
brand,color,materialare optional and currently not directly used by the core matching AI, including them is good practice for your data management. The AI primarily focuses ontitleanddescriptionfor categorization.)
3. Required Data Fields
For each product (each row after the header), the following fields are essential:
id:- Must be unique for each product in your list.
- Should be a string.
title:- The product's title.
- Should be a string.
- Automatically truncated to 2000 characters if longer.
description:- A comprehensive description of the product. More detail often leads to better matching.
- Should be a string.
- Automatically truncated to 2000 characters if longer.
4. File Encoding
- Use UTF-8 encoding for your CSV files. This ensures that special characters, accented letters, and different language scripts are handled correctly.
5. Quotation Marks
-
If your data fields (especially
titleordescription) contain semicolons (;), newlines, or double quotes ("), they should be enclosed in double quotes. -
If a field enclosed in double quotes contains a double quote character itself, it should be escaped by doubling it (e.g.,
"").Example with Quoting:
id;title;description SKU003;"Super Widget; ""Deluxe"" Model";"This widget is truly super, includes advanced features; and a ""deluxe"" carrying case."- In this example, the title
Super Widget; "Deluxe" Modelcontains both a semicolon and double quotes, so the entire field is enclosed in double quotes, and the internal double quotes are doubled. - The description also contains a semicolon and double quotes, handled similarly.
Most spreadsheet programs (like Excel, Google Sheets, LibreOffice Calc) handle this quoting automatically when you save or export as CSV, provided you select semicolon as the delimiter.
- In this example, the title
6. Example of a Well-Formatted CSV
id;title;description P001;Organic Green Tea;High-quality organic Sencha green tea, 50 bags. Rich in antioxidants. P002;Bluetooth Speaker X200;"Portable Bluetooth speaker with 10-hour battery life; waterproof (IPX7) and ""SuperBass"" technology." P003;Men's Running Shoes - UltraBoost;Lightweight running shoes for men, designed for comfort and performance. Available in sizes 7-13.
Tips for Preparing Your CSV
- Check Delimiter on Export: When exporting from a spreadsheet program, ensure you select "Semicolon" as the field delimiter in the export options.
- Validate Headers: Double-check that your header row exactly contains
id,title, anddescription(case-sensitive, though the application might try to be flexible, it's best to match exactly). - No Empty Rows: Avoid empty rows within your data or at the end of the file.
- Test with a Small Sample: Before uploading a large file, test with a small CSV (2-3 products) to ensure the format is correct and it's processed as expected.
By following these formatting guidelines, you'll ensure smooth processing of your product data and achieve the best possible matching results with the Taxonomy Matcher.