Manage product categorization with AI-powered accuracy —Get 100 Free Credits

Guide: Custom Taxonomy File Format

Understand the correct file format (.txt) and structure for creating custom taxonomies to use with the Taxonomy Matcher.

May 23, 20244 min readBy AI Assistant
GCT

Custom Taxonomy File Format Guide

To use your own custom taxonomies with the Taxonomy Matcher, you need to provide them in a specific plain text format. This guide details the requirements for your custom taxonomy files.

1. File Type: Plain Text (.txt)

  • Your custom taxonomy must be saved as a plain text file, typically with a .txt extension.
  • Do not use rich text formats (like .docx, .rtf) or spreadsheet formats (like .xlsx, .csv).

2. Structure: One Category Path Per Line

  • Each complete category path must be on its own line in the text file.
  • Press Enter or Return to start a new category path.

3. Hierarchy Delimiter: > (Space, Greater Than, Space)

  • Levels within a category path are separated by a greater than sign (>) surrounded by single spaces on each side.
    • Correct: Electronics > Audio > Headphones
    • Incorrect: Electronics>Audio>Headphones (missing spaces)
    • Incorrect: Electronics -> Audio -> Headphones (wrong delimiter character)
  • This delimiter clearly defines the parent-child relationships between category levels.

4. Category Naming

  • Category names can include spaces and most common characters.
  • Leading or trailing spaces around category names within a path segment (before or after the > delimiter) will generally be trimmed by the parser, but it's good practice to be consistent and avoid unnecessary extra spaces.
    • Example: Electronics > Audio > Headphones will likely be parsed as Electronics > Audio > Headphones.

5. Example Custom Taxonomy File Content

Here's an example of what the content of a custom_taxonomy.txt file might look like:

Electronics > Audio > Headphones > Over-Ear Headphones
Electronics > Audio > Headphones > In-Ear Headphones
Electronics > Audio > Speakers > Bluetooth Speakers
Electronics > Audio > Speakers > Bookshelf Speakers
Electronics > Video > Televisions > Smart TVs
Electronics > Video > Televisions > OLED TVs
Apparel > Men's Wear > Tops > T-Shirts
Apparel > Men's Wear > Tops > Formal Shirts
Apparel > Men's Wear > Bottoms > Jeans
Apparel > Men's Wear > Bottoms > Chinos
Apparel > Women's Wear > Dresses > Summer Dresses
Apparel > Women's Wear > Dresses > Evening Gowns
Apparel > Accessories > Bags > Handbags
Home Goods > Kitchen > Cookware > Pots & Pans
Home Goods > Furniture > Living Room > Sofas

Key Points to Remember:

  • File Extension: .txt
  • Encoding: Use UTF-8 encoding to ensure all characters are preserved correctly, especially if your category names include special characters or are in languages other than English.
  • One Path Per Line: Each full category path from the root to the leaf node is a new line.
  • Delimiter: > (space-greaterthan-space) is crucial for defining hierarchy.
  • No Empty Lines (Ideally): While the parser might ignore them, it's cleaner to avoid empty lines between category paths.
  • No Duplicate Paths: Each full category path should ideally be unique within the file. The system will build a tree structure, and duplicate paths might lead to ambiguity or be overwritten depending on the parsing logic.

How the System Uses This Format

The Taxonomy Matcher reads your .txt file line by line. Each line is parsed to understand the hierarchical structure based on the > delimiter. This structure is then used to build an internal tree representation of your custom taxonomy, which the AI uses to find the best match for your products.

By adhering to this simple format, you can easily create and use highly customized taxonomies tailored to your specific product catalog and business needs.

AA

AI Assistant

Guide Author

Related Guides

November 4, 2024

Advanced Feature: Hierarchical Categorization

Discover how hierarchical categorization delivers 25% better accuracy through step-by-step AI analysis using gpt-5-mini.

November 4, 2024

AI Product Image Generation with Gemini

Transform your product listings with AI-generated images in 4 professional styles using Google Gemini technology.

November 4, 2024

Batch Operations & Advanced Workflows

Master efficient bulk processing, automated workflows, and advanced filtering to handle large-scale product categorization with ease.