Back to Blocks
PXL-831Running
High

๐Ÿ” Smarter food search that understands synonyms

Labels
1-circesearchui/uxImprovement

What we're solving:

The food search engine currently has trouble understanding that different words can mean the same thing. For example, searching for "butter unsalted" won't find "butter without salt" even though they describe the same food. Additionally, the search prioritizes your personal foods over the main database regardless of match quality, so searching for "cream" might suggest "cream cheese" from your saved foods instead of plain cream from the USDA database. This task teaches the search system to recognize synonym descriptors, detect conflicting descriptors, and prioritize exact matches over partial ones.


Overview

Improve the food matching algorithm to handle descriptor synonyms, reject conflicting descriptors, and prioritize exact name matches over partial matches from higher-priority sources.


Issue 1: Descriptor Synonym Matching

Problem: Currently "butter unsalted" doesn't match "butter without salt" because the matching algorithm doesn't recognize equivalent food descriptors.

Solution: Implement synonym groups for common food descriptors:

CategorySynonyms
Salt"unsalted" โ†” "without salt" โ†” "no salt" โ†” "salt free"
Pasteurization"unpasteurized" โ†” "raw"
Sugar"unsweetened" โ†” "no sugar" โ†” "sugar free"
Fat"nonfat" โ†” "fat free" โ†” "skim" / "low fat" โ†” "reduced fat" โ†” "light"
Cooking methods"grilled" โ†” "chargrilled" โ†” "bbq", etc.

Issue 2: Conflicting Descriptors Not Handled

Problem: The synonym matching normalizes "unsalted" โ†’ no_salt, but doesn't check for conflicting descriptors. "salted" is the opposite of "unsalted" and should be rejected.

Examples:

  • โ—Query: "butter unsalted" should NOT match "Butter, salted"
  • โ—Query: "unsweetened tea" should NOT match "sweetened tea"

Solution: Add conflict detection - if query contains a descriptor, reject foods with the opposite descriptor:

  • โ—no_salt conflicts with salted
  • โ—no_sugar conflicts with sweetened
  • โ—fat_free conflicts with full_fat
  • โ—raw conflicts with cooked methods (grilled, fried, baked, boiled)

Issue 3: Exact Match Prioritization

Problem: "200g cream" matches "Cheese, cream" from My Foods instead of "Cream" from USDA dataset.

Current behavior: The priority system (My Foods > Dataset > OFF) always prefers higher-priority sources regardless of match quality.

Expected behavior: Exact or near-exact name matches in lower-priority sources should beat partial matches in higher-priority sources.

Example: When searching "cream", "Cream" from USDA should rank higher than "Cheese, cream" from My Foods because "Cream" is an exact match.


Issue 4: Brand Matches Ranked Too High

Problem: Searching "milk skim" matches "Yoghurt, plain" with brand "skim milk" instead of actual skim milk products.

Solution:

  1. โ—Deprioritize brand matches in isStrongMatch() - brand should only be a tiebreaker, not a primary match criterion
  2. โ—In relevanceScore(), give lower scores to brand-only matches
  3. โ—Consider excluding brand from single-word query matching entirely

Technical Approach

  1. โ—Add descriptorSynonymGroups dictionary in FoodMatchingUtils.swift
    • โ—Create a data structure mapping each synonym to its canonical form
    • โ—Example: ["unsalted": "no_salt", "without salt": "no_salt", "no salt": "no_salt", "salt free": "no_salt"]
  2. โ—Add descriptorConflicts dictionary to define mutually exclusive pairs
    • โ—Example: ["no_salt": ["salted"], "salted": ["no_salt"]]
  3. โ—Create normalizeDescriptor() function to map synonyms to canonical forms
    • โ—Should handle compound descriptors (e.g., "unsalted low fat")
    • โ—Case-insensitive matching
  4. โ—Create haveMatchingDescriptors() function to detect conflicts
    • โ—Returns false if query and food have conflicting descriptors
  5. โ—Update isStrongMatch() and relevanceScore() to use synonym matching and conflict detection
    • โ—Normalize both query and candidate descriptors before comparison
    • โ—Treat synonym-matched descriptors as equivalent for scoring purposes
    • โ—Reject matches with conflicting descriptors
  6. โ—Modify pickBestCandidate() to consider match quality across all sources before applying priority
    • โ—Introduce a match quality threshold (e.g., exact match = 1.0, strong match = 0.9, partial = 0.5)
    • โ—Only apply source priority as a tiebreaker when match quality is similar
    • โ—Proposed logic: finalScore = matchQuality * 100 + sourcePriority (ensures quality dominates)

Files to Modify

  • โ—NutriKit/Voice/FoodMatchingUtils.swift - Main implementation location
  • โ—NutriKit/Voice/TextLoggingModel.swift - May need updates if search flow changes affect the model layer

Acceptance Criteria

  • โ— Synonym matching: "butter unsalted" matches "butter without salt" with high confidence
  • โ— Synonym matching: "unsweetened yogurt" matches "yogurt no sugar"
  • โ— Conflict detection: "butter unsalted" does NOT match "Butter, salted"
  • โ— Conflict detection: "unsweetened tea" does NOT match "sweetened tea"
  • โ— Exact match priority: Searching "cream" returns "Cream" from USDA over "Cheese, cream" from My Foods
  • โ— Exact match priority: Searching "chicken" returns "Chicken" over "Chicken salad" when exact match exists
  • โ— Brand deprioritization: "milk skim" matches milk products, not yogurt with "skim milk" brand
  • โ— Brand deprioritization: Name/detail matches always rank higher than brand-only matches
  • โ— Existing exact matches continue to work correctly (no regressions)
  • โ— Unit tests added for synonym normalization and conflict detection logic

Related

Follow-up to PXL-826 (Voice/Text Logging food matching fix) Incorporates requirements from PXL-837 (now canceled as duplicate)


Build instruction: Use -destination 'platform=iOS Simulator,name=iPhone 17 Pro' when building this project.

Created Jan 3, 2026, 10:11 AM ยท Updated Jan 6, 2026, 1:09 PM