4.4 KiB

Raw Blame History

Branded+ Anchor Text Implementation Plan

Overview

Enhance the ingest-cora command to support "branded+" anchor text generation, which combines brand names with related searches. Add a brand mapping system to store company URLs and their associated brand names, and update the anchor text calculation logic to handle branded, branded+, and regular terms sequentially.

Components

1. Brand Mapping Storage

File: brands.json (root directory)

Format: JSON mapping normalized domains to brand name arrays

{
  "gullco.com": ["Gullco", "Gullco International"]
}

Location: Project root for easy editing
Normalization: Store only normalized domains (no www., no scheme)

2. Brand Lookup Helper (Inline)

File: src/cli/commands.py (add helper function)
Function: _get_brands_for_url(url: str) -> List[str]
- Extract domain from URL (remove scheme, www., trailing slash)
- Load brands.json from project root
- Lookup normalized domain
- Return brand names list or empty list if not found/file missing

3. Branded+ Anchor Text Generation

File: src/cli/commands.py (modify create_job_file_for_project)
Patterns: Generate two variations per related search:
- "{brand} {term}" (e.g., "Gullco welder")
- "{term} by {brand}" (e.g., "welder by Gullco")
Logic: For each brand name and each related search, generate both patterns

4. CLI Command Updates

File: src/cli/commands.py (modify ingest_cora)
New flag: --tier1-branded-plus-ratio (float, optional)
- Only prompts for branded+ if this flag is provided
- Prompts for percentage (0.0-1.0) of remaining slots after branded
Brand text prompt update:
- Show default brands from brand mapping if URL found
- Allow Enter to accept defaults
- Format: "Enter branded anchor text (company name) for tier1 [default: 'Gullco, Gullco International'] (press Enter for default):"

5. Anchor Text Calculation Logic

File: src/cli/commands.py (modify create_job_file_for_project)
Calculation order:
1. Get available terms (custom_anchor_text or related_searches)
2. Calculate branded count: total * tier1_branded_ratio
3. Calculate remaining: total - branded_count
4. Calculate branded+ count: remaining * branded_plus_ratio (if enabled)
5. Calculate regular count: remaining - branded_plus_count
Generation:
- Branded terms: Use provided brand names (cycled)
- Branded+ terms: Generate from brands + related_searches (both patterns)
- Regular terms: Use remaining related_searches/keyword variations

6. Function Signature Updates

File: src/cli/commands.py
create_job_file_for_project:
- Add tier1_branded_plus_ratio: Optional[float] = None
- Add brand_names: Optional[List[str]] = None (for branded+ generation)
ingest_cora:
- Add tier1_branded_plus_ratio: Optional[float] = None parameter
- Pass brand names to create_job_file_for_project

Implementation Details

Brand Lookup Flow

Normalize money_site_url: remove scheme (http://, https://), remove www. prefix, remove trailing slash
Look up normalized domain in brands.json
Return list of brand names or empty list if not found

Branded+ Generation Example

Brands: ["Gullco", "Gullco International"]
Related searches: ["welder", "automatic welder"]
Generated terms:
- "Gullco welder"
- "welder by Gullco"
- "Gullco automatic welder"
- "automatic welder by Gullco"
- "Gullco International welder"
- "welder by Gullco International"
- "Gullco International automatic welder"
- "automatic welder by Gullco International"

Anchor Text Distribution Example

Total available terms: 10
tier1_branded_ratio: 0.4 → 4 branded terms
Remaining: 6
tier1_branded_plus_ratio: 0.67 → 4 branded+ terms
Regular: 2 terms
Final list: [4 branded, 4 branded+, 2 regular]

Files to Modify

src/cli/commands.py - Add branded+ logic, brand lookup helper, update prompts, calculation
brands.json - New file for brand mappings (create with example entry)

Testing Considerations

Test with brand mapping present and absent
Test with Enter (default) and custom brand input
Test branded+ calculation with various ratios
Test URL normalization (with/without www., http/https)
Test with multiple brand names per URL
Test with no related searches (fallback behavior)

4.4 KiB Raw Blame History