117 lines
4.4 KiB
Markdown
117 lines
4.4 KiB
Markdown
# Branded+ Anchor Text Implementation Plan
|
|
|
|
## Overview
|
|
|
|
Enhance the `ingest-cora` command to support "branded+" anchor text generation, which combines brand names with related searches. Add a brand mapping system to store company URLs and their associated brand names, and update the anchor text calculation logic to handle branded, branded+, and regular terms sequentially.
|
|
|
|
## Components
|
|
|
|
### 1. Brand Mapping Storage
|
|
|
|
- **File**: `brands.json` (root directory)
|
|
- **Format**: JSON mapping normalized domains to brand name arrays
|
|
```json
|
|
{
|
|
"gullco.com": ["Gullco", "Gullco International"]
|
|
}
|
|
```
|
|
- **Location**: Project root for easy editing
|
|
- **Normalization**: Store only normalized domains (no www., no scheme)
|
|
|
|
### 2. Brand Lookup Helper (Inline)
|
|
|
|
- **File**: `src/cli/commands.py` (add helper function)
|
|
- **Function**: `_get_brands_for_url(url: str) -> List[str]`
|
|
- Extract domain from URL (remove scheme, www., trailing slash)
|
|
- Load brands.json from project root
|
|
- Lookup normalized domain
|
|
- Return brand names list or empty list if not found/file missing
|
|
|
|
### 3. Branded+ Anchor Text Generation
|
|
|
|
- **File**: `src/cli/commands.py` (modify `create_job_file_for_project`)
|
|
- **Patterns**: Generate two variations per related search:
|
|
- `"{brand} {term}"` (e.g., "Gullco welder")
|
|
- `"{term} by {brand}"` (e.g., "welder by Gullco")
|
|
- **Logic**: For each brand name and each related search, generate both patterns
|
|
|
|
### 4. CLI Command Updates
|
|
|
|
- **File**: `src/cli/commands.py` (modify `ingest_cora`)
|
|
- **New flag**: `--tier1-branded-plus-ratio` (float, optional)
|
|
- Only prompts for branded+ if this flag is provided
|
|
- Prompts for percentage (0.0-1.0) of remaining slots after branded
|
|
- **Brand text prompt update**:
|
|
- Show default brands from brand mapping if URL found
|
|
- Allow Enter to accept defaults
|
|
- Format: "Enter branded anchor text (company name) for tier1 [default: 'Gullco, Gullco International'] (press Enter for default):"
|
|
|
|
### 5. Anchor Text Calculation Logic
|
|
|
|
- **File**: `src/cli/commands.py` (modify `create_job_file_for_project`)
|
|
- **Calculation order**:
|
|
1. Get available terms (custom_anchor_text or related_searches)
|
|
2. Calculate branded count: `total * tier1_branded_ratio`
|
|
3. Calculate remaining: `total - branded_count`
|
|
4. Calculate branded+ count: `remaining * branded_plus_ratio` (if enabled)
|
|
5. Calculate regular count: `remaining - branded_plus_count`
|
|
- **Generation**:
|
|
- Branded terms: Use provided brand names (cycled)
|
|
- Branded+ terms: Generate from brands + related_searches (both patterns)
|
|
- Regular terms: Use remaining related_searches/keyword variations
|
|
|
|
### 6. Function Signature Updates
|
|
|
|
- **File**: `src/cli/commands.py`
|
|
- **`create_job_file_for_project`**:
|
|
- Add `tier1_branded_plus_ratio: Optional[float] = None`
|
|
- Add `brand_names: Optional[List[str]] = None` (for branded+ generation)
|
|
- **`ingest_cora`**:
|
|
- Add `tier1_branded_plus_ratio: Optional[float] = None` parameter
|
|
- Pass brand names to `create_job_file_for_project`
|
|
|
|
## Implementation Details
|
|
|
|
### Brand Lookup Flow
|
|
|
|
1. Normalize `money_site_url`: remove scheme (http://, https://), remove www. prefix, remove trailing slash
|
|
2. Look up normalized domain in brands.json
|
|
3. Return list of brand names or empty list if not found
|
|
|
|
### Branded+ Generation Example
|
|
|
|
- Brands: ["Gullco", "Gullco International"]
|
|
- Related searches: ["welder", "automatic welder"]
|
|
- Generated terms:
|
|
- "Gullco welder"
|
|
- "welder by Gullco"
|
|
- "Gullco automatic welder"
|
|
- "automatic welder by Gullco"
|
|
- "Gullco International welder"
|
|
- "welder by Gullco International"
|
|
- "Gullco International automatic welder"
|
|
- "automatic welder by Gullco International"
|
|
|
|
### Anchor Text Distribution Example
|
|
|
|
- Total available terms: 10
|
|
- `tier1_branded_ratio`: 0.4 → 4 branded terms
|
|
- Remaining: 6
|
|
- `tier1_branded_plus_ratio`: 0.67 → 4 branded+ terms
|
|
- Regular: 2 terms
|
|
- Final list: [4 branded, 4 branded+, 2 regular]
|
|
|
|
## Files to Modify
|
|
|
|
1. `src/cli/commands.py` - Add branded+ logic, brand lookup helper, update prompts, calculation
|
|
2. `brands.json` - New file for brand mappings (create with example entry)
|
|
|
|
## Testing Considerations
|
|
|
|
- Test with brand mapping present and absent
|
|
- Test with Enter (default) and custom brand input
|
|
- Test branded+ calculation with various ratios
|
|
- Test URL normalization (with/without www., http/https)
|
|
- Test with multiple brand names per URL
|
|
- Test with no related searches (fallback behavior)
|