# Story 3.2: Find Tiered Links - Implementation Summary ## Status Completed ## What Was Implemented ### 1. Database Models - **Added `money_site_url` to Project model** - Stores the client's actual website URL for tier 1 articles to link to - **Created `ArticleLink` model** - Tracks all link relationships between articles (tiered, wheel, homepage) ### 2. Database Repositories - **Extended `ProjectRepository`** - Now accepts `money_site_url` in the data dict during creation - **Extended `GeneratedContentRepository`** - Added filter for site_deployment_id in `get_by_project_and_tier()` - **Created `ArticleLinkRepository`** - Full CRUD operations for article link tracking - `create()` - Create internal or external links - `get_by_source_article()` - Get all outbound links from an article - `get_by_target_article()` - Get all inbound links to an article - `get_by_link_type()` - Get all links of a specific type - `delete()` - Remove a link ### 3. Job Configuration - **Extended `Job` dataclass** - Added optional `tiered_link_count_range` field - **Validation** - Validates that min >= 1 and max >= min - **Defaults** - If not specified, uses `{min: 2, max: 4}` ### 4. Core Functionality Created `src/interlinking/tiered_links.py` with: - **`find_tiered_links()`** - Main function to find tiered links for a batch - For tier 1: Returns the money site URL - For tier 2+: Returns random selection of lower-tier article URLs - Respects project boundaries (only queries same project) - Applies link count configuration - Handles edge cases (insufficient articles, missing money site URL) ### 5. Tests - **22 unit tests** in `tests/unit/test_tiered_links.py` - All passing - **8 unit tests** in `tests/unit/test_article_link_repository.py` - All passing - **9 integration tests** in `tests/integration/test_story_3_2_integration.py` - All passing - **Total: 39 tests, all passing** ## Usage Examples ### Finding Tiered Links for Tier 1 Batch ```python from src.interlinking.tiered_links import find_tiered_links # Tier 1 articles link to the money site result = find_tiered_links(tier1_content_records, job, project_repo, content_repo, site_repo) # Returns: { # "tier": 1, # "money_site_url": "https://www.mymoneysite.com" # } ``` ### Finding Tiered Links for Tier 2 Batch ```python # Tier 2 articles link to random tier 1 articles result = find_tiered_links(tier2_content_records, job, project_repo, content_repo, site_repo) # Returns: { # "tier": 2, # "lower_tier": 1, # "lower_tier_urls": [ # "https://site1.b-cdn.net/article-1.html", # "https://site2.b-cdn.net/article-2.html", # "https://site3.b-cdn.net/article-3.html" # ] # } ``` ### Job Config with Custom Link Count ```json { "jobs": [{ "project_id": 1, "tiers": { "tier1": {"count": 5}, "tier2": {"count": 10} }, "tiered_link_count_range": { "min": 3, "max": 5 } }] } ``` ### Recording Links in Database ```python from src.database.repositories import ArticleLinkRepository link_repo = ArticleLinkRepository(session) # Record tier 1 article linking to money site link_repo.create( from_content_id=tier1_article.id, to_content_id=None, to_url="https://www.moneysite.com", link_type="tiered" ) # Record tier 2 article linking to tier 1 article link_repo.create( from_content_id=tier2_article.id, to_content_id=tier1_article.id, to_url=None, link_type="tiered" ) # Query all links from an article outbound_links = link_repo.get_by_source_article(article.id) ``` ## Database Schema Changes ### Project Table ```sql ALTER TABLE projects ADD COLUMN money_site_url VARCHAR(500) NULL; CREATE INDEX idx_projects_money_site_url ON projects(money_site_url); ``` ### Article Links Table (New) ```sql CREATE TABLE article_links ( id INTEGER PRIMARY KEY AUTOINCREMENT, from_content_id INTEGER NOT NULL, to_content_id INTEGER NULL, to_url TEXT NULL, anchor_text TEXT NULL, -- Added in Story 4.5 link_type VARCHAR(20) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (from_content_id) REFERENCES generated_content(id) ON DELETE CASCADE, FOREIGN KEY (to_content_id) REFERENCES generated_content(id) ON DELETE CASCADE, CHECK (to_content_id IS NOT NULL OR to_url IS NOT NULL) ); CREATE INDEX idx_article_links_from ON article_links(from_content_id); CREATE INDEX idx_article_links_to ON article_links(to_content_id); CREATE INDEX idx_article_links_type ON article_links(link_type); ``` ## Link Types - `tiered` - Link from tier N to tier N-1 (or money site for tier 1) - `wheel_next` - Link to next article in wheel structure - `wheel_prev` - Link to previous article in wheel structure - `homepage` - Link to site homepage ## Key Features 1. **Project Isolation** - Only queries articles from the same project 2. **Random Selection** - Randomly selects articles within configured range 3. **Flexible Configuration** - Supports both range (min-max) and exact counts 4. **Error Handling** - Clear error messages for missing data 5. **Warning Logs** - Logs warnings when fewer articles available than requested 6. **URL Generation** - Integrates with Story 3.1 URL generation ## Next Steps (Future Stories) - Story 3.3 will use `find_tiered_links()` for actual content injection - Story 3.3 will populate `article_links` table with wheel and homepage links - Story 4.2 will log tiered links after deployment - Future: Analytics dashboard using `article_links` data ## Files Created/Modified ### Created - `src/interlinking/tiered_links.py` - `tests/unit/test_tiered_links.py` - `tests/unit/test_article_link_repository.py` - `tests/integration/test_story_3_2_integration.py` - `jobs/example_story_3.2_tiered_links.json` - `STORY_3.2_IMPLEMENTATION_SUMMARY.md` (this file) ### Modified - `src/database/models.py` - Added `money_site_url` to Project, added `ArticleLink` model - `src/database/interfaces.py` - Added `IArticleLinkRepository` interface - `src/database/repositories.py` - Extended `ProjectRepository`, added `ArticleLinkRepository` - `src/generation/job_config.py` - Added `tiered_link_count_range` to Job config ## Test Coverage All acceptance criteria from the story are covered by tests: - Tier 1 returns money site URL - Tier 2+ queries lower tier from same project - Custom link count ranges work - Error handling for missing data - Warning logs for insufficient articles - ArticleLink CRUD operations - Integration with URL generation