Big-Link-Man/STORY_3.2_IMPLEMENTATION_SU...

187 lines
6.3 KiB
Markdown

# Story 3.2: Find Tiered Links - Implementation Summary
## Status
Completed
## What Was Implemented
### 1. Database Models
- **Added `money_site_url` to Project model** - Stores the client's actual website URL for tier 1 articles to link to
- **Created `ArticleLink` model** - Tracks all link relationships between articles (tiered, wheel, homepage)
### 2. Database Repositories
- **Extended `ProjectRepository`** - Now accepts `money_site_url` in the data dict during creation
- **Extended `GeneratedContentRepository`** - Added filter for site_deployment_id in `get_by_project_and_tier()`
- **Created `ArticleLinkRepository`** - Full CRUD operations for article link tracking
- `create()` - Create internal or external links
- `get_by_source_article()` - Get all outbound links from an article
- `get_by_target_article()` - Get all inbound links to an article
- `get_by_link_type()` - Get all links of a specific type
- `delete()` - Remove a link
### 3. Job Configuration
- **Extended `Job` dataclass** - Added optional `tiered_link_count_range` field
- **Validation** - Validates that min >= 1 and max >= min
- **Defaults** - If not specified, uses `{min: 2, max: 4}`
### 4. Core Functionality
Created `src/interlinking/tiered_links.py` with:
- **`find_tiered_links()`** - Main function to find tiered links for a batch
- For tier 1: Returns the money site URL
- For tier 2+: Returns random selection of lower-tier article URLs
- Respects project boundaries (only queries same project)
- Applies link count configuration
- Handles edge cases (insufficient articles, missing money site URL)
### 5. Tests
- **22 unit tests** in `tests/unit/test_tiered_links.py` - All passing
- **8 unit tests** in `tests/unit/test_article_link_repository.py` - All passing
- **9 integration tests** in `tests/integration/test_story_3_2_integration.py` - All passing
- **Total: 39 tests, all passing**
## Usage Examples
### Finding Tiered Links for Tier 1 Batch
```python
from src.interlinking.tiered_links import find_tiered_links
# Tier 1 articles link to the money site
result = find_tiered_links(tier1_content_records, job, project_repo, content_repo, site_repo)
# Returns: {
# "tier": 1,
# "money_site_url": "https://www.mymoneysite.com"
# }
```
### Finding Tiered Links for Tier 2 Batch
```python
# Tier 2 articles link to random tier 1 articles
result = find_tiered_links(tier2_content_records, job, project_repo, content_repo, site_repo)
# Returns: {
# "tier": 2,
# "lower_tier": 1,
# "lower_tier_urls": [
# "https://site1.b-cdn.net/article-1.html",
# "https://site2.b-cdn.net/article-2.html",
# "https://site3.b-cdn.net/article-3.html"
# ]
# }
```
### Job Config with Custom Link Count
```json
{
"jobs": [{
"project_id": 1,
"tiers": {
"tier1": {"count": 5},
"tier2": {"count": 10}
},
"tiered_link_count_range": {
"min": 3,
"max": 5
}
}]
}
```
### Recording Links in Database
```python
from src.database.repositories import ArticleLinkRepository
link_repo = ArticleLinkRepository(session)
# Record tier 1 article linking to money site
link_repo.create(
from_content_id=tier1_article.id,
to_content_id=None,
to_url="https://www.moneysite.com",
link_type="tiered"
)
# Record tier 2 article linking to tier 1 article
link_repo.create(
from_content_id=tier2_article.id,
to_content_id=tier1_article.id,
to_url=None,
link_type="tiered"
)
# Query all links from an article
outbound_links = link_repo.get_by_source_article(article.id)
```
## Database Schema Changes
### Project Table
```sql
ALTER TABLE projects ADD COLUMN money_site_url VARCHAR(500) NULL;
CREATE INDEX idx_projects_money_site_url ON projects(money_site_url);
```
### Article Links Table (New)
```sql
CREATE TABLE article_links (
id INTEGER PRIMARY KEY AUTOINCREMENT,
from_content_id INTEGER NOT NULL,
to_content_id INTEGER NULL,
to_url TEXT NULL,
link_type VARCHAR(20) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (from_content_id) REFERENCES generated_content(id) ON DELETE CASCADE,
FOREIGN KEY (to_content_id) REFERENCES generated_content(id) ON DELETE CASCADE,
CHECK (to_content_id IS NOT NULL OR to_url IS NOT NULL)
);
CREATE INDEX idx_article_links_from ON article_links(from_content_id);
CREATE INDEX idx_article_links_to ON article_links(to_content_id);
CREATE INDEX idx_article_links_type ON article_links(link_type);
```
## Link Types
- `tiered` - Link from tier N to tier N-1 (or money site for tier 1)
- `wheel_next` - Link to next article in wheel structure
- `wheel_prev` - Link to previous article in wheel structure
- `homepage` - Link to site homepage
## Key Features
1. **Project Isolation** - Only queries articles from the same project
2. **Random Selection** - Randomly selects articles within configured range
3. **Flexible Configuration** - Supports both range (min-max) and exact counts
4. **Error Handling** - Clear error messages for missing data
5. **Warning Logs** - Logs warnings when fewer articles available than requested
6. **URL Generation** - Integrates with Story 3.1 URL generation
## Next Steps (Future Stories)
- Story 3.3 will use `find_tiered_links()` for actual content injection
- Story 3.3 will populate `article_links` table with wheel and homepage links
- Story 4.2 will log tiered links after deployment
- Future: Analytics dashboard using `article_links` data
## Files Created/Modified
### Created
- `src/interlinking/tiered_links.py`
- `tests/unit/test_tiered_links.py`
- `tests/unit/test_article_link_repository.py`
- `tests/integration/test_story_3_2_integration.py`
- `jobs/example_story_3.2_tiered_links.json`
- `STORY_3.2_IMPLEMENTATION_SUMMARY.md` (this file)
### Modified
- `src/database/models.py` - Added `money_site_url` to Project, added `ArticleLink` model
- `src/database/interfaces.py` - Added `IArticleLinkRepository` interface
- `src/database/repositories.py` - Extended `ProjectRepository`, added `ArticleLinkRepository`
- `src/generation/job_config.py` - Added `tiered_link_count_range` to Job config
## Test Coverage
All acceptance criteria from the story are covered by tests:
- Tier 1 returns money site URL
- Tier 2+ queries lower tier from same project
- Custom link count ranges work
- Error handling for missing data
- Warning logs for insufficient articles
- ArticleLink CRUD operations
- Integration with URL generation