Story 3.2: Link tier strategy implemented
parent
ee573fb948
commit
87bf317207
|
|
@ -0,0 +1,186 @@
|
|||
# Story 3.2: Find Tiered Links - Implementation Summary
|
||||
|
||||
## Status
|
||||
Completed
|
||||
|
||||
## What Was Implemented
|
||||
|
||||
### 1. Database Models
|
||||
- **Added `money_site_url` to Project model** - Stores the client's actual website URL for tier 1 articles to link to
|
||||
- **Created `ArticleLink` model** - Tracks all link relationships between articles (tiered, wheel, homepage)
|
||||
|
||||
### 2. Database Repositories
|
||||
- **Extended `ProjectRepository`** - Now accepts `money_site_url` in the data dict during creation
|
||||
- **Extended `GeneratedContentRepository`** - Added filter for site_deployment_id in `get_by_project_and_tier()`
|
||||
- **Created `ArticleLinkRepository`** - Full CRUD operations for article link tracking
|
||||
- `create()` - Create internal or external links
|
||||
- `get_by_source_article()` - Get all outbound links from an article
|
||||
- `get_by_target_article()` - Get all inbound links to an article
|
||||
- `get_by_link_type()` - Get all links of a specific type
|
||||
- `delete()` - Remove a link
|
||||
|
||||
### 3. Job Configuration
|
||||
- **Extended `Job` dataclass** - Added optional `tiered_link_count_range` field
|
||||
- **Validation** - Validates that min >= 1 and max >= min
|
||||
- **Defaults** - If not specified, uses `{min: 2, max: 4}`
|
||||
|
||||
### 4. Core Functionality
|
||||
Created `src/interlinking/tiered_links.py` with:
|
||||
- **`find_tiered_links()`** - Main function to find tiered links for a batch
|
||||
- For tier 1: Returns the money site URL
|
||||
- For tier 2+: Returns random selection of lower-tier article URLs
|
||||
- Respects project boundaries (only queries same project)
|
||||
- Applies link count configuration
|
||||
- Handles edge cases (insufficient articles, missing money site URL)
|
||||
|
||||
### 5. Tests
|
||||
- **22 unit tests** in `tests/unit/test_tiered_links.py` - All passing
|
||||
- **8 unit tests** in `tests/unit/test_article_link_repository.py` - All passing
|
||||
- **9 integration tests** in `tests/integration/test_story_3_2_integration.py` - All passing
|
||||
- **Total: 39 tests, all passing**
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Finding Tiered Links for Tier 1 Batch
|
||||
```python
|
||||
from src.interlinking.tiered_links import find_tiered_links
|
||||
|
||||
# Tier 1 articles link to the money site
|
||||
result = find_tiered_links(tier1_content_records, job, project_repo, content_repo, site_repo)
|
||||
# Returns: {
|
||||
# "tier": 1,
|
||||
# "money_site_url": "https://www.mymoneysite.com"
|
||||
# }
|
||||
```
|
||||
|
||||
### Finding Tiered Links for Tier 2 Batch
|
||||
```python
|
||||
# Tier 2 articles link to random tier 1 articles
|
||||
result = find_tiered_links(tier2_content_records, job, project_repo, content_repo, site_repo)
|
||||
# Returns: {
|
||||
# "tier": 2,
|
||||
# "lower_tier": 1,
|
||||
# "lower_tier_urls": [
|
||||
# "https://site1.b-cdn.net/article-1.html",
|
||||
# "https://site2.b-cdn.net/article-2.html",
|
||||
# "https://site3.b-cdn.net/article-3.html"
|
||||
# ]
|
||||
# }
|
||||
```
|
||||
|
||||
### Job Config with Custom Link Count
|
||||
```json
|
||||
{
|
||||
"jobs": [{
|
||||
"project_id": 1,
|
||||
"tiers": {
|
||||
"tier1": {"count": 5},
|
||||
"tier2": {"count": 10}
|
||||
},
|
||||
"tiered_link_count_range": {
|
||||
"min": 3,
|
||||
"max": 5
|
||||
}
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
### Recording Links in Database
|
||||
```python
|
||||
from src.database.repositories import ArticleLinkRepository
|
||||
|
||||
link_repo = ArticleLinkRepository(session)
|
||||
|
||||
# Record tier 1 article linking to money site
|
||||
link_repo.create(
|
||||
from_content_id=tier1_article.id,
|
||||
to_content_id=None,
|
||||
to_url="https://www.moneysite.com",
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
# Record tier 2 article linking to tier 1 article
|
||||
link_repo.create(
|
||||
from_content_id=tier2_article.id,
|
||||
to_content_id=tier1_article.id,
|
||||
to_url=None,
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
# Query all links from an article
|
||||
outbound_links = link_repo.get_by_source_article(article.id)
|
||||
```
|
||||
|
||||
## Database Schema Changes
|
||||
|
||||
### Project Table
|
||||
```sql
|
||||
ALTER TABLE projects ADD COLUMN money_site_url VARCHAR(500) NULL;
|
||||
CREATE INDEX idx_projects_money_site_url ON projects(money_site_url);
|
||||
```
|
||||
|
||||
### Article Links Table (New)
|
||||
```sql
|
||||
CREATE TABLE article_links (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
from_content_id INTEGER NOT NULL,
|
||||
to_content_id INTEGER NULL,
|
||||
to_url TEXT NULL,
|
||||
link_type VARCHAR(20) NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
FOREIGN KEY (from_content_id) REFERENCES generated_content(id) ON DELETE CASCADE,
|
||||
FOREIGN KEY (to_content_id) REFERENCES generated_content(id) ON DELETE CASCADE,
|
||||
CHECK (to_content_id IS NOT NULL OR to_url IS NOT NULL)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_article_links_from ON article_links(from_content_id);
|
||||
CREATE INDEX idx_article_links_to ON article_links(to_content_id);
|
||||
CREATE INDEX idx_article_links_type ON article_links(link_type);
|
||||
```
|
||||
|
||||
## Link Types
|
||||
- `tiered` - Link from tier N to tier N-1 (or money site for tier 1)
|
||||
- `wheel_next` - Link to next article in wheel structure
|
||||
- `wheel_prev` - Link to previous article in wheel structure
|
||||
- `homepage` - Link to site homepage
|
||||
|
||||
## Key Features
|
||||
1. **Project Isolation** - Only queries articles from the same project
|
||||
2. **Random Selection** - Randomly selects articles within configured range
|
||||
3. **Flexible Configuration** - Supports both range (min-max) and exact counts
|
||||
4. **Error Handling** - Clear error messages for missing data
|
||||
5. **Warning Logs** - Logs warnings when fewer articles available than requested
|
||||
6. **URL Generation** - Integrates with Story 3.1 URL generation
|
||||
|
||||
## Next Steps (Future Stories)
|
||||
- Story 3.3 will use `find_tiered_links()` for actual content injection
|
||||
- Story 3.3 will populate `article_links` table with wheel and homepage links
|
||||
- Story 4.2 will log tiered links after deployment
|
||||
- Future: Analytics dashboard using `article_links` data
|
||||
|
||||
## Files Created/Modified
|
||||
|
||||
### Created
|
||||
- `src/interlinking/tiered_links.py`
|
||||
- `tests/unit/test_tiered_links.py`
|
||||
- `tests/unit/test_article_link_repository.py`
|
||||
- `tests/integration/test_story_3_2_integration.py`
|
||||
- `jobs/example_story_3.2_tiered_links.json`
|
||||
- `STORY_3.2_IMPLEMENTATION_SUMMARY.md` (this file)
|
||||
|
||||
### Modified
|
||||
- `src/database/models.py` - Added `money_site_url` to Project, added `ArticleLink` model
|
||||
- `src/database/interfaces.py` - Added `IArticleLinkRepository` interface
|
||||
- `src/database/repositories.py` - Extended `ProjectRepository`, added `ArticleLinkRepository`
|
||||
- `src/generation/job_config.py` - Added `tiered_link_count_range` to Job config
|
||||
|
||||
## Test Coverage
|
||||
All acceptance criteria from the story are covered by tests:
|
||||
- Tier 1 returns money site URL
|
||||
- Tier 2+ queries lower tier from same project
|
||||
- Custom link count ranges work
|
||||
- Error handling for missing data
|
||||
- Warning logs for insufficient articles
|
||||
- ArticleLink CRUD operations
|
||||
- Integration with URL generation
|
||||
|
||||
|
|
@ -1,7 +1,7 @@
|
|||
# Story 3.2: Find Tiered Links
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
Review
|
||||
|
||||
## Story
|
||||
**As a developer**, I want a module that finds all required tiered links (money site or lower-tier) based on the current batch's tier, so I have them ready for injection.
|
||||
|
|
|
|||
|
|
@ -0,0 +1,23 @@
|
|||
{
|
||||
"jobs": [
|
||||
{
|
||||
"project_id": 1,
|
||||
"tiers": {
|
||||
"tier1": {
|
||||
"count": 5
|
||||
},
|
||||
"tier2": {
|
||||
"count": 10
|
||||
},
|
||||
"tier3": {
|
||||
"count": 20
|
||||
}
|
||||
},
|
||||
"tiered_link_count_range": {
|
||||
"min": 3,
|
||||
"max": 5
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
|
@ -1,317 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
"""
|
||||
Dry-run test for Story 3.1 features
|
||||
Tests all functionality without creating real bunny.net sites
|
||||
"""
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add project root to path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||
|
||||
from unittest.mock import Mock
|
||||
from src.database.session import db_manager
|
||||
from src.database.repositories import SiteDeploymentRepository, GeneratedContentRepository, ProjectRepository, UserRepository
|
||||
from src.generation.url_generator import generate_slug, generate_urls_for_batch
|
||||
from src.generation.job_config import Job
|
||||
|
||||
|
||||
def print_section(title):
|
||||
print(f"\n{'='*80}")
|
||||
print(f" {title}")
|
||||
print(f"{'='*80}\n")
|
||||
|
||||
|
||||
def test_slug_generation():
|
||||
print_section("TEST 1: Slug Generation")
|
||||
|
||||
test_cases = [
|
||||
("How to Fix Your Engine", "how-to-fix-your-engine"),
|
||||
("10 Best SEO Tips for 2024!", "10-best-seo-tips-for-2024"),
|
||||
("C++ Programming Guide", "c-programming-guide"),
|
||||
("Multiple Spaces Here", "multiple-spaces-here"),
|
||||
("!!!Special Characters!!!", "special-characters"),
|
||||
]
|
||||
|
||||
for title, expected in test_cases:
|
||||
slug = generate_slug(title)
|
||||
status = "[PASS]" if slug == expected else "[FAIL]"
|
||||
print(f"{status} '{title}'")
|
||||
print(f" -> {slug}")
|
||||
if slug != expected:
|
||||
print(f" Expected: {expected}")
|
||||
|
||||
print("\nSlug generation: PASSED")
|
||||
|
||||
|
||||
def test_site_assignment_priority():
|
||||
print_section("TEST 2: Site Assignment Priority Logic")
|
||||
|
||||
# Create mock sites
|
||||
preferred_site = Mock()
|
||||
preferred_site.id = 1
|
||||
preferred_site.site_name = "preferred-site"
|
||||
preferred_site.custom_hostname = "www.premium.com"
|
||||
preferred_site.pull_zone_bcdn_hostname = "premium.b-cdn.net"
|
||||
|
||||
keyword_site = Mock()
|
||||
keyword_site.id = 2
|
||||
keyword_site.site_name = "engine-repair-abc"
|
||||
keyword_site.custom_hostname = None
|
||||
keyword_site.pull_zone_bcdn_hostname = "engine-repair-abc.b-cdn.net"
|
||||
|
||||
random_site = Mock()
|
||||
random_site.id = 3
|
||||
random_site.site_name = "random-site-xyz"
|
||||
random_site.custom_hostname = None
|
||||
random_site.pull_zone_bcdn_hostname = "random-site-xyz.b-cdn.net"
|
||||
|
||||
print("Available sites:")
|
||||
print(f" 1. {preferred_site.custom_hostname} (preferred)")
|
||||
print(f" 2. {keyword_site.pull_zone_bcdn_hostname} (keyword: 'engine-repair')")
|
||||
print(f" 3. {random_site.pull_zone_bcdn_hostname} (random)")
|
||||
|
||||
print("\nTier1 article with keyword 'engine':")
|
||||
print(" Priority: preferred -> keyword -> random")
|
||||
print(" [PASS] Should get: preferred site (www.premium.com)")
|
||||
|
||||
print("\nTier2 article with keyword 'car':")
|
||||
print(" Priority: keyword -> random (no preferred for tier2)")
|
||||
print(" [PASS] Should get: random site or keyword if matching")
|
||||
|
||||
print("\nPriority logic: PASSED")
|
||||
|
||||
|
||||
def test_url_generation():
|
||||
print_section("TEST 3: URL Generation")
|
||||
|
||||
# Test with custom domain
|
||||
print("Test 3a: Custom domain")
|
||||
print(" Hostname: www.example.com")
|
||||
print(" Title: How to Fix Your Engine")
|
||||
print(" [PASS] URL: https://www.example.com/how-to-fix-your-engine.html")
|
||||
|
||||
# Test with bcdn only
|
||||
print("\nTest 3b: Bunny CDN hostname only")
|
||||
print(" Hostname: mysite123.b-cdn.net")
|
||||
print(" Title: SEO Best Practices")
|
||||
print(" [PASS] URL: https://mysite123.b-cdn.net/seo-best-practices.html")
|
||||
|
||||
print("\nURL generation: PASSED")
|
||||
|
||||
|
||||
def test_job_config_parsing():
|
||||
print_section("TEST 4: Job Config Extensions")
|
||||
|
||||
job = Job(
|
||||
project_id=1,
|
||||
tiers={"tier1": Mock(count=10)},
|
||||
tier1_preferred_sites=["www.premium1.com", "www.premium2.com"],
|
||||
auto_create_sites=True,
|
||||
create_sites_for_keywords=[
|
||||
{"keyword": "engine repair", "count": 3},
|
||||
{"keyword": "car maintenance", "count": 2}
|
||||
]
|
||||
)
|
||||
|
||||
print("Job configuration loaded:")
|
||||
print(f" [PASS] project_id: {job.project_id}")
|
||||
print(f" [PASS] tier1_preferred_sites: {job.tier1_preferred_sites}")
|
||||
print(f" [PASS] auto_create_sites: {job.auto_create_sites}")
|
||||
print(f" [PASS] create_sites_for_keywords: {len(job.create_sites_for_keywords)} keywords")
|
||||
|
||||
for kw in job.create_sites_for_keywords:
|
||||
print(f" - {kw['keyword']}: {kw['count']} sites")
|
||||
|
||||
print("\nJob config parsing: PASSED")
|
||||
|
||||
|
||||
def test_database_schema():
|
||||
print_section("TEST 5: Database Schema Validation")
|
||||
|
||||
session = db_manager.get_session()
|
||||
|
||||
try:
|
||||
site_repo = SiteDeploymentRepository(session)
|
||||
|
||||
# Create a test site without custom hostname
|
||||
print("Creating test site without custom hostname...")
|
||||
test_site = site_repo.create(
|
||||
site_name="test-dryrun-site",
|
||||
storage_zone_id=999,
|
||||
storage_zone_name="test-zone",
|
||||
storage_zone_password="test-pass",
|
||||
storage_zone_region="DE",
|
||||
pull_zone_id=888,
|
||||
pull_zone_bcdn_hostname=f"test-dryrun-{id(session)}.b-cdn.net",
|
||||
custom_hostname=None # This is the key test
|
||||
)
|
||||
|
||||
print(f" [PASS] Created site with id={test_site.id}")
|
||||
print(f" [PASS] custom_hostname: {test_site.custom_hostname} (None = nullable works!)")
|
||||
print(f" [PASS] pull_zone_bcdn_hostname: {test_site.pull_zone_bcdn_hostname}")
|
||||
|
||||
# Test get_by_bcdn_hostname
|
||||
found = site_repo.get_by_bcdn_hostname(test_site.pull_zone_bcdn_hostname)
|
||||
print(f" [PASS] get_by_bcdn_hostname() works: {found is not None}")
|
||||
|
||||
# Clean up
|
||||
site_repo.delete(test_site.id)
|
||||
print(f" [PASS] Test site deleted (cleanup)")
|
||||
|
||||
session.commit()
|
||||
print("\nDatabase schema: PASSED")
|
||||
|
||||
except Exception as e:
|
||||
session.rollback()
|
||||
print(f"\n[FAILED] Database schema test FAILED: {e}")
|
||||
return False
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def test_full_workflow_simulation():
|
||||
print_section("TEST 6: Full Workflow Simulation (Simplified)")
|
||||
|
||||
session = db_manager.get_session()
|
||||
|
||||
try:
|
||||
# Create repositories
|
||||
site_repo = SiteDeploymentRepository(session)
|
||||
|
||||
print("Testing Story 3.1 core features...")
|
||||
|
||||
# Create test sites (2 sites)
|
||||
site1 = site_repo.create(
|
||||
site_name="test-site-1",
|
||||
storage_zone_id=101,
|
||||
storage_zone_name="test-site-1",
|
||||
storage_zone_password="pass1",
|
||||
storage_zone_region="DE",
|
||||
pull_zone_id=201,
|
||||
pull_zone_bcdn_hostname=f"test-site-1-{id(session)}.b-cdn.net",
|
||||
custom_hostname="www.test-custom1.com"
|
||||
)
|
||||
|
||||
site2 = site_repo.create(
|
||||
site_name="test-site-2",
|
||||
storage_zone_id=102,
|
||||
storage_zone_name="test-site-2",
|
||||
storage_zone_password="pass2",
|
||||
storage_zone_region="NY",
|
||||
pull_zone_id=202,
|
||||
pull_zone_bcdn_hostname=f"test-site-2-{id(session)}.b-cdn.net",
|
||||
custom_hostname=None # bcdn-only site
|
||||
)
|
||||
print(f" [PASS] Created 2 test sites")
|
||||
|
||||
# Create mock content objects
|
||||
from unittest.mock import Mock
|
||||
content1 = Mock()
|
||||
content1.id = 999
|
||||
content1.project_id = 1
|
||||
content1.tier = "tier1"
|
||||
content1.keyword = "engine repair"
|
||||
content1.title = "How to Fix Your Car Engine"
|
||||
content1.outline = {"sections": []}
|
||||
content1.content = "<p>Test content</p>"
|
||||
content1.word_count = 500
|
||||
content1.status = "generated"
|
||||
content1.site_deployment_id = site1.id
|
||||
|
||||
content2 = Mock()
|
||||
content2.id = 1000
|
||||
content2.project_id = 1
|
||||
content2.tier = "tier2"
|
||||
content2.keyword = "car maintenance"
|
||||
content2.title = "Essential Car Maintenance Tips"
|
||||
content2.outline = {"sections": []}
|
||||
content2.content = "<p>Test content 2</p>"
|
||||
content2.word_count = 400
|
||||
content2.status = "generated"
|
||||
content2.site_deployment_id = site2.id
|
||||
|
||||
print(f" [PASS] Created 2 mock articles")
|
||||
|
||||
# Generate URLs
|
||||
print("\nGenerating URLs...")
|
||||
urls = generate_urls_for_batch([content1, content2], site_repo)
|
||||
|
||||
for url_info in urls:
|
||||
print(f"\n Article: {url_info['title']}")
|
||||
print(f" Tier: {url_info['tier']}")
|
||||
print(f" Slug: {url_info['slug']}")
|
||||
print(f" Hostname: {url_info['hostname']}")
|
||||
print(f" [PASS] URL: {url_info['url']}")
|
||||
|
||||
# Cleanup (only delete sites, mock content wasn't saved)
|
||||
print("\nCleaning up test data...")
|
||||
site_repo.delete(site1.id)
|
||||
site_repo.delete(site2.id)
|
||||
|
||||
session.commit()
|
||||
print(" [PASS] Test data cleaned up")
|
||||
|
||||
print("\nFull workflow simulation: PASSED")
|
||||
|
||||
except Exception as e:
|
||||
session.rollback()
|
||||
print(f"\n[FAILED] Full workflow FAILED: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return False
|
||||
finally:
|
||||
session.close()
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
print("\n" + "="*80)
|
||||
print(" STORY 3.1 DRY-RUN TEST SUITE")
|
||||
print(" Testing all features without creating real bunny.net sites")
|
||||
print("="*80)
|
||||
|
||||
tests = [
|
||||
("Slug Generation", test_slug_generation),
|
||||
("Priority Logic", test_site_assignment_priority),
|
||||
("URL Generation", test_url_generation),
|
||||
("Job Config", test_job_config_parsing),
|
||||
("Database Schema", test_database_schema),
|
||||
("Full Workflow", test_full_workflow_simulation),
|
||||
]
|
||||
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
for name, test_func in tests:
|
||||
try:
|
||||
result = test_func()
|
||||
if result is None or result is True:
|
||||
passed += 1
|
||||
else:
|
||||
failed += 1
|
||||
except Exception as e:
|
||||
print(f"\n[FAILED] {name} FAILED with exception: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
failed += 1
|
||||
|
||||
print_section("SUMMARY")
|
||||
print(f"Tests Passed: {passed}/{len(tests)}")
|
||||
print(f"Tests Failed: {failed}/{len(tests)}")
|
||||
|
||||
if failed == 0:
|
||||
print("\n[SUCCESS] ALL TESTS PASSED - Story 3.1 is ready to use!")
|
||||
return 0
|
||||
else:
|
||||
print(f"\n[FAILED] {failed} test(s) failed - please review errors above")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
|
||||
|
|
@ -4,7 +4,7 @@ Abstract repository interfaces for data access layer
|
|||
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Optional, List, Dict, Any
|
||||
from src.database.models import User, SiteDeployment, Project, GeneratedContent
|
||||
from src.database.models import User, SiteDeployment, Project, GeneratedContent, ArticleLink
|
||||
|
||||
|
||||
class IUserRepository(ABC):
|
||||
|
|
@ -176,3 +176,38 @@ class IGeneratedContentRepository(ABC):
|
|||
def delete(self, content_id: int) -> bool:
|
||||
"""Delete a generated content record by ID"""
|
||||
pass
|
||||
|
||||
|
||||
class IArticleLinkRepository(ABC):
|
||||
"""Interface for ArticleLink data access"""
|
||||
|
||||
@abstractmethod
|
||||
def create(
|
||||
self,
|
||||
from_content_id: int,
|
||||
to_content_id: Optional[int] = None,
|
||||
to_url: Optional[str] = None,
|
||||
link_type: str = "tiered"
|
||||
) -> ArticleLink:
|
||||
"""Create a new article link"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_by_source_article(self, from_content_id: int) -> List[ArticleLink]:
|
||||
"""Get all outbound links from an article"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_by_target_article(self, to_content_id: int) -> List[ArticleLink]:
|
||||
"""Get all inbound links to an article"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_by_link_type(self, link_type: str) -> List[ArticleLink]:
|
||||
"""Get all links of a specific type"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def delete(self, link_id: int) -> bool:
|
||||
"""Delete an article link by ID"""
|
||||
pass
|
||||
|
|
|
|||
|
|
@ -72,6 +72,7 @@ class Project(Base):
|
|||
name: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
main_keyword: Mapped[str] = mapped_column(String(255), nullable=False, index=True)
|
||||
tier: Mapped[int] = mapped_column(Integer, nullable=False, default=1, index=True)
|
||||
money_site_url: Mapped[Optional[str]] = mapped_column(String(500), nullable=True, index=True)
|
||||
|
||||
word_count: Mapped[int] = mapped_column(Integer, nullable=False, default=1250)
|
||||
term_frequency: Mapped[int] = mapped_column(Integer, nullable=False, default=3)
|
||||
|
|
@ -146,3 +147,29 @@ class GeneratedContent(Base):
|
|||
|
||||
def __repr__(self) -> str:
|
||||
return f"<GeneratedContent(id={self.id}, project_id={self.project_id}, tier='{self.tier}', status='{self.status}')>"
|
||||
|
||||
|
||||
class ArticleLink(Base):
|
||||
"""Article link tracking model for tiered linking, wheel links, etc."""
|
||||
__tablename__ = "article_links"
|
||||
|
||||
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||
from_content_id: Mapped[int] = mapped_column(
|
||||
Integer,
|
||||
ForeignKey('generated_content.id', ondelete='CASCADE'),
|
||||
nullable=False,
|
||||
index=True
|
||||
)
|
||||
to_content_id: Mapped[Optional[int]] = mapped_column(
|
||||
Integer,
|
||||
ForeignKey('generated_content.id', ondelete='CASCADE'),
|
||||
nullable=True,
|
||||
index=True
|
||||
)
|
||||
to_url: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
|
||||
link_type: Mapped[str] = mapped_column(String(20), nullable=False, index=True)
|
||||
created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow, nullable=False)
|
||||
|
||||
def __repr__(self) -> str:
|
||||
target = f"content_id={self.to_content_id}" if self.to_content_id else f"url={self.to_url}"
|
||||
return f"<ArticleLink(id={self.id}, from={self.from_content_id}, to={target}, type='{self.link_type}')>"
|
||||
|
|
|
|||
|
|
@ -6,8 +6,8 @@ from typing import Optional, List, Dict, Any
|
|||
from sqlalchemy.orm import Session
|
||||
from sqlalchemy.exc import IntegrityError
|
||||
from src.core.config import get_config
|
||||
from src.database.interfaces import IUserRepository, ISiteDeploymentRepository, IProjectRepository
|
||||
from src.database.models import User, SiteDeployment, Project, GeneratedContent
|
||||
from src.database.interfaces import IUserRepository, ISiteDeploymentRepository, IProjectRepository, IArticleLinkRepository
|
||||
from src.database.models import User, SiteDeployment, Project, GeneratedContent, ArticleLink
|
||||
|
||||
|
||||
class UserRepository(IUserRepository):
|
||||
|
|
@ -316,6 +316,7 @@ class ProjectRepository(IProjectRepository):
|
|||
entities=data.get("entities", []),
|
||||
related_searches=data.get("related_searches", []),
|
||||
custom_anchor_text=data.get("custom_anchor_text", []),
|
||||
money_site_url=data.get("money_site_url"),
|
||||
)
|
||||
|
||||
try:
|
||||
|
|
@ -454,10 +455,15 @@ class GeneratedContentRepository:
|
|||
return self.session.query(GeneratedContent).filter(GeneratedContent.project_id == project_id).all()
|
||||
|
||||
def get_by_project_and_tier(self, project_id: int, tier: str) -> List[GeneratedContent]:
|
||||
"""Get content for a project and tier"""
|
||||
"""
|
||||
Get content for a project and tier with site assignment
|
||||
|
||||
Returns only articles that have been assigned to a site (site_deployment_id is not None)
|
||||
"""
|
||||
return self.session.query(GeneratedContent).filter(
|
||||
GeneratedContent.project_id == project_id,
|
||||
GeneratedContent.tier == tier
|
||||
GeneratedContent.tier == tier,
|
||||
GeneratedContent.site_deployment_id.isnot(None)
|
||||
).all()
|
||||
|
||||
def get_by_keyword(self, keyword: str) -> List[GeneratedContent]:
|
||||
|
|
@ -483,3 +489,78 @@ class GeneratedContentRepository:
|
|||
self.session.commit()
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
class ArticleLinkRepository(IArticleLinkRepository):
|
||||
"""Repository for ArticleLink data access"""
|
||||
|
||||
def __init__(self, session: Session):
|
||||
self.session = session
|
||||
|
||||
def create(
|
||||
self,
|
||||
from_content_id: int,
|
||||
to_content_id: Optional[int] = None,
|
||||
to_url: Optional[str] = None,
|
||||
link_type: str = "tiered"
|
||||
) -> ArticleLink:
|
||||
"""
|
||||
Create a new article link
|
||||
|
||||
Args:
|
||||
from_content_id: Source article ID
|
||||
to_content_id: Target article ID (for internal links)
|
||||
to_url: Target URL (for external links like money site)
|
||||
link_type: Type of link (tiered, wheel_next, wheel_prev, homepage)
|
||||
|
||||
Returns:
|
||||
The created ArticleLink object
|
||||
|
||||
Raises:
|
||||
ValueError: If neither to_content_id nor to_url is provided
|
||||
"""
|
||||
if to_content_id is None and to_url is None:
|
||||
raise ValueError("Either to_content_id or to_url must be provided")
|
||||
|
||||
link = ArticleLink(
|
||||
from_content_id=from_content_id,
|
||||
to_content_id=to_content_id,
|
||||
to_url=to_url,
|
||||
link_type=link_type
|
||||
)
|
||||
|
||||
try:
|
||||
self.session.add(link)
|
||||
self.session.commit()
|
||||
self.session.refresh(link)
|
||||
return link
|
||||
except IntegrityError as e:
|
||||
self.session.rollback()
|
||||
raise ValueError(f"Failed to create article link: {e}")
|
||||
|
||||
def get_by_source_article(self, from_content_id: int) -> List[ArticleLink]:
|
||||
"""Get all outbound links from an article"""
|
||||
return self.session.query(ArticleLink).filter(
|
||||
ArticleLink.from_content_id == from_content_id
|
||||
).all()
|
||||
|
||||
def get_by_target_article(self, to_content_id: int) -> List[ArticleLink]:
|
||||
"""Get all inbound links to an article"""
|
||||
return self.session.query(ArticleLink).filter(
|
||||
ArticleLink.to_content_id == to_content_id
|
||||
).all()
|
||||
|
||||
def get_by_link_type(self, link_type: str) -> List[ArticleLink]:
|
||||
"""Get all links of a specific type"""
|
||||
return self.session.query(ArticleLink).filter(
|
||||
ArticleLink.link_type == link_type
|
||||
).all()
|
||||
|
||||
def delete(self, link_id: int) -> bool:
|
||||
"""Delete an article link by ID"""
|
||||
link = self.session.query(ArticleLink).filter(ArticleLink.id == link_id).first()
|
||||
if link:
|
||||
self.session.delete(link)
|
||||
self.session.commit()
|
||||
return True
|
||||
return False
|
||||
|
|
|
|||
|
|
@ -56,6 +56,7 @@ class Job:
|
|||
tier1_preferred_sites: Optional[List[str]] = None
|
||||
auto_create_sites: bool = False
|
||||
create_sites_for_keywords: Optional[List[Dict[str, any]]] = None
|
||||
tiered_link_count_range: Optional[Dict[str, int]] = None
|
||||
|
||||
|
||||
class JobConfig:
|
||||
|
|
@ -136,13 +137,29 @@ class JobConfig:
|
|||
if "keyword" not in kw_config or "count" not in kw_config:
|
||||
raise ValueError("Each item in 'create_sites_for_keywords' must have 'keyword' and 'count'")
|
||||
|
||||
tiered_link_count_range = job_data.get("tiered_link_count_range")
|
||||
if tiered_link_count_range is not None:
|
||||
if not isinstance(tiered_link_count_range, dict):
|
||||
raise ValueError("'tiered_link_count_range' must be an object")
|
||||
if "min" not in tiered_link_count_range or "max" not in tiered_link_count_range:
|
||||
raise ValueError("'tiered_link_count_range' must have 'min' and 'max' fields")
|
||||
min_val = tiered_link_count_range["min"]
|
||||
max_val = tiered_link_count_range["max"]
|
||||
if not isinstance(min_val, int) or not isinstance(max_val, int):
|
||||
raise ValueError("'tiered_link_count_range' min and max must be integers")
|
||||
if min_val < 1:
|
||||
raise ValueError("'tiered_link_count_range' min must be >= 1")
|
||||
if max_val < min_val:
|
||||
raise ValueError("'tiered_link_count_range' max must be >= min")
|
||||
|
||||
return Job(
|
||||
project_id=project_id,
|
||||
tiers=tiers,
|
||||
deployment_targets=deployment_targets,
|
||||
tier1_preferred_sites=tier1_preferred_sites,
|
||||
auto_create_sites=auto_create_sites,
|
||||
create_sites_for_keywords=create_sites_for_keywords
|
||||
create_sites_for_keywords=create_sites_for_keywords,
|
||||
tiered_link_count_range=tiered_link_count_range
|
||||
)
|
||||
|
||||
def _parse_tier(self, tier_name: str, tier_data: dict) -> TierConfig:
|
||||
|
|
|
|||
|
|
@ -0,0 +1,165 @@
|
|||
"""
|
||||
Tiered link finder for article interlinking
|
||||
"""
|
||||
|
||||
import random
|
||||
import logging
|
||||
from typing import List, Dict
|
||||
from src.database.models import GeneratedContent
|
||||
from src.database.repositories import ProjectRepository, GeneratedContentRepository, SiteDeploymentRepository
|
||||
from src.generation.url_generator import generate_urls_for_batch
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def find_tiered_links(
|
||||
content_records: List[GeneratedContent],
|
||||
job_config,
|
||||
project_repo: ProjectRepository,
|
||||
content_repo: GeneratedContentRepository,
|
||||
site_repo: SiteDeploymentRepository
|
||||
) -> Dict:
|
||||
"""
|
||||
Find tiered links for a batch of articles
|
||||
|
||||
Args:
|
||||
content_records: Batch of articles (all same tier, same project)
|
||||
job_config: Job configuration (Job object or dict) with optional link count range
|
||||
project_repo: For retrieving money_site_url
|
||||
content_repo: For querying lower-tier articles
|
||||
site_repo: For URL generation
|
||||
|
||||
Returns:
|
||||
Tier 1: {tier: 1, money_site_url: "https://..."}
|
||||
Tier 2+: {tier: N, lower_tier_urls: [...], lower_tier: N-1}
|
||||
|
||||
Raises:
|
||||
ValueError: If batch is invalid or required data is missing
|
||||
"""
|
||||
if not content_records:
|
||||
raise ValueError("content_records cannot be empty")
|
||||
|
||||
tier = _validate_batch_tier(content_records)
|
||||
tier_int = _extract_tier_number(tier)
|
||||
project_id = content_records[0].project_id
|
||||
|
||||
logger.info(f"Finding tiered links for tier {tier_int} batch (project {project_id})")
|
||||
|
||||
if tier_int == 1:
|
||||
project = project_repo.get_by_id(project_id)
|
||||
if not project or not project.money_site_url:
|
||||
raise ValueError(
|
||||
f"Cannot generate tier 1 batch: money_site_url not set in project {project_id}"
|
||||
)
|
||||
return {
|
||||
"tier": tier_int,
|
||||
"money_site_url": project.money_site_url
|
||||
}
|
||||
|
||||
lower_tier_int = tier_int - 1
|
||||
lower_tier = f"tier{lower_tier_int}"
|
||||
logger.info(f"Batch is tier {tier_int}, querying tier {lower_tier_int} articles")
|
||||
|
||||
lower_tier_articles = content_repo.get_by_project_and_tier(project_id, lower_tier)
|
||||
|
||||
if not lower_tier_articles:
|
||||
raise ValueError(
|
||||
f"Cannot generate tier {tier_int} batch: no tier {lower_tier_int} articles found in project {project_id}"
|
||||
)
|
||||
|
||||
link_range = _get_link_count_range(job_config)
|
||||
min_count = link_range["min"]
|
||||
max_count = link_range["max"]
|
||||
|
||||
available_count = len(lower_tier_articles)
|
||||
desired_count = random.randint(min_count, max_count)
|
||||
|
||||
if available_count < min_count:
|
||||
logger.warning(
|
||||
f"Only {available_count} tier {lower_tier_int} articles available, "
|
||||
f"requested min {min_count}. Using all available."
|
||||
)
|
||||
selected_articles = lower_tier_articles
|
||||
else:
|
||||
actual_count = min(desired_count, available_count)
|
||||
selected_articles = random.sample(lower_tier_articles, actual_count)
|
||||
|
||||
logger.info(
|
||||
f"Selected {len(selected_articles)} random tier {lower_tier_int} URLs "
|
||||
f"from {available_count} available"
|
||||
)
|
||||
|
||||
url_mappings = generate_urls_for_batch(selected_articles, site_repo)
|
||||
lower_tier_urls = [mapping["url"] for mapping in url_mappings]
|
||||
|
||||
return {
|
||||
"tier": tier_int,
|
||||
"lower_tier": lower_tier_int,
|
||||
"lower_tier_urls": lower_tier_urls
|
||||
}
|
||||
|
||||
|
||||
def _validate_batch_tier(content_records: List[GeneratedContent]) -> str:
|
||||
"""
|
||||
Validate that all articles in batch are the same tier
|
||||
|
||||
Args:
|
||||
content_records: List of GeneratedContent records
|
||||
|
||||
Returns:
|
||||
The tier string (e.g., "tier1", "tier2")
|
||||
|
||||
Raises:
|
||||
ValueError: If articles have different tiers
|
||||
"""
|
||||
tiers = set(record.tier for record in content_records)
|
||||
if len(tiers) > 1:
|
||||
raise ValueError(f"All articles in batch must be same tier, found: {tiers}")
|
||||
return list(tiers)[0]
|
||||
|
||||
|
||||
def _extract_tier_number(tier: str) -> int:
|
||||
"""
|
||||
Extract tier number from tier string
|
||||
|
||||
Args:
|
||||
tier: Tier string like "tier1", "tier2"
|
||||
|
||||
Returns:
|
||||
Integer tier number (1, 2, 3, etc.)
|
||||
"""
|
||||
if tier.startswith("tier"):
|
||||
try:
|
||||
return int(tier[4:])
|
||||
except (ValueError, IndexError):
|
||||
pass
|
||||
raise ValueError(f"Invalid tier format: {tier}")
|
||||
|
||||
|
||||
def _get_link_count_range(job_config) -> Dict[str, int]:
|
||||
"""
|
||||
Get link count range from job config with defaults
|
||||
|
||||
Args:
|
||||
job_config: Job object or dict
|
||||
|
||||
Returns:
|
||||
Dict with 'min' and 'max' keys, defaults to {min: 2, max: 4}
|
||||
"""
|
||||
default_range = {"min": 2, "max": 4}
|
||||
|
||||
if job_config is None:
|
||||
return default_range
|
||||
|
||||
if hasattr(job_config, 'tiered_link_count_range'):
|
||||
link_range = job_config.tiered_link_count_range
|
||||
elif isinstance(job_config, dict):
|
||||
link_range = job_config.get('tiered_link_count_range')
|
||||
else:
|
||||
return default_range
|
||||
|
||||
if link_range is None:
|
||||
return default_range
|
||||
|
||||
return link_range
|
||||
|
||||
|
|
@ -0,0 +1,522 @@
|
|||
"""
|
||||
Integration tests for Story 3.2: Find Tiered Links
|
||||
"""
|
||||
|
||||
import pytest
|
||||
from src.database.models import GeneratedContent, SiteDeployment, Project, ArticleLink
|
||||
from src.database.repositories import (
|
||||
SiteDeploymentRepository,
|
||||
GeneratedContentRepository,
|
||||
ProjectRepository,
|
||||
ArticleLinkRepository
|
||||
)
|
||||
from src.generation.job_config import Job
|
||||
from src.interlinking.tiered_links import find_tiered_links
|
||||
|
||||
|
||||
class TestTieredLinksFindingIntegration:
|
||||
"""Integration tests for tiered link finding with real database"""
|
||||
|
||||
def test_tier1_returns_money_site_url(self, db_session):
|
||||
"""Test tier 1 batch returns money site URL from project"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
site_repo = SiteDeploymentRepository(db_session)
|
||||
|
||||
# Create project with money site URL
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={
|
||||
"main_keyword": "test keyword",
|
||||
"money_site_url": "https://www.mymoneysite.com"
|
||||
}
|
||||
)
|
||||
|
||||
# Create tier 1 content
|
||||
content = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier1",
|
||||
keyword="test",
|
||||
title="Tier 1 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
result = find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
assert result["tier"] == 1
|
||||
assert result["money_site_url"] == "https://www.mymoneysite.com"
|
||||
|
||||
def test_tier2_queries_tier1_articles_same_project(self, db_session):
|
||||
"""Test tier 2 batch queries tier 1 articles from same project only"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
site_repo = SiteDeploymentRepository(db_session)
|
||||
|
||||
# Create site for tier 1 articles
|
||||
site = site_repo.create(
|
||||
site_name="test-site",
|
||||
storage_zone_id=1,
|
||||
storage_zone_name="test",
|
||||
storage_zone_password="pass",
|
||||
storage_zone_region="DE",
|
||||
pull_zone_id=10,
|
||||
pull_zone_bcdn_hostname="test.b-cdn.net"
|
||||
)
|
||||
|
||||
# Create two projects
|
||||
project1 = project_repo.create(
|
||||
user_id=1,
|
||||
name="Project 1",
|
||||
data={"main_keyword": "test1"}
|
||||
)
|
||||
|
||||
project2 = project_repo.create(
|
||||
user_id=1,
|
||||
name="Project 2",
|
||||
data={"main_keyword": "test2"}
|
||||
)
|
||||
|
||||
# Create tier 1 articles for project 1
|
||||
tier1_p1_articles = []
|
||||
for i in range(5):
|
||||
article = content_repo.create(
|
||||
project_id=project1.id,
|
||||
tier="tier1",
|
||||
keyword="test1",
|
||||
title=f"Project 1 Tier 1 Article {i}",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated",
|
||||
site_deployment_id=site.id
|
||||
)
|
||||
tier1_p1_articles.append(article)
|
||||
|
||||
# Create tier 1 articles for project 2 (should not be selected)
|
||||
for i in range(3):
|
||||
content_repo.create(
|
||||
project_id=project2.id,
|
||||
tier="tier1",
|
||||
keyword="test2",
|
||||
title=f"Project 2 Tier 1 Article {i}",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated",
|
||||
site_deployment_id=site.id
|
||||
)
|
||||
|
||||
# Create tier 2 article for project 1
|
||||
tier2_article = content_repo.create(
|
||||
project_id=project1.id,
|
||||
tier="tier2",
|
||||
keyword="test1",
|
||||
title="Project 1 Tier 2 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
result = find_tiered_links([tier2_article], None, project_repo, content_repo, site_repo)
|
||||
|
||||
assert result["tier"] == 2
|
||||
assert result["lower_tier"] == 1
|
||||
assert len(result["lower_tier_urls"]) >= 2
|
||||
assert len(result["lower_tier_urls"]) <= 4
|
||||
|
||||
# Verify URLs are from tier 1 project 1 articles only
|
||||
for url in result["lower_tier_urls"]:
|
||||
assert "test.b-cdn.net" in url
|
||||
assert any(f"project-1-tier-1-article-{i}" in url.lower() for i in range(5))
|
||||
|
||||
def test_tier3_queries_tier2_articles(self, db_session):
|
||||
"""Test tier 3 batch queries tier 2 articles"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
site_repo = SiteDeploymentRepository(db_session)
|
||||
|
||||
site = site_repo.create(
|
||||
site_name="test-site",
|
||||
storage_zone_id=1,
|
||||
storage_zone_name="test",
|
||||
storage_zone_password="pass",
|
||||
storage_zone_region="DE",
|
||||
pull_zone_id=10,
|
||||
pull_zone_bcdn_hostname="tier2site.b-cdn.net"
|
||||
)
|
||||
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={"main_keyword": "test"}
|
||||
)
|
||||
|
||||
# Create tier 2 articles
|
||||
for i in range(10):
|
||||
content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier2",
|
||||
keyword="test",
|
||||
title=f"Tier 2 Article {i}",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated",
|
||||
site_deployment_id=site.id
|
||||
)
|
||||
|
||||
# Create tier 3 article
|
||||
tier3_article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier3",
|
||||
keyword="test",
|
||||
title="Tier 3 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
result = find_tiered_links([tier3_article], None, project_repo, content_repo, site_repo)
|
||||
|
||||
assert result["tier"] == 3
|
||||
assert result["lower_tier"] == 2
|
||||
assert len(result["lower_tier_urls"]) >= 2
|
||||
assert len(result["lower_tier_urls"]) <= 4
|
||||
|
||||
def test_custom_link_count_range(self, db_session):
|
||||
"""Test custom link count range from job config"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
site_repo = SiteDeploymentRepository(db_session)
|
||||
|
||||
site = site_repo.create(
|
||||
site_name="test-site",
|
||||
storage_zone_id=1,
|
||||
storage_zone_name="test",
|
||||
storage_zone_password="pass",
|
||||
storage_zone_region="DE",
|
||||
pull_zone_id=10,
|
||||
pull_zone_bcdn_hostname="test.b-cdn.net"
|
||||
)
|
||||
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={"main_keyword": "test"}
|
||||
)
|
||||
|
||||
# Create 15 tier 1 articles
|
||||
for i in range(15):
|
||||
content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier1",
|
||||
keyword="test",
|
||||
title=f"Tier 1 Article {i}",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated",
|
||||
site_deployment_id=site.id
|
||||
)
|
||||
|
||||
# Create tier 2 article
|
||||
tier2_article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier2",
|
||||
keyword="test",
|
||||
title="Tier 2 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
# Test with custom range (min=5, max=8)
|
||||
job = Job(
|
||||
project_id=project.id,
|
||||
tiers={},
|
||||
tiered_link_count_range={"min": 5, "max": 8}
|
||||
)
|
||||
|
||||
result = find_tiered_links([tier2_article], job, project_repo, content_repo, site_repo)
|
||||
|
||||
url_count = len(result["lower_tier_urls"])
|
||||
assert 5 <= url_count <= 8
|
||||
|
||||
def test_exact_count_when_min_equals_max(self, db_session):
|
||||
"""Test exact link count when min equals max"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
site_repo = SiteDeploymentRepository(db_session)
|
||||
|
||||
site = site_repo.create(
|
||||
site_name="test-site",
|
||||
storage_zone_id=1,
|
||||
storage_zone_name="test",
|
||||
storage_zone_password="pass",
|
||||
storage_zone_region="DE",
|
||||
pull_zone_id=10,
|
||||
pull_zone_bcdn_hostname="test.b-cdn.net"
|
||||
)
|
||||
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={"main_keyword": "test"}
|
||||
)
|
||||
|
||||
# Create 20 tier 1 articles
|
||||
for i in range(20):
|
||||
content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier1",
|
||||
keyword="test",
|
||||
title=f"Tier 1 Article {i}",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated",
|
||||
site_deployment_id=site.id
|
||||
)
|
||||
|
||||
tier2_article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier2",
|
||||
keyword="test",
|
||||
title="Tier 2 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
# Test with exact count (min=7, max=7)
|
||||
job = Job(
|
||||
project_id=project.id,
|
||||
tiers={},
|
||||
tiered_link_count_range={"min": 7, "max": 7}
|
||||
)
|
||||
|
||||
result = find_tiered_links([tier2_article], job, project_repo, content_repo, site_repo)
|
||||
|
||||
assert len(result["lower_tier_urls"]) == 7
|
||||
|
||||
def test_insufficient_lower_tier_articles_uses_all(self, db_session):
|
||||
"""Test that all available articles are used when fewer than min requested"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
site_repo = SiteDeploymentRepository(db_session)
|
||||
|
||||
site = site_repo.create(
|
||||
site_name="test-site",
|
||||
storage_zone_id=1,
|
||||
storage_zone_name="test",
|
||||
storage_zone_password="pass",
|
||||
storage_zone_region="DE",
|
||||
pull_zone_id=10,
|
||||
pull_zone_bcdn_hostname="test.b-cdn.net"
|
||||
)
|
||||
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={"main_keyword": "test"}
|
||||
)
|
||||
|
||||
# Create only 1 tier 1 article
|
||||
content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier1",
|
||||
keyword="test",
|
||||
title="Only Tier 1 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated",
|
||||
site_deployment_id=site.id
|
||||
)
|
||||
|
||||
tier2_article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier2",
|
||||
keyword="test",
|
||||
title="Tier 2 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
result = find_tiered_links([tier2_article], None, project_repo, content_repo, site_repo)
|
||||
|
||||
# Should return the 1 available article even though min is 2
|
||||
assert len(result["lower_tier_urls"]) == 1
|
||||
|
||||
|
||||
class TestArticleLinkRepositoryIntegration:
|
||||
"""Integration tests for ArticleLink repository with database constraints"""
|
||||
|
||||
def test_create_and_query_tiered_links(self, db_session):
|
||||
"""Test creating and querying tiered links"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
link_repo = ArticleLinkRepository(db_session)
|
||||
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={"main_keyword": "test"}
|
||||
)
|
||||
|
||||
# Create tier 1 and tier 2 articles
|
||||
tier1_article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier1",
|
||||
keyword="test",
|
||||
title="Tier 1 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
tier2_article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier2",
|
||||
keyword="test",
|
||||
title="Tier 2 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
# Create tiered link from tier 2 to tier 1
|
||||
link = link_repo.create(
|
||||
from_content_id=tier2_article.id,
|
||||
to_content_id=tier1_article.id,
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
assert link.id is not None
|
||||
|
||||
# Query links
|
||||
outbound = link_repo.get_by_source_article(tier2_article.id)
|
||||
assert len(outbound) == 1
|
||||
assert outbound[0].to_content_id == tier1_article.id
|
||||
|
||||
inbound = link_repo.get_by_target_article(tier1_article.id)
|
||||
assert len(inbound) == 1
|
||||
assert inbound[0].from_content_id == tier2_article.id
|
||||
|
||||
def test_create_money_site_link(self, db_session):
|
||||
"""Test creating external link to money site"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
link_repo = ArticleLinkRepository(db_session)
|
||||
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={
|
||||
"main_keyword": "test",
|
||||
"money_site_url": "https://www.moneysite.com"
|
||||
}
|
||||
)
|
||||
|
||||
tier1_article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier1",
|
||||
keyword="test",
|
||||
title="Tier 1 Article",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
|
||||
# Create link to money site
|
||||
link = link_repo.create(
|
||||
from_content_id=tier1_article.id,
|
||||
to_content_id=None,
|
||||
to_url="https://www.moneysite.com",
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
assert link.to_content_id is None
|
||||
assert link.to_url == "https://www.moneysite.com"
|
||||
|
||||
# Query
|
||||
links = link_repo.get_by_source_article(tier1_article.id)
|
||||
assert len(links) == 1
|
||||
assert links[0].to_url == "https://www.moneysite.com"
|
||||
|
||||
def test_multiple_link_types(self, db_session):
|
||||
"""Test different link types (tiered, wheel_next, wheel_prev, homepage)"""
|
||||
project_repo = ProjectRepository(db_session)
|
||||
content_repo = GeneratedContentRepository(db_session)
|
||||
link_repo = ArticleLinkRepository(db_session)
|
||||
|
||||
project = project_repo.create(
|
||||
user_id=1,
|
||||
name="Test Project",
|
||||
data={"main_keyword": "test"}
|
||||
)
|
||||
|
||||
# Create 3 articles
|
||||
articles = []
|
||||
for i in range(3):
|
||||
article = content_repo.create(
|
||||
project_id=project.id,
|
||||
tier="tier1",
|
||||
keyword="test",
|
||||
title=f"Article {i}",
|
||||
outline={},
|
||||
content="<p>Test</p>",
|
||||
word_count=100,
|
||||
status="generated"
|
||||
)
|
||||
articles.append(article)
|
||||
|
||||
# Create different link types
|
||||
tiered_link = link_repo.create(
|
||||
from_content_id=articles[0].id,
|
||||
to_content_id=articles[1].id,
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
wheel_next_link = link_repo.create(
|
||||
from_content_id=articles[0].id,
|
||||
to_content_id=articles[1].id,
|
||||
link_type="wheel_next"
|
||||
)
|
||||
|
||||
wheel_prev_link = link_repo.create(
|
||||
from_content_id=articles[1].id,
|
||||
to_content_id=articles[0].id,
|
||||
link_type="wheel_prev"
|
||||
)
|
||||
|
||||
homepage_link = link_repo.create(
|
||||
from_content_id=articles[2].id,
|
||||
to_content_id=articles[0].id,
|
||||
link_type="homepage"
|
||||
)
|
||||
|
||||
# Query by type
|
||||
tiered_links = link_repo.get_by_link_type("tiered")
|
||||
assert len(tiered_links) == 1
|
||||
|
||||
wheel_links = link_repo.get_by_link_type("wheel_next")
|
||||
assert len(wheel_links) == 1
|
||||
|
||||
# Article 0 should have multiple outbound links
|
||||
outbound = link_repo.get_by_source_article(articles[0].id)
|
||||
assert len(outbound) == 2 # tiered and wheel_next
|
||||
|
||||
|
|
@ -0,0 +1,118 @@
|
|||
"""
|
||||
Unit tests for ArticleLink repository
|
||||
"""
|
||||
|
||||
import pytest
|
||||
from src.database.repositories import ArticleLinkRepository
|
||||
from src.database.models import ArticleLink, GeneratedContent, Project
|
||||
|
||||
|
||||
class TestArticleLinkRepository:
|
||||
"""Tests for ArticleLinkRepository"""
|
||||
|
||||
def test_create_internal_link(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
link = repo.create(
|
||||
from_content_id=1,
|
||||
to_content_id=2,
|
||||
to_url=None,
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
assert link.id is not None
|
||||
assert link.from_content_id == 1
|
||||
assert link.to_content_id == 2
|
||||
assert link.to_url is None
|
||||
assert link.link_type == "tiered"
|
||||
|
||||
def test_create_external_link(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
link = repo.create(
|
||||
from_content_id=1,
|
||||
to_content_id=None,
|
||||
to_url="https://www.moneysite.com",
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
assert link.id is not None
|
||||
assert link.from_content_id == 1
|
||||
assert link.to_content_id is None
|
||||
assert link.to_url == "https://www.moneysite.com"
|
||||
assert link.link_type == "tiered"
|
||||
|
||||
def test_create_without_target_raises_error(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
with pytest.raises(ValueError, match="Either to_content_id or to_url must be provided"):
|
||||
repo.create(
|
||||
from_content_id=1,
|
||||
to_content_id=None,
|
||||
to_url=None,
|
||||
link_type="tiered"
|
||||
)
|
||||
|
||||
def test_get_by_source_article(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
link1 = repo.create(from_content_id=1, to_content_id=2, link_type="tiered")
|
||||
link2 = repo.create(from_content_id=1, to_content_id=3, link_type="wheel_next")
|
||||
link3 = repo.create(from_content_id=2, to_content_id=4, link_type="tiered")
|
||||
|
||||
links = repo.get_by_source_article(1)
|
||||
|
||||
assert len(links) == 2
|
||||
link_ids = [link.id for link in links]
|
||||
assert link1.id in link_ids
|
||||
assert link2.id in link_ids
|
||||
assert link3.id not in link_ids
|
||||
|
||||
def test_get_by_target_article(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
link1 = repo.create(from_content_id=1, to_content_id=2, link_type="tiered")
|
||||
link2 = repo.create(from_content_id=3, to_content_id=2, link_type="tiered")
|
||||
link3 = repo.create(from_content_id=1, to_content_id=4, link_type="tiered")
|
||||
|
||||
links = repo.get_by_target_article(2)
|
||||
|
||||
assert len(links) == 2
|
||||
link_ids = [link.id for link in links]
|
||||
assert link1.id in link_ids
|
||||
assert link2.id in link_ids
|
||||
assert link3.id not in link_ids
|
||||
|
||||
def test_get_by_link_type(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
link1 = repo.create(from_content_id=1, to_content_id=2, link_type="tiered")
|
||||
link2 = repo.create(from_content_id=2, to_content_id=3, link_type="wheel_next")
|
||||
link3 = repo.create(from_content_id=3, to_content_id=4, link_type="tiered")
|
||||
|
||||
links = repo.get_by_link_type("tiered")
|
||||
|
||||
assert len(links) == 2
|
||||
link_ids = [link.id for link in links]
|
||||
assert link1.id in link_ids
|
||||
assert link3.id in link_ids
|
||||
assert link2.id not in link_ids
|
||||
|
||||
def test_delete(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
link = repo.create(from_content_id=1, to_content_id=2, link_type="tiered")
|
||||
link_id = link.id
|
||||
|
||||
result = repo.delete(link_id)
|
||||
assert result is True
|
||||
|
||||
links = repo.get_by_source_article(1)
|
||||
assert len(links) == 0
|
||||
|
||||
def test_delete_nonexistent_returns_false(self, db_session):
|
||||
repo = ArticleLinkRepository(db_session)
|
||||
|
||||
result = repo.delete(999)
|
||||
assert result is False
|
||||
|
||||
|
|
@ -0,0 +1,330 @@
|
|||
"""
|
||||
Unit tests for tiered link finder
|
||||
"""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import Mock, MagicMock, patch
|
||||
from src.interlinking.tiered_links import (
|
||||
find_tiered_links,
|
||||
_validate_batch_tier,
|
||||
_extract_tier_number,
|
||||
_get_link_count_range
|
||||
)
|
||||
from src.database.models import GeneratedContent, Project
|
||||
from src.generation.job_config import Job
|
||||
|
||||
|
||||
class TestExtractTierNumber:
|
||||
"""Tests for _extract_tier_number helper"""
|
||||
|
||||
def test_tier1(self):
|
||||
assert _extract_tier_number("tier1") == 1
|
||||
|
||||
def test_tier2(self):
|
||||
assert _extract_tier_number("tier2") == 2
|
||||
|
||||
def test_tier3(self):
|
||||
assert _extract_tier_number("tier3") == 3
|
||||
|
||||
def test_invalid_format(self):
|
||||
with pytest.raises(ValueError, match="Invalid tier format"):
|
||||
_extract_tier_number("invalid")
|
||||
|
||||
def test_tier_without_number(self):
|
||||
with pytest.raises(ValueError, match="Invalid tier format"):
|
||||
_extract_tier_number("tier")
|
||||
|
||||
|
||||
class TestValidateBatchTier:
|
||||
"""Tests for _validate_batch_tier helper"""
|
||||
|
||||
def test_single_tier_batch(self):
|
||||
content1 = Mock(spec=GeneratedContent)
|
||||
content1.tier = "tier1"
|
||||
content2 = Mock(spec=GeneratedContent)
|
||||
content2.tier = "tier1"
|
||||
|
||||
result = _validate_batch_tier([content1, content2])
|
||||
assert result == "tier1"
|
||||
|
||||
def test_mixed_tiers_raises_error(self):
|
||||
content1 = Mock(spec=GeneratedContent)
|
||||
content1.tier = "tier1"
|
||||
content2 = Mock(spec=GeneratedContent)
|
||||
content2.tier = "tier2"
|
||||
|
||||
with pytest.raises(ValueError, match="All articles in batch must be same tier"):
|
||||
_validate_batch_tier([content1, content2])
|
||||
|
||||
|
||||
class TestGetLinkCountRange:
|
||||
"""Tests for _get_link_count_range helper"""
|
||||
|
||||
def test_default_range(self):
|
||||
result = _get_link_count_range(None)
|
||||
assert result == {"min": 2, "max": 4}
|
||||
|
||||
def test_job_object_with_range(self):
|
||||
job = Job(
|
||||
project_id=1,
|
||||
tiers={},
|
||||
tiered_link_count_range={"min": 3, "max": 6}
|
||||
)
|
||||
result = _get_link_count_range(job)
|
||||
assert result == {"min": 3, "max": 6}
|
||||
|
||||
def test_job_object_without_range(self):
|
||||
job = Job(
|
||||
project_id=1,
|
||||
tiers={},
|
||||
tiered_link_count_range=None
|
||||
)
|
||||
result = _get_link_count_range(job)
|
||||
assert result == {"min": 2, "max": 4}
|
||||
|
||||
def test_dict_with_range(self):
|
||||
job_dict = {"tiered_link_count_range": {"min": 5, "max": 8}}
|
||||
result = _get_link_count_range(job_dict)
|
||||
assert result == {"min": 5, "max": 8}
|
||||
|
||||
def test_dict_without_range(self):
|
||||
job_dict = {}
|
||||
result = _get_link_count_range(job_dict)
|
||||
assert result == {"min": 2, "max": 4}
|
||||
|
||||
|
||||
class TestFindTieredLinks:
|
||||
"""Tests for find_tiered_links main function"""
|
||||
|
||||
def test_empty_content_records_raises_error(self):
|
||||
project_repo = Mock()
|
||||
content_repo = Mock()
|
||||
site_repo = Mock()
|
||||
|
||||
with pytest.raises(ValueError, match="content_records cannot be empty"):
|
||||
find_tiered_links([], None, project_repo, content_repo, site_repo)
|
||||
|
||||
def test_tier1_returns_money_site_url(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier1"
|
||||
content.project_id = 1
|
||||
|
||||
project = Mock(spec=Project)
|
||||
project.money_site_url = "https://www.mymoneysite.com"
|
||||
|
||||
project_repo = Mock()
|
||||
project_repo.get_by_id.return_value = project
|
||||
|
||||
content_repo = Mock()
|
||||
site_repo = Mock()
|
||||
|
||||
result = find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
assert result["tier"] == 1
|
||||
assert result["money_site_url"] == "https://www.mymoneysite.com"
|
||||
project_repo.get_by_id.assert_called_once_with(1)
|
||||
|
||||
def test_tier1_missing_money_site_url_raises_error(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier1"
|
||||
content.project_id = 1
|
||||
|
||||
project = Mock(spec=Project)
|
||||
project.money_site_url = None
|
||||
|
||||
project_repo = Mock()
|
||||
project_repo.get_by_id.return_value = project
|
||||
|
||||
content_repo = Mock()
|
||||
site_repo = Mock()
|
||||
|
||||
with pytest.raises(ValueError, match="money_site_url not set in project 1"):
|
||||
find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
def test_tier1_missing_project_raises_error(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier1"
|
||||
content.project_id = 999
|
||||
|
||||
project_repo = Mock()
|
||||
project_repo.get_by_id.return_value = None
|
||||
|
||||
content_repo = Mock()
|
||||
site_repo = Mock()
|
||||
|
||||
with pytest.raises(ValueError, match="money_site_url not set in project 999"):
|
||||
find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
def test_tier2_queries_tier1_articles(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier2"
|
||||
content.project_id = 1
|
||||
|
||||
lower_tier_article = Mock(spec=GeneratedContent)
|
||||
lower_tier_article.id = 10
|
||||
lower_tier_article.tier = "tier1"
|
||||
lower_tier_article.title = "Lower Tier Article"
|
||||
lower_tier_article.site_deployment_id = 5
|
||||
|
||||
project_repo = Mock()
|
||||
|
||||
content_repo = Mock()
|
||||
content_repo.get_by_project_and_tier.return_value = [lower_tier_article]
|
||||
|
||||
site_repo = Mock()
|
||||
|
||||
with patch('src.interlinking.tiered_links.generate_urls_for_batch') as mock_gen:
|
||||
mock_gen.return_value = [{
|
||||
"content_id": 10,
|
||||
"url": "https://example.com/article.html",
|
||||
"title": "Lower Tier Article"
|
||||
}]
|
||||
|
||||
result = find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
assert result["tier"] == 2
|
||||
assert result["lower_tier"] == 1
|
||||
assert len(result["lower_tier_urls"]) == 1
|
||||
assert result["lower_tier_urls"][0] == "https://example.com/article.html"
|
||||
content_repo.get_by_project_and_tier.assert_called_once_with(1, "tier1")
|
||||
|
||||
def test_tier3_queries_tier2_articles(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier3"
|
||||
content.project_id = 2
|
||||
|
||||
lower_tier_articles = [
|
||||
Mock(id=i, tier="tier2", site_deployment_id=5) for i in range(10)
|
||||
]
|
||||
|
||||
project_repo = Mock()
|
||||
|
||||
content_repo = Mock()
|
||||
content_repo.get_by_project_and_tier.return_value = lower_tier_articles
|
||||
|
||||
site_repo = Mock()
|
||||
|
||||
with patch('src.interlinking.tiered_links.generate_urls_for_batch') as mock_gen:
|
||||
mock_gen.return_value = [
|
||||
{"content_id": i, "url": f"https://example.com/article-{i}.html"}
|
||||
for i in range(10)
|
||||
]
|
||||
|
||||
result = find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
assert result["tier"] == 3
|
||||
assert result["lower_tier"] == 2
|
||||
content_repo.get_by_project_and_tier.assert_called_once_with(2, "tier2")
|
||||
|
||||
def test_no_lower_tier_articles_raises_error(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier2"
|
||||
content.project_id = 1
|
||||
|
||||
project_repo = Mock()
|
||||
|
||||
content_repo = Mock()
|
||||
content_repo.get_by_project_and_tier.return_value = []
|
||||
|
||||
site_repo = Mock()
|
||||
|
||||
with pytest.raises(ValueError, match="no tier 1 articles found in project 1"):
|
||||
find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
def test_custom_link_count_range(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier2"
|
||||
content.project_id = 1
|
||||
|
||||
lower_tier_articles = [
|
||||
Mock(id=i, tier="tier1", site_deployment_id=5) for i in range(20)
|
||||
]
|
||||
|
||||
job = Job(
|
||||
project_id=1,
|
||||
tiers={},
|
||||
tiered_link_count_range={"min": 5, "max": 8}
|
||||
)
|
||||
|
||||
project_repo = Mock()
|
||||
|
||||
content_repo = Mock()
|
||||
content_repo.get_by_project_and_tier.return_value = lower_tier_articles
|
||||
|
||||
site_repo = Mock()
|
||||
|
||||
with patch('src.interlinking.tiered_links.generate_urls_for_batch') as mock_gen:
|
||||
# Mock should return URLs based on how many articles were passed to it
|
||||
def mock_url_gen(articles, site_repo):
|
||||
return [
|
||||
{"content_id": i, "url": f"https://example.com/article-{i}.html"}
|
||||
for i in range(len(articles))
|
||||
]
|
||||
mock_gen.side_effect = mock_url_gen
|
||||
|
||||
result = find_tiered_links([content], job, project_repo, content_repo, site_repo)
|
||||
|
||||
url_count = len(result["lower_tier_urls"])
|
||||
assert 5 <= url_count <= 8
|
||||
|
||||
def test_fewer_articles_than_min_uses_all_available(self, caplog):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier2"
|
||||
content.project_id = 1
|
||||
|
||||
lower_tier_articles = [
|
||||
Mock(id=1, tier="tier1", site_deployment_id=5)
|
||||
]
|
||||
|
||||
project_repo = Mock()
|
||||
|
||||
content_repo = Mock()
|
||||
content_repo.get_by_project_and_tier.return_value = lower_tier_articles
|
||||
|
||||
site_repo = Mock()
|
||||
|
||||
with patch('src.interlinking.tiered_links.generate_urls_for_batch') as mock_gen:
|
||||
mock_gen.return_value = [
|
||||
{"content_id": 1, "url": "https://example.com/article.html"}
|
||||
]
|
||||
|
||||
result = find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||
|
||||
assert len(result["lower_tier_urls"]) == 1
|
||||
assert "Only 1 tier 1 articles available" in caplog.text
|
||||
|
||||
def test_exact_count_when_min_equals_max(self):
|
||||
content = Mock(spec=GeneratedContent)
|
||||
content.tier = "tier2"
|
||||
content.project_id = 1
|
||||
|
||||
lower_tier_articles = [
|
||||
Mock(id=i, tier="tier1", site_deployment_id=5) for i in range(20)
|
||||
]
|
||||
|
||||
job = Job(
|
||||
project_id=1,
|
||||
tiers={},
|
||||
tiered_link_count_range={"min": 8, "max": 8}
|
||||
)
|
||||
|
||||
project_repo = Mock()
|
||||
|
||||
content_repo = Mock()
|
||||
content_repo.get_by_project_and_tier.return_value = lower_tier_articles
|
||||
|
||||
site_repo = Mock()
|
||||
|
||||
with patch('src.interlinking.tiered_links.generate_urls_for_batch') as mock_gen:
|
||||
# Mock should return URLs based on how many articles were passed to it
|
||||
def mock_url_gen(articles, site_repo):
|
||||
return [
|
||||
{"content_id": i, "url": f"https://example.com/article-{i}.html"}
|
||||
for i in range(len(articles))
|
||||
]
|
||||
mock_gen.side_effect = mock_url_gen
|
||||
|
||||
result = find_tiered_links([content], job, project_repo, content_repo, site_repo)
|
||||
|
||||
assert len(result["lower_tier_urls"]) == 8
|
||||
|
||||
Loading…
Reference in New Issue