Implement Story 8.1: Job-Level Anchor Text Control for T1 and T2+
- Add explicit anchor text mode support in AnchorTextConfig - Support tier-specific anchor text terms at job-level (tier1, tier2, tier3, tier4_plus) - Support tier-level explicit anchor text with 'terms' array - Update content injection to prioritize explicit terms when mode is 'explicit' - Add validation for explicit mode requiring term lists - Update JOB_FIELD_REFERENCE.md with explicit mode documentation and examples - Add comprehensive unit and integration tests for explicit anchor text Includes multi-cloud storage migration script and related database changes.main
parent
a815cbcf3e
commit
de21a22b72
|
|
@ -11,7 +11,8 @@ auto_create_sites - Boolean (NOT IMPLEMENTED - parsed but doesn't wor
|
||||||
create_sites_for_keywords - Array of {keyword, count} objects (NOT IMPLEMENTED - parsed but doesn't work)
|
create_sites_for_keywords - Array of {keyword, count} objects (NOT IMPLEMENTED - parsed but doesn't work)
|
||||||
models - {title, outline, content} with model strings
|
models - {title, outline, content} with model strings
|
||||||
tiered_link_count_range - {min, max} integers
|
tiered_link_count_range - {min, max} integers
|
||||||
anchor_text_config - {mode, custom_text}
|
anchor_text_config - {mode, custom_text, tier1, tier2, tier3, tier4_plus}
|
||||||
|
- For "explicit" mode, use tier-specific arrays (tier1, tier2, etc.) instead of custom_text
|
||||||
failure_config - {max_consecutive_failures, skip_on_failure}
|
failure_config - {max_consecutive_failures, skip_on_failure}
|
||||||
interlinking - {links_per_article_min, links_per_article_max, see_also_min, see_also_max}
|
interlinking - {links_per_article_min, links_per_article_max, see_also_min, see_also_max}
|
||||||
tiers - Required, object with tier1/tier2/tier3
|
tiers - Required, object with tier1/tier2/tier3
|
||||||
|
|
@ -28,7 +29,8 @@ min_h3_tags - Integer
|
||||||
max_h3_tags - Integer
|
max_h3_tags - Integer
|
||||||
models - {title, outline, content} - overrides job-level
|
models - {title, outline, content} - overrides job-level
|
||||||
interlinking - {links_per_article_min, links_per_article_max, see_also_min, see_also_max} - overrides job-level
|
interlinking - {links_per_article_min, links_per_article_max, see_also_min, see_also_max} - overrides job-level
|
||||||
anchor_text_config - {mode, custom_text} - overrides job-level for this tier only
|
anchor_text_config - {mode, custom_text, terms} - overrides job-level for this tier only
|
||||||
|
- For "explicit" mode, use "terms" array instead of "custom_text"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Field Behaviors
|
## Field Behaviors
|
||||||
|
|
@ -43,6 +45,9 @@ anchor_text_config - {mode, custom_text} - overrides job-level for this tier
|
||||||
- "default" = Use master.config.json tier rules
|
- "default" = Use master.config.json tier rules
|
||||||
- "override" = Replace with custom_text
|
- "override" = Replace with custom_text
|
||||||
- "append" = Add custom_text to tier rules
|
- "append" = Add custom_text to tier rules
|
||||||
|
- "explicit" = Use only explicitly provided terms (no algorithm-generated terms)
|
||||||
|
- Job-level: Provide tier1, tier2, tier3, tier4_plus arrays with terms
|
||||||
|
- Tier-level: Provide terms array for that specific tier
|
||||||
- Tier-level config overrides job-level config for that tier
|
- Tier-level config overrides job-level config for that tier
|
||||||
|
|
||||||
**tiered_link_count_range**: How many links to lower tier
|
**tiered_link_count_range**: How many links to lower tier
|
||||||
|
|
@ -161,3 +166,54 @@ If not specified, these defaults apply:
|
||||||
- Tier2: Uses related_searches from project
|
- Tier2: Uses related_searches from project
|
||||||
- Can override with anchor_text_config
|
- Can override with anchor_text_config
|
||||||
|
|
||||||
|
## Explicit Anchor Text Example
|
||||||
|
|
||||||
|
Use "explicit" mode to specify exact anchor text terms for each tier:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["high volume", "precision machining", "custom manufacturing"],
|
||||||
|
"tier2": ["high volume production", "bulk manufacturing", "large scale"]
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {"count": 12},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use tier-level explicit config to override job-level for a specific tier:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["high volume", "precision machining"],
|
||||||
|
"tier2": ["bulk manufacturing"]
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {
|
||||||
|
"count": 12,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"terms": ["high volume", "precision"]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When using "explicit" mode, the system will:
|
||||||
|
- Use only the provided terms (no algorithm-generated terms)
|
||||||
|
- Try to find these terms in content first, then insert if not found
|
||||||
|
- Tier-level explicit config takes precedence over job-level for that tier
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,51 @@
|
||||||
|
# Epic 8: Functional Debt
|
||||||
|
|
||||||
|
## Epic Goal
|
||||||
|
To address functional limitations and gaps in the system that prevent users from achieving specific business requirements, particularly around customization and control of content generation and linking behavior.
|
||||||
|
|
||||||
|
## Rationale
|
||||||
|
While the system provides automated content generation with sensible defaults, there are cases where users need explicit control over specific aspects (like anchor text terms) that don't fit the standard algorithmic approach. This epic addresses these functional gaps to provide the flexibility needed for real-world use cases.
|
||||||
|
|
||||||
|
## Status
|
||||||
|
- **Story 8.1**: 🔄 PLANNING (Job-Level Anchor Text Control)
|
||||||
|
|
||||||
|
## Stories
|
||||||
|
|
||||||
|
### Story 8.1: Job-Level Anchor Text Control for T1 and T2+
|
||||||
|
**Estimated Effort**: 2 story points
|
||||||
|
|
||||||
|
**As a user**, I want to explicitly specify anchor text terms in my job configuration for both Tier 1 and Tier 2+ links, so that I can include specific terms (like "high volume") that aren't covered by the standard algorithm.
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
* Job JSON supports explicit anchor text configuration for both Tier 1 and Tier 2+
|
||||||
|
* Anchor text can be specified at job-level (applies to all tiers) or tier-level (tier-specific)
|
||||||
|
* When explicit anchor text is provided, it should be used in addition to or instead of algorithm-generated anchor text
|
||||||
|
* Support for multiple anchor text terms per tier
|
||||||
|
* Anchor text terms are used when injecting links (tiered links, homepage links, etc.)
|
||||||
|
* If explicit anchor text is provided, it takes precedence over algorithm-generated terms
|
||||||
|
* Backward compatible: existing jobs without explicit anchor text continue to work with current algorithm
|
||||||
|
|
||||||
|
**Technical Notes**:
|
||||||
|
* Extend `anchor_text_config` in job JSON to support explicit term lists
|
||||||
|
* Update `_get_anchor_texts_for_tier()` in `content_injection.py` to prioritize explicit terms
|
||||||
|
* Consider adding a new mode like "explicit" that uses only the provided terms, or extend "override" mode
|
||||||
|
* Document in `JOB_FIELD_REFERENCE.md` how to use explicit anchor text
|
||||||
|
|
||||||
|
**Example Job Configuration**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["high volume", "precision machining", "custom manufacturing"],
|
||||||
|
"tier2": ["high volume production", "bulk manufacturing", "large scale"]
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {"count": 12},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
@ -0,0 +1,101 @@
|
||||||
|
# Story 8.1: Job-Level Anchor Text Control for T1 and T2+
|
||||||
|
|
||||||
|
## Story Details
|
||||||
|
**As a user**, I want to explicitly specify anchor text terms in my job configuration for both Tier 1 and Tier 2+ links, so that I can include specific terms (like "high volume") that aren't covered by the standard algorithm.
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
### 1. Job JSON Configuration Support
|
||||||
|
**Status:** TODO
|
||||||
|
|
||||||
|
- Job JSON supports explicit anchor text configuration for both Tier 1 and Tier 2+
|
||||||
|
- Anchor text can be specified at job-level (applies to all tiers) or tier-level (tier-specific)
|
||||||
|
- Support for multiple anchor text terms per tier
|
||||||
|
- Configuration format is intuitive and well-documented
|
||||||
|
|
||||||
|
**Example Format:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["high volume", "precision machining", "custom manufacturing"],
|
||||||
|
"tier2": ["high volume production", "bulk manufacturing", "large scale"]
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {
|
||||||
|
"count": 12,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"terms": ["high volume", "precision"]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Anchor Text Priority and Usage
|
||||||
|
**Status:** TODO
|
||||||
|
|
||||||
|
- When explicit anchor text is provided, it should be used instead of algorithm-generated anchor text
|
||||||
|
- Explicit anchor text takes precedence over algorithm-generated terms
|
||||||
|
- Anchor text terms are used when injecting links (tiered links, homepage links, money site links, etc.)
|
||||||
|
- System tries to find explicit terms in content first, then falls back to insertion if not found
|
||||||
|
|
||||||
|
### 3. Backward Compatibility
|
||||||
|
**Status:** TODO
|
||||||
|
|
||||||
|
- Existing jobs without explicit anchor text continue to work with current algorithm
|
||||||
|
- Default behavior unchanged when no explicit anchor text is provided
|
||||||
|
- All existing anchor text modes ("default", "override", "append") continue to work
|
||||||
|
|
||||||
|
### 4. Implementation Details
|
||||||
|
**Status:** TODO
|
||||||
|
|
||||||
|
- Extend `AnchorTextConfig` dataclass in `src/generation/job_config.py` to support tier-specific term lists
|
||||||
|
- Update `_parse_job()` and `_parse_tier()` methods to parse explicit anchor text configuration
|
||||||
|
- Update `_get_anchor_texts_for_tier()` in `src/interlinking/content_injection.py` to prioritize explicit terms
|
||||||
|
- Add validation to ensure explicit terms are provided when mode is "explicit"
|
||||||
|
- Update `JOB_FIELD_REFERENCE.md` with documentation and examples
|
||||||
|
|
||||||
|
### 5. Testing
|
||||||
|
**Status:** TODO
|
||||||
|
|
||||||
|
- Unit tests for parsing explicit anchor text configuration
|
||||||
|
- Integration tests verifying explicit anchor text is used in link injection
|
||||||
|
- Tests for tier-level override of job-level anchor text
|
||||||
|
- Tests for backward compatibility with existing configurations
|
||||||
|
|
||||||
|
## Technical Implementation
|
||||||
|
|
||||||
|
### Changes Required
|
||||||
|
|
||||||
|
1. **Job Config Parser** (`src/generation/job_config.py`):
|
||||||
|
- Extend `AnchorTextConfig` to support `tier1`, `tier2`, etc. fields
|
||||||
|
- Update parsing logic to handle tier-specific anchor text lists
|
||||||
|
- Add validation for explicit mode requiring term lists
|
||||||
|
|
||||||
|
2. **Content Injection** (`src/interlinking/content_injection.py`):
|
||||||
|
- Update `_get_anchor_texts_for_tier()` to check for explicit terms first
|
||||||
|
- If explicit terms exist, use them; otherwise fall back to current algorithm
|
||||||
|
- Support both job-level and tier-level explicit anchor text
|
||||||
|
|
||||||
|
3. **Documentation**:
|
||||||
|
- Update `JOB_FIELD_REFERENCE.md` with new anchor text configuration options
|
||||||
|
- Add examples showing explicit anchor text usage
|
||||||
|
|
||||||
|
## Example Use Cases
|
||||||
|
|
||||||
|
1. **Wide Variety**: User wants to include multiple different terms to a page, which isn't in related_searches or main_keyword variations
|
||||||
|
2. **Brand-Specific Terms**: User wants to use specific branded terms that aren't algorithmically generated
|
||||||
|
3. **Industry-Specific Jargon**: Terms that are important for SEO but don't appear in standard keyword extraction
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
- None (standalone feature)
|
||||||
|
|
||||||
|
## Related Stories
|
||||||
|
- Story 3.3: Content Interlinking Injection (existing anchor text system)
|
||||||
|
|
||||||
|
|
@ -0,0 +1,142 @@
|
||||||
|
"""
|
||||||
|
Database migration script to add multi-cloud storage fields to site_deployments table
|
||||||
|
Story 6.3: Database Schema Updates for Multi-Cloud
|
||||||
|
|
||||||
|
Adds:
|
||||||
|
- storage_provider (String(20), Not Null, Default: 'bunny', Indexed)
|
||||||
|
- s3_bucket_name (String(255), Nullable)
|
||||||
|
- s3_bucket_region (String(50), Nullable)
|
||||||
|
- s3_custom_domain (String(255), Nullable)
|
||||||
|
- s3_endpoint_url (String(500), Nullable)
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python scripts/migrate_add_multi_cloud_storage_fields.py
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
project_root = Path(__file__).parent.parent
|
||||||
|
sys.path.insert(0, str(project_root))
|
||||||
|
|
||||||
|
from sqlalchemy import text
|
||||||
|
from src.database.session import db_manager
|
||||||
|
from src.core.config import get_config
|
||||||
|
|
||||||
|
|
||||||
|
def migrate():
|
||||||
|
"""Add multi-cloud storage fields to site_deployments table"""
|
||||||
|
print("Starting migration: add multi-cloud storage fields to site_deployments...")
|
||||||
|
|
||||||
|
try:
|
||||||
|
config = get_config()
|
||||||
|
print(f"Database URL: {config.database.url}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error loading configuration: {e}")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
try:
|
||||||
|
db_manager.initialize()
|
||||||
|
engine = db_manager.get_engine()
|
||||||
|
|
||||||
|
with engine.connect() as conn:
|
||||||
|
print("Checking for existing columns...")
|
||||||
|
|
||||||
|
result = conn.execute(text("PRAGMA table_info(site_deployments)"))
|
||||||
|
existing_columns = [row[1] for row in result]
|
||||||
|
print(f"Existing columns: {', '.join(existing_columns)}")
|
||||||
|
|
||||||
|
migrations_applied = []
|
||||||
|
|
||||||
|
if "storage_provider" not in existing_columns:
|
||||||
|
print("Adding storage_provider column...")
|
||||||
|
conn.execute(text("""
|
||||||
|
ALTER TABLE site_deployments
|
||||||
|
ADD COLUMN storage_provider VARCHAR(20) NOT NULL DEFAULT 'bunny'
|
||||||
|
"""))
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
print("Setting storage_provider='bunny' for all existing records...")
|
||||||
|
conn.execute(text("""
|
||||||
|
UPDATE site_deployments
|
||||||
|
SET storage_provider = 'bunny'
|
||||||
|
"""))
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
print("Creating index on storage_provider...")
|
||||||
|
conn.execute(text("""
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_site_deployments_storage_provider
|
||||||
|
ON site_deployments(storage_provider)
|
||||||
|
"""))
|
||||||
|
conn.commit()
|
||||||
|
migrations_applied.append("storage_provider")
|
||||||
|
else:
|
||||||
|
print("storage_provider column already exists, skipping")
|
||||||
|
|
||||||
|
if "s3_bucket_name" not in existing_columns:
|
||||||
|
print("Adding s3_bucket_name column...")
|
||||||
|
conn.execute(text("""
|
||||||
|
ALTER TABLE site_deployments
|
||||||
|
ADD COLUMN s3_bucket_name VARCHAR(255)
|
||||||
|
"""))
|
||||||
|
conn.commit()
|
||||||
|
migrations_applied.append("s3_bucket_name")
|
||||||
|
else:
|
||||||
|
print("s3_bucket_name column already exists, skipping")
|
||||||
|
|
||||||
|
if "s3_bucket_region" not in existing_columns:
|
||||||
|
print("Adding s3_bucket_region column...")
|
||||||
|
conn.execute(text("""
|
||||||
|
ALTER TABLE site_deployments
|
||||||
|
ADD COLUMN s3_bucket_region VARCHAR(50)
|
||||||
|
"""))
|
||||||
|
conn.commit()
|
||||||
|
migrations_applied.append("s3_bucket_region")
|
||||||
|
else:
|
||||||
|
print("s3_bucket_region column already exists, skipping")
|
||||||
|
|
||||||
|
if "s3_custom_domain" not in existing_columns:
|
||||||
|
print("Adding s3_custom_domain column...")
|
||||||
|
conn.execute(text("""
|
||||||
|
ALTER TABLE site_deployments
|
||||||
|
ADD COLUMN s3_custom_domain VARCHAR(255)
|
||||||
|
"""))
|
||||||
|
conn.commit()
|
||||||
|
migrations_applied.append("s3_custom_domain")
|
||||||
|
else:
|
||||||
|
print("s3_custom_domain column already exists, skipping")
|
||||||
|
|
||||||
|
if "s3_endpoint_url" not in existing_columns:
|
||||||
|
print("Adding s3_endpoint_url column...")
|
||||||
|
conn.execute(text("""
|
||||||
|
ALTER TABLE site_deployments
|
||||||
|
ADD COLUMN s3_endpoint_url VARCHAR(500)
|
||||||
|
"""))
|
||||||
|
conn.commit()
|
||||||
|
migrations_applied.append("s3_endpoint_url")
|
||||||
|
else:
|
||||||
|
print("s3_endpoint_url column already exists, skipping")
|
||||||
|
|
||||||
|
if migrations_applied:
|
||||||
|
print(f"\nMigration complete! Added columns: {', '.join(migrations_applied)}")
|
||||||
|
print("\nNew fields added:")
|
||||||
|
print(" - storage_provider (VARCHAR(20), NOT NULL, DEFAULT 'bunny', indexed)")
|
||||||
|
print(" - s3_bucket_name (VARCHAR(255), nullable)")
|
||||||
|
print(" - s3_bucket_region (VARCHAR(50), nullable)")
|
||||||
|
print(" - s3_custom_domain (VARCHAR(255), nullable)")
|
||||||
|
print(" - s3_endpoint_url (VARCHAR(500), nullable)")
|
||||||
|
else:
|
||||||
|
print("\nNo migrations needed - all columns already exist")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error during migration: {e}")
|
||||||
|
import traceback
|
||||||
|
traceback.print_exc()
|
||||||
|
sys.exit(1)
|
||||||
|
finally:
|
||||||
|
db_manager.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
migrate()
|
||||||
|
|
||||||
|
|
@ -59,7 +59,12 @@ class ISiteDeploymentRepository(ABC):
|
||||||
storage_zone_region: str,
|
storage_zone_region: str,
|
||||||
pull_zone_id: int,
|
pull_zone_id: int,
|
||||||
pull_zone_bcdn_hostname: str,
|
pull_zone_bcdn_hostname: str,
|
||||||
custom_hostname: Optional[str] = None
|
custom_hostname: Optional[str] = None,
|
||||||
|
storage_provider: Optional[str] = None,
|
||||||
|
s3_bucket_name: Optional[str] = None,
|
||||||
|
s3_bucket_region: Optional[str] = None,
|
||||||
|
s3_custom_domain: Optional[str] = None,
|
||||||
|
s3_endpoint_url: Optional[str] = None
|
||||||
) -> SiteDeployment:
|
) -> SiteDeployment:
|
||||||
"""Create a new site deployment"""
|
"""Create a new site deployment"""
|
||||||
pass
|
pass
|
||||||
|
|
|
||||||
|
|
@ -38,12 +38,13 @@ class User(Base):
|
||||||
|
|
||||||
|
|
||||||
class SiteDeployment(Base):
|
class SiteDeployment(Base):
|
||||||
"""Site deployment model for bunny.net infrastructure tracking"""
|
"""Site deployment model for multi-cloud storage infrastructure tracking"""
|
||||||
__tablename__ = "site_deployments"
|
__tablename__ = "site_deployments"
|
||||||
|
|
||||||
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
site_name: Mapped[str] = mapped_column(String(255), nullable=False)
|
site_name: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||||
custom_hostname: Mapped[Optional[str]] = mapped_column(String(255), unique=True, nullable=True, index=True)
|
custom_hostname: Mapped[Optional[str]] = mapped_column(String(255), unique=True, nullable=True, index=True)
|
||||||
|
storage_provider: Mapped[str] = mapped_column(String(20), nullable=False, default="bunny", index=True)
|
||||||
storage_zone_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
storage_zone_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
storage_zone_name: Mapped[str] = mapped_column(String(255), nullable=False)
|
storage_zone_name: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||||
storage_zone_password: Mapped[str] = mapped_column(String(255), nullable=False)
|
storage_zone_password: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||||
|
|
@ -51,6 +52,10 @@ class SiteDeployment(Base):
|
||||||
pull_zone_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
pull_zone_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
pull_zone_bcdn_hostname: Mapped[str] = mapped_column(String(255), unique=True, nullable=False)
|
pull_zone_bcdn_hostname: Mapped[str] = mapped_column(String(255), unique=True, nullable=False)
|
||||||
template_name: Mapped[str] = mapped_column(String(50), default="basic", nullable=False)
|
template_name: Mapped[str] = mapped_column(String(50), default="basic", nullable=False)
|
||||||
|
s3_bucket_name: Mapped[Optional[str]] = mapped_column(String(255), nullable=True)
|
||||||
|
s3_bucket_region: Mapped[Optional[str]] = mapped_column(String(50), nullable=True)
|
||||||
|
s3_custom_domain: Mapped[Optional[str]] = mapped_column(String(255), nullable=True)
|
||||||
|
s3_endpoint_url: Mapped[Optional[str]] = mapped_column(String(500), nullable=True)
|
||||||
created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow, nullable=False)
|
created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow, nullable=False)
|
||||||
updated_at: Mapped[datetime] = mapped_column(
|
updated_at: Mapped[datetime] = mapped_column(
|
||||||
DateTime,
|
DateTime,
|
||||||
|
|
@ -61,7 +66,7 @@ class SiteDeployment(Base):
|
||||||
|
|
||||||
def __repr__(self) -> str:
|
def __repr__(self) -> str:
|
||||||
hostname = self.custom_hostname or self.pull_zone_bcdn_hostname
|
hostname = self.custom_hostname or self.pull_zone_bcdn_hostname
|
||||||
return f"<SiteDeployment(id={self.id}, site_name='{self.site_name}', hostname='{hostname}')>"
|
return f"<SiteDeployment(id={self.id}, site_name='{self.site_name}', hostname='{hostname}', provider='{self.storage_provider}')>"
|
||||||
|
|
||||||
|
|
||||||
class Project(Base):
|
class Project(Base):
|
||||||
|
|
|
||||||
|
|
@ -143,7 +143,12 @@ class SiteDeploymentRepository(ISiteDeploymentRepository):
|
||||||
storage_zone_region: str,
|
storage_zone_region: str,
|
||||||
pull_zone_id: int,
|
pull_zone_id: int,
|
||||||
pull_zone_bcdn_hostname: str,
|
pull_zone_bcdn_hostname: str,
|
||||||
custom_hostname: Optional[str] = None
|
custom_hostname: Optional[str] = None,
|
||||||
|
storage_provider: Optional[str] = None,
|
||||||
|
s3_bucket_name: Optional[str] = None,
|
||||||
|
s3_bucket_region: Optional[str] = None,
|
||||||
|
s3_custom_domain: Optional[str] = None,
|
||||||
|
s3_endpoint_url: Optional[str] = None
|
||||||
) -> SiteDeployment:
|
) -> SiteDeployment:
|
||||||
"""
|
"""
|
||||||
Create a new site deployment
|
Create a new site deployment
|
||||||
|
|
@ -157,6 +162,11 @@ class SiteDeploymentRepository(ISiteDeploymentRepository):
|
||||||
pull_zone_id: bunny.net Pull Zone ID
|
pull_zone_id: bunny.net Pull Zone ID
|
||||||
pull_zone_bcdn_hostname: Default b-cdn.net hostname
|
pull_zone_bcdn_hostname: Default b-cdn.net hostname
|
||||||
custom_hostname: Optional custom FQDN (e.g., www.yourdomain.com)
|
custom_hostname: Optional custom FQDN (e.g., www.yourdomain.com)
|
||||||
|
storage_provider: Storage provider type ('bunny', 's3', 's3_compatible'). Defaults to 'bunny'
|
||||||
|
s3_bucket_name: S3 bucket name (for S3 providers)
|
||||||
|
s3_bucket_region: S3 bucket region (for S3 providers)
|
||||||
|
s3_custom_domain: Custom domain for S3 (optional)
|
||||||
|
s3_endpoint_url: Custom endpoint URL for S3-compatible services (optional)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
The created SiteDeployment object
|
The created SiteDeployment object
|
||||||
|
|
@ -167,12 +177,17 @@ class SiteDeploymentRepository(ISiteDeploymentRepository):
|
||||||
deployment = SiteDeployment(
|
deployment = SiteDeployment(
|
||||||
site_name=site_name,
|
site_name=site_name,
|
||||||
custom_hostname=custom_hostname,
|
custom_hostname=custom_hostname,
|
||||||
|
storage_provider=storage_provider or "bunny",
|
||||||
storage_zone_id=storage_zone_id,
|
storage_zone_id=storage_zone_id,
|
||||||
storage_zone_name=storage_zone_name,
|
storage_zone_name=storage_zone_name,
|
||||||
storage_zone_password=storage_zone_password,
|
storage_zone_password=storage_zone_password,
|
||||||
storage_zone_region=storage_zone_region,
|
storage_zone_region=storage_zone_region,
|
||||||
pull_zone_id=pull_zone_id,
|
pull_zone_id=pull_zone_id,
|
||||||
pull_zone_bcdn_hostname=pull_zone_bcdn_hostname
|
pull_zone_bcdn_hostname=pull_zone_bcdn_hostname,
|
||||||
|
s3_bucket_name=s3_bucket_name,
|
||||||
|
s3_bucket_region=s3_bucket_region,
|
||||||
|
s3_custom_domain=s3_custom_domain,
|
||||||
|
s3_endpoint_url=s3_endpoint_url
|
||||||
)
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
|
|
||||||
|
|
@ -46,8 +46,13 @@ class ModelConfig:
|
||||||
@dataclass
|
@dataclass
|
||||||
class AnchorTextConfig:
|
class AnchorTextConfig:
|
||||||
"""Anchor text configuration for interlinking"""
|
"""Anchor text configuration for interlinking"""
|
||||||
mode: str # "default", "override", "append"
|
mode: str # "default", "override", "append", "explicit"
|
||||||
custom_text: Optional[List[str]] = None
|
custom_text: Optional[List[str]] = None
|
||||||
|
tier1: Optional[List[str]] = None
|
||||||
|
tier2: Optional[List[str]] = None
|
||||||
|
tier3: Optional[List[str]] = None
|
||||||
|
tier4_plus: Optional[List[str]] = None
|
||||||
|
terms: Optional[List[str]] = None # For tier-level explicit config
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
|
|
@ -262,12 +267,45 @@ class JobConfig:
|
||||||
if "mode" not in anchor_text_data:
|
if "mode" not in anchor_text_data:
|
||||||
raise ValueError("'anchor_text_config' must have 'mode' field")
|
raise ValueError("'anchor_text_config' must have 'mode' field")
|
||||||
mode = anchor_text_data["mode"]
|
mode = anchor_text_data["mode"]
|
||||||
if mode not in ["default", "override", "append"]:
|
if mode not in ["default", "override", "append", "explicit"]:
|
||||||
raise ValueError("'anchor_text_config' mode must be 'default', 'override', or 'append'")
|
raise ValueError("'anchor_text_config' mode must be 'default', 'override', 'append', or 'explicit'")
|
||||||
|
|
||||||
|
# Validate explicit mode requires tier-specific terms
|
||||||
|
if mode == "explicit":
|
||||||
|
has_tier_terms = any(
|
||||||
|
anchor_text_data.get(tier_key) is not None
|
||||||
|
for tier_key in ["tier1", "tier2", "tier3", "tier4_plus"]
|
||||||
|
)
|
||||||
|
if not has_tier_terms:
|
||||||
|
raise ValueError("'anchor_text_config' with mode 'explicit' must have at least one tier-specific term list (tier1, tier2, tier3, or tier4_plus)")
|
||||||
|
|
||||||
custom_text = anchor_text_data.get("custom_text")
|
custom_text = anchor_text_data.get("custom_text")
|
||||||
if custom_text is not None and not isinstance(custom_text, list):
|
if custom_text is not None and not isinstance(custom_text, list):
|
||||||
raise ValueError("'anchor_text_config' custom_text must be an array")
|
raise ValueError("'anchor_text_config' custom_text must be an array")
|
||||||
anchor_text_config = AnchorTextConfig(mode=mode, custom_text=custom_text)
|
|
||||||
|
# Parse tier-specific terms for explicit mode
|
||||||
|
tier1_terms = anchor_text_data.get("tier1")
|
||||||
|
tier2_terms = anchor_text_data.get("tier2")
|
||||||
|
tier3_terms = anchor_text_data.get("tier3")
|
||||||
|
tier4_plus_terms = anchor_text_data.get("tier4_plus")
|
||||||
|
|
||||||
|
if tier1_terms is not None and not isinstance(tier1_terms, list):
|
||||||
|
raise ValueError("'anchor_text_config' tier1 must be an array")
|
||||||
|
if tier2_terms is not None and not isinstance(tier2_terms, list):
|
||||||
|
raise ValueError("'anchor_text_config' tier2 must be an array")
|
||||||
|
if tier3_terms is not None and not isinstance(tier3_terms, list):
|
||||||
|
raise ValueError("'anchor_text_config' tier3 must be an array")
|
||||||
|
if tier4_plus_terms is not None and not isinstance(tier4_plus_terms, list):
|
||||||
|
raise ValueError("'anchor_text_config' tier4_plus must be an array")
|
||||||
|
|
||||||
|
anchor_text_config = AnchorTextConfig(
|
||||||
|
mode=mode,
|
||||||
|
custom_text=custom_text,
|
||||||
|
tier1=tier1_terms,
|
||||||
|
tier2=tier2_terms,
|
||||||
|
tier3=tier3_terms,
|
||||||
|
tier4_plus=tier4_plus_terms
|
||||||
|
)
|
||||||
|
|
||||||
# Parse failure configuration
|
# Parse failure configuration
|
||||||
failure_config = None
|
failure_config = None
|
||||||
|
|
@ -358,12 +396,24 @@ class JobConfig:
|
||||||
if "mode" not in anchor_text_data:
|
if "mode" not in anchor_text_data:
|
||||||
raise ValueError(f"'{tier_name}.anchor_text_config' must have 'mode' field")
|
raise ValueError(f"'{tier_name}.anchor_text_config' must have 'mode' field")
|
||||||
mode = anchor_text_data["mode"]
|
mode = anchor_text_data["mode"]
|
||||||
if mode not in ["default", "override", "append"]:
|
if mode not in ["default", "override", "append", "explicit"]:
|
||||||
raise ValueError(f"'{tier_name}.anchor_text_config' mode must be 'default', 'override', or 'append'")
|
raise ValueError(f"'{tier_name}.anchor_text_config' mode must be 'default', 'override', 'append', or 'explicit'")
|
||||||
|
|
||||||
|
# Validate explicit mode requires terms
|
||||||
|
if mode == "explicit":
|
||||||
|
terms = anchor_text_data.get("terms")
|
||||||
|
if not terms or not isinstance(terms, list):
|
||||||
|
raise ValueError(f"'{tier_name}.anchor_text_config' with mode 'explicit' must have 'terms' array")
|
||||||
|
|
||||||
custom_text = anchor_text_data.get("custom_text")
|
custom_text = anchor_text_data.get("custom_text")
|
||||||
if custom_text is not None and not isinstance(custom_text, list):
|
if custom_text is not None and not isinstance(custom_text, list):
|
||||||
raise ValueError(f"'{tier_name}.anchor_text_config' custom_text must be an array")
|
raise ValueError(f"'{tier_name}.anchor_text_config' custom_text must be an array")
|
||||||
anchor_text_config = AnchorTextConfig(mode=mode, custom_text=custom_text)
|
|
||||||
|
terms = anchor_text_data.get("terms")
|
||||||
|
if terms is not None and not isinstance(terms, list):
|
||||||
|
raise ValueError(f"'{tier_name}.anchor_text_config' terms must be an array")
|
||||||
|
|
||||||
|
anchor_text_config = AnchorTextConfig(mode=mode, custom_text=custom_text, terms=terms)
|
||||||
|
|
||||||
# Parse tier-level models if present
|
# Parse tier-level models if present
|
||||||
tier_models = None
|
tier_models = None
|
||||||
|
|
@ -462,12 +512,24 @@ class JobConfig:
|
||||||
if "mode" not in anchor_text_data:
|
if "mode" not in anchor_text_data:
|
||||||
raise ValueError(f"'{tier_name}.anchor_text_config' must have 'mode' field")
|
raise ValueError(f"'{tier_name}.anchor_text_config' must have 'mode' field")
|
||||||
mode = anchor_text_data["mode"]
|
mode = anchor_text_data["mode"]
|
||||||
if mode not in ["default", "override", "append"]:
|
if mode not in ["default", "override", "append", "explicit"]:
|
||||||
raise ValueError(f"'{tier_name}.anchor_text_config' mode must be 'default', 'override', or 'append'")
|
raise ValueError(f"'{tier_name}.anchor_text_config' mode must be 'default', 'override', 'append', or 'explicit'")
|
||||||
|
|
||||||
|
# Validate explicit mode requires terms
|
||||||
|
if mode == "explicit":
|
||||||
|
terms = anchor_text_data.get("terms")
|
||||||
|
if not terms or not isinstance(terms, list):
|
||||||
|
raise ValueError(f"'{tier_name}.anchor_text_config' with mode 'explicit' must have 'terms' array")
|
||||||
|
|
||||||
custom_text = anchor_text_data.get("custom_text")
|
custom_text = anchor_text_data.get("custom_text")
|
||||||
if custom_text is not None and not isinstance(custom_text, list):
|
if custom_text is not None and not isinstance(custom_text, list):
|
||||||
raise ValueError(f"'{tier_name}.anchor_text_config' custom_text must be an array")
|
raise ValueError(f"'{tier_name}.anchor_text_config' custom_text must be an array")
|
||||||
anchor_text_config = AnchorTextConfig(mode=mode, custom_text=custom_text)
|
|
||||||
|
terms = anchor_text_data.get("terms")
|
||||||
|
if terms is not None and not isinstance(terms, list):
|
||||||
|
raise ValueError(f"'{tier_name}.anchor_text_config' terms must be an array")
|
||||||
|
|
||||||
|
anchor_text_config = AnchorTextConfig(mode=mode, custom_text=custom_text, terms=terms)
|
||||||
|
|
||||||
# Parse image_config if present (same logic as _parse_tier)
|
# Parse image_config if present (same logic as _parse_tier)
|
||||||
image_config = None
|
image_config = None
|
||||||
|
|
|
||||||
|
|
@ -302,6 +302,27 @@ def _get_anchor_texts_for_tier(
|
||||||
mode = anchor_text_config.get('mode') if isinstance(anchor_text_config, dict) else getattr(anchor_text_config, 'mode', None)
|
mode = anchor_text_config.get('mode') if isinstance(anchor_text_config, dict) else getattr(anchor_text_config, 'mode', None)
|
||||||
custom_text = anchor_text_config.get('custom_text') if isinstance(anchor_text_config, dict) else getattr(anchor_text_config, 'custom_text', None)
|
custom_text = anchor_text_config.get('custom_text') if isinstance(anchor_text_config, dict) else getattr(anchor_text_config, 'custom_text', None)
|
||||||
|
|
||||||
|
# Handle explicit mode - prioritize explicit terms
|
||||||
|
if mode == "explicit":
|
||||||
|
# Check for tier-level explicit terms first
|
||||||
|
if hasattr(anchor_text_config, 'terms') and anchor_text_config.terms:
|
||||||
|
return anchor_text_config.terms
|
||||||
|
|
||||||
|
# Check for job-level tier-specific terms
|
||||||
|
tier_attr = getattr(anchor_text_config, tier, None)
|
||||||
|
if tier_attr:
|
||||||
|
return tier_attr
|
||||||
|
|
||||||
|
# Fallback: check if it's a dict with tier key
|
||||||
|
if isinstance(anchor_text_config, dict):
|
||||||
|
tier_terms = anchor_text_config.get(tier)
|
||||||
|
if tier_terms:
|
||||||
|
return tier_terms
|
||||||
|
|
||||||
|
# If explicit mode but no terms found, return empty list (shouldn't happen due to validation)
|
||||||
|
logger.warning(f"Explicit mode specified for {tier} but no terms found, falling back to defaults")
|
||||||
|
return default_anchors
|
||||||
|
|
||||||
if mode == "override" and custom_text:
|
if mode == "override" and custom_text:
|
||||||
return custom_text
|
return custom_text
|
||||||
elif mode == "append" and custom_text:
|
elif mode == "append" and custom_text:
|
||||||
|
|
|
||||||
|
|
@ -4,6 +4,7 @@ Tests full flow with database
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
from unittest.mock import Mock
|
||||||
from sqlalchemy import create_engine
|
from sqlalchemy import create_engine
|
||||||
from sqlalchemy.orm import sessionmaker
|
from sqlalchemy.orm import sessionmaker
|
||||||
from src.database.models import Base, User, Project, SiteDeployment, GeneratedContent, ArticleLink
|
from src.database.models import Base, User, Project, SiteDeployment, GeneratedContent, ArticleLink
|
||||||
|
|
@ -345,6 +346,89 @@ class TestAnchorTextConfigOverrides:
|
||||||
db_session.refresh(content)
|
db_session.refresh(content)
|
||||||
assert '<a href=' in content.content
|
assert '<a href=' in content.content
|
||||||
|
|
||||||
|
def test_explicit_mode_job_level(
|
||||||
|
self, db_session, project, site_deployment, content_repo, project_repo, site_repo, link_repo
|
||||||
|
):
|
||||||
|
"""Test explicit anchor text mode with job-level tier-specific terms"""
|
||||||
|
from src.generation.job_config import Job, AnchorTextConfig, TierConfig
|
||||||
|
|
||||||
|
content = content_repo.create(
|
||||||
|
project_id=project.id,
|
||||||
|
tier="tier1",
|
||||||
|
keyword="test",
|
||||||
|
title="Test",
|
||||||
|
outline={},
|
||||||
|
content="<p>Article about high volume production and precision machining services.</p>",
|
||||||
|
word_count=30,
|
||||||
|
status="generated",
|
||||||
|
site_deployment_id=site_deployment.id
|
||||||
|
)
|
||||||
|
|
||||||
|
article_urls = generate_urls_for_batch([content], site_repo)
|
||||||
|
tiered_links = find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||||
|
|
||||||
|
job_config = Mock()
|
||||||
|
job_config.anchor_text_config = AnchorTextConfig(
|
||||||
|
mode="explicit",
|
||||||
|
tier1=["high volume", "precision machining"]
|
||||||
|
)
|
||||||
|
job_config.tiers = {}
|
||||||
|
|
||||||
|
inject_interlinks([content], article_urls, tiered_links, project, job_config, content_repo, link_repo)
|
||||||
|
|
||||||
|
db_session.refresh(content)
|
||||||
|
|
||||||
|
# Should use explicit anchor text
|
||||||
|
assert 'high volume' in content.content or 'precision machining' in content.content
|
||||||
|
assert '<a href=' in content.content
|
||||||
|
|
||||||
|
def test_explicit_mode_tier_level(
|
||||||
|
self, db_session, project, site_deployment, content_repo, project_repo, site_repo, link_repo
|
||||||
|
):
|
||||||
|
"""Test explicit anchor text mode with tier-level terms"""
|
||||||
|
from src.generation.job_config import Job, AnchorTextConfig, TierConfig
|
||||||
|
|
||||||
|
content = content_repo.create(
|
||||||
|
project_id=project.id,
|
||||||
|
tier="tier1",
|
||||||
|
keyword="test",
|
||||||
|
title="Test",
|
||||||
|
outline={},
|
||||||
|
content="<p>Article about custom manufacturing and bulk production.</p>",
|
||||||
|
word_count=30,
|
||||||
|
status="generated",
|
||||||
|
site_deployment_id=site_deployment.id
|
||||||
|
)
|
||||||
|
|
||||||
|
article_urls = generate_urls_for_batch([content], site_repo)
|
||||||
|
tiered_links = find_tiered_links([content], None, project_repo, content_repo, site_repo)
|
||||||
|
|
||||||
|
job_config = Mock()
|
||||||
|
job_config.anchor_text_config = None
|
||||||
|
job_config.tiers = {
|
||||||
|
"tier1": TierConfig(
|
||||||
|
count=12,
|
||||||
|
min_word_count=2000,
|
||||||
|
max_word_count=2500,
|
||||||
|
min_h2_tags=3,
|
||||||
|
max_h2_tags=5,
|
||||||
|
min_h3_tags=5,
|
||||||
|
max_h3_tags=10,
|
||||||
|
anchor_text_config=AnchorTextConfig(
|
||||||
|
mode="explicit",
|
||||||
|
terms=["custom manufacturing", "bulk production"]
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
inject_interlinks([content], article_urls, tiered_links, project, job_config, content_repo, link_repo)
|
||||||
|
|
||||||
|
db_session.refresh(content)
|
||||||
|
|
||||||
|
# Should use tier-level explicit anchor text
|
||||||
|
assert 'custom manufacturing' in content.content or 'bulk production' in content.content
|
||||||
|
assert '<a href=' in content.content
|
||||||
|
|
||||||
|
|
||||||
class TestDifferentBatchSizes:
|
class TestDifferentBatchSizes:
|
||||||
"""Test with various batch sizes"""
|
"""Test with various batch sizes"""
|
||||||
|
|
|
||||||
|
|
@ -213,6 +213,93 @@ class TestGetAnchorTextsForTier:
|
||||||
result = _get_anchor_texts_for_tier("tier1", mock_project, job_config)
|
result = _get_anchor_texts_for_tier("tier1", mock_project, job_config)
|
||||||
assert result == ["default"]
|
assert result == ["default"]
|
||||||
|
|
||||||
|
def test_explicit_mode_job_level(self, mock_project):
|
||||||
|
"""Test explicit mode with job-level tier-specific terms"""
|
||||||
|
from src.generation.job_config import AnchorTextConfig
|
||||||
|
|
||||||
|
job_config = Mock()
|
||||||
|
job_config.anchor_text_config = AnchorTextConfig(
|
||||||
|
mode="explicit",
|
||||||
|
tier1=["high volume", "precision machining"],
|
||||||
|
tier2=["bulk manufacturing"]
|
||||||
|
)
|
||||||
|
job_config.tiers = {}
|
||||||
|
|
||||||
|
result = _get_anchor_texts_for_tier("tier1", mock_project, job_config)
|
||||||
|
assert result == ["high volume", "precision machining"]
|
||||||
|
|
||||||
|
result = _get_anchor_texts_for_tier("tier2", mock_project, job_config)
|
||||||
|
assert result == ["bulk manufacturing"]
|
||||||
|
|
||||||
|
def test_explicit_mode_tier_level(self, mock_project):
|
||||||
|
"""Test explicit mode with tier-level terms"""
|
||||||
|
from src.generation.job_config import AnchorTextConfig, TierConfig
|
||||||
|
|
||||||
|
job_config = Mock()
|
||||||
|
job_config.anchor_text_config = None
|
||||||
|
job_config.tiers = {
|
||||||
|
"tier1": TierConfig(
|
||||||
|
count=12,
|
||||||
|
min_word_count=2000,
|
||||||
|
max_word_count=2500,
|
||||||
|
min_h2_tags=3,
|
||||||
|
max_h2_tags=5,
|
||||||
|
min_h3_tags=5,
|
||||||
|
max_h3_tags=10,
|
||||||
|
anchor_text_config=AnchorTextConfig(
|
||||||
|
mode="explicit",
|
||||||
|
terms=["tier level term 1", "tier level term 2"]
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
result = _get_anchor_texts_for_tier("tier1", mock_project, job_config)
|
||||||
|
assert result == ["tier level term 1", "tier level term 2"]
|
||||||
|
|
||||||
|
def test_explicit_mode_tier_overrides_job(self, mock_project):
|
||||||
|
"""Test tier-level explicit config overrides job-level"""
|
||||||
|
from src.generation.job_config import AnchorTextConfig, TierConfig
|
||||||
|
|
||||||
|
job_config = Mock()
|
||||||
|
job_config.anchor_text_config = AnchorTextConfig(
|
||||||
|
mode="explicit",
|
||||||
|
tier1=["job level term"]
|
||||||
|
)
|
||||||
|
job_config.tiers = {
|
||||||
|
"tier1": TierConfig(
|
||||||
|
count=12,
|
||||||
|
min_word_count=2000,
|
||||||
|
max_word_count=2500,
|
||||||
|
min_h2_tags=3,
|
||||||
|
max_h2_tags=5,
|
||||||
|
min_h3_tags=5,
|
||||||
|
max_h3_tags=10,
|
||||||
|
anchor_text_config=AnchorTextConfig(
|
||||||
|
mode="explicit",
|
||||||
|
terms=["tier level term"]
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
result = _get_anchor_texts_for_tier("tier1", mock_project, job_config)
|
||||||
|
assert result == ["tier level term"]
|
||||||
|
|
||||||
|
def test_explicit_mode_dict_format(self, mock_project):
|
||||||
|
"""Test explicit mode with dict format job config"""
|
||||||
|
job_config = {
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["dict tier1 term"],
|
||||||
|
"tier2": ["dict tier2 term"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
result = _get_anchor_texts_for_tier("tier1", mock_project, job_config)
|
||||||
|
assert result == ["dict tier1 term"]
|
||||||
|
|
||||||
|
result = _get_anchor_texts_for_tier("tier2", mock_project, job_config)
|
||||||
|
assert result == ["dict tier2 term"]
|
||||||
|
|
||||||
|
|
||||||
class TestTryInjectLink:
|
class TestTryInjectLink:
|
||||||
"""Tests for link injection attempts"""
|
"""Tests for link injection attempts"""
|
||||||
|
|
|
||||||
|
|
@ -150,7 +150,7 @@ def test_invalid_job_file_no_jobs_key(temp_job_file):
|
||||||
|
|
||||||
job_file = temp_job_file(data)
|
job_file = temp_job_file(data)
|
||||||
|
|
||||||
with pytest.raises(ValueError, match="must contain 'jobs'"):
|
with pytest.raises(ValueError, match="must contain either 'jobs' array or 'project_id' field"):
|
||||||
JobConfig(job_file)
|
JobConfig(job_file)
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -174,3 +174,161 @@ def test_file_not_found():
|
||||||
"""Test error when file doesn't exist"""
|
"""Test error when file doesn't exist"""
|
||||||
with pytest.raises(FileNotFoundError):
|
with pytest.raises(FileNotFoundError):
|
||||||
JobConfig("nonexistent_file.json")
|
JobConfig("nonexistent_file.json")
|
||||||
|
|
||||||
|
|
||||||
|
def test_explicit_anchor_text_job_level(temp_job_file):
|
||||||
|
"""Test explicit anchor text configuration at job level"""
|
||||||
|
data = {
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["high volume", "precision machining"],
|
||||||
|
"tier2": ["bulk manufacturing", "large scale"]
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {"count": 12},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
|
||||||
|
job_file = temp_job_file(data)
|
||||||
|
config = JobConfig(job_file)
|
||||||
|
|
||||||
|
job = config.get_jobs()[0]
|
||||||
|
assert job.anchor_text_config is not None
|
||||||
|
assert job.anchor_text_config.mode == "explicit"
|
||||||
|
assert job.anchor_text_config.tier1 == ["high volume", "precision machining"]
|
||||||
|
assert job.anchor_text_config.tier2 == ["bulk manufacturing", "large scale"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_explicit_anchor_text_tier_level(temp_job_file):
|
||||||
|
"""Test explicit anchor text configuration at tier level"""
|
||||||
|
data = {
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {
|
||||||
|
"count": 12,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"terms": ["high volume", "precision"]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
|
||||||
|
job_file = temp_job_file(data)
|
||||||
|
config = JobConfig(job_file)
|
||||||
|
|
||||||
|
job = config.get_jobs()[0]
|
||||||
|
tier1_config = job.tiers["tier1"]
|
||||||
|
assert tier1_config.anchor_text_config is not None
|
||||||
|
assert tier1_config.anchor_text_config.mode == "explicit"
|
||||||
|
assert tier1_config.anchor_text_config.terms == ["high volume", "precision"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_explicit_anchor_text_tier_override_job(temp_job_file):
|
||||||
|
"""Test tier-level explicit config overrides job-level"""
|
||||||
|
data = {
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["job level term"],
|
||||||
|
"tier2": ["bulk manufacturing"]
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {
|
||||||
|
"count": 12,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"terms": ["tier level term"]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
|
||||||
|
job_file = temp_job_file(data)
|
||||||
|
config = JobConfig(job_file)
|
||||||
|
|
||||||
|
job = config.get_jobs()[0]
|
||||||
|
tier1_config = job.tiers["tier1"]
|
||||||
|
assert tier1_config.anchor_text_config.terms == ["tier level term"]
|
||||||
|
assert job.anchor_text_config.tier1 == ["job level term"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_explicit_mode_requires_terms_job_level(temp_job_file):
|
||||||
|
"""Test that explicit mode requires tier-specific terms at job level"""
|
||||||
|
data = {
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit"
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {"count": 12}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
|
||||||
|
job_file = temp_job_file(data)
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="must have at least one tier-specific term list"):
|
||||||
|
JobConfig(job_file)
|
||||||
|
|
||||||
|
|
||||||
|
def test_explicit_mode_requires_terms_tier_level(temp_job_file):
|
||||||
|
"""Test that explicit mode requires terms at tier level"""
|
||||||
|
data = {
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {
|
||||||
|
"count": 12,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
|
||||||
|
job_file = temp_job_file(data)
|
||||||
|
|
||||||
|
with pytest.raises(ValueError, match="must have 'terms' array"):
|
||||||
|
JobConfig(job_file)
|
||||||
|
|
||||||
|
|
||||||
|
def test_explicit_anchor_text_all_tiers(temp_job_file):
|
||||||
|
"""Test explicit anchor text for all tier levels"""
|
||||||
|
data = {
|
||||||
|
"jobs": [{
|
||||||
|
"project_id": 26,
|
||||||
|
"anchor_text_config": {
|
||||||
|
"mode": "explicit",
|
||||||
|
"tier1": ["tier1 term"],
|
||||||
|
"tier2": ["tier2 term"],
|
||||||
|
"tier3": ["tier3 term"],
|
||||||
|
"tier4_plus": ["tier4 term"]
|
||||||
|
},
|
||||||
|
"tiers": {
|
||||||
|
"tier1": {"count": 12},
|
||||||
|
"tier2": {"count": 38}
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
|
||||||
|
job_file = temp_job_file(data)
|
||||||
|
config = JobConfig(job_file)
|
||||||
|
|
||||||
|
job = config.get_jobs()[0]
|
||||||
|
assert job.anchor_text_config.tier1 == ["tier1 term"]
|
||||||
|
assert job.anchor_text_config.tier2 == ["tier2 term"]
|
||||||
|
assert job.anchor_text_config.tier3 == ["tier3 term"]
|
||||||
|
assert job.anchor_text_config.tier4_plus == ["tier4 term"]
|
||||||
Loading…
Reference in New Issue