Big-Link-Man/DEPLOY_BATCH_ANALYSIS.md

8.3 KiB

Deploy-Batch Analysis for test_shaft_machining.json

Quick Answers to Your Questions

1. What should the anchor text be at each level?

Tier 1 Articles (5 articles):

  • Money Site Links: Uses main_keyword variations from project

    • "shaft machining"
    • "learn about shaft machining"
    • "shaft machining guide"
    • "best shaft machining"
    • "shaft machining tips"
    • System tries to find these phrases in content; picks first one that matches
  • Home Link: Now in navigation menu (not injected into content)

  • See Also Links: Uses article titles as anchor text

Tier 2 Articles (20 articles):

  • Lower Tier Links: Uses related_searches from CORA data

    • Depends on what related searches were in the shaft_machining.xlsx file
    • If no related searches exist, falls back to main_keyword variations
  • Home Link: Now in navigation menu (not injected into content)

  • See Also Links: Uses article titles as anchor text

Configuration:

  • Anchor text rules come from master.config.jsoninterlinking.tier_anchor_text_rules
  • Can be overridden in job config with anchor_text_config

2. How many links should be in each article?

Tier 1 Articles:

Tier 2 Articles:

  • 2-4 links to tier1 articles (random selection, count is interlinking.links_per_article_min to max)
  • 19 "See Also" links (to the other 19 tier2 articles)
  • Total: 21-23 links per tier2 article (plus Home in nav menu)

Your JSON Configuration:

"interlinking": {
    "links_per_article_min": 2,
    "links_per_article_max": 4
}

This controls the tiered links (tier2 → tier1). Each tier2 article will get between 2-4 random tier1 articles to link to.

YES - Home is a link in the navigation menu at the top of every page.

How it works:

  • The HTML template (basic.html) includes a <nav> menu with Home link
  • Template line 113: <li><a href="/index.html">Home</a></li>
  • This is part of the template wrapper, not injected into article content

Old behavior (now removed):

  • Previously, system searched article content for "Home" and tried to link it
  • This was redundant since Home is already in the nav menu
  • Code has been updated to remove this injection

Step-by-Step: What Happens During deploy-batch

Step 1: Load Articles from Database

- Project 1 has generated content already
- Tier 1: 5 articles
- Tier 2: 20 articles
- Each article has: title, content (HTML), site_deployment_id

Step 2: URL Generation (already done during generate-batch)

Tier 1 URLs (round-robin between getcnc.info and textbullseye.com):
- Article 0: https://getcnc.info/{slug}.html
- Article 1: https://www.textbullseye.com/{slug}.html
- Article 2: https://getcnc.info/{slug}.html
- Article 3: https://www.textbullseye.com/{slug}.html
- Article 4: https://getcnc.info/{slug}.html

Tier 2 URLs (round-robin):
- Articles 0-19 distributed across both domains

For Tier 1:

  • Target: Money site URL from project database
  • Anchor text: main_keyword variations
  • Links already in generated_content.content HTML

For Tier 2:

  • Target: Random selection of tier1 URLs (2-4 per article)
  • Anchor text: related_searches from project
  • Links already in HTML
  • Home link is in the navigation menu (template)
  • No longer injected into article content

Step 5: See Also Section (already injected)

  • HTML section with links to other articles in same tier

Step 6: Template Application (already done)

  • HTML wrapped in template from src/templating/templates/basic.html
  • Navigation menu added
  • Stored in generated_content.formatted_html

Step 7: Upload to Bunny.net

For each article:
  1. Get site deployment credentials
  2. Upload formatted_html to storage zone
  3. File path: /{slug}.html
  4. Log URL to deployment_logs/
  5. Update database: deployed_url, status='deployed'

For each site's boilerplate pages:
  1. Upload index.html (if exists)
  2. Upload about.html
  3. Upload contact.html
  4. Upload privacy.html

All links are tracked in article_links table:

Tier 1 Article Example (ID: 43):

| from_content_id | to_content_id | to_url | anchor_text | link_type |
|-----------------|---------------|--------|-------------|-----------|
| 43 | NULL | https://fzemanufacturing.com/... | "shaft machining" | tiered |
| 43 | 44 | NULL | "Understanding CNC..." | wheel_see_also |
| 43 | 45 | NULL | "Advanced Shaft..." | wheel_see_also |
| 43 | 46 | NULL | "Precision Machining..." | wheel_see_also |
| 43 | 47 | NULL | "Modern Shaft..." | wheel_see_also |

Tier 2 Article Example (ID: 48):

| from_content_id | to_content_id | to_url | anchor_text | link_type |
|-----------------|---------------|--------|-------------|-----------|
| 48 | NULL | https://getcnc.info/{slug1}.html | "cnc machining services" | tiered |
| 48 | NULL | https://www.textbullseye.com/{slug2}.html | "precision shaft work" | tiered |
| 48 | NULL | https://getcnc.info/{slug3}.html | "shaft turning operations" | tiered |
| 48 | 49 | NULL | "Tier 2 Article 2 Title" | wheel_see_also |
| ... | ... | ... | ... | ... |
| 48 | 67 | NULL | "Tier 2 Article 20 Title" | wheel_see_also |

Note: Home link is no longer tracked in the database since it's in the template, not injected into content.

Your Specific JSON File Analysis

{
  "jobs": [
    {
      "project_id": 1,
      "deployment_targets": [
        "getcnc.info",
        "www.textbullseye.com"
      ],
      "tiers": {
        "tier1": {
          "count": 5,
          "min_word_count": 1500,
          "max_word_count": 2000,
          "models": {
            "title": "openai/gpt-4o-mini",
            "outline": "openai/gpt-4o-mini",
            "content": "anthropic/claude-3.5-sonnet"
          }
        },
        "tier2": {
          "count": 20,
          "models": {
            "title": "openai/gpt-4o-mini",
            "outline": "openai/gpt-4o-mini",
            "content": "openai/gpt-4o-mini"
          },
          "interlinking": {
            "links_per_article_min": 2,
            "links_per_article_max": 4
          }
        }
      }
    }
  ]
}

What This Configuration Does:

  1. Tier 1 (5 articles):

    • Uses Claude Sonnet for content, GPT-4o-mini for titles/outlines
    • 1500-2000 words per article
    • Distributed across getcnc.info and textbullseye.com
    • Each links to: money site (1) + See Also (4) = 5 total links (plus Home in nav menu)
  2. Tier 2 (20 articles):

    • Uses GPT-4o-mini for everything (cheaper)
    • Default word count (1100-1500)
    • Each links to: 2-4 tier1 articles + See Also (19) = 21-23 total links (plus Home in nav menu)
    • Distributed across both domains
  3. Missing Configurations (using defaults):

    • tier1.interlinking: Not specified → uses defaults (but tier1 always gets 1 money site link anyway)
    • anchor_text_config: Not specified → uses master.config.json rules

All JSON Fields That Affect Behavior

See MASTER_JSON.json for the complete reference. Key fields:

Top-level job fields:

  • project_id - Which project's data to use
  • deployment_targets - Which domains to deploy to
  • models - Which AI models to use
  • tiered_link_count_range - How many tiered links (job-level default)
  • anchor_text_config - Override anchor text generation
  • interlinking - Job-level interlinking defaults

Tier-level fields:

  • count - Number of articles
  • min_word_count, max_word_count - Content length
  • min_h2_tags, max_h2_tags, min_h3_tags, max_h3_tags - Outline structure
  • models - Tier-specific model overrides
  • interlinking - Tier-specific interlinking overrides

Fields in master.config.json:

  • interlinking.tier_anchor_text_rules - Defines anchor text sources per tier
  • interlinking.include_home_link - Global default for Home links
  • interlinking.wheel_links - Enable/disable See Also sections

Fields in project database:

  • main_keyword - Used for tier1 anchor text
  • related_searches - Used for tier2 anchor text
  • entities - Used for tier3+ anchor text
  • money_site_url - Destination for tier1 links