Big-Link-Man/docs/architecture/data-models.md

3.2 KiB

Data Models

The following data models will be implemented using SQLAlchemy.

1. User

Purpose: Stores user credentials and role information.

Key Attributes:

  • id: Integer, Primary Key
  • username: String, Unique, Not Null
  • hashed_password: String, Not Null
  • role: String, Not Null ("Admin" or "User")

Relationships: A User can have many Projects.

2. Project

Purpose: Represents a single content generation job initiated from a CORA report.

Key Attributes:

  • id: Integer, Primary Key
  • user_id: Integer, Foreign Key to User
  • project_name: String, Not Null
  • cora_data: JSON (stores extracted keywords, entities, etc.)
  • status: String (e.g., "Pending", "Generating", "Complete")

Relationships: A Project belongs to one User and has many GeneratedContents.

3. GeneratedContent

Purpose: Stores the AI-generated content from the three-stage pipeline.

Key Attributes:

  • id: Integer, Primary Key, Auto-increment
  • project_id: Integer, Foreign Key to Project, Indexed
  • tier: String(20), Not Null, Indexed (tier1, tier2, tier3)
  • keyword: String(255), Not Null, Indexed
  • title: Text, Not Null (Generated in stage 1)
  • outline: JSON, Not Null (Generated in stage 2)
  • content: Text, Not Null (HTML fragment from stage 3)
  • word_count: Integer, Not Null (Validated word count)
  • status: String(20), Not Null (generated, augmented, failed)
  • created_at: DateTime, Not Null
  • updated_at: DateTime, Not Null

Relationships: Belongs to one Project.

Status Values:

  • generated: Content was successfully generated within word count range
  • augmented: Content was below minimum and was augmented
  • failed: Generation failed (error details in outline JSON)

Purpose: Tracks link relationships between articles for interlinking (tiered links, wheel links, homepage links).

Key Attributes:

  • id: Integer, Primary Key, Auto-increment
  • from_content_id: Integer, Foreign Key to GeneratedContent, Not Null, Indexed
  • to_content_id: Integer, Foreign Key to GeneratedContent, Nullable, Indexed
  • to_url: Text, Nullable (for external links like money site)
  • anchor_text: Text, Nullable (actual anchor text used for the link, added in Story 4.5)
  • link_type: String(20), Not Null, Indexed (tiered, wheel_next, wheel_prev, homepage, wheel_see_also)
  • created_at: DateTime, Not Null

Relationships:

  • Belongs to one GeneratedContent (source)
  • Optionally belongs to another GeneratedContent (target)

Link Types:

  • tiered: Link from tier N article to tier N-1 article (or money site for tier 1)
  • wheel_next: Link to next article in batch wheel
  • wheel_prev: Link to previous article in batch wheel
  • wheel_see_also: Link in "See Also" section
  • homepage: Link to site homepage

Constraints:

  • Either to_content_id OR to_url must be set (not both)
  • Unique constraint on (from_content_id, to_content_id, link_type)

5. FqdnMapping

Purpose: Maps cloud storage buckets to fully qualified domain names for URL generation.

Key Attributes:

  • id: Integer, Primary Key
  • bucket_name: String, Not Null
  • provider: String, Not Null (e.g., "aws", "bunny", "azure")
  • fqdn: String, Not Null

Relationships: None.