Big-Link-Man/docs/architecture/data-models.md

# Data Models

The following data models will be implemented using SQLAlchemy.

## 1. User

**Purpose**: Stores user credentials and role information.

**Key Attributes**:
- `id`: Integer, Primary Key
- `username`: String, Unique, Not Null
- `hashed_password`: String, Not Null
- `role`: String, Not Null ("Admin" or "User")

**Relationships**: A User can have many Projects.

## 2. Project

**Purpose**: Represents a single content generation job initiated from a CORA report.

**Key Attributes**:
- `id`: Integer, Primary Key
- `user_id`: Integer, Foreign Key to User
- `project_name`: String, Not Null
- `cora_data`: JSON (stores extracted keywords, entities, etc.)
- `status`: String (e.g., "Pending", "Generating", "Complete")

**Relationships**: A Project belongs to one User and has many GeneratedContents.

## 3. GeneratedContent

**Purpose**: Stores the AI-generated content from the three-stage pipeline.

**Key Attributes**:
- `id`: Integer, Primary Key, Auto-increment
- `project_id`: Integer, Foreign Key to Project, Indexed
- `tier`: String(20), Not Null, Indexed (tier1, tier2, tier3)
- `keyword`: String(255), Not Null, Indexed
- `title`: Text, Not Null (Generated in stage 1)
- `outline`: JSON, Not Null (Generated in stage 2)
- `content`: Text, Not Null (HTML fragment from stage 3)
- `word_count`: Integer, Not Null (Validated word count)
- `status`: String(20), Not Null (generated, augmented, failed)
- `created_at`: DateTime, Not Null
- `updated_at`: DateTime, Not Null

**Relationships**: Belongs to one Project.

**Status Values**:
- `generated`: Content was successfully generated within word count range
- `augmented`: Content was below minimum and was augmented
- `failed`: Generation failed (error details in outline JSON)

## 4. ArticleLink

**Purpose**: Tracks link relationships between articles for interlinking (tiered links, wheel links, homepage links).

**Key Attributes**:
- `id`: Integer, Primary Key, Auto-increment
- `from_content_id`: Integer, Foreign Key to GeneratedContent, Not Null, Indexed
- `to_content_id`: Integer, Foreign Key to GeneratedContent, Nullable, Indexed
- `to_url`: Text, Nullable (for external links like money site)
- `anchor_text`: Text, Nullable (actual anchor text used for the link, added in Story 4.5)
- `link_type`: String(20), Not Null, Indexed (tiered, wheel_next, wheel_prev, homepage, wheel_see_also)
- `created_at`: DateTime, Not Null

**Relationships**:
- Belongs to one GeneratedContent (source)
- Optionally belongs to another GeneratedContent (target)

**Link Types**:
- `tiered`: Link from tier N article to tier N-1 article (or money site for tier 1)
- `wheel_next`: Link to next article in batch wheel
- `wheel_prev`: Link to previous article in batch wheel
- `wheel_see_also`: Link in "See Also" section
- `homepage`: Link to site homepage

**Constraints**:
- Either `to_content_id` OR `to_url` must be set (not both)
- Unique constraint on (from_content_id, to_content_id, link_type)

## 5. FqdnMapping

**Purpose**: Maps cloud storage buckets to fully qualified domain names for URL generation.

**Key Attributes**:
- `id`: Integer, Primary Key
- `bucket_name`: String, Not Null
- `provider`: String, Not Null (e.g., "aws", "bunny", "azure")
- `fqdn`: String, Not Null

**Relationships**: None.