# Auto-Ingestion System Documentation

## Table of Contents
1. [Overview](#overview)
2. [Architecture](#architecture)
3. [Implementation Plan](#implementation-plan)
4. [Components](#components)
5. [Setup Instructions](#setup-instructions)
6. [Testing](#testing)
7. [Monitoring](#monitoring)
8. [Future Enhancements](#future-enhancements)

---

## Overview

This document describes the automatic ingestion system for emails and reviews in the ReplyPilot application.

### What We're Building
- **Email Auto-Ingestion**: Fetch emails from Gmail every 15 minutes automatically
- **Review Auto-Ingestion**: Fetch reviews from Google Business Profile every 15 minutes automatically

### Key Requirements
- ✅ Run every 15 minutes
- ✅ Background processing (non-blocking)
- ✅ Multi-tenant safe (each user's data isolated)
- ✅ Automatic retries on failure
- ✅ No filtering (basic implementation first)
- ✅ Reuse existing ingestion logic
- ✅ Full logging for debugging

---

## Architecture

### High-Level Flow

```
┌─────────────────────────────────────────────────────────────┐
│                    LARAVEL SCHEDULER                         │
│                  (runs every minute via cron)                │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ├─── Every 15 minutes ───┐
                       │                         │
                       ▼                         ▼
        ┌──────────────────────┐   ┌──────────────────────┐
        │  emails:ingest-all   │   │ reviews:ingest-all   │
        │      COMMAND         │   │      COMMAND         │
        └──────────┬───────────┘   └──────────┬───────────┘
                   │                           │
                   │ Query all                 │ Query all
                   │ MailAccounts              │ Locations
                   │                           │
                   ▼                           ▼
        ┌──────────────────────┐   ┌──────────────────────┐
        │  Dispatch Jobs       │   │  Dispatch Jobs       │
        │  to Queue            │   │  to Queue            │
        └──────────┬───────────┘   └──────────┬───────────┘
                   │                           │
                   └───────────┬───────────────┘
                               │
                               ▼
                    ┌─────────────────────┐
                    │   DATABASE QUEUE    │
                    │   (jobs table)      │
                    └──────────┬──────────┘
                               │
                               ▼
                    ┌─────────────────────┐
                    │   QUEUE WORKER      │
                    │   (always running)  │
                    └──────────┬──────────┘
                               │
                ┌──────────────┴──────────────┐
                │                             │
                ▼                             ▼
     ┌────────────────────┐      ┌────────────────────┐
     │ IngestEmailsJob    │      │ IngestReviewsJob   │
     │                    │      │                    │
     │ - Connect to Gmail │      │ - Connect to GBP   │
     │ - Fetch emails     │      │ - Fetch reviews    │
     │ - Create threads   │      │ - Save to DB       │
     │ - Generate drafts  │      │ - Log results      │
     └────────────────────┘      └────────────────────┘
```

---

## Implementation Plan

### Phase 1: Setup Queue System ✅
1. Create jobs table migration
2. Configure queue driver to use database
3. Create failed_jobs table for error tracking

### Phase 2: Create Background Jobs ✅
1. `IngestEmailsJob` - Processes one mail account
2. `IngestReviewsJob` - Processes one location

### Phase 3: Create Console Commands ✅
1. `IngestAllEmailsCommand` - Dispatches jobs for all mail accounts
2. `IngestAllReviewsCommand` - Dispatches jobs for all locations

### Phase 4: Schedule Tasks ✅
1. Modify `app/Console/Kernel.php`
2. Schedule both commands to run every 15 minutes
3. Add overlap prevention

### Phase 5: Extract Service Logic ✅
1. Create `EmailIngestionService` - Reusable email ingestion logic
2. Create `ReviewIngestionService` - Reusable review ingestion logic
3. Update controllers to use services
4. Jobs also use same services (DRY principle)

### Phase 6: Server Setup 📋
1. Run migrations
2. Start queue worker
3. Set up Windows Task Scheduler for Laravel scheduler

---

## Components

### 1. Database Tables

#### `jobs` Table
Stores queued jobs waiting to be processed.

```sql
- id (bigint)
- queue (string)
- payload (longtext)
- attempts (unsigned tinyint)
- reserved_at (unsigned int, nullable)
- available_at (unsigned int)
- created_at (unsigned int)
```

#### `failed_jobs` Table
Stores jobs that failed after all retry attempts.

```sql
- id (bigint)
- uuid (string)
- connection (text)
- queue (text)
- payload (longtext)
- exception (longtext)
- failed_at (timestamp)
```

---

### 2. Background Jobs

#### `IngestEmailsJob`
**File:** `app/Jobs/IngestEmailsJob.php`

**Purpose:** Fetch emails for a single mail account

**Properties:**
- `$mailAccount` - The MailAccount to process
- `$tries = 3` - Retry up to 3 times on failure
- `$timeout = 300` - Kill job after 5 minutes

**Flow:**
```
1. Receive MailAccount as parameter
2. Call EmailIngestionService->ingestForAccount($mailAccount)
3. Log success with count of emails fetched
4. If error, log and throw exception (will retry)
```

**Multi-tenant Safety:**
- Job scoped to single MailAccount
- MailAccount already has user_id/tenant_id
- No cross-tenant access possible

---

#### `IngestReviewsJob`
**File:** `app/Jobs/IngestReviewsJob.php`

**Purpose:** Fetch reviews for a single location

**Properties:**
- `$location` - The Location to process
- `$tries = 3` - Retry up to 3 times on failure
- `$timeout = 300` - Kill job after 5 minutes

**Flow:**
```
1. Receive Location as parameter
2. Call ReviewIngestionService->ingestForLocation($location)
3. Log success with count of reviews fetched
4. If error, log and throw exception (will retry)
```

**Multi-tenant Safety:**
- Job scoped to single Location
- Location already has user_id/tenant_id
- No cross-tenant access possible

---

### 3. Console Commands

#### `IngestAllEmailsCommand`
**File:** `app/Console/Commands/IngestAllEmailsCommand.php`

**Signature:** `emails:ingest-all`

**Description:** Dispatch email ingestion jobs for all mail accounts

**Flow:**
```
1. Query all MailAccounts from database
   - BelongsToUser trait automatically applied
   - Each user sees only their accounts
2. Loop through each account
3. Dispatch IngestEmailsJob::dispatch($mailAccount)
4. Log count of jobs dispatched
```

**Manual Usage:**
```bash
php artisan emails:ingest-all
```

**Scheduled Usage:**
Runs automatically every 15 minutes via scheduler

---

#### `IngestAllReviewsCommand`
**File:** `app/Console/Commands/IngestAllReviewsCommand.php`

**Signature:** `reviews:ingest-all`

**Description:** Dispatch review ingestion jobs for all locations

**Flow:**
```
1. Query all Locations from database
   - BelongsToUser trait automatically applied
   - Each user sees only their locations
2. Loop through each location
3. Dispatch IngestReviewsJob::dispatch($location)
4. Log count of jobs dispatched
```

**Manual Usage:**
```bash
php artisan reviews:ingest-all
```

**Scheduled Usage:**
Runs automatically every 15 minutes via scheduler

---

### 4. Services (DRY Code)

#### `EmailIngestionService`
**File:** `app/Services/EmailIngestionService.php`

**Purpose:** Centralized email ingestion logic (used by both controller and job)

**Method:** `ingestForAccount(MailAccount $mailAccount): array`

**Returns:**
```php
[
    'success' => true,
    'threads_created' => 5,
    'messages_fetched' => 12,
    'drafts_generated' => 3,
]
```

**Logic Extracted From:**
- `EmailController@ingest` method
- Same Gmail API calls
- Same thread/draft creation logic

---

#### `ReviewIngestionService`
**File:** `app/Services/ReviewIngestionService.php`

**Purpose:** Centralized review ingestion logic (used by both controller and job)

**Method:** `ingestForLocation(Location $location): array`

**Returns:**
```php
[
    'success' => true,
    'reviews_ingested' => 8,
    'new_reviews' => 3,
    'updated_reviews' => 5,
]
```

**Logic Extracted From:**
- `ReviewController@ingest` method
- Same Google Business Profile API calls
- Same review creation/update logic

---

### 5. Task Scheduler

#### Modified File: `app/Console/Kernel.php`

**Scheduled Tasks:**

```php
protected function schedule(Schedule $schedule)
{
    // Ingest emails every 15 minutes
    $schedule->command('emails:ingest-all')
             ->everyFifteenMinutes()
             ->withoutOverlapping()
             ->appendOutputTo(storage_path('logs/email-ingestion.log'));

    // Ingest reviews every 15 minutes
    $schedule->command('reviews:ingest-all')
             ->everyFifteenMinutes()
             ->withoutOverlapping()
             ->appendOutputTo(storage_path('logs/review-ingestion.log'));
}
```

**Explanation:**
- `->everyFifteenMinutes()` - Runs at :00, :15, :30, :45 of every hour
- `->withoutOverlapping()` - Prevents concurrent runs (if previous still running)
- `->appendOutputTo()` - Logs output to separate files for easy debugging

---

## Setup Instructions

### Step 1: Environment Configuration

Update `.env` file:
```env
QUEUE_CONNECTION=database
```

This tells Laravel to use the database for queueing jobs.

---

### Step 2: Run Migrations

Create the queue tables:

```bash
# Create jobs table
php artisan queue:table

# Create failed jobs table (if not exists)
php artisan queue:failed-table

# Run migrations
php artisan migrate
```

**Expected Output:**
```
Migration table created successfully.
Created Migration: 2024_xx_xx_xxxxxx_create_jobs_table
Migrating: 2024_xx_xx_xxxxxx_create_jobs_table
Migrated:  2024_xx_xx_xxxxxx_create_jobs_table
```

---

### Step 3: Start Queue Worker

**For Development (Windows/WAMP):**

Open a new Command Prompt and run:
```bash
cd C:\Users\iyall\OneDrive\Documents\AI\claude\replypilot
php artisan queue:work --daemon --tries=3 --timeout=300
```

**Leave this window open!** The queue worker must run continuously.

**Parameters:**
- `--daemon` - Runs continuously (doesn't exit after each job)
- `--tries=3` - Retry failed jobs 3 times before giving up
- `--timeout=300` - Kill job if it runs longer than 5 minutes (300 seconds)

**Expected Output:**
```
[2024-10-20 12:00:00] Processing: App\Jobs\IngestEmailsJob
[2024-10-20 12:00:05] Processed:  App\Jobs\IngestEmailsJob
[2024-10-20 12:00:05] Processing: App\Jobs\IngestReviewsJob
[2024-10-20 12:00:10] Processed:  App\Jobs\IngestReviewsJob
```

---

### Step 4: Set Up Laravel Scheduler

Laravel's scheduler needs to be triggered every minute. We'll use Windows Task Scheduler.

#### Windows Task Scheduler Setup:

1. **Open Task Scheduler**
   - Press `Win + R`
   - Type `taskschd.msc`
   - Press Enter

2. **Create Basic Task**
   - Click "Create Basic Task" in right panel
   - Name: `Laravel Scheduler - ReplyPilot`
   - Description: `Runs Laravel task scheduler every minute`

3. **Trigger Settings**
   - Trigger: Daily
   - Start: Today at 12:00 AM
   - Recur every: 1 days
   - Click "Next"

4. **Action Settings**
   - Action: Start a program
   - Program/script: `C:\wamp64\bin\php\php8.3.25\php.exe` (adjust to your PHP path)
   - Add arguments: `C:\Users\iyall\OneDrive\Documents\AI\claude\replypilot\artisan schedule:run`
   - Click "Next"

5. **Advanced Settings**
   - Check "Open the Properties dialog"
   - Click "Finish"

6. **Properties Dialog**
   - Go to "Triggers" tab
   - Edit the trigger
   - Check "Repeat task every: 1 minute"
   - Duration: Indefinitely
   - Click OK

**Alternative: Simple Batch File (Easier)**

Create file: `C:\ReplyPilot-Scheduler.bat`
```batch
@echo off
cd C:\Users\iyall\OneDrive\Documents\AI\claude\replypilot
C:\wamp64\bin\php\php8.3.25\php.exe artisan schedule:run
```

Then create Task Scheduler to run this batch file every 1 minute.

---

### Step 5: Verify Setup

#### Test Manually:

```bash
# Test email ingestion command
php artisan emails:ingest-all

# Test review ingestion command
php artisan reviews:ingest-all

# Run scheduler manually (simulates cron)
php artisan schedule:run

# Process one job from queue
php artisan queue:work --once
```

#### Check Queue:

```bash
# View all jobs in queue
php artisan queue:monitor

# View failed jobs
php artisan queue:failed
```

---

## Testing

### Manual Testing Checklist

- [ ] Migrations ran successfully (jobs and failed_jobs tables exist)
- [ ] Queue worker starts without errors
- [ ] `php artisan emails:ingest-all` dispatches jobs
- [ ] `php artisan reviews:ingest-all` dispatches jobs
- [ ] Jobs appear in `jobs` table
- [ ] Queue worker processes jobs successfully
- [ ] Emails are fetched and saved
- [ ] Reviews are fetched and saved
- [ ] Logs show successful ingestion
- [ ] No errors in `storage/logs/laravel.log`
- [ ] Failed jobs table is empty (or retries work)

---

### Test Scenarios

#### Scenario 1: First Run (Empty Queue)
```
1. Start queue worker
2. Run: php artisan emails:ingest-all
3. Observe: Jobs dispatched to queue
4. Observe: Worker picks up jobs immediately
5. Check: Emails appear in database
6. Check: Logs show success
```

#### Scenario 2: Scheduled Run
```
1. Ensure scheduler is running (Task Scheduler)
2. Wait for :15, :30, :45, or :00 minute mark
3. Check: Commands execute automatically
4. Check: Jobs dispatched to queue
5. Check: Worker processes jobs
6. Check: New emails/reviews in database
```

#### Scenario 3: Error Handling
```
1. Temporarily break OAuth connection (delete tokens)
2. Run: php artisan emails:ingest-all
3. Observe: Job fails
4. Observe: Job retries (attempts 1, 2, 3)
5. Observe: After 3 failures, moves to failed_jobs table
6. Fix OAuth connection
7. Run: php artisan queue:retry all
8. Observe: Failed job retries and succeeds
```

---

## Monitoring

### Log Files

#### Laravel Log
**Location:** `storage/logs/laravel.log`

**Contains:**
- All application logs
- Job execution logs
- Error traces
- API call logs

**Example Entry:**
```
[2024-10-20 12:15:00] local.INFO: Dispatched 5 email ingestion jobs
[2024-10-20 12:15:05] local.INFO: IngestEmailsJob: Fetched 12 emails for account@example.com
[2024-10-20 12:15:10] local.INFO: IngestEmailsJob: Created 3 new threads
```

---

#### Email Ingestion Log
**Location:** `storage/logs/email-ingestion.log`

**Contains:**
- Output from `emails:ingest-all` command
- Count of jobs dispatched
- Timestamp of each run

**Example Entry:**
```
[2024-10-20 12:15:00] Dispatched 5 email ingestion jobs
```

---

#### Review Ingestion Log
**Location:** `storage/logs/review-ingestion.log`

**Contains:**
- Output from `reviews:ingest-all` command
- Count of jobs dispatched
- Timestamp of each run

**Example Entry:**
```
[2024-10-20 12:15:00] Dispatched 3 review ingestion jobs
```

---

### Database Monitoring

#### Check Jobs Queue
```sql
SELECT * FROM jobs ORDER BY created_at DESC LIMIT 10;
```

Shows pending/processing jobs.

#### Check Failed Jobs
```sql
SELECT * FROM failed_jobs ORDER BY failed_at DESC LIMIT 10;
```

Shows jobs that failed after all retries.

#### Check Last Ingestion
```sql
-- Last email ingested
SELECT MAX(created_at) as last_email_ingestion FROM email_threads;

-- Last review ingested
SELECT MAX(created_at) as last_review_ingestion FROM reviews;
```

---

### Artisan Commands for Monitoring

```bash
# List all scheduled tasks
php artisan schedule:list

# View failed jobs
php artisan queue:failed

# Retry all failed jobs
php artisan queue:retry all

# Retry specific failed job
php artisan queue:retry {job-id}

# Flush all failed jobs (delete them)
php artisan queue:flush

# Clear all jobs from queue (nuclear option)
php artisan queue:clear
```

---

## Error Handling

### Common Errors and Solutions

#### Error: "No OAuth connection found"
**Cause:** User's OAuth token expired or missing

**Solution:**
1. User needs to re-authenticate with Google
2. Job will fail and retry
3. After 3 failures, moves to failed_jobs
4. Once OAuth fixed, retry the job

---

#### Error: "Failed to refresh access token"
**Cause:** Refresh token expired (rare, happens after ~6 months of inactivity)

**Solution:**
1. User must re-authenticate completely
2. System will log detailed error
3. Admin can see which user needs to reconnect

---

#### Error: "Queue worker not running"
**Cause:** Queue worker process stopped or crashed

**Symptoms:**
- Jobs pile up in jobs table
- Nothing gets processed
- No "Processed" logs

**Solution:**
1. Restart queue worker: `php artisan queue:work --daemon --tries=3 --timeout=300`
2. For production, use Supervisor to auto-restart worker

---

#### Error: "Scheduler not running"
**Cause:** Windows Task Scheduler not set up or disabled

**Symptoms:**
- Commands don't run every 15 minutes
- No automatic ingestion happening
- Manual commands work fine

**Solution:**
1. Verify Task Scheduler task exists and is enabled
2. Check task history for errors
3. Test manually: `php artisan schedule:run`

---

### Job Failure Flow

```
Job Dispatched
     │
     ▼
Job Attempts (1st try)
     │
     ├─ Success ──> Job Complete ✅
     │
     └─ Failure ──> Wait 0 seconds
                        │
                        ▼
                Job Attempts (2nd try)
                        │
                        ├─ Success ──> Job Complete ✅
                        │
                        └─ Failure ──> Wait 0 seconds
                                           │
                                           ▼
                                   Job Attempts (3rd try)
                                           │
                                           ├─ Success ──> Job Complete ✅
                                           │
                                           └─ Failure ──> Move to failed_jobs ❌
                                                          (Manual retry needed)
```

---

## Performance Considerations

### Current Scale Estimate

**Assumptions:**
- 10 users in system
- Each user has 2 mail accounts on average
- Each user has 1 location on average
- Total: ~20 email jobs + ~10 review jobs = 30 jobs per 15-minute cycle

**Processing Time:**
- Email ingestion: ~5-30 seconds per account
- Review ingestion: ~3-10 seconds per location
- Total time per cycle: ~5-10 minutes

**Conclusion:**
- All jobs complete well before next cycle (15 minutes)
- Database queue is sufficient
- Single queue worker is sufficient

---

### Scaling Considerations

**If system grows to 100+ users:**

**Option 1: Multiple Queue Workers**
Run 2-5 workers in parallel:
```bash
# Worker 1
php artisan queue:work --daemon --tries=3 --timeout=300

# Worker 2 (in separate terminal)
php artisan queue:work --daemon --tries=3 --timeout=300
```

**Option 2: Redis Queue**
Switch to Redis for faster queue processing:
```env
QUEUE_CONNECTION=redis
```

**Option 3: Staggered Scheduling**
Instead of all jobs at once every 15 minutes, stagger them:
```php
// Emails at :00, :15, :30, :45
$schedule->command('emails:ingest-all')->everyFifteenMinutes();

// Reviews at :05, :20, :35, :50
$schedule->command('reviews:ingest-all')->everyFifteenMinutes()->at('5');
```

---

## Multi-Tenant Safety

### How Isolation Works

#### At Command Level:
```php
// IngestAllEmailsCommand
$mailAccounts = MailAccount::all(); // BelongsToUser scope applied automatically

// If Super Admin is running: sees ALL mail accounts
// If Tenant Admin is running: sees only THEIR mail accounts
// If Agent is running: sees only THEIR mail accounts
```

#### At Job Level:
```php
// IngestEmailsJob receives $mailAccount
// Job only processes THIS specific account
// No queries for other accounts
// No possibility of cross-contamination
```

#### At Service Level:
```php
// EmailIngestionService->ingestForAccount($mailAccount)
// All queries scoped to this mailAccount
// All created threads/drafts belong to mailAccount->user_id
```

### Visual Flow

```
Scheduler runs emails:ingest-all
    │
    ├─ Super Admin Context
    │   └─ Fetches ALL mail accounts (no scope)
    │       ├─ Dispatch job for user1@example.com (user_id=1)
    │       ├─ Dispatch job for user2@example.com (user_id=2)
    │       └─ Dispatch job for user3@example.com (user_id=3)
    │
    ├─ Tenant Admin Context (user_id=2)
    │   └─ Fetches ONLY their mail accounts (BelongsToUser scope)
    │       └─ Dispatch job for user2@example.com (user_id=2)
    │
    └─ Agent Context (user_id=4)
        └─ Fetches ONLY their mail accounts (BelongsToUser scope)
            └─ Dispatch job for user4@example.com (user_id=4)
```

**Note:** In practice, scheduler runs in application context, not user context. The BelongsToUser trait applies when commands query the database. In a fully automated system, you may want to dispatch jobs for ALL accounts and rely on job-level isolation.

**Updated Approach:**
Commands should explicitly bypass global scopes and fetch ALL accounts, then dispatch jobs for each. Job-level isolation ensures safety.

```php
// IngestAllEmailsCommand (CORRECTED)
$mailAccounts = MailAccount::withoutGlobalScopes()->get(); // Get ALL accounts
foreach ($mailAccounts as $account) {
    IngestEmailsJob::dispatch($account); // Job scoped to this account
}
```

---

## Future Enhancements

### Phase 2: User Settings (Not Implemented Yet)

**Database Schema:**
```sql
-- Add columns to users table
ALTER TABLE users ADD COLUMN auto_ingest_emails BOOLEAN DEFAULT true;
ALTER TABLE users ADD COLUMN auto_ingest_reviews BOOLEAN DEFAULT true;
ALTER TABLE users ADD COLUMN email_ingest_frequency ENUM('5min','15min','30min','1hour') DEFAULT '15min';
ALTER TABLE users ADD COLUMN review_ingest_frequency ENUM('15min','30min','1hour','daily') DEFAULT '15min';
```

**Command Logic:**
```php
// Only dispatch jobs for users who have auto-ingestion enabled
$mailAccounts = MailAccount::whereHas('user', function($q) {
    $q->where('auto_ingest_emails', true);
})->get();
```

---

### Phase 3: Email/SMS Notifications (Not Implemented Yet)

**Notify on Failure:**
```php
// In IngestEmailsJob failed() method
Mail::to($this->mailAccount->user->email)->send(
    new IngestionFailedNotification($this->mailAccount)
);
```

**Notify on Success (Optional):**
```php
// After successful ingestion with new emails
if ($newEmailsCount > 0) {
    Notification::send($user, new NewEmailsIngested($newEmailsCount));
}
```

---

### Phase 4: Real-Time Ingestion via Gmail Push (Not Implemented Yet)

**Gmail Pub/Sub Setup:**
1. Set up Google Cloud Pub/Sub topic
2. Configure Gmail watch endpoint
3. Receive push notifications when new email arrives
4. Trigger immediate ingestion (instead of waiting 15 minutes)

**Benefits:**
- Instant email ingestion (within seconds)
- Reduced API calls (no polling every 15 minutes)
- Better user experience

**Complexity:**
- Requires public HTTPS endpoint
- Needs Pub/Sub subscription management
- More complex error handling

---

### Phase 5: Advanced Filtering (Not Implemented Yet)

**See Previous Discussion for Full Details:**
- Sender whitelist/blacklist
- Subject pattern filtering
- AI classification
- Spam scoring
- Marketing email detection
- Auto-archiving

---

### Phase 6: Analytics Dashboard (Not Implemented Yet)

**Metrics to Track:**
- Last successful ingestion time
- Total emails ingested (last 24h, 7d, 30d)
- Total reviews ingested (last 24h, 7d, 30d)
- Failed ingestion attempts
- Average ingestion time
- API quota usage

**UI Mockup:**
```
┌─────────────────────────────────────────┐
│     Auto-Ingestion Dashboard            │
├─────────────────────────────────────────┤
│                                          │
│  📧 Email Ingestion                     │
│  Last Run: 5 minutes ago ✅              │
│  Status: Success                         │
│  Emails Fetched: 12                      │
│  Next Run: in 10 minutes                 │
│                                          │
│  ⭐ Review Ingestion                     │
│  Last Run: 5 minutes ago ✅              │
│  Status: Success                         │
│  Reviews Fetched: 3                      │
│  Next Run: in 10 minutes                 │
│                                          │
│  ⚠️ Recent Issues: None                  │
│                                          │
│  [View Logs] [Retry Failed Jobs]        │
│                                          │
└─────────────────────────────────────────┘
```

---

## Troubleshooting

### Queue Worker Keeps Stopping

**Symptom:** Worker process exits unexpectedly

**Possible Causes:**
1. PHP timeout/memory limit
2. Fatal error in job code
3. Server reboot

**Solutions:**
1. Increase PHP limits in `php.ini`
2. Fix job errors (check failed_jobs table)
3. Use process monitor (Supervisor on Linux, NSSM on Windows)

---

### Jobs Not Processing

**Symptom:** Jobs pile up in jobs table but never process

**Checklist:**
- [ ] Is queue worker running? (Check task manager)
- [ ] Is queue connection set to 'database' in .env?
- [ ] Did you run migrations? (jobs table exists?)
- [ ] Any errors in worker console output?
- [ ] Check storage/logs/laravel.log for errors

---

### Scheduler Not Running

**Symptom:** Commands don't execute every 15 minutes

**Checklist:**
- [ ] Is Task Scheduler task enabled?
- [ ] Does task run every 1 minute?
- [ ] Test manually: `php artisan schedule:run`
- [ ] Check `php artisan schedule:list` output
- [ ] Verify task history in Task Scheduler

---

### Duplicate Jobs Running

**Symptom:** Same job runs twice simultaneously

**Cause:** Multiple queue workers running or overlap prevention not working

**Solution:**
1. Kill all queue worker processes
2. Start only ONE worker
3. Verify `withoutOverlapping()` is in schedule
4. Check for duplicate Task Scheduler tasks

---

## Reference Commands

### Daily Operations
```bash
# Check scheduled tasks
php artisan schedule:list

# Check failed jobs
php artisan queue:failed

# Retry all failed jobs
php artisan queue:retry all

# View logs
tail -f storage/logs/laravel.log
```

### Debugging
```bash
# Test commands manually
php artisan emails:ingest-all
php artisan reviews:ingest-all

# Process one job and exit (for testing)
php artisan queue:work --once

# Run scheduler manually
php artisan schedule:run

# Clear failed jobs
php artisan queue:flush
```

### Maintenance
```bash
# Restart queue worker (if hung)
# 1. Kill process in Task Manager
# 2. Restart:
php artisan queue:work --daemon --tries=3 --timeout=300

# Clear all jobs from queue (nuclear option)
php artisan queue:clear

# Prune old failed jobs (older than 48 hours)
php artisan queue:prune-failed --hours=48
```

---

## File Reference

### Files Created
1. `app/Jobs/IngestEmailsJob.php`
2. `app/Jobs/IngestReviewsJob.php`
3. `app/Console/Commands/IngestAllEmailsCommand.php`
4. `app/Console/Commands/IngestAllReviewsCommand.php`
5. `app/Services/EmailIngestionService.php`
6. `app/Services/ReviewIngestionService.php`
7. `database/migrations/xxxx_create_jobs_table.php` (via artisan)
8. `database/migrations/xxxx_create_failed_jobs_table.php` (via artisan)

### Files Modified
1. `app/Console/Kernel.php` - Added scheduled tasks
2. `.env` - Set QUEUE_CONNECTION=database
3. `app/Http/Controllers/EmailController.php` - Use EmailIngestionService
4. `app/Http/Controllers/ReviewController.php` - Use ReviewIngestionService

---

## Conclusion

This auto-ingestion system provides:
- ✅ Automatic email fetching every 15 minutes
- ✅ Automatic review fetching every 15 minutes
- ✅ Background processing (non-blocking)
- ✅ Multi-tenant isolation
- ✅ Automatic retries on failure
- ✅ Full logging and monitoring
- ✅ Easy to test and debug
- ✅ Scalable for future growth

**Next Steps:**
1. Follow setup instructions to configure server
2. Test manually to verify functionality
3. Monitor logs for first few days
4. Consider Phase 2+ enhancements based on usage patterns

---

**Document Version:** 1.0
**Created:** 2024-10-20
**Last Updated:** 2024-10-20
**Author:** Claude Code
