# Benefits Navigator — ChatGPT (OpenAI) API Integration
**Provider:** OpenAI  
**Recommended Model:** `gpt-4o` (best) · `gpt-4o-mini` (cost-efficient)  
**Base URL:** `https://api.openai.com/v1/chat/completions`  
**Auth Header:** `Authorization: Bearer YOUR_OPENAI_KEY`

---

## Project Role: Where ChatGPT Is Used

| Section | ChatGPT Task |
|---|---|
| Conversational FAQ Bot | Fast real-time chat answers for common questions |
| Document OCR Extraction | Extract text/data from uploaded document images (GPT-4o Vision) |
| Multilingual Support | Translate the full interface and AI responses |
| Intake Form Parsing | Extract structured fields from free-form user descriptions |
| Quick Eligibility Pre-Screen | Fast yes/no pre-screening before full Claude analysis |

**Key difference from Claude:** Use GPT for fast conversational responses, vision/image tasks, and multilingual support. Use Claude for deep policy analysis and long-form reasoning.

---

## Section 1 — Conversational FAQ Bot

This is the real-time chat widget shown at the bottom of every page.

### System Prompt

```
You are a helpful benefits assistant for a US government benefits navigator app.
Answer questions about SNAP, Medicaid, WIC, housing assistance, and other programs.

Rules:
- Answers must be 2-4 sentences only
- Plain English, no jargon
- If someone seems in crisis, provide crisis resources
- Never promise benefit amounts — say "estimated" or "up to"
- End responses with a practical next step when possible
- If unsure about a state-specific rule, say "this may vary by state — check with your local office"

Current user context: {USER_CONTEXT}
```

### PHP Code — FAQ Chat

```php
<?php
// File: /api/gpt/chat.php

class GPTChatService {
    private string $apiKey;
    private string $model;
    private string $baseUrl = 'https://api.openai.com/v1/chat/completions';

    public function __construct(string $apiKey, string $model = 'gpt-4o-mini') {
        $this->apiKey = $apiKey;
        $this->model  = $model;
    }

    public function chat(
        array  $history,
        string $userMessage,
        array  $userContext = []
    ): array {
        $contextStr = empty($userContext) ? 'No specific user context available.' : json_encode($userContext);

        $systemPrompt = str_replace(
            '{USER_CONTEXT}',
            $contextStr,
            'You are a helpful benefits assistant for a US government benefits navigator app. '
          . 'Answer questions about SNAP, Medicaid, WIC, housing assistance, and other programs. '
          . 'Rules: Answers must be 2-4 sentences only. Plain English, no jargon. '
          . 'Never promise benefit amounts — say "estimated" or "up to". '
          . 'Current user context: {USER_CONTEXT}'
        );

        $messages   = [['role' => 'system', 'content' => $systemPrompt]];
        $messages   = array_merge($messages, $history);
        $messages[] = ['role' => 'user', 'content' => $userMessage];

        $payload = [
            'model'       => $this->model,
            'messages'    => $messages,
            'max_tokens'  => 300,
            'temperature' => 0.5,
            'stream'      => false,
        ];

        $ch = curl_init($this->baseUrl);
        curl_setopt_array($ch, [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_POST           => true,
            CURLOPT_POSTFIELDS     => json_encode($payload),
            CURLOPT_HTTPHEADER     => [
                'Content-Type: application/json',
                'Authorization: Bearer ' . $this->apiKey,
            ],
            CURLOPT_TIMEOUT => 30,
        ]);

        $body = curl_exec($ch);
        $code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        curl_close($ch);

        if ($code !== 200) {
            $err = json_decode($body, true);
            throw new RuntimeException('GPT error ' . $code . ': ' . ($err['error']['message'] ?? $body));
        }

        $data = json_decode($body, true);

        return [
            'reply'           => $data['choices'][0]['message']['content'] ?? '',
            'prompt_tokens'   => $data['usage']['prompt_tokens'] ?? 0,
            'completion_tokens' => $data['usage']['completion_tokens'] ?? 0,
            'model_used'      => $data['model'] ?? $this->model,
        ];
    }
}

// ---- Streaming version (for real-time typing effect in browser) ----
function chatStream(array $messages, string $apiKey): void {
    header('Content-Type: text/event-stream');
    header('Cache-Control: no-cache');
    header('X-Accel-Buffering: no');

    $payload = [
        'model'    => 'gpt-4o-mini',
        'messages' => $messages,
        'stream'   => true,
        'max_tokens' => 300,
    ];

    $ch = curl_init('https://api.openai.com/v1/chat/completions');
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => false,
        CURLOPT_POST           => true,
        CURLOPT_POSTFIELDS     => json_encode($payload),
        CURLOPT_HTTPHEADER     => [
            'Content-Type: application/json',
            "Authorization: Bearer $apiKey",
        ],
        CURLOPT_WRITEFUNCTION  => function($ch, $chunk) {
            echo $chunk;
            ob_flush();
            flush();
            return strlen($chunk);
        },
    ]);
    curl_exec($ch);
    curl_close($ch);
}
?>
```

---

## Section 2 — Document Image Extraction (GPT-4o Vision)

Users can photograph their documents and GPT-4o will extract the relevant fields automatically — reducing data entry errors.

### PHP Code — OCR Document Extraction

```php
<?php
// File: /api/gpt/document-ocr.php

class GPTDocumentOCR {
    private string $apiKey;
    private string $baseUrl = 'https://api.openai.com/v1/chat/completions';

    public function __construct(string $apiKey) {
        $this->apiKey = $apiKey;
    }

    /**
     * Extract structured data from a photographed document
     * @param string $base64Image  Base64-encoded image (JPEG, PNG, WebP)
     * @param string $documentType E.g. "pay_stub", "photo_id", "utility_bill"
     */
    public function extractFromImage(string $base64Image, string $documentType): array {
        $instructions = $this->getExtractionInstructions($documentType);

        $payload = [
            'model'      => 'gpt-4o',
            'max_tokens' => 800,
            'messages'   => [[
                'role'    => 'user',
                'content' => [
                    [
                        'type'      => 'image_url',
                        'image_url' => [
                            'url'    => 'data:image/jpeg;base64,' . $base64Image,
                            'detail' => 'high'
                        ]
                    ],
                    [
                        'type' => 'text',
                        'text' => $instructions
                    ]
                ]
            ]]
        ];

        $ch = curl_init($this->baseUrl);
        curl_setopt_array($ch, [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_POST           => true,
            CURLOPT_POSTFIELDS     => json_encode($payload),
            CURLOPT_HTTPHEADER     => [
                'Content-Type: application/json',
                'Authorization: Bearer ' . $this->apiKey,
            ],
            CURLOPT_TIMEOUT => 45,
        ]);

        $body = curl_exec($ch);
        curl_close($ch);

        $data = json_decode($body, true);
        $text = $data['choices'][0]['message']['content'] ?? '{}';
        $text = preg_replace('/```json\s*|\s*```/', '', trim($text));
        return json_decode($text, true) ?? [];
    }

    private function getExtractionInstructions(string $docType): string {
        $instructions = [
            'pay_stub' =>
                'This is a pay stub. Extract these fields as JSON:\n'
              . '{"employer_name": string, "employee_name": string, "pay_period_start": "YYYY-MM-DD", '
              . '"pay_period_end": "YYYY-MM-DD", "gross_pay": number, "net_pay": number, '
              . '"is_readable": boolean, "issues": []}\n'
              . 'If a field is not visible, use null. No text outside JSON.',

            'photo_id' =>
                'This is a government-issued photo ID. Extract as JSON:\n'
              . '{"full_name": string, "date_of_birth": "YYYY-MM-DD", "id_number": string, '
              . '"expiration_date": "YYYY-MM-DD", "state": string, "id_type": string, '
              . '"is_expired": boolean, "is_readable": boolean, "issues": []}\n'
              . 'Do not return the actual ID number for security — replace with "REDACTED". No text outside JSON.',

            'utility_bill' =>
                'This is a utility bill. Extract as JSON:\n'
              . '{"account_holder_name": string, "service_address": string, "city": string, '
              . '"state": string, "zip": string, "bill_date": "YYYY-MM-DD", '
              . '"is_readable": boolean, "issues": []}\n'
              . 'No text outside JSON.',

            'ssn_card' =>
                'This is a Social Security card. Return JSON:\n'
              . '{"name_on_card": string, "is_readable": boolean, '
              . '"issues": [], "warning": "SSN not extracted for security"}\n'
              . 'NEVER extract the SSN number. No text outside JSON.',
        ];

        return $instructions[$docType] ??
            'Extract all text from this document and return as JSON: {"text": string, "is_readable": boolean}';
    }
}

// ---- Usage ----
// $ocr = new GPTDocumentOCR($_ENV['OPENAI_API_KEY']);
// $imageData = base64_encode(file_get_contents($_FILES['document']['tmp_name']));
// $result = $ocr->extractFromImage($imageData, 'pay_stub');
// echo json_encode($result);
?>
```

---

## Section 3 — Multilingual Translation

For Spanish, Chinese, Arabic, Tagalog, Vietnamese (common among benefit-seeking populations).

### PHP Code — Translation Service

```php
<?php
class GPTTranslationService {
    private string $apiKey;
    private array $supportedLanguages = [
        'es' => 'Spanish',
        'zh' => 'Simplified Chinese',
        'ar' => 'Arabic',
        'tl' => 'Tagalog (Filipino)',
        'vi' => 'Vietnamese',
        'so' => 'Somali',
        'am' => 'Amharic',
    ];

    public function __construct(string $apiKey) {
        $this->apiKey = $apiKey;
    }

    public function translate(string $text, string $targetLangCode): string {
        $language = $this->supportedLanguages[$targetLangCode] ?? 'Spanish';

        $payload = [
            'model'      => 'gpt-4o-mini',
            'max_tokens' => 1000,
            'messages'   => [
                [
                    'role'    => 'system',
                    'content' => "Translate the following text to $language. "
                               . "Maintain all formatting. "
                               . "For government program names (SNAP, Medicaid, etc), keep the English name "
                               . "followed by the translation in parentheses on first use. "
                               . "Return only the translation, nothing else."
                ],
                [
                    'role'    => 'user',
                    'content' => $text
                ]
            ],
            'temperature' => 0.1, // Low temperature for accurate translation
        ];

        $ch = curl_init('https://api.openai.com/v1/chat/completions');
        curl_setopt_array($ch, [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_POST           => true,
            CURLOPT_POSTFIELDS     => json_encode($payload),
            CURLOPT_HTTPHEADER     => [
                'Content-Type: application/json',
                'Authorization: Bearer ' . $this->apiKey,
            ],
        ]);
        $body = curl_exec($ch);
        curl_close($ch);

        $data = json_decode($body, true);
        return $data['choices'][0]['message']['content'] ?? $text;
    }

    /**
     * Translate the entire benefits result page content
     */
    public function translateBenefitsResult(array $benefitResult, string $targetLang): array {
        $translatable = ['program_name', 'eligibility_reason', 'tip'];
        foreach ($translatable as $field) {
            if (!empty($benefitResult[$field])) {
                $benefitResult[$field . '_translated'] = $this->translate($benefitResult[$field], $targetLang);
            }
        }
        return $benefitResult;
    }
}
?>
```

---

## Section 4 — Intake Form Parsing (Free-Form Input)

Some users find forms overwhelming. Allow them to describe their situation in their own words and GPT extracts the structured fields.

### PHP Code — Natural Language Intake

```php
<?php
function parseIntakeFromText(string $apiKey, string $userDescription): array {
    $prompt = "A person described their situation. Extract structured data as JSON.\n\n"
            . "Their description:\n\"$userDescription\"\n\n"
            . "Return JSON with these fields (use null if not mentioned):\n"
            . '{'
            . '"state": string (2-letter), '
            . '"household_size": integer, '
            . '"annual_income": number, '
            . '"employment_status": string (employed|unemployed|self_employed|retired|disabled|student), '
            . '"has_children": boolean, '
            . '"youngest_child_age": integer or null, '
            . '"has_disability": boolean, '
            . '"pregnant": boolean, '
            . '"rent_monthly": number or null, '
            . '"confidence": integer 0-100 (how confident you are in the extraction)'
            . '}\nNo text outside JSON.';

    $payload = [
        'model'      => 'gpt-4o',
        'max_tokens' => 500,
        'messages'   => [
            [
                'role'    => 'system',
                'content' => 'You are a structured data extraction assistant. '
                           . 'Extract benefit eligibility intake fields from free-form text. '
                           . 'Return only valid JSON. No text outside JSON.'
            ],
            ['role' => 'user', 'content' => $prompt]
        ],
        'temperature'    => 0.1,
        'response_format' => ['type' => 'json_object'], // GPT-4o JSON mode
    ];

    $ch = curl_init('https://api.openai.com/v1/chat/completions');
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST           => true,
        CURLOPT_POSTFIELDS     => json_encode($payload),
        CURLOPT_HTTPHEADER     => [
            'Content-Type: application/json',
            "Authorization: Bearer $apiKey",
        ],
    ]);
    $body = curl_exec($ch);
    curl_close($ch);

    $data = json_decode($body, true);
    $text = $data['choices'][0]['message']['content'] ?? '{}';
    return json_decode($text, true) ?? [];
}

// ---- Example Usage ----
// $userSaid = "I live in Seattle with my wife and 2 kids. 
//              I make about $2,000 a month driving for a delivery app.
//              My rent is $1,400 a month.";
// $result = parseIntakeFromText($_ENV['OPENAI_API_KEY'], $userSaid);
// Output: {"state":"WA","household_size":4,"annual_income":24000,...}
?>
```

---

## Section 5 — Quick Pre-Screen (Before Full Claude Analysis)

Use GPT-4o-mini for a fast initial filter to avoid wasting Claude tokens on clearly ineligible users.

### PHP Code — Quick Pre-Screen

```php
<?php
function quickPreScreen(string $apiKey, array $inputs): array {
    $payload = [
        'model'      => 'gpt-4o-mini',
        'max_tokens' => 200,
        'messages'   => [
            [
                'role'    => 'system',
                'content' => 'You are a benefits pre-screening tool. Given household data, '
                           . 'return JSON: {"likely_eligible_programs": array of strings, '
                           . '"worth_full_check": boolean, "reason": string}. '
                           . 'No text outside JSON.'
            ],
            [
                'role'    => 'user',
                'content' => 'Pre-screen this household: ' . json_encode([
                    'state'          => $inputs['state'],
                    'household_size' => $inputs['household_size'],
                    'annual_income'  => $inputs['annual_income'],
                    'has_children'   => $inputs['has_children'] ?? false,
                ])
            ]
        ],
        'temperature'    => 0.1,
        'response_format' => ['type' => 'json_object'],
    ];

    $ch = curl_init('https://api.openai.com/v1/chat/completions');
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST           => true,
        CURLOPT_POSTFIELDS     => json_encode($payload),
        CURLOPT_HTTPHEADER     => [
            'Content-Type: application/json',
            "Authorization: Bearer $apiKey",
        ],
    ]);
    $body = curl_exec($ch);
    curl_close($ch);

    $data = json_decode($body, true);
    $text = $data['choices'][0]['message']['content'] ?? '{}';
    return json_decode($text, true) ?? ['worth_full_check' => true];
}

// If worth_full_check === true, then call Claude for deep analysis
// If worth_full_check === false, show "You may not qualify" message with explanation
?>
```

---

## API Tags Reference

| Tag Name | Usage | Value Example |
|---|---|---|
| `model` | Which GPT model | `gpt-4o`, `gpt-4o-mini` |
| `messages[].role` | Turn type | `system`, `user`, `assistant` |
| `messages[].content` | Message content | String or array (for vision) |
| `max_tokens` | Output limit | `300` – `1500` |
| `temperature` | Randomness | `0.1` factual · `0.7` creative |
| `stream` | Streaming output | `true` or `false` |
| `response_format` | Force JSON mode | `{"type": "json_object"}` |
| `top_p` | Alternative to temp | `0.9` |
| `presence_penalty` | Reduce repetition | `0.1` |

### Vision-Specific Tags (GPT-4o)

| Tag | Value |
|---|---|
| `content[].type` | `"image_url"` or `"text"` |
| `content[].image_url.url` | `"data:image/jpeg;base64,BASE64DATA"` |
| `content[].image_url.detail` | `"low"` (fast) or `"high"` (accurate) |

### Response Tags

| Tag | Contains |
|---|---|
| `choices[0].message.role` | Always `"assistant"` |
| `choices[0].message.content` | GPT's response text |
| `choices[0].finish_reason` | `stop`, `length`, `content_filter` |
| `usage.prompt_tokens` | Input tokens used |
| `usage.completion_tokens` | Output tokens used |
| `usage.total_tokens` | Sum of both |

---

## Model Selection Guide

| Task | Model | Reason |
|---|---|---|
| FAQ chat | `gpt-4o-mini` | Fast, cheap, good enough |
| Document OCR (vision) | `gpt-4o` | Only 4o has vision |
| Translation | `gpt-4o-mini` | Excellent multilingual performance |
| JSON extraction | `gpt-4o` with `response_format` | More reliable structured output |
| Pre-screening | `gpt-4o-mini` | Speed and cost |

---

## Cost Estimation (2025 Pricing)

| Operation | Model | Est. Tokens In | Est. Tokens Out | Est. Cost |
|---|---|---|---|---|
| FAQ chat | gpt-4o-mini | 400 | 150 | ~$0.0001 |
| Document OCR | gpt-4o | 1000 + image | 400 | ~$0.007 |
| Translation (500 words) | gpt-4o-mini | 800 | 800 | ~$0.0002 |
| JSON extraction | gpt-4o | 500 | 200 | ~$0.004 |
| Pre-screen | gpt-4o-mini | 200 | 100 | ~$0.00005 |

*Check platform.openai.com/docs/pricing for current rates*

---

## Error Handling

```php
<?php
function callOpenAIWithRetry(array $payload, string $apiKey, int $maxRetries = 3): array {
    $retryStatuses = [429, 500, 502, 503];

    for ($attempt = 0; $attempt < $maxRetries; $attempt++) {
        $ch = curl_init('https://api.openai.com/v1/chat/completions');
        curl_setopt_array($ch, [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_POST           => true,
            CURLOPT_POSTFIELDS     => json_encode($payload),
            CURLOPT_HTTPHEADER     => [
                'Content-Type: application/json',
                "Authorization: Bearer $apiKey",
            ],
            CURLOPT_TIMEOUT => 30,
        ]);
        $body   = curl_exec($ch);
        $status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        $err    = curl_error($ch);
        curl_close($ch);

        if ($status === 200) {
            return json_decode($body, true);
        }

        if (in_array($status, $retryStatuses)) {
            // Check Retry-After header for 429
            sleep(pow(2, $attempt)); // 1, 2, 4 seconds
            continue;
        }

        $errData = json_decode($body, true);
        $errMsg  = $errData['error']['message'] ?? "HTTP $status: $body";

        // Content filter triggered — return safe fallback
        if ($status === 400 && str_contains($errMsg, 'content_filter')) {
            return ['choices' => [['message' => ['content' => 
                'I\'m unable to answer that. Please contact your local DSHS office for help.'
            ]]]];
        }

        throw new RuntimeException("OpenAI API error: $errMsg");
    }

    throw new RuntimeException("OpenAI API failed after $maxRetries retries");
}
?>
```

---

## Environment Setup (.env)

```
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
GPT_MODEL_CHAT=gpt-4o-mini
GPT_MODEL_VISION=gpt-4o
GPT_MODEL_EXTRACTION=gpt-4o
GPT_TEMPERATURE_FACTUAL=0.1
GPT_TEMPERATURE_CHAT=0.5
GPT_MAX_TOKENS_CHAT=300
GPT_MAX_TOKENS_OCR=800
GPT_MAX_TOKENS_TRANSLATION=1000
```

---

## Dual-AI Decision Flow

```
User submits intake form
        │
        ▼
GPT-4o-mini Pre-Screen (fast, cheap)
        │
   worth_full_check?
     │          │
    NO          YES
     │           │
Show basic    Claude Sonnet
message      Deep Analysis
                │
                ▼
          Results page
                │
         User has questions?
                │
                ▼
        GPT-4o-mini FAQ chat
                │
         Uploads a document?
                │
                ▼
          GPT-4o Vision OCR
                │
         Needs another language?
                │
                ▼
        GPT-4o-mini Translation
```
