OpenAI Users Manual
As of …A practical, step-by-step guide to OpenAI's full lineup — every current model, every product surface (ChatGPT, the API, Codex), and copy-paste prompt templates for the most common goals.
ChatGPT is a smart helper that can read, write, and answer questions. The same brain comes in different doors: ChatGPT (just type and ask), Business/Enterprise (ChatGPT for your whole team with your tools plugged in), and the API + Codex (the LEGO version you bolt into your own apps and code projects).
The magic isn't that it's smart — the magic is telling it exactly what you want (who it should be, what to make, who reads it, how it should look). Be specific and you'll get great answers.
Getting started in 60 seconds
- Sign in at chatgpt.com for the consumer app, or at platform.openai.com for API/developer access. Free, Plus, Pro, Business, Enterprise, and Edu plans share the same chat interface.
- Pick the right surface for the job: ChatGPT for everyday conversations and tasks, Business/Enterprise for team workspaces, the API for building products, Codex for cloud-based coding agents.
- Pick a model that matches the task — flagship (GPT-5.5) for hard reasoning, GPT-5 / GPT-5.4 for production defaults, Mini/Nano for speed and cost, gpt-image-2 for images, the Realtime API for voice.
- Tell the model what good looks like. Goal, audience, format. The single biggest jump in quality comes from saying these three things up front.
Which OpenAI surface should I use?
ChatGPT
chatgpt.com & mobile
- Quick questions, writing, brainstorms
- Vision, files, code, charts
- Browsing & deep research
- Voice mode
- Custom GPTs and memory
ChatGPT Business / Enterprise
Team workspace
- SSO, admin controls, audit logs
- Shared connectors (Drive, GitHub, Slack…)
- Shared GPTs across the team
- Data retention controls — your data is not used for training
- Higher rate limits and bigger context
API & Codex
platform.openai.com
- Build products with any model
- Tool use / function calling
- Realtime API for voice
- Batch API for 50% off async work
- Codex for cloud-based coding agents
The five prompt fundamentals
Every great prompt — chat or API — has at most five parts. Use the ones that apply.
| Part | Purpose | Example phrase |
|---|---|---|
| Role | Frame the model's perspective | "You are a senior staff engineer reviewing a junior PR." |
| Goal | What "done" looks like | "Produce a 1-page exec summary I can paste into Notion." |
| Context | Background & constraints | "Audience: non-technical execs. Tone: confident, plain English." |
| Inputs | The raw material | Pasted text, attached file, URL, image. |
| Format | Shape of the output | "5 bullets, ≤15 words each, no preamble." |
Compact universal template
OpenAI has lots of models — think of it like a tool drawer. GPT-5.5 is the big screwdriver: most powerful, newest, slowest. GPT-5 is the everyday hammer: what you usually grab. Mini and Nano sizes are tiny tools — cheap and fast for tiny clear jobs. The o-series (o3, o4-mini) are the math helpers — they think extra-hard but slowly, like solving homework one step at a time.
There are also helpers for pictures (gpt-image-2 draws), voice (Realtime API talks), and search-y "memory cards" (embeddings) you use behind the scenes. Pick the one that matches the job and the budget.
The current OpenAI lineup
As of 2026-05-03, OpenAI's text models are organised into three families: the GPT-5.x series (flagship and frontier), the GPT-4.1 family (production-recommended for most apps), and the o-series (deep reasoning). Plus image, audio, and embedding models.
Frontier & flagship (GPT-5.x)
| Model | API ID | Released | Best for | Pricing (in / out) |
|---|---|---|---|---|
| GPT-5.5 flagship | gpt-5.5 |
2026-04-23 | Hardest reasoning, complex goals, multi-step tool use, professional work in ChatGPT & Codex | API "coming very soon" |
| GPT-5.5 Pro | gpt-5.5-pro |
2026-04-23 | Pro/Business/Enterprise users — extended thinking, longer autonomous runs | API forthcoming |
| GPT-5.4 | gpt-5.4 |
2026-03-05 | Frontier reasoning + coding (incorporates 5.3-Codex stack) + native Computer Use. 272K context standard, up to 1,050,000 via API/Codex; 128K max output. | $2.50 / $15.00 (input doubles to $5.00 above 272K) |
| GPT-5.4 Pro | gpt-5.4-pro |
2026-03-05 | Premium GPT-5.4 variant for the most complex tasks — extended thinking, deeper reasoning | $30.00 / $180.00 |
| GPT-5.4 Mini | gpt-5.4-mini |
2026-03 † | Mid-tier 5.4 for high-volume product use cases (separate announcement) | See API pricing page |
| GPT-5.4 Nano | gpt-5.4-nano |
2026-03 † | Cheapest production model — bulk classification, simple transforms (separate announcement) | See API pricing page |
| GPT-5.3-Codex | gpt-5.3-codex |
2025-Q4 † | Most capable agentic coding model. ~25% faster than GPT-5.2 on coding benchmarks | See API pricing page |
| GPT-5.2 Instant | gpt-5.2-instant |
2025-Q4 † | Fast, grounded, measured tone — good for chat UX | See API pricing page |
| GPT-5 | gpt-5 |
2025-08-07 | The original GPT-5; 400K-token context; strong all-rounder for production | $1.25 / $10 |
| GPT-5 Mini | gpt-5-mini |
2025-08-07 | GPT-5 architecture at a budget — quality jump over GPT-4.1 Mini | $0.25 / $2.00 |
Production & long-context (GPT-4.1 family)
| Model | API ID | Released | Best for | Pricing (in / out) |
|---|---|---|---|---|
| GPT-4.1 | gpt-4.1 |
2025-04 † | Production-recommended replacement for GPT-4o. 1,000,000-token context window | $2.00 / $8.00 |
| GPT-4.1 Mini | gpt-4.1-mini |
2025-04 † | Mid-tier for high-volume product use cases | See API pricing page |
| GPT-4.1 Nano | gpt-4.1-nano |
2025-04 † | Cheapest in the GPT-4.1 family — bulk extraction, classification | $0.10 / $0.40 |
Reasoning (o-series)
| Model | API ID | Released | Best for | Pricing (in / out) |
|---|---|---|---|---|
| o3 | o3 |
2025-04 † | Multi-step reasoning, math proofs, complex debugging, scientific analysis | $2.00 / $8.00 |
| o4-mini | o4-mini |
2025-04 † | Cheaper reasoning at high volume; still beats GPT-4-class on hard tasks | See API pricing page |
| o1 | o1 |
2024-12-05 | Original reasoning model — superseded by o3 for most tasks but still in the API | See API pricing page |
Release timeline (chronological)
Useful when looking at old code, picking up a deprecated app, or understanding capability jumps.
| Date | Release | What changed |
|---|---|---|
| 2018-06 | GPT-1 | First transformer-based generative model from OpenAI. |
| 2019-02 | GPT-2 | Larger, more coherent — initially released in stages over safety concerns. |
| 2020-06 | GPT-3 | 175B parameters; introduced few-shot prompting at scale. |
| 2022-09 | Whisper | Open-source multilingual speech recognition. Trained on 680,000 hours of audio. |
| 2022-11-30 | ChatGPT (GPT-3.5) | The product that started the wave. Free chat interface. |
| 2023-03-14 | GPT-4 | Multimodal-capable, much stronger reasoning. Initially text-only in API. |
| 2023-09 | DALL-E 3 | Text-to-image, integrated into ChatGPT. |
| 2023-11-06 | GPT-4 Turbo + Custom GPTs + Assistants API | DevDay 2023 — 128K context, GPTs marketplace, first agent-style API. |
| 2024-01-25 | text-embedding-3-small & -large | 3rd-gen embeddings — 1536 / 3072 dim, with shrinkable dimensions parameter. |
| 2024-05-13 | GPT-4o | "Omni" — natively multimodal text/audio/vision in one model. Free-tier access. |
| 2024-07-18 | GPT-4o mini | Cheap, fast everyday model that replaced GPT-3.5 Turbo as the default. |
| 2024-09-12 | o1 (preview) | First reasoning-trained model — explicit chain-of-thought. |
| 2024-12-05 | o1 (general availability) | Plus o1-pro mode for ChatGPT Pro. |
| 2025-03 † | gpt-4o-transcribe / gpt-4o-mini-transcribe | Next-gen transcription beating Whisper on word-error rate. |
| 2025-04 † | GPT-4.1 family + o3 + o4-mini | 1M-token context for GPT-4.1; o3 supersedes o1 for most reasoning. |
| 2025-08-07 | GPT-5 + GPT-5 Mini | Major lift on math, code, finance, multimodal. 400K context. $1.25/$10. |
| 2025-08-28 | Realtime API — general availability | Speech-to-speech voice agents at low latency. |
| 2025-09-30 | Sora 2 | Text-to-video model with audio. iOS app, then Android two months later. |
| 2025-Q4 † | GPT-5.2 Instant + GPT-5.3-Codex | Faster default in ChatGPT; coding-specialised stack for Codex. |
| 2026-03-05 | GPT-5.4 + GPT-5.4 Pro | Frontier model unifying reasoning + coding (5.3-Codex stack) + native Computer Use. 75.0% on OSWorld-Verified (vs 47.3% for GPT-5.2; 72.4% human baseline). Tool search in API cuts token usage 47% with many MCP servers. 272K context standard, up to 1,050,000 via API/Codex. |
| 2026-03 † | GPT-5.4 Mini + GPT-5.4 Nano | Smaller siblings for high-volume and cheap-and-fast workloads. |
| 2026-03-11 | GPT-5.1 family retired | Instant, Thinking, and Pro variants removed from ChatGPT. |
| 2026-04-21 | ChatGPT Images 2.0 (gpt-image-2) | First OpenAI image model with native reasoning capabilities. |
| 2026-04-23 | GPT-5.5 + GPT-5.5 Pro | Current flagship; rollout to Plus/Pro/Business/Enterprise + Codex. API access coming. |
| 2026-04-26 | Sora app shut down | Mobile app discontinued; API to follow on 2026-09-24. |
| 2026-05-12 | DALL-E 2 & 3 retiring | Replaced by gpt-image-2. |
What's new in GPT-5.4 (2026-03-05)
GPT-5.4 is the first mainline OpenAI reasoning model that incorporates the frontier coding capabilities of GPT-5.3-Codex, plus a step change in agentic capability — native Computer Use without a separate specialist model. Highlights:
| Area | What changed |
|---|---|
| Native Computer Use new | First general-purpose OpenAI model that can take control of a computer — clicking, typing, navigating software using screenshots + mouse/keyboard commands. No specialised CUA model required. 75.0% on OSWorld-Verified (vs 47.3% for GPT-5.2 and 72.4% human baseline). |
| Reliability at scale | On ~30,000 HOA and property-tax portals: 95% success on first attempt, 100% within three attempts (vs ~73–79% with prior CUA models). Sessions ran ~3× faster using ~70% fewer tokens. |
| Tool search in API new | When given many tools, the model searches its toolset before deciding what to call. On 250 tasks from Scale's MCP Atlas benchmark with 36 MCP servers enabled, total token use dropped 47% while accuracy held. |
| Five-level reasoning effort | Finer control than the prior low/medium/high. Tune reasoning depth ↔ latency more precisely per request. |
| Vision fidelity | New "original" image input detail level — full-fidelity perception up to 10.24 megapixels total or 6,000 px max edge (whichever is lower). "High" detail now supports up to 2.56 megapixels total / 2,048 px max edge. |
| Coding | ~80% on SWE-bench Verified. Folds in the GPT-5.3-Codex coding training stack, so you don't have to swap models for hard coding work. |
| Context window | 272K standard, expandable to 1,050,000 (1M+) tokens via the API and Codex. Max output 128,000 tokens. |
| Variants | gpt-5.4-pro for the most demanding tasks. gpt-5.4-mini and gpt-5.4-nano announced separately for high-volume / low-cost workloads. |
Pricing
| Input | Output | |
|---|---|---|
| GPT-5.4 (≤272K input) | $2.50 per million tokens | $15.00 per million tokens |
| GPT-5.4 (above 272K input) | $5.00 per million tokens | $15.00 per million tokens |
| GPT-5.4 Pro | $30.00 per million tokens | $180.00 per million tokens |
| GPT-5 (for comparison) | $1.25 per million tokens | $10.00 per million tokens |
Where you can use it
- ChatGPT (Plus / Pro / Business / Enterprise)
- OpenAI API (
gpt-5.4,gpt-5.4-pro) - Codex (cloud and CLI)
gpt-5.4 initially; only step up to gpt-5.4-pro when an eval shows the standard tier missing on your hardest tasks.
Optimal prompts for GPT-5.4's new features
How to pick a model
Pick GPT-5.5 / GPT-5.5 Pro when…
- The task is genuinely hard reasoning or strategy.
- Long-horizon agentic work — coding sessions, multi-source research.
- The cost of a wrong answer is high.
- You want the best available model regardless of speed/cost.
Pick GPT-5.4 when…
- You need native Computer Use (browser automation, desktop apps, portals).
- You're orchestrating many tools / MCP servers — tool search keeps token use down.
- You need a true 1M+ token context via API/Codex.
- You want a single model that handles reasoning, coding (5.3-Codex stack), and agents.
Pick GPT-5 when…
- You want a stable production default at $1.25 / $10 pricing.
- Your prompts are already calibrated to GPT-5.
- You don't need Computer Use or the 1M context window.
- Latency and cost matter more than 5.4's bleeding-edge features.
Pick GPT-4.1 when…
- You need a true 1M-token context window.
- You're processing very long docs, codebases, transcripts.
- Your stack was built around GPT-4o and you don't want to retune yet.
Pick GPT-5.4 Nano / GPT-4.1 Nano when…
- The task is repetitive, unambiguous, and high-volume.
- Cost dominates the decision.
- Real-time UX where latency matters more than depth.
Pick o3 / o4-mini when…
- The task rewards explicit step-by-step reasoning.
- Math, formal logic, code debugging, scientific analysis.
- You're OK with longer latency in exchange for accuracy.
Pick GPT-5.3-Codex when…
- You're using Codex (cloud or CLI) for agentic coding.
- Long-running build/refactor jobs that touch many files.
- You need the strongest coding-specific tuning available.
Image, audio, and video models
Image generation
| Model | API ID | Released | Notes |
|---|---|---|---|
| ChatGPT Images 2.0 | gpt-image-2 | 2026-04-21 | First OpenAI image model with native reasoning. Better text rendering, layout control, and instruction-following than DALL-E 3. |
| gpt-image-1 | gpt-image-1 | 2025-04 † | Production image model in ChatGPT through 2025. |
| DALL-E 3 | dall-e-3 | 2023-09 | Retiring 2026-05-12. Migrate to gpt-image-2. |
| DALL-E 2 | dall-e-2 | 2022-04 | Retiring 2026-05-12. |
Speech & audio
| Model | API ID | Type | Notes |
|---|---|---|---|
| gpt-4o-transcribe | gpt-4o-transcribe | Speech → text | Released 2025-03 †. Lower word-error rate than Whisper. API-only. |
| gpt-4o-mini-transcribe | gpt-4o-mini-transcribe | Speech → text | Cheaper sibling of the above. |
| Whisper | whisper-1 | Speech → text | Released 2022-09. Open-source weights (MIT). Useful when you need to self-host. |
| gpt-4o-tts | gpt-4o-tts | Text → speech | Steerable voices. Pair with the Realtime API for interactive use. |
| Realtime API | gpt-4o-realtime-preview / gpt-realtime | Speech ↔ speech | GA 2025-08-28. Build voice agents end-to-end without separate STT/TTS. |
Video
| Model | API ID | Released | Status |
|---|---|---|---|
| Sora 2 | sora-2 | 2025-09-30 | App shut down 2026-04-26. API planned to be discontinued 2026-09-24. Plan migration paths if you depend on it. |
Embeddings
| Model | API ID | Released | Dim | Pricing |
|---|---|---|---|---|
| text-embedding-3-large | text-embedding-3-large | 2024-01-25 | 3072 (shrinkable) | $0.13 /M tokens |
| text-embedding-3-small | text-embedding-3-small | 2024-01-25 | 1536 (shrinkable) | $0.02 /M tokens |
| text-embedding-ada-002 (legacy) | text-embedding-ada-002 | 2022-12 | 1536 | Legacy — migrate to 3-small. |
dimensions parameter set to 1024 or 1536 usually outperforms text-embedding-3-small at full size — at lower storage cost. Benchmark before committing.
Switching models
In ChatGPT
- Click the model name at the top of the conversation.
- Pick from the dropdown. Free plans see a subset; Plus/Pro/Business/Enterprise see the full lineup.
- Switching mid-conversation is fine — the new model inherits the full context. Useful for "draft with GPT-5, polish with GPT-5.5."
- Some models have modes — e.g. GPT-5 has Auto / Fast / Thinking. Pick by what the task rewards.
In the API
Pass the model field in your request. Snapshots are dated (gpt-5-2025-08-07); aliases (gpt-5) auto-track the newest.
Deprecated & sunsetting models
| Model | Status | Migrate to |
|---|---|---|
| GPT-5.1 (Instant / Thinking / Pro) | Retired 2026-03-11 | GPT-5.5 / GPT-5.4 |
| GPT-4o | Superseded by GPT-4.1 / GPT-5 | GPT-4.1 (drop-in) or GPT-5 |
| GPT-4o mini | Superseded | GPT-5 Mini or GPT-4.1 Nano |
| GPT-4 Turbo | Legacy | GPT-4.1 |
| GPT-4 (original) | Legacy | GPT-4.1 / GPT-5 |
| GPT-3.5 Turbo | Legacy | GPT-4.1 Nano |
| o1 (preview & GA) | Active but superseded | o3 / o4-mini |
| DALL-E 2 / DALL-E 3 | Retiring 2026-05-12 | gpt-image-2 |
| Sora 2 app | Shut down 2026-04-26 | API still works until 2026-09-24 |
| text-embedding-ada-002 | Legacy | text-embedding-3-small |
ChatGPT is the basic door — type in the box, get an answer. You can drag in files, ask it to draw pictures, or even talk to it with your voice. Canvas is the side panel for editing long documents and code together — like Google Docs, but with a smart helper.
Custom GPTs are like saved bookmarks for the way you like to talk to it. Build one with your style guide or instructions, give it a name, and use it again and again. Memory lets ChatGPT remember stuff about you across chats so you don't have to repeat yourself.
Setup & the ChatGPT interface
- Sign in at chatgpt.com or install the desktop / mobile app.
- Start a new chat. The big input accepts text, drag-and-dropped files, pasted images, audio recordings.
- Pick a model from the top selector. Plus/Pro/Business/Enterprise see the full lineup.
- Toggle tools as needed: web search, image generation, code execution (Python sandbox), Canvas, voice mode, deep research.
- Use Projects to bundle related chats with shared instructions and files.
Modes & tools
- Auto — ChatGPT picks the right model and reasoning depth for your prompt. The default in 2026.
- Fast — quick, low-latency responses. Good for chat-style back-and-forth.
- Thinking / Pro — the model takes longer and reasons more. Use for hard problems.
- Search — browses the web, returns cited sources.
- Deep Research — agentic, longer-running research that produces a structured report.
- Image — calls gpt-image-2 to generate or edit pictures.
- Voice — full duplex speech-to-speech via the Realtime API.
Canvas — long-form writing & code
Canvas is a side panel for editing documents and code with the model. Like Claude's Artifacts, but with inline edit suggestions you can accept or reject.
- Open Canvas from the tools menu, or just ask: "Open this in canvas."
- Highlight a passage and ask for a tightening, a tone shift, or a translation — only that span changes.
- Ask for inline comments ("Mark every place I'm being too vague") instead of a rewrite.
- Export when done — copy/paste to your final destination.
Files & vision
- Drag a file onto the chat box. PDFs, Word, spreadsheets, code files, images, and short audio clips work.
- Tell the model what to do with it. "Summarize" is weak — say "Pull every dollar amount and the page it appears on, return as a table."
- For long docs, tell it where to focus: "Only the financial statements section, pages 14-22."
- For images / screenshots, ask for transcription first, then analysis.
Search & deep research
Search returns cited results in a single response. Deep Research kicks off an agentic, multi-source investigation that takes several minutes and produces a structured report.
Memory & projects
- Memory stores facts about you across chats — preferences, recurring projects, names of teammates. Manage it in Settings → Personalization → Memory.
- Projects are folders of related chats with shared custom instructions and files. Use one per ongoing initiative.
- Custom instructions in a project override your global ones — useful for switching context (work voice vs. personal voice).
Custom GPTs
A Custom GPT is a packaged ChatGPT — a name, instructions, optional knowledge files, optional API actions — that anyone with the link can use.
- Open the GPT builder from the sidebar → "Explore GPTs" → "Create."
- Tell the builder what the GPT does in plain English. It drafts the system prompt for you.
- Add knowledge files — style guides, schemas, FAQs.
- Add Actions if it needs to call your API (paste an OpenAPI spec).
- Publish as private, link-only, or to the GPT Store.
Voice mode
- Tap the voice icon. Phone, desktop, or web. Standard voice = turn-taking; Advanced voice = continuous, with interruptions.
- Pick a voice in Settings → Voice.
- Use it for thinking-out-loud tasks — driving brainstorms, talking through code, language practice.
- Switch to text mid-conversation — context carries over.
Optimal prompts for ChatGPT
Writing & editing
Brainstorming
Learning a topic
Document analysis
Image generation
Code in chat (one-offs)
Spreadsheet / data tasks
Decision-making
Same ChatGPT, but for your whole company. Your bosses can decide who sees what, your data stays private (it's not used to train the AI), and your team can share helpers (custom GPTs) so nobody starts from scratch.
You can plug in your work tools — Drive, Slack, GitHub, Salesforce, Notion — so ChatGPT can read them and pull info into your chats. Great for getting up to speed on a project or prepping for a customer call.
What ChatGPT Business / Enterprise gives you
- SSO & admin controls — central provisioning, audit logs, group-level permissions.
- Data privacy — your prompts and outputs are not used to train OpenAI models.
- Higher rate limits and bigger context than consumer plans.
- Shared connectors — Drive, Slack, GitHub, SharePoint, Box, Outlook, Confluence, and more.
- Shared GPTs — your team builds custom GPTs that everyone can use.
- Workspace memory and analytics — usage reports for the org.
Setting up a workspace
- Admin creates the workspace at chatgpt.com/admin.
- Configure SSO (Okta, Azure AD, Google, etc.) and SCIM if you want auto-provisioning.
- Invite teammates in bulk by domain or individually.
- Pin a default model and decide which models are available org-wide.
- Drop a workspace-wide instructions doc (style guide, glossary, escalation policy). Every chat inherits it.
Connectors & data
Common connectors in 2026:
| Connector | What it unlocks |
|---|---|
| Google Drive / OneDrive / Box | Search and read files inside a chat without copy-paste. |
| Slack / Teams | Pull recent threads from a channel as context. |
| GitHub | Read PRs, issues, file contents. |
| Salesforce / HubSpot | Surface accounts, contacts, opportunities. |
| Linear / Jira / Asana | Read tickets, summarize backlogs. |
| Confluence / Notion | Treat your team wiki as searchable context. |
| Outlook / Gmail | Read recent threads, draft replies (you send). |
Shared GPTs
- Build a GPT the same way as in consumer ChatGPT.
- Publish to your workspace instead of "Anyone with the link."
- Pin to the sidebar for the team.
- Govern with admin tools — turn off external GPTs in settings if you want to lock things down.
Optimal prompts for Business / Enterprise
Cross-tool research
Sales prep
Knowledge-grounded support reply
The API is the LEGO version of ChatGPT — you bolt the AI brain into your own apps with code. You get an API key (a secret password), pick a model, and call it from Python or JavaScript. The Responses API is the modern way to do this; function calling lets the AI call your code (look up a price, send an email, search a database) and feed the answer back.
Codex is a coding-specialist version. Connect it to a GitHub project and tell it "add this feature." It writes the code in a branch and opens a pull request — like having an extra teammate who never sleeps. There's also a CLI version that runs on your laptop.
Account & API keys
- Sign in at platform.openai.com. You can use the same account as ChatGPT.
- Create an API key under "API keys." Use named, scoped keys per project — never reuse.
- Set a usage limit in Settings → Limits before you start. Cheap insurance against runaway costs.
- Add a payment method and watch the first day's usage closely.
- Pick an SDK:
openai(Python) oropenai(Node/TS) are the official ones.
First API call
Responses API — the modern shape
The responses endpoint is OpenAI's unified API for text, images, audio, and tool use. Prefer it over the older chat.completions endpoint for new projects.
Tool use & function calling
Define functions the model can call and the API returns structured arguments you execute, then feed back the result. This is how you build agents that touch real systems.
Realtime API — voice agents
Bidirectional WebSocket / WebRTC stream for speech-to-speech experiences. GA since 2025-08-28. Use it for phone bots, voice assistants, language tutors.
gpt-4o-transcribe + gpt-4o-tts is simpler and cheaper.
Voice agent prompt
Batch API — 50% off, 24-hour turnaround
Submit a JSONL file of requests; results come back within 24 hours at half price. Perfect for evals, classification at scale, embedding back-fills, nightly summarization.
Codex — the cloud coding agent
Codex is OpenAI's agentic coding product. It can run as a cloud agent (parallel tasks against your GitHub repo) or as a local CLI (similar to Claude Code).
- Connect a repo at chatgpt.com/codex (Plus/Pro/Business/Enterprise).
- Pick a model — GPT-5.3-Codex is the coding-specialised tune; GPT-5.5 / GPT-5.4 also work.
- Give it a task in plain English. It clones the repo, makes changes in a branch, runs your tests, and opens a PR for review.
- Review and merge like any human PR.
Codex CLI (local)
- Install:
npm install -g @openai/codex(or follow the platform docs). - Authenticate with your OpenAI account.
- Run inside a project:
codexand type your goal. - Approve actions as it proposes file edits and shell commands.
- Configure per-project rules in a
codex.mdat repo root (similar to CLAUDE.md).
Optimal prompts for the API & Codex
Structured extraction
Codex feature work
Long-context summarization (GPT-4.1)
Reasoning model (o3) for hard analysis
Cost-aware routing
A prompt is directions for the AI. The better the directions, the better the help. Tell it four things: WHO it should be (a sharp editor? a patient teacher?), WHAT you want done, WHO reads the answer, and HOW the answer should look (5 bullets? a paragraph? a JSON object?).
If the answer is bad, don't start over. Say "redo that, but tighter and bossier" or "drop the both-sides framing — pick one and defend it." It already has the context — just steer it.
Use-case prompt library
Copy-paste-ready prompts for the most common goals. Edit the bracketed parts to fit your situation. Click Copy on any card.
Writing & email
Editing & feedback
Learning & research
Decision-making
Brainstorming
Coding
Data & analysis
Documents
Creative & personal
Interactive prompt builder
Pick a goal, fill in a few blanks, and copy the result. Works for any OpenAI surface.
Patterns library
The "stop and ask" pattern — for ambiguous tasks
The "two passes" pattern — for higher-quality output
The "narrow-then-widen" pattern — for research
The "constraint stack" pattern — for code
The "rubric" pattern — for evaluating output
Anti-patterns (what to stop doing)
| Anti-pattern | Why it hurts | Do this instead |
|---|---|---|
| "Help me with this." | Generic input → generic output. | State the deliverable: "Rewrite this email so it's 30% shorter and warmer." |
| "Be creative." / "Be smart." | Vague adjectives don't constrain output. | Name a target: "Three options ranging from safe to ambitious. Each with one risk." |
| Asking again instead of correcting | Loses context — model restarts from scratch. | "Same task, but tighter and with no jargon" in the same conversation. |
| 10-paragraph prompt for a simple task | Drowns the actual ask. | Match prompt length to task complexity. |
| "Don't hallucinate." | It's a vibe, not a constraint. | "If you're not sure, write 'unknown' and explain what would resolve it." |
| Pasting 50-page docs without focus | Model treats it all as equally important. | "Only the methodology section. Ignore the appendices." |
| Using GPT-5.5 for everything | Slow + expensive when GPT-5 / GPT-5.4 would do. | Default to GPT-5; escalate when an answer disappoints. |
| Ignoring temperature for structured output | Drift between calls breaks downstream parsing. | Set temperature 0.0–0.2 + use json_schema response format. |
Universal rescue prompts
When something isn't working, paste one of these mid-conversation rather than starting over.