Google AI Users Manual

As of …

A practical, step-by-step guide to Google's AI lineup — every current model (Gemini, Imagen, Veo, Lyria, Gemma), every product surface (Gemini app, AI Studio, Vertex AI, Workspace, Code Assist), and copy-paste prompt templates for the most common goals.

🎈 ELI5

Google has lots of AI helpers under one big umbrella called Gemini. The Gemini app is where you chat. AI Studio is the developer playground. Vertex AI is the big enterprise version on Google Cloud. Workspace is where Gemini lives inside your Google Docs, Sheets, and Gmail.

Google also makes Imagen (draws pictures), Veo (makes videos with sound), Lyria (composes music), and Gemma (free open-source models you can run yourself). Pick the door that matches what you're doing.

Getting started in 60 seconds

Sign in at gemini.google.com for the consumer app, or at aistudio.google.com for the developer playground. Same Google account works for both.
Pick the right surface for the job: Gemini app for chat, Workspace for in-document AI, AI Studio for prototyping with the API, Vertex AI for production deployments on Google Cloud.
Pick a model that matches the task — Gemini 3.1 Pro for hard reasoning, Gemini 3 Flash for the everyday default, Flash-Lite for speed/cost, Imagen 4 for images, Veo 3 for video, Live API for voice.
Tell the model what good looks like. Goal, audience, format. The single biggest jump in quality comes from saying these three things up front.

Which Google AI surface should I use?

Gemini app

gemini.google.com

Quick questions, writing, brainstorms
Vision, files, code, charts
Deep Research & Canvas
Live voice mode
Custom Gems and memory

Workspace + Vertex AI

Enterprise & in-product

Gemini inside Docs, Sheets, Gmail, Meet
Vertex AI for production deployments
Data residency & admin controls
Connect to your own data via grounding

AI Studio + Gemini API

aistudio.google.com

Free playground for testing prompts
Code Assist in your IDE
Live API for voice/video agents
Massive context windows (up to 2M)

The five prompt fundamentals

Every great prompt — chat or API — has at most five parts. Use the ones that apply.

Part	Purpose	Example phrase
Role	Frame the model's perspective	"You are a senior staff engineer reviewing a junior PR."
Goal	What "done" looks like	"Produce a 1-page exec summary I can paste into Notion."
Context	Background & constraints	"Audience: non-technical execs. Tone: confident, plain English."
Inputs	The raw material	Pasted text, attached file, URL, image, video.
Format	Shape of the output	"5 bullets, ≤15 words each, no preamble."

Gemini-specific tip Gemini handles video natively in the API and app. Don't transcribe video to text first — paste the video and ask Gemini to reason about it directly. Video understanding is a Gemini superpower.

🎈 ELI5

Gemini comes in three sizes — Pro, Flash, and Flash-Lite. Pro is the biggest brain — slow and pricey but smartest. Flash is the everyday hammer — fast and good. Flash-Lite is the tiniest — super cheap for tiny clear jobs.

The numbers (3.1, 3, 2.5…) are version numbers — bigger is newer. Today the headline is Gemini 3.1 Pro (the biggest, smartest one with a 2-million-word memory). Google also makes special models for pictures (Imagen 4), videos (Veo 3), music (Lyria 2), and free-to-run open ones (Gemma).

The current Gemini lineup

As of 2026-05-03, the active flagship is Gemini 3.1 Pro. The 3.x family follows the standard three-tier pattern: Pro (frontier), Flash (production default), Flash-Lite (cheapest).

Latest releases Gemini 3.1 Pro launched 2026-02-19 with massive jumps on reasoning benchmarks. Gemini 3.1 Flash TTS shipped 2026-04-15 with natural-language voice control. Gemma 4 (open weights) released April 2026. As of 2026-04-01, Pro tier is paid-only on the API; Flash & Flash-Lite retain a reduced-quota free tier.

About these dates Dates are pulled from Google AI release notes and announcements where confirmed. Where uncertain, marked †. Pricing in USD per million tokens. Always confirm against ai.google.dev/gemini-api/docs/models before billing-sensitive decisions.

Frontier & flagship (Gemini 3.x)

Model	API ID	Released	Best for	Pricing (in / out)
Gemini 3.1 Pro flagship	`gemini-3.1-pro`	2026-02-19	Hardest reasoning, agentic tasks. 2M-token context. 94.3% GPQA Diamond, 77.1% ARC-AGI-2.	$2.00 / $12.00 (≤200K) · $4.00 / $18.00 (above)
Gemini 3 Pro	`gemini-3-pro`	2025-11 †	Stable flagship alternative. LMArena Elo 1501; 91.9% GPQA Diamond. Includes Deep Think mode for hardest problems.	$2.00 / $12.00
Gemini 3 Flash	`gemini-3-flash`	2025-Q4 †	Mid-tier production default. Strong all-rounder.	$0.50 / $3.00 (free tier reduced)
Gemini 3.1 Flash	`gemini-3.1-flash`	2026-Q1 †	Newest mid-tier; foundation for the 3.1 Flash audio/TTS variants.	See pricing page
Gemini 3.1 Flash-Lite	`gemini-3.1-flash-lite`	2026-Q1 †	Cheapest tier-1 production model. High-volume classification, simple transforms.	$0.25 / $1.50 (free tier retained)

Audio variants (Gemini 3.1 Flash family)

Model	Released	Notes
Gemini 3.1 Flash Live	2026 †	Realtime speech-to-speech voice agents via the Live API.
Gemini 3.1 Flash TTS	2026-04-15	Natural-language control of style, pace, pitch, emphasis — no SSML required. Single-speaker and multi-speaker output.

Legacy Gemini models still in service

Model	Released	Status
Gemini 2.5 Pro	2025-03 †	$1.25 / $10.00. Legacy — paid-tier only as of 2026-04-01.
Gemini 2.5 Flash	2025-Q2 †	$0.30 / $2.50. Paid-only after 2026-04-01.
Gemini 2.0 Flash / Flash-Lite	2024-12-11	Flash-Lite scheduled for deprecation 2026-06-01. Migrate to 2.5 Flash or 3 Flash.

Release timeline (chronological)

Date	Release	What changed
2018-05	BERT	Foundational language model from Google Research.
2022-04	PaLM	540B-parameter Pathways Language Model.
2022-05	Imagen 1	First Google text-to-image model (research).
2023-03-21	Bard	Public chatbot launch (later renamed Gemini).
2023-05-10	PaLM 2	Powered Bard, Workspace, Vertex AI through 2023.
2023-12-06	Gemini 1.0 (Ultra, Pro, Nano)	Three-tier launch. Multimodal-native from day one.
2024-02-08	Bard renamed to Gemini	Single brand across consumer + developer.
2024-02-15	Gemini 1.5 Pro	First model with 1M-token context. Game-changer for long docs.
2024-02-21	Gemma (open) 1.0	Google's first open-weight model family.
2024-05-14	Gemini 1.5 Flash + Imagen 3 + Veo 1	I/O 2024 — first-gen video, faster Flash tier.
2024-12-11	Gemini 2.0 Flash	Native tool use, real-time multimodal in/out.
2025-03-12	Gemma 3	1B–27B params, 128k context, multimodal, 140+ languages.
2025-03 †	Gemini 2.5 Pro & Flash	Thinking models; reasoning via test-time compute.
2025-05-20	Imagen 4 + Veo 3 + Lyria 2	I/O 2025. Veo 3 generates video with synchronized audio.
2025-11 †	Gemini 3 + 3 Pro	Major generation jump. Deep Think mode debuts. LMArena Elo 1501.
2026-02-19	Gemini 3.1 Pro	2M-token context; 94.3% GPQA Diamond; 77.1% ARC-AGI-2 (more than 2× Gemini 3 Pro on novel pattern recognition).
2026-04-01	API tier changes	Pro models become paid-only on Gemini API; Flash & Flash-Lite retain reduced free tier.
2026-04 †	Gemma 4 (open)	Next-generation open-weight family.
2026-04-15	Gemini 3.1 Flash TTS	Natural-language voice control for TTS.

What's new in Gemini 3.1 Pro (2026-02-19)

Gemini 3.1 Pro is Google's frontier flagship — built for the most complex tasks. Established Gemini as the benchmark leader in most categories at release.

Area	What changed
Reasoning	Substantially improved reasoning. 94.3% on GPQA Diamond (highest score ever reported on that graduate-level science benchmark at release).
Novel pattern recognition	77.1% on ARC-AGI-2 vs 31.1% for Gemini 3 Pro — more than 2× the previous score.
Agentic performance	Stronger tool use, multi-step planning, and follow-through across long-horizon tasks.
Coding output	Expanded coding output features. Stronger at writing larger blocks of structured code.
Long context	Dominates on long-context tasks. 2M-token window; pricing tiers above 200K input.
Deep Think mode	Inherits the Deep Think reasoning mode introduced with Gemini 3 — extra test-time compute for the very hardest problems.

Pricing

	Input ≤200K	Input >200K	Output ≤200K	Output >200K
Gemini 3.1 Pro	$2.00 /M	$4.00 /M	$12.00 /M	$18.00 /M
Gemini 3 Pro (comparison)	$2.00 /M	$12.00 /M output (single tier)

Where you can use it

Gemini app — Google AI Pro/Plus subscribers
AI Studio & Gemini API
Vertex AI on Google Cloud

How to pick a Gemini model

Pick Gemini 3.1 Pro when…

The task is genuinely hard reasoning or strategy.
You need a true 2M-token context window.
Long-horizon agentic work — multi-step planning.
Hard science / math (GPQA, ARC-AGI territory).

Pick Gemini 3 / 3.1 Flash when…

Production default — strong all-rounder.
Day-to-day coding, writing, analysis, agents.
You want cheaper per-call cost than Pro.
Real-time UX where latency matters.

Pick Gemini 3.1 Flash-Lite when…

Tasks are repetitive, unambiguous, high-volume.
Cost dominates ($0.25/$1.50 per M tokens).
You want a free-tier option for prototyping.
Classification, extraction, simple transforms.

Pick Deep Think when…

Latency doesn't matter; depth does.
Math proofs, novel synthesis, hardest reasoning.
Available in Gemini 3 / 3.1 Pro.

The "escalate, don't rewrite" rule If Gemini Flash gives a shallow answer to a well-formed prompt, escalate to Gemini 3.1 Pro instead of rephrasing three more times. Tokens beat hours.

Media generation — Imagen, Veo, Lyria

Imagen 4 (image generation)

Model	API ID	Notes
Imagen 4 Ultra	`imagen-4-ultra`	Highest-fidelity tier. Released 2025-05-20.
Imagen 4 Standard	`imagen-4`	Default production tier with substantial text-rendering improvements.
Imagen 4 Fast	`imagen-4-fast`	Lower-latency, lower-cost variant for high-volume use.
Imagen 3 (legacy)	`imagen-3`	Still works; new builds should use Imagen 4.

Veo (video generation)

Model	Released	Notes
Veo 3	2025-05	Generates video with synchronized audio — dialogue, sound effects, ambient noise.
Veo 3 Fast	2025-05	Lower-cost sibling for high-volume use.
Veo 3.1 Lite Preview	2025-Q4 †	Cost-efficient preview for rapid iteration.
Veo 4	—	Not yet announced. Likely window: Google I/O 2026 (May 19–20).

Lyria 2 (music generation)

Announced alongside Veo 3 / Imagen 4 at I/O 2025.
High-fidelity music generation; instrumental and vocal styles.
Available via Vertex AI.

Gemma — open-weight family

Google's open-source AI models. Same DNA as Gemini, but you download the weights and run them yourself.

Family	Released	Notes
Gemma 4	2026-04 †	Newest open generation. Use when running locally or fine-tuning your own.
Gemma 3	2025-03-12	1B–27B parameters, 128k context, multimodal, 140+ languages. Solid default open model.
Gemma 2	2024-06	Legacy; migrate when convenient.

Deprecated & sunsetting models

Model	Status	Migrate to
Gemini 1.0 (Ultra/Pro/Nano)	Retired	Gemini 3.1 Pro / Flash
Gemini 1.5 Pro / Flash	Legacy	Gemini 3 Pro / 3 Flash
Gemini 2.0 Flash-Lite	Deprecating 2026-06-01	Gemini 2.5 Flash or 3 Flash
Gemini 2.5 Pro / Flash	Legacy — paid-only since 2026-04-01	Gemini 3.1 Pro / 3 Flash
PaLM 2	Retired	Gemini 2.5+ family
Bard branding	Renamed to Gemini (2024-02-08)	—
Imagen 3	Legacy	Imagen 4
Veo 1 / 2	Legacy	Veo 3

🎈 ELI5

The Gemini app is the basic door — type, get an answer. You can drag in pictures, files, or even videos (Gemini can watch them and tell you what's happening). Canvas is the side panel for editing long documents and code. Deep Research is when you want it to spend 10 minutes researching something properly.

Gems are like saved bookmarks for the way you like to talk to Gemini — set up your style guide, give it a name, use it again and again. Live lets you talk to Gemini with your voice and even share your camera or screen with it.

Setup & the Gemini interface

Sign in at gemini.google.com with any Google account.
Pick a model from the top selector — Pro for hard tasks, Flash for everyday.
Drop files, images, or videos into the chat. Gemini reads them all natively.
Subscribe to Google AI Plus ($7.99/mo) or Pro ($19.99/mo) for higher limits and access to Gemini 3.1 Pro.

Modes & tools

Standard — quick chat replies.
Deep Think — Pro-only mode that takes longer and reasons more for the hardest problems.
Deep Research — agentic mode that spends several minutes researching and produces a structured report with citations.
Canvas — side-panel document/code editor with inline AI edits.
Live — voice + video conversation; show Gemini your camera or screen.
Image & Video — generate via Imagen 4 / Veo 3 directly from chat.

Canvas & Deep Research

Canvas opens a document/code editor beside the chat. Highlight a passage and ask for a tone shift, a tightening, or a rewrite — only that span changes. Useful for long-form writing and code review.

Deep Research kicks off a multi-step research agent that visits dozens of pages, synthesizes findings, and returns a long structured report with sources. Best for "I want to understand X comprehensively" — not for quick lookups.

Deep Research Run Deep Research on [topic]. I need: - 10–15 primary sources (vendor docs, papers, official blogs preferred) - A 200-word executive summary - A "what changed in the last 6 months" section - A list of open questions worth asking an expert - Source quality flagged — primary, secondary, or weak Where sources contradict each other, surface it. Don't pad with marketing copy.

Files & vision (and video)

Drag a file onto the chat box. PDFs, Word, spreadsheets, code files, images, and videos all work natively.
For long PDFs / videos — Gemini's huge context handles whole books and hours of video. Tell it where to focus: "Watch minutes 4–8 closely; ignore the intro."
For images / screenshots — ask for transcription first, then analysis.

Gems — custom Geminis

Gems are reusable, named Geminis with their own custom instructions and (optionally) knowledge files. Same idea as ChatGPT's Custom GPTs.

Open Gem manager from the sidebar → "Create Gem."
Tell the builder what the Gem does in plain English — it drafts the system prompt.
Add knowledge files — style guides, schemas, FAQs.
Save and pin to your sidebar.

Live — voice & video conversation

Tap the Live icon to start a voice conversation. Continuous, with interruptions.
Share your camera — point your phone at something and ask Gemini what it is.
Share your screen — Gemini sees what you see and helps in real-time.
Switch to text mid-conversation — context carries over.

Optimal prompts for the Gemini app

Video analysis (a Gemini superpower)

Video deep-read Watch the attached video carefully and produce: 1. A timestamped outline (every section change with HH:MM:SS). 2. A list of every claim made by speakers, with the timestamp and speaker. 3. The 3 most important quotes verbatim, with timestamps. 4. Anything visually shown that contradicts what's said aloud. Don't summarize the whole thing — be granular.

Long-document analysis

PDF / book deep-read I've attached [doc name] (~300 pages). Treat it as the source of truth. Return, in this order: 1. The thesis in one sentence. 2. The structural outline (chapters → key arguments). 3. Five claims it actually defends, with the page they're argued on. 4. Any claim that's asserted but not supported. 5. Three questions I should ask the author to stress-test the argument. If something is unclear from the doc, say "not stated" — never guess.

Writing & editing

Edit my draft Act as a sharp editor. Don't rewrite — diagnose and prescribe. For the draft below, return: 1. The single biggest weakness, in one sentence. 2. Three specific edits with before → after. 3. One sentence I should consider cutting entirely, and why. Audience: [who reads this]. Tone target: [tone]. Draft: """ [paste draft] """

Image generation

Imagen 4 Use Imagen 4. Generate a [scene/object/illustration]. Style: [art direction — photorealism, flat illustration, magazine cover…] Composition: [framing — close-up, wide shot, isometric, top-down] Must include (legible): [exact text or signage, if any] Avoid: [what should not appear] Aspect ratio: [16:9 / 1:1 / 4:5]. After generating, check the result against the "must include" list and regenerate if anything is missing or misspelled.

Video generation

Veo 3 Use Veo 3. Generate an 8-second clip. Scene: [what's happening, camera angle, subject] Style: [cinematic / handheld / drone / cartoon] Audio: [dialogue line, ambient sounds, music style] Pacing: [slow contemplative / fast cuts / single take] Aspect: 16:9. Resolution: highest available. Make audio match the action — footsteps when feet move, dialogue lip-synced, ambient sound to match the location.

🎈 ELI5

Gemini lives inside Google Docs, Sheets, Gmail, and Meet — so you don't have to copy-paste between apps. Right inside Docs, Gemini can rewrite a paragraph. Right inside Gmail, it can draft a reply. Right inside Meet, it can take notes for you.

Vertex AI is the big enterprise version on Google Cloud — for when your company needs to deploy AI with strict data controls and connect it to your own databases. NotebookLM is a separate tool for thinking with a stack of documents — drop research papers in and ask questions across all of them.

Gemini in Google Workspace

Gemini is now woven into the Google productivity suite. Where ChatGPT requires a separate tab, Gemini lives where you already work.

Surface	What you get
Docs	"Help me write" + side-panel editor; rewrite, expand, summarize, translate.
Sheets	Generate formulas from natural language; build tables, automate ranges.
Gmail	"Help me write" replies; summarize threads; suggested response chips.
Meet	Take notes, capture action items, summarize the meeting after.
Slides	Generate slide decks, suggest images, restyle in one click.
Drive	Search across all your files semantically; ask questions about any doc.

Plans Most Workspace AI features require Gemini for Workspace add-ons: Business, Enterprise, or AI Pro/AI Plus consumer plans. Some basic features (summarize) are now in standard Workspace; advanced features (Gems in Workspace, deeper integrations) require a paid AI tier.

Vertex AI — enterprise on Google Cloud

The full Gemini, Imagen, Veo, Lyria, and Gemma stack — running inside Google Cloud with enterprise controls.

Model Garden — pick from Gemini, Anthropic Claude, Meta Llama, and others (yes, Vertex hosts third-party models too).
Grounding — anchor responses to your own data (BigQuery, Cloud Storage, Search) without exposing it.
Tuning — fine-tune Gemini on your data; supervised & RLHF available.
Agent Builder — managed service for building production agents with grounding + tools.
Data residency, audit logs, IAM — typical Google Cloud governance.

NotebookLM

A standalone tool for "thinking with a stack of sources." Different shape from chat.

Drop in your sources — PDFs, Google Docs, websites, YouTube videos, audio files. Up to 50+ per notebook.
Ask questions grounded in those sources. Every answer cites which sources it came from.
Generate study aids — briefing docs, FAQs, timelines, mind maps, even AI-generated podcasts about your sources.
Best for: research synthesis, study, deep document analysis. Not for general chat.

Optimal prompts for Workspace & Vertex

Inside Google Docs

Doc rewrite Rewrite the highlighted paragraph for a non-technical executive audience. Constraints: - Cut every acronym not defined in the previous paragraph. - Replace passive voice with active. - Maximum 80 words. - Keep the load-bearing facts — drop the rest. Show only the rewritten version, not commentary.

Inside Sheets

Sheet formulas Generate a formula for column F that computes "days since last activity" based on column D (last activity date). If column D is empty, return blank. Format the result as plain integer days, not text. After the formula, give me 3 test rows that demonstrate edge cases (empty, today, future date).

Inside Gmail

Reply draft Draft a reply to this thread. Tone: warm, professional, no apologies-for- apologies'-sake. Length: 4–6 sentences. End with one clear next step. If the thread mentions a specific commitment I made, restate it back so they know I read carefully.

Inside NotebookLM

Cross-source synthesis Across all sources in this notebook, find every mention of [topic]. Return: 1. A timeline of when each source addresses it. 2. Any disagreements between sources, side-by-side. 3. Which source is most authoritative on this topic and why. 4. The 3 questions still unanswered after reading all of them. Cite specific sources for every claim.

🎈 ELI5

AI Studio is the developer playground at aistudio.google.com — where you try out prompts, get an API key, and copy code into your app. The Gemini API is what you call from your code. Vertex AI is the bigger enterprise version with extra tooling.

Gemini Code Assist is Google's AI for your IDE — like GitHub Copilot but powered by Gemini. The Live API is for building voice agents that listen and talk in real time.

AI Studio — the playground

Open aistudio.google.com — sign in with your Google account.
Test prompts in the playground. Try every model. Free tier available for Flash & Flash-Lite.
Get an API key from the API Keys page. Use named, scoped keys per project.
Copy code — Studio generates Python, Node, Go snippets you can paste into your app.
Save prompts as reusable templates.

Gemini API — first call

Python pip install google-genai export GEMINI_API_KEY="..." python - <<'PY' from google import genai client = genai.Client() resp = client.models.generate_content( model="gemini-3.1-pro", contents="Write a haiku about debugging at 2am.", ) print(resp.text) PY

Node npm install @google/genai export GEMINI_API_KEY="..." node -e " import('@google/genai').then(async ({ GoogleGenAI }) => { const ai = new GoogleGenAI({}); const resp = await ai.models.generateContent({ model: 'gemini-3.1-pro', contents: 'Write a haiku about debugging at 2am.', }); console.log(resp.text); }); "

System instructions + generation config

System + config resp = client.models.generate_content( model="gemini-3.1-pro", contents="Summarize the attached doc in 5 bullets.", config={ "system_instruction": ( "You are a precise technical editor. Reply in markdown only. " "If unsure, say 'not stated'." ), "max_output_tokens": 600, "temperature": 0.2, }, )

Tool use & structured output

Gemini supports function calling, code execution (built-in Python sandbox), and structured JSON output via response_schema.

Function calling from google.genai import types get_weather = types.FunctionDeclaration( name="get_weather", description="Get current weather for a city.", parameters={ "type": "OBJECT", "properties": {"city": {"type": "STRING"}}, "required": ["city"], }, ) resp = client.models.generate_content( model="gemini-3.1-pro", contents="What's the weather in Lagos?", config={"tools": [types.Tool(function_declarations=[get_weather])]}, ) # Inspect resp.candidates[0].content.parts for the function call.

Structured JSON output resp = client.models.generate_content( model="gemini-3-flash", contents=f"Extract structured fields from this resume:\n\n{resume_text}", config={ "response_mime_type": "application/json", "response_schema": { "type": "OBJECT", "properties": { "name": {"type": "STRING"}, "email": {"type": "STRING"}, "years_experience": {"type": "INTEGER"}, "skills": {"type": "ARRAY", "items": {"type": "STRING"}}, }, "required": ["name", "skills"], }, }, )

Live API — voice & video agents

WebSocket-based bidirectional stream for low-latency speech-to-speech (and screen/camera input). Backed by gemini-3.1-flash-live.

When NOT to use Live For one-shot transcription or simple TTS, use gemini-3.1-flash-tts directly. The Live API is for turn-by-turn voice/video agents where the user can interrupt mid-sentence.

Voice agent system prompt

Voice agent You are a voice agent. Your output will be spoken aloud. Therefore: - Use short sentences. No bullet lists, headers, or markdown. - Avoid em-dashes, parentheticals, and acronyms not pronounced as words. - If the user interrupts, stop talking immediately and listen. - When given a list, mention "I'll list three items" first, then read them. - If you're going to take a long action, say "one moment" before doing it. Persona: [warm, concise, professional]. Domain: [your domain]. Out of scope: [what you don't do]. When out of scope, say what you can do instead.

Gemini Code Assist

Google's AI coding assistant for IDEs (VS Code, JetBrains) and the Cloud Console.

Inline suggestions — like Copilot, but Gemini-powered.
Chat in IDE — explain code, suggest refactors, generate tests.
Code transformation — multi-file edits with review-before-apply.
Free tier for individuals; Standard / Enterprise for teams.

Optimal prompts for the Gemini API

Long-context analysis (1M / 2M tokens)

Massive context resp = client.models.generate_content( model="gemini-3.1-pro", # 2M-token window contents=[ "Read the entire monorepo bundle below. Identify every place we issue " "a JWT and every place we verify one. Report file:line for each, plus " "whether HMAC or RSA is used. Flag any code path that bypasses verification.", full_repo_bundle, # up to ~2,000,000 tokens ], ) # Note: input above 200K tokens is billed at $4.00/MTok rather than $2.00.

Cost-aware routing

Two-pass triage # Cheap first pass — does this even need the flagship? triage = client.models.generate_content( model="gemini-3.1-flash-lite", contents=f"Reply with only 'simple' or 'hard' for this query:\n{query}", ).text.strip() model = "gemini-3.1-pro" if triage == "hard" else "gemini-3-flash" final = client.models.generate_content(model=model, contents=query)

Video understanding via API

Video API video = client.files.upload(file="meeting.mp4") resp = client.models.generate_content( model="gemini-3.1-pro", contents=[ "Produce a timestamped action-item list. Format: HH:MM:SS — owner — action.", video, ], ) print(resp.text)

Vertex AI grounding (production)

Grounded answer // Vertex AI: ground the answer in your private corpus const resp = await generativeModel.generateContent({ contents: [{ role: 'user', parts: [{ text: question }] }], tools: [{ retrieval: { vertexAiSearch: { datastore: 'projects/.../dataStores/my-corpus' } } }], }); // resp will include grounding metadata (which docs were used).

🎈 ELI5

A prompt is directions for the AI. The better your directions, the better the help. Tell Gemini four things: WHO it should be, WHAT you want done, WHO reads the answer, and HOW the answer should look.

If the answer is bad, don't start over. Say "redo that, but tighter and bossier." It already has the context — just steer it.

Use-case prompt library

Copy-paste-ready prompts for the most common goals. Edit the bracketed parts to fit your situation. Click Copy on any card.

Writing & email

Email reply Reply to this email. Tone: warm, concise, no apology-for-apologies'-sake. Length: 4–6 sentences. End with one clear next step. If the thread mentions a specific commitment I made, restate it back so they know I read carefully. Output only the reply text — no preamble, no signature. Email: """ [paste email] """

Slack / DM rewrite Rewrite this Slack message: - 30% shorter - friendlier tone - keep the load-bearing facts - no apology-for-apologies'-sake Output only the rewritten version, no commentary. Original: """ [paste] """

Difficult message — 3 versions Help me write a [decline / push-back / disagree] message. Recipient: [who they are, relationship] What I need to communicate: [the message] Constraints: [keep relationship intact / be firm / be diplomatic] Draft three versions: gentle, neutral, firm. End each with one open question that invites them to engage further.

Meeting follow-up Turn these meeting notes into: 1. A 3-bullet summary I can paste into Slack 2. An action items list (owner — action — by when) 3. One open question that wasn't resolved Notes: """ [paste] """

LinkedIn post Draft a LinkedIn post about [topic]. Tone: confident, not braggy. Hook in the first line. 5–8 short paragraphs. End with a question that invites comments. No emojis, no hashtags. Avoid "thrilled to share" and "humbled".

Editing & feedback

Sharp editor Act as a sharp editor. Don't rewrite — diagnose and prescribe. For the draft below, return: 1. The single biggest weakness, in one sentence. 2. Three specific edits with before → after. 3. One sentence I should consider cutting entirely, and why. Audience: [who reads this]. Tone target: [tone]. Draft: """ [paste] """

Tighten by N% Cut this text by 40% without losing meaning. Keep the voice. Return only the tightened version, no commentary. """ [paste] """

Stress-test my argument Find the holes in this argument as a tough but fair critic would. Return: 1. The strongest counter-argument I haven't addressed. 2. The weakest claim I'm making, and why it's weak. 3. What evidence would change your mind. 4. One thing I should NOT change — what I'm getting right. Argument: """ [paste] """

Learning & research

Teach me [topic] Teach me [topic]. I already know [X, Y]. I don't know [Z]. Structure: 1. The one-sentence elevator definition. 2. The mental model (an analogy that maps to something I already know). 3. Three concrete examples of increasing complexity. 4. The two most common misconceptions and why they're wrong. 5. A 5-question self-test (no answers — I'll check myself). Skip filler. Be willing to be wrong if I push back.

Concept to analogy Find me the best analogy for explaining [concept] to [audience]. Return 3 candidates ranging from safe to creative. For each, name what the analogy gets right AND what it breaks. Pick your favorite and defend the choice.

Compare options Compare [option A] vs [option B] for [use case]. Return a 1-page brief: - The decision in one sentence - Top 3 dimensions where they differ - Best fit for each (when would you pick A, when B) - One scenario where neither is right and what is Skip the generic "depends on your needs" framing. Pick.

Decision-making

1-page decision memo Help me decide between [option A] and [option B]. Context: [situation, who's affected, deadline]. Constraints: [budget, time, reversibility]. What I value most: [criteria, in order]. Produce a 1-page decision memo with: - Recommendation in the first line - The 3 reasons that drove it - The strongest counter-argument and your response - What would have to be true for the other option to win

Pre-mortem Run a pre-mortem on this plan: [paste plan]. Imagine it's [N months] from now and the project failed. Write the failure post-mortem. List the top 5 root causes in order of likelihood, and what we could do today to prevent each. Be specific. "Insufficient communication" is a non-answer; "the design team and engineering used different definitions of done" is useful.

Devil's advocate Take the position that [my proposal] is wrong. Be the strongest advocate for the opposite view. Give me: 1. Three reasons my proposal is the wrong call. 2. A version of the opposite plan that you'd actually defend. 3. One question that, if answered honestly, would tell us who's right. Don't soften — I want the spicy version. I'll evaluate it on the merits.

Brainstorming

Diverge then converge Help me brainstorm [topic]. Round 1 — diverge: 12 ideas covering safe, weird, ambitious, contrarian. Round 2 — converge: pick the top 3 by [criterion] and explain trade-offs. Round 3 — sharpen: turn the winner into a one-paragraph pitch. Don't hedge. I want opinions.

Opposite of obvious What's the obvious answer to [question]? State it in one sentence. Now: what if the OPPOSITE is true? Make the strongest case for the counterintuitive answer. Don't strawman; defend it as if you believed it. End with: which one do you actually think is right, and why.

Coding

Bug repro & fix Bug: [describe the bug + what should happen]. Reproduce with a failing test first, then fix the smallest amount of code that turns it green. Constraints: - Don't refactor unrelated code. - No new dependencies. - Test goes in [path] next to similar cases. When done: show me the diff, the test output, and one sentence on the root cause.

Code review of a snippet Review this code as a senior reviewer would. Flag: 1. Anything broken or buggy 2. Anything that looks like dead code or leftover debug 3. Tests that test the implementation rather than behavior 4. Risky assumptions that aren't documented 5. One thing that needs a comment explaining the WHY Skip stylistic nits the formatter would catch. Code: """ [paste] """

Refactor for [property] Refactor [file/function] to [be smaller / be testable / remove duplication]. Rules: - No behavior changes — every existing test must still pass without edits. - Keep the public API identical. - Split by responsibility, not by line count. Start by reading the code and tests, then propose the split before editing.

Generate test cases For the function below, generate test cases covering: - Happy path - Edge cases (empty input, null, max size, off-by-one) - Error cases (what should throw, what should return safely) Use [framework]. Output the tests only, no commentary. Function: """ [paste] """

Regex / SQL from English Generate a [regex / SQL query] that [describes what it should do]. Then explain it line-by-line in plain English. Then give me 3 test cases — 2 that should match/return rows and 1 that shouldn't.

Data & analysis

Clean a CSV Attached CSV is messy. Clean it for analysis. Steps: 1. Detect and list every quality issue (nulls, mixed types, dupes, weird dates). 2. Propose a fix for each, with the assumption you're making. 3. Apply the fixes; return a clean version. 4. End with a "trust this output if…" caveat list. Never silently drop rows — quarantine them in a separate sheet.

Extract structured fields Extract structured fields from this text. Required fields: - [field 1] (type) - [field 2] (type) Output as JSON. For any field that's not stated in the text, return null. Never guess. Text: """ [paste] """

Find anomalies Look at this dataset. Identify the 5 most surprising rows or values. For each, explain why it's surprising and what plausible explanations exist (data error, real outlier, definition issue). Don't moralize about which explanation is right — just lay them out.

Documents

TL;DR a long doc TL;DR this document. Return: 1. The thesis in one sentence. 2. The 5 claims it actually defends. 3. Any claim that's asserted but not supported. 4. The single most surprising fact. 5. Three questions worth asking the author. If something is unclear, say "not stated" — never guess.

Find every claim Pull every factual claim from this document into a numbered list. For each, note: - The page/section it appears in - Whether the doc supports it (with evidence) or just asserts it - Confidence level (high / medium / low) Useful for fact-checking before sharing.

Outline → draft Turn this outline into a [first draft / one-pager / blog post]. Tone: [tone]. Length: [target]. Audience: [who]. Don't add new claims I didn't include in the outline. If a section is underspecified, leave a note like [needs example here] instead of inventing.

Creative & personal

Name brainstorm Brainstorm 12 names for [thing]. Cover these vibes: - 3 plain/descriptive (does what it says) - 3 evocative/metaphorical - 3 weird/punchy/short - 3 contrarian (intentionally counterintuitive) For each: a one-line rationale and one risk. Pick your top 3 and rank them.

Cold outreach email Draft a cold outreach email to [recipient role at company type]. Goal: [what I want them to do] What I bring: [my context, what's interesting] Constraint: ≤120 words, no buzzwords ("synergy," "leverage"), one specific reference to their work. End with one easy ask (e.g., 15-min call) — not "let me know if interested".

Travel itinerary Plan a [N-day] trip to [destination] for [N people, ages, interests]. Constraints: [budget, mobility, dietary, packed-vs-slow style]. Day-by-day: - Morning / afternoon / evening for each day - One must-do, one easy fallback - Estimated cost per day - One thing locals do that tourists miss Skip generic recommendations everyone already knows.

Interactive prompt builder

What's your goal? Who is Gemini (role)? Who's the audience? Topic / subject / paste-in Output format Constraints / things to avoid

Fill in the fields above…

Patterns library

The "stop and ask" pattern

Stop & ask Before doing anything, list the 3 most important things you'd need to know to do this well that I haven't told you. Number them. Then propose the answer you'd assume by default for each. Don't start the work until I confirm.

Multi-source synthesis (Gemini long-context special)

Multi-source synthesis I've attached [N] documents totaling [~M tokens]. Treat the bundle as one corpus. Cross-reference everything. Find: 1. Every place where two sources disagree, with side-by-side quotes. 2. The dominant consensus position on [topic]. 3. The single most surprising claim in any source, with evidence. 4. What's missing — what should be in this corpus that isn't? Cite source name + page/timestamp for every claim.

Deep Think escalation

Deep Think Use Deep Think mode for this. Take your time. Problem: [paste the hard problem] I want: 1. Every assumption you're making, listed. 2. Alternatives you considered and rejected, with reasons. 3. Your best answer, with confidence level (high / medium / low). 4. What evidence would change your answer.

Anti-patterns

Anti-pattern	Do this instead
"Help me with this."	State the deliverable: "Rewrite this email so it's 30% shorter and warmer."
Transcribing video to text first	Paste the video — Gemini reads video natively.
Splitting a 500K-token doc into chunks	Use Gemini 3.1 Pro's 2M context and read it whole.
Using Pro for everything	Default to Flash; escalate to Pro only when an answer disappoints.
"Don't hallucinate."	"If unsure, say 'not stated' and explain what would resolve it."
10-paragraph prompt for a simple task	Match prompt length to task complexity.

Universal rescue prompts

Diagnose & reset Pause. Tell me in 3 bullets why your last response missed what I wanted. Then propose a tighter prompt I could give you to fix it. Wait — don't try again yet.

Cut the fluff Redo your last response with no preamble, no caveats, no closing summary. Just the substance.

Show your work Walk me through your reasoning before the answer this time. I want to see the trade-offs you considered.

Stronger opinion Drop the both-sides framing. Pick the option you'd choose if it were your own project, and defend it. I'll push back if I disagree.