Technology May 03, 2026 · 2 min read

Gemini API Cheatsheet 2026 — Free Tier Limits, Models, and Endpoints in One Place

If this is useful, a ❤️ helps others find it. Everything I keep looking up when building with Gemini — in one place. Models (2026) Model Context Best for gemini-2.5-flash-preview 1M tokens General use, thinking, fast gemini-2.5-pro-preview 1M tokens Complex reasoning, best q...

DE
DEV Community
by hiyoyo
Gemini API Cheatsheet 2026 — Free Tier Limits, Models, and Endpoints in One Place

If this is useful, a ❤️ helps others find it.

Everything I keep looking up when building with Gemini — in one place.

Models (2026)

Model Context Best for
gemini-2.5-flash-preview 1M tokens General use, thinking, fast
gemini-2.5-pro-preview 1M tokens Complex reasoning, best quality
gemini-1.5-flash 1M tokens Stable, production-ready
gemini-1.5-pro 2M tokens Longest context
gemini-2.0-flash-lite 1M tokens Lowest latency, highest volume

For most use cases: gemini-2.5-flash-preview

Free Tier Limits (Google AI Studio)

Model RPM TPM RPD
Gemini 2.5 Flash Preview 10 250,000 500
Gemini 1.5 Flash 15 1,000,000 1,500
Gemini 1.5 Pro 2 32,000 50
Gemini 2.0 Flash Lite 30 1,000,000 1,500

RPM = requests per minute, TPM = tokens per minute, RPD = requests per day

Basic Request (REST)

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview:generateContent \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: YOUR_API_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Your prompt here"}]}]
  }'

With System Prompt

{
  "system_instruction": {
    "parts": [{"text": "You are a helpful assistant."}]
  },
  "contents": [
    {"role": "user", "parts": [{"text": "Your prompt here"}]}
  ]
}

Streaming

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview:streamGenerateContent \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: YOUR_API_KEY" \
  -d '{"contents": [{"parts": [{"text": "Tell me a story"}]}]}'

In Rust (reqwest)

use reqwest::Client;
use serde_json::json;

pub async fn call_gemini(prompt: &str, api_key: &str) -> Result {
    let client = Client::new();
    let url = format!(
        "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview:generateContent?key={}",
        api_key
    );

    let body = json!({
        "contents": [{"parts": [{"text": prompt}]}]
    });

    let res = client.post(&url).json(&body).send().await?;
    let data: serde_json::Value = res.json().await?;

    let text = data["candidates"][0]["content"]["parts"][0]["text"]
        .as_str()
        .unwrap_or("")
        .to_string();

    Ok(text)
}

Error Codes

Code Meaning Fix
400 Bad request / token limit Shorten prompt
403 Invalid API key Check key
429 Rate limit hit Wait and retry
500 Internal error Retry
503 Overloaded Wait 2s, retry once

Token Counting (rough guide)

  • 1 token ≈ 4 characters in English
  • 1 token ≈ 2–3 characters in Japanese
  • 100 lines of logcat ≈ 3,000–5,000 tokens
  • 1 page of PDF text ≈ 500–800 tokens

Get a Free API Key

  1. Go to aistudio.google.com
  2. Sign in with Google
  3. Click "Get API Key"
  4. Done — no credit card required

Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault
X → @hiyoyok

DE
Source

This article was originally published by DEV Community and written by hiyoyo.

Read original article on DEV Community
Back to Discover

Reading List