This is day 2 of the series. Yesterday I showed the architecture diagram. Today we are looking at the database that sits underneath it. Every table, every column, every index, and the reason each one exists.
If you are following with kavachOS, pnpm kavachos migrate creates all of this for you, and you can skim the article to understand what is in your database. If you are building it yourself, this is your schema.
The shape at a glance
Eight tables. One user-centric join key. No fancy polymorphic associations. No stored procedures. It stays boring on purpose.
+-----------+
| users |
+-----+-----+
|
+-------------+---------+---------+----------------+-----------+
| | | | | |
v v v v v v
+----------+ +----------+ +--------+ +-----------+ +-----------+ +-----------+
| sessions | | oauth_ | | reset | | magic_ | | email_ | | agent_ |
| | | accounts| | tokens| | link_ | | verif_ | | tokens |
| | | | | | | tokens | | tokens | | |
+----------+ +----------+ +--------+ +-----------+ +-----------+ +-----------+
+-----------+
| passkeys |
+-----------+
Every table has a user_id foreign key back to users and an on delete cascade. If you delete a user, everything goes. No orphans. No manual cleanup jobs.
Postgres, and the extensions you want on
create extension if not exists "uuid-ossp";
create extension if not exists "citext";
create extension if not exists "pgcrypto";
Three of them, and each earns its keep.
citext: case-insensitive text. The moment you use it for email, Alice@Example.com and alice@example.com collide on the unique constraint. Without it, you will discover the bug when a customer complains that they cannot log in with the same email they signed up with because they capitalized it differently.
pgcrypto: gives you gen_random_uuid() for default UUID values, and digest() if you ever want to hash in the database (I do not, but it is there).
uuid-ossp: only if you want uuid_generate_v4() the old way. pgcrypto covers this, you can skip uuid-ossp if you want. I include it because older migrations assume it.
The users table
create table users (
id bigserial primary key,
email citext not null unique,
password_hash text,
email_verified boolean not null default false,
failed_login_attempts int not null default 0,
locked_until timestamptz,
last_login_at timestamptz,
created_at timestamptz not null default now(),
updated_at timestamptz not null default now()
);
create index idx_users_email on users (email);
A few decisions worth explaining.
bigserial over uuid. A bigserial is 8 bytes, an indexed lookup is fast, and it sorts cleanly by creation order. A UUID is 16 bytes, the primary key index is larger, and on older Postgres versions random UUIDs cause B-tree bloat. If you are going to expose IDs in URLs, use a separate public_id column that is a UUID or a slug, and keep id as a bigserial for joins. Best of both.
password_hash is nullable. A user who signs up via Google OAuth never sets a password. A nullable password_hash means no password login is possible for that user until they explicitly set one. If you make it NOT NULL you need a sentinel value and now you have a bug waiting to happen.
failed_login_attempts and locked_until. These support account lockout after N failed attempts in a window. Some teams hate this (users get locked out and complain). I default to "lock after 10 failures in 10 minutes" which is almost invisible to humans and useful against credential stuffing.
citext on email. Already covered. Do it.
The sessions table
create table sessions (
id bigserial primary key,
user_id bigint not null references users(id) on delete cascade,
token_hash text not null unique,
expires_at timestamptz not null,
last_used_at timestamptz not null default now(),
user_agent text,
ip_address inet,
created_at timestamptz not null default now()
);
create index idx_sessions_user on sessions (user_id);
create index idx_sessions_expires on sessions (expires_at);
create index idx_sessions_last_used on sessions (last_used_at);
token_hash, not token. Same rule as every other token in this schema. You store the SHA-256 of the cookie value. If the database leaks, the leaked rows are useless.
user_agent and ip_address are for display only. Do not tie session validity to either. Phone carriers rotate IPs, browsers update user agents, you will end up signing users out at random.
inet type for ip_address. Postgres has a native IP type. It handles IPv4 and IPv6, it indexes correctly, and it is smaller than the equivalent text column. No reason to use text here.
The index on last_used_at. You will want this the first time a user opens the "active sessions" screen and sees 40 rows. An index lets you order by recency cheaply.
The oauth_accounts table
create table oauth_accounts (
id bigserial primary key,
user_id bigint not null references users(id) on delete cascade,
provider text not null,
provider_uid text not null,
access_token text,
refresh_token text,
expires_at timestamptz,
scope text,
created_at timestamptz not null default now(),
updated_at timestamptz not null default now(),
unique (provider, provider_uid)
);
create index idx_oauth_user on oauth_accounts (user_id);
The (provider, provider_uid) unique constraint. This is the identity of the OAuth account. A Google account with sub 1234 is the same account no matter how many times it connects. Without this constraint you can end up creating duplicate users every time someone clicks "Sign in with Google".
access_token and refresh_token as text, not encrypted. I know some people want to encrypt these at rest. If you do, use application-level encryption with a KMS-managed key, not Postgres encryption (which is disk-level and does not help against a compromised database user). For most side projects, the risk calculus does not justify the operational cost.
Why scope as text instead of an array? You can, but you almost never query for "all accounts with scope X". Storing the original space-separated scope string is simpler and preserves what the provider actually granted.
The password_reset_tokens table
create table password_reset_tokens (
id bigserial primary key,
user_id bigint not null references users(id) on delete cascade,
token_hash text not null unique,
expires_at timestamptz not null,
used_at timestamptz,
created_at timestamptz not null default now()
);
create index idx_prt_user_unused on password_reset_tokens (user_id) where used_at is null;
create index idx_prt_expires on password_reset_tokens (expires_at);
Partial index on unused tokens. The lookup during a reset is "find the unused token for this user that matches this hash". A partial index where used_at is null is smaller and faster than a full one, because used tokens never participate in the lookup.
token_hash is unique. The odds of a random 32-byte token colliding are effectively zero, but the unique constraint gives you a free integrity check and a better error message if your random number generator ever breaks.
used_at is nullable. On successful reset, set it to now(). The verification query checks used_at is null, so a single row can be used exactly once. This is the whole one-time-use property.
The magic_link_tokens table
create table magic_link_tokens (
id bigserial primary key,
user_id bigint references users(id) on delete cascade,
email citext not null,
token_hash text not null unique,
expires_at timestamptz not null,
used_at timestamptz,
created_at timestamptz not null default now()
);
create index idx_mlt_email on magic_link_tokens (email);
create index idx_mlt_expires on magic_link_tokens (expires_at);
user_id is nullable. A magic link can be sent to an email that does not yet have an account. On click, the system either signs in the existing user or creates a new one. Nullable user_id lets you create the token before the user exists.
email is stored directly. Redundant if the user exists, but necessary for the new-user case. Keep it.
The email_verification_tokens table
create table email_verification_tokens (
id bigserial primary key,
user_id bigint not null references users(id) on delete cascade,
token_hash text not null unique,
expires_at timestamptz not null,
used_at timestamptz,
created_at timestamptz not null default now()
);
create index idx_evt_user_unused on email_verification_tokens (user_id) where used_at is null;
This table is almost identical to password_reset_tokens. Two observations.
Why not combine them into one tokens table with a type column? You can, and some libraries do. I keep them separate because the access patterns differ (reset tokens expire in 15 minutes, verification tokens in 24 hours), the cleanup jobs run at different cadences, and splitting them keeps each table small and easy to reason about. A type column also means every query has a where type = ... clause, which is easy to forget.
The 24 hour expiry is intentional. A password reset window needs to be short because the blast radius is high. An email verification window is lower risk (the worst case is a user cannot activate their account) so the window is longer.
The passkeys table
create table passkeys (
id bigserial primary key,
user_id bigint not null references users(id) on delete cascade,
credential_id text not null unique,
public_key bytea not null,
counter bigint not null default 0,
transports text[],
name text,
last_used_at timestamptz,
created_at timestamptz not null default now()
);
create index idx_passkeys_user on passkeys (user_id);
create index idx_passkeys_cred on passkeys (credential_id);
credential_id is the WebAuthn identifier. It comes back from the browser during registration. We store it as base64url text. The unique constraint prevents the same key from being registered twice.
public_key as bytea. The key material is binary. Use the native bytea type, not text. It is smaller and does not require decoding on every verification.
counter starts at 0 and increments on each use. The WebAuthn spec uses this as a cloning detection signal. If an assertion comes in with a counter lower than the last-seen value, the authenticator has been cloned. Handle that case by invalidating the credential.
transports as text[]. The browser tells you which transports the authenticator supports (usb, nfc, ble, internal, hybrid). Storing them lets you pass the right set back to the browser on the next authentication, which speeds up the UX.
name is user-supplied. Let users name their passkeys ("Work laptop", "Phone"). When they lose a device, this is how they find the right row to revoke.
The agent_tokens table
create table agent_tokens (
id bigserial primary key,
user_id bigint not null references users(id) on delete cascade,
token_hash text not null unique,
name text,
permissions text[] not null default '{}',
expires_at timestamptz,
last_used_at timestamptz,
revoked_at timestamptz,
created_at timestamptz not null default now()
);
create index idx_agent_user on agent_tokens (user_id);
create index idx_agent_active on agent_tokens (user_id) where revoked_at is null;
This is the new one. The agent token system is how scripts, cron jobs, and AI agents authenticate on behalf of a user. Article 11 goes deep on this.
permissions as a text array. Something like {reports:read, invoices:write}. Granular enough to be useful, simple enough to reason about. You can upgrade to a real scope/policy engine later.
expires_at is nullable. Some agent tokens live forever (long-running bots). Most should have an expiry. Treat nullable as "no expiry" and document it clearly.
revoked_at is separate from expires_at. Revocation is explicit. Expiry is passive. Different signals, different columns.
Drizzle schema (TypeScript)
If you are using Drizzle ORM, here is the whole thing in one file:
// db/schema.ts
import {
bigserial,
bigint,
boolean,
inet,
integer,
pgTable,
text,
timestamp,
bytea,
unique,
index,
} from "drizzle-orm/pg-core";
import { citext } from "./citext";
export const users = pgTable(
"users",
{
id: bigserial("id", { mode: "number" }).primaryKey(),
email: citext("email").notNull().unique(),
passwordHash: text("password_hash"),
emailVerified: boolean("email_verified").notNull().default(false),
failedLoginAttempts: integer("failed_login_attempts").notNull().default(0),
lockedUntil: timestamp("locked_until", { withTimezone: true }),
lastLoginAt: timestamp("last_login_at", { withTimezone: true }),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(t) => ({ emailIdx: index("idx_users_email").on(t.email) }),
);
export const sessions = pgTable(
"sessions",
{
id: bigserial("id", { mode: "number" }).primaryKey(),
userId: bigint("user_id", { mode: "number" })
.notNull()
.references(() => users.id, { onDelete: "cascade" }),
tokenHash: text("token_hash").notNull().unique(),
expiresAt: timestamp("expires_at", { withTimezone: true }).notNull(),
lastUsedAt: timestamp("last_used_at", { withTimezone: true }).notNull().defaultNow(),
userAgent: text("user_agent"),
ipAddress: inet("ip_address"),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
},
(t) => ({
userIdx: index("idx_sessions_user").on(t.userId),
expiresIdx: index("idx_sessions_expires").on(t.expiresAt),
lastUsedIdx: index("idx_sessions_last_used").on(t.lastUsedAt),
}),
);
// ...same pattern for the other 6 tables, omitted for space
The Drizzle citext helper is a tiny file:
// db/citext.ts
import { customType } from "drizzle-orm/pg-core";
export const citext = customType<{ data: string }>({
dataType: () => "citext",
});
Full schema with all 8 tables is in the starter repo at github.com/kavachos/nextjs-auth-from-scratch on the 02-schema branch.
Running the migrations
$ pnpm drizzle-kit generate
✓ Your SQL migrations file saved to drizzle/0000_initial.sql
$ pnpm drizzle-kit push
Pulling schema from database...
Changes applied:
users [+] 9 columns
sessions [+] 8 columns
oauth_accounts [+] 10 columns
password_reset_tokens [+] 6 columns
magic_link_tokens [+] 7 columns
email_verification_tokens[+] 6 columns
passkeys [+] 9 columns
agent_tokens [+] 9 columns
Verify it all landed:
$ psql $DATABASE_URL -c "\dt"
List of relations
Schema | Name | Type | Owner
--------+---------------------------+-------+--------
public | agent_tokens | table | postgres
public | email_verification_tokens | table | postgres
public | magic_link_tokens | table | postgres
public | oauth_accounts | table | postgres
public | passkeys | table | postgres
public | password_reset_tokens | table | postgres
public | sessions | table | postgres
public | users | table | postgres
(8 rows)
And the indexes:
$ psql $DATABASE_URL -c "\di"
List of relations
Schema | Name | Type | Table
--------+-------------------------------+-------+---------
public | idx_agent_active | index | agent_tokens
public | idx_agent_user | index | agent_tokens
public | idx_evt_user_unused | index | email_verification_tokens
public | idx_mlt_email | index | magic_link_tokens
public | idx_mlt_expires | index | magic_link_tokens
public | idx_oauth_user | index | oauth_accounts
public | idx_passkeys_cred | index | passkeys
public | idx_passkeys_user | index | passkeys
public | idx_prt_expires | index | password_reset_tokens
public | idx_prt_user_unused | index | password_reset_tokens
public | idx_sessions_expires | index | sessions
public | idx_sessions_last_used | index | sessions
public | idx_sessions_user | index | sessions
public | idx_users_email | index | users
(14 rows)
Cleanup: the expiry janitor
Tokens accumulate. Sessions accumulate. Without cleanup, the tables grow and lookups slow down.
Run this once a day on a cron:
delete from password_reset_tokens
where expires_at < now() - interval '7 days';
delete from magic_link_tokens
where expires_at < now() - interval '7 days';
delete from email_verification_tokens
where expires_at < now() - interval '30 days';
delete from sessions
where expires_at < now() - interval '1 day';
Why the extra grace period? If a user comes back with an expired link, you want to show them "this link expired" rather than "this link does not exist". Keeping expired rows around for a week gives you the ability to distinguish those cases in the query.
Agent tokens and passkeys do not get cleaned up automatically. They are explicit user artifacts and should only go when the user asks or when the user itself is deleted.
The same thing in kavachOS
If you do not want to own any of this, the kavachOS version is one command:
$ pnpm add kavachos @kavachos/nextjs
$ pnpm kavachos migrate
Running migrations against $DATABASE_URL
✓ 0001_users
✓ 0002_sessions
✓ 0003_oauth_accounts
✓ 0004_password_reset_tokens
✓ 0005_magic_link_tokens
✓ 0006_email_verification_tokens
✓ 0007_passkeys
✓ 0008_agent_tokens
✓ 0009_indexes
Done. 8 tables, 14 indexes.
The schema kavachOS ships is close to this article, with two differences: it uses uuid for id columns by default (so tenants in multi-tenant deployments do not share key spaces) and it ships its own cleanup job as a scheduled worker. Docs: kavachos.com/docs/schema.
If you want to add custom columns to users (like full_name, avatar_url, tenant_id), kavachOS supports that through a schema extension mechanism. You do not fork the library.
Three things I would change if I had to do this again
I would use bigint everywhere instead of bigserial. Postgres 10+ has identity columns (generated always as identity) which are cleaner than sequences and easier to work with in multi-master setups. I default to bigserial because most tutorials do, but identity columns are better if you are starting fresh.
I would put locked_until on a separate user_security table. It is a hot-updated column and separating it keeps the main users row static. Minor optimization, only matters at scale.
I would denormalize email_verified out of users into an event log. Having a boolean that represents "the current state of a verification process" means you lose the history. A small user_events table that records every verification attempt, lockout, and password change is invaluable when debugging and cheap to maintain.
None of these are blocking. The schema above ships and runs for years.
What is next
Tomorrow: the register flow, which uses exactly this schema. Then the login flow with the timing attack defense nobody covers in other tutorials.
Comment with the column you think I got wrong. I like database arguments.
Gagan Deep Singh builds open source tools at Glincker. Currently working on kavachOS (open source auth for AI agents and humans) and AskVerdict (multi-model AI verdicts).
If this was useful, follow me on Dev.to or X where I post weekly.
This article was originally published by DEV Community and written by GDS K S.
Read original article on DEV Community