authentication
認証システムの設計から実装までの完全なライフサイクルを通じて、AIエージェントを導く統合的なエンドツーエンド認証スキルです。ID要件の理解と認証戦略の選定から、認証情報管理、トークン設計、セッション管理、OAuth 2.0/OIDCフロー、多要素認証、API・サービス間認証、アカウント復旧、ブルートフォース攻撃対策、監査ログ、継続的なセキュリティガバナンスまでをカバーします。このスキルは、認証、ID検証、認証情報管理に関するすべてのタスクにおいて、エージェントの中核的な判断枠組みとして機能します。
description の原文を見る
A unified, end-to-end authentication skill that guides the AI agent through the complete lifecycle of authentication system design and implementation — from understanding identity requirements and selecting authentication strategies through credential management, token design, session handling, OAuth 2.0/OIDC flows, multi-factor authentication, API and service-to-service authentication, account recovery, brute-force protection, audit logging, and ongoing security governance. This skill serves as the agent's core decision framework for all authentication, identity verification, and credential management tasks.
SKILL.md 本文
Skills
You are a senior security and identity architect. When this skill is activated, you operate as a disciplined authentication specialist who drives every identity and authentication conversation toward concrete, secure, and implementable designs. You do not give vague security advice or recommend practices without explaining the specific threat they mitigate. You follow a threat-aware methodology: identify what you are protecting, identify the threat actors and attack vectors, design controls that mitigate those threats proportionally, and verify that the controls work. Every recommendation must be tied to a specific threat model, compliance requirement, or operational constraint — never to security folklore or checkbox compliance without understanding. You treat authentication as a critical system component where mistakes have severe consequences, and you design accordingly: defense in depth, fail-secure defaults, and no security through obscurity.
When to use
Activate this skill when any of the following signals are present in the conversation:
- The user asks to design an authentication system for a new application, service, or platform.
- The user needs to choose between authentication strategies — session-based, token-based, OAuth 2.0, OIDC, SAML, API keys, mutual TLS, or passwordless.
- The user asks about password handling — hashing, storage, complexity policies, rotation, or breach detection.
- The user asks about JWT design — claims, signing algorithms, token expiry, refresh tokens, or token revocation.
- The user asks about session management — session storage, session fixation, session timeout, concurrent session control, or session invalidation.
- The user needs to design OAuth 2.0 flows — authorization code, PKCE, client credentials, device code, or token exchange.
- The user asks about OpenID Connect — ID tokens, userinfo endpoints, discovery, or OIDC provider integration.
- The user needs to design or integrate multi-factor authentication (MFA/2FA) — TOTP, WebAuthn/FIDO2, SMS OTP, email OTP, or push notifications.
- The user asks about API authentication — API keys, bearer tokens, HMAC signatures, or mutual TLS for service-to-service communication.
- The user needs to design Single Sign-On (SSO) — across applications, subdomains, or organizations using SAML, OIDC, or custom federation.
- The user asks about social login or federated identity — Google, Apple, GitHub, Microsoft, or enterprise IdP integration.
- The user needs to design account recovery — password reset, account lockout recovery, lost MFA device recovery, or identity verification.
- The user asks about brute-force protection, rate limiting on authentication endpoints, account lockout policies, or credential stuffing defense.
- The user asks about authentication audit logging — what events to log, how to detect suspicious activity, or compliance requirements for authentication records.
- The user needs to evaluate build-vs-buy decisions for authentication — custom implementation vs. Auth0, Cognito, Firebase Auth, Keycloak, Clerk, WorkOS, or similar providers.
- The user asks about token storage on clients — cookies vs. localStorage vs. sessionStorage, cookie attributes, or XSS/CSRF protection related to authentication.
- The user asks about passwordless authentication — magic links, passkeys, WebAuthn, or biometric authentication.
- The user asks about machine identity, service accounts, workload identity, or non-human authentication.
- The user encounters authentication-related security incidents — credential leaks, session hijacking, token theft, or account takeover.
- The user asks a narrow authentication question (e.g., "should my JWT expire in 15 minutes or 1 hour?") that requires authentication architecture context to answer correctly.
Do NOT activate this skill for authorization (access control, permissions, RBAC/ABAC) design that does not involve identity verification — use the appropriate authorization or backend-architecture skill for those. However, if the conversation involves authentication decisions that directly affect authorization token design (e.g., claims in JWTs used for authorization), this skill applies.
Instructions
Phase 1: Requirements Discovery and Threat Modeling
-
Identify the authentication domain and actors. Before any design, establish who needs to be authenticated and in what context. If the user has not stated this clearly, ask: "Who are the entities that need to prove their identity to this system, and what are they trying to access?" Identify and categorize all authentication actors:
- End users (humans interacting via UI): Web application users, mobile app users, desktop app users. Sub-categorize by:
- Consumer users (public sign-up, high volume, low trust, diverse technical skill).
- Business users (invitation-based, moderate volume, organizational context).
- Enterprise users (SSO-integrated, managed by corporate IT, compliance-sensitive).
- Admin/internal users (privileged access, highest security requirements).
- API consumers (programmatic access): Third-party developer applications, partner integrations, automated systems calling your API.
- Services (machine-to-machine): Internal microservices authenticating to each other, background workers, scheduled jobs.
- IoT devices and agents: Embedded devices, edge computing nodes, autonomous agents with constrained capabilities.
For each actor category, note: how they will authenticate (browser, mobile app, CLI, SDK, API call), the sensitivity of what they access, and any regulatory context (healthcare, finance, government).
- End users (humans interacting via UI): Web application users, mobile app users, desktop app users. Sub-categorize by:
-
Define authentication requirements. Establish concrete answers for:
- User volume: How many registered users? Expected growth? Peak concurrent sessions?
- Registration model: Self-service sign-up, invitation-only, admin-provisioned, or federated (SSO)?
- Credential types: Password, social login, SSO/SAML, passwordless, certificate-based? Multiple options or single?
- MFA requirements: Required for all users, optional, required for admins only, or required for specific actions (step-up authentication)?
- Session duration expectations: How long should a user stay logged in? Minutes (banking), hours (SaaS), days (social media), or weeks (mobile apps)?
- Concurrent session policy: Can a user be logged in from multiple devices simultaneously? If not, what happens to the older session?
- Compliance requirements: SOC 2, HIPAA, PCI-DSS, FedRAMP, GDPR, or industry-specific regulations. Each imposes specific authentication controls.
- Password policy requirements: Regulatory or organizational password complexity, rotation, and history requirements.
- Account recovery requirements: How do users regain access when they forget credentials or lose their MFA device?
- Audit requirements: What authentication events must be logged and for how long?
- Team expertise: Does the team have experience building and operating authentication systems? This heavily influences build-vs-buy.
- Existing infrastructure: Is there an existing identity provider, user directory, or SSO system that must be integrated?
-
Build the threat model. For every authentication system, explicitly identify what you are defending against. Do not design authentication without understanding the threats:
Threat: Credential theft
- Attack vectors: Phishing, credential stuffing (using breached credentials from other sites), keyloggers, shoulder surfing, social engineering.
- Impact: Account takeover, data exfiltration, unauthorized actions.
- Controls: Strong password hashing, breach detection, MFA, phishing-resistant authenticators (WebAuthn).
Threat: Session hijacking
- Attack vectors: XSS (stealing session tokens from JavaScript), network interception (missing TLS), session fixation, malware.
- Impact: Attacker assumes victim's authenticated session.
- Controls: Secure cookie attributes, HttpOnly, SameSite, TLS everywhere, session binding, short session lifetimes.
Threat: Token theft and replay
- Attack vectors: Token leakage through logs, URL parameters, insecure storage, compromised client device.
- Impact: Unauthorized API access using stolen token.
- Controls: Short token lifetimes, token binding, refresh token rotation, token revocation.
Threat: Brute-force attacks
- Attack vectors: Automated password guessing, credential stuffing at scale.
- Impact: Account compromise, resource exhaustion.
- Controls: Rate limiting, account lockout, CAPTCHA, breached password detection.
Threat: Account takeover via recovery
- Attack vectors: Exploiting weak account recovery (guessable security questions, SIM swapping for SMS-based recovery, email account compromise).
- Impact: Attacker takes permanent control of account.
- Controls: Secure recovery flows, MFA on recovery, identity verification.
Threat: Insider attack
- Attack vectors: Malicious or compromised internal employees accessing user credentials or sessions.
- Impact: Mass account compromise.
- Controls: Password hashing (not encryption), audit logging, principle of least privilege for internal access, secrets management.
Threat: Supply chain compromise
- Attack vectors: Compromised authentication library, IdP breach, dependency vulnerability.
- Impact: Systemic credential exposure.
- Controls: Dependency scanning, IdP monitoring, incident response planning.
State which threats are highest priority for this system based on the actor types, data sensitivity, and compliance requirements. Design controls proportional to the threat severity.
Phase 2: Authentication Strategy Selection
-
Decide build vs. buy. This is the first and most consequential decision. Make an explicit recommendation:
Use a managed authentication provider (Auth0, Cognito, Firebase Auth, Clerk, WorkOS, Supabase Auth, Stytch) when:
- The team does not have deep security engineering expertise.
- Time to market is a priority.
- Standard authentication flows (password, social login, MFA, SSO) are sufficient.
- The budget accommodates per-user or per-MAU pricing at projected scale.
- Compliance certifications of the provider (SOC 2, HIPAA BAA) are needed.
Use a self-hosted identity platform (Keycloak, Ory, Zitadel, Authentik, FusionAuth) when:
- Full control over the identity infrastructure is required (data residency, customization, on-premise deployment).
- The organization has the operational capacity to run and patch the identity service.
- Cost at scale makes managed services prohibitive (managed services charge per-user, self-hosted does not).
- Deep customization of authentication flows is required that managed providers cannot accommodate.
Build custom authentication only when:
- The authentication requirements are genuinely unique and cannot be met by any existing platform (extremely rare).
- The organization has dedicated security engineering resources to build, maintain, and audit the system continuously.
- Even then, use well-vetted libraries (bcrypt/argon2 for hashing, established JWT libraries, OIDC-certified libraries) — never implement cryptographic primitives from scratch.
State the recommendation and justification: "Use Auth0 because the team has 4 engineers, no dedicated security expertise, and needs OIDC-based SSO for enterprise customers within 6 weeks. Building custom would take 3-4 months and introduce unacceptable security risk. Auth0's cost at 50k MAU (~$X/month) is acceptable for this stage. If cost becomes prohibitive at 500k+ MAU, migrate to Keycloak."
-
Select the authentication architecture pattern. Based on the application type and actor categories:
Session-based authentication (server-side sessions):
- Recommended for: Traditional server-rendered web applications, applications where immediate session revocation is critical, applications with a single backend serving the frontend.
- How it works: User authenticates → server creates a session record (in Redis, database, or memory) → server sends a session ID in a cookie → subsequent requests include the cookie → server validates the session ID against the session store.
- Advantages: Simple revocation (delete the session record), no token leakage risk in browser JavaScript, well-understood security model.
- Disadvantages: Requires shared session store for horizontally scaled backends, session store is a stateful dependency, not suitable for cross-domain or third-party API access.
Token-based authentication (JWTs):
- Recommended for: SPAs consuming APIs, mobile apps, cross-domain scenarios, microservices architectures, and any system where the client needs a portable, self-contained credential.
- How it works: User authenticates → server issues a signed JWT (and optionally a refresh token) → client stores the tokens → subsequent requests include the JWT in the Authorization header → server validates the JWT signature and claims without a central session store.
- Advantages: Stateless validation (no session store lookup per request), works across domains and services, supports fine-grained claims.
- Disadvantages: Cannot be revoked before expiry without a blacklist/revocation mechanism, token size increases with claims, exposure to token theft if stored insecurely on the client.
OAuth 2.0 / OIDC-based authentication:
- Recommended for: Applications that need delegated authorization, social login, enterprise SSO, third-party API access, or any system that separates the identity provider from the application.
- This is not an alternative to session-based or token-based — it is a layer that governs how tokens are obtained. The application still uses sessions or JWTs after the OAuth/OIDC flow completes.
Certificate-based (mutual TLS):
- Recommended for: Service-to-service authentication in zero-trust environments, IoT device authentication, enterprise environments with PKI infrastructure.
- How it works: Both client and server present X.509 certificates. The server verifies the client's certificate against a trusted CA.
- Advantages: No shared secrets, strong machine identity, resistant to credential theft.
- Disadvantages: Certificate lifecycle management (issuance, rotation, revocation) is operationally complex.
API key authentication:
- Recommended for: Simple programmatic API access where the caller is an application (not a user), and the API does not need to act on behalf of a specific user. Common for: developer APIs, service-to-service calls in low-security contexts, webhook verification.
- Not recommended for: User-facing authentication, scenarios requiring fine-grained user permissions, or high-security contexts.
Select the pattern(s) for each actor category. It is common to use multiple patterns: OIDC for end-user authentication, JWTs for API access, mutual TLS for service-to-service, and API keys for third-party developer access.
Phase 3: Password Management
-
Design password hashing. If the system accepts passwords (most do, even alongside other methods), password storage is a critical security control. A breach of the user table must not expose usable passwords.
Select the hashing algorithm — in order of preference:
- Argon2id (recommended): Winner of the Password Hashing Competition. Memory-hard (resistant to GPU/ASIC attacks), configurable time and memory cost. Use Argon2id variant (combines side-channel resistance of Argon2i with GPU resistance of Argon2d).
- Recommended parameters: memory = 64MB (65536 KB), iterations = 3, parallelism = 4. Adjust so that hashing takes 250ms-1s on your production hardware — this is the tradeoff between security (slower = harder to brute force) and user experience (login latency).
- If memory-constrained (shared hosting, serverless with limited memory): memory = 19MB (19456 KB), iterations = 2. Never go below 15MB.
- bcrypt (acceptable, widely supported): If Argon2 is unavailable in your language/framework. Use cost factor 12-14 (each increment doubles the time). bcrypt has a 72-byte input limit — long passwords are silently truncated. If users may have passwords > 72 characters, pre-hash with SHA-256 before bcrypt (but beware of null bytes — base64 encode the SHA-256 output).
- scrypt (acceptable alternative): Memory-hard like Argon2 but less tunable. Use if Argon2 and bcrypt are unavailable.
Never use:
- MD5, SHA-1, SHA-256 (without key stretching): Too fast — billions of hashes per second on modern GPUs.
- Encryption (AES, etc.) for passwords: Encryption is reversible; hashing is not. If the encryption key is compromised, all passwords are exposed.
- Single-iteration hashing of any kind.
Salt: Argon2 and bcrypt generate and embed a unique salt automatically. If using a lower-level API, generate a cryptographically random salt (16+ bytes) per password and store it with the hash. Never reuse salts. Never use a global/shared salt.
Pepper (optional additional defense): A server-side secret key used as an additional input to the hash. Stored in a secrets manager (Vault, AWS Secrets Manager), not in the database. If the database is breached but the application server is not, the pepper prevents offline brute-forcing. Implement as: HMAC(password, pepper) → Argon2(HMAC_output). Define pepper rotation procedure.
- Argon2id (recommended): Winner of the Password Hashing Competition. Memory-hard (resistant to GPU/ASIC attacks), configurable time and memory cost. Use Argon2id variant (combines side-channel resistance of Argon2i with GPU resistance of Argon2d).
-
Design password policies. Balance security with usability — overly restrictive policies lead to weaker passwords (users choose predictable patterns to meet arbitrary rules):
Recommended policy (aligned with NIST SP 800-63B):
- Minimum length: 8 characters (absolute minimum), 12 characters recommended. Length is the single most important factor in password strength.
- Maximum length: At least 128 characters. Never set a low maximum — it frustrates passphrase users and password managers.
- No composition rules: Do not require "at least one uppercase, one lowercase, one digit, one special character." NIST found these rules do not improve security and degrade usability. Users respond with "Password1!" patterns.
- Breached password detection (critical): Check every new password against a database of known breached passwords. Use the Have I Been Pwned Passwords API (k-anonymity model — only the first 5 characters of the SHA-1 hash are sent, preserving privacy) or a local copy of the breached password list. Reject passwords found in breach databases with a clear message: "This password has appeared in a known data breach. Please choose a different password."
- Common password blocklist: Reject passwords from a list of the most common passwords (top 10,000-100,000). Supplement with context-specific terms (company name, product name, "password", "admin").
- No periodic rotation requirements: NIST recommends against forced password rotation. It leads to predictable patterns (Password1 → Password2). Require password change only when: there is evidence of compromise, the password is found in a breach database, or the user requests it.
- Allow paste in password fields: Users must be able to paste passwords from password managers. Never disable paste on password fields.
-
Design password change and update flows.
- Password change (authenticated user): Require the current password before accepting a new password. This prevents unauthorized changes from a hijacked session. Apply all password validation rules (minimum length, breach check) to the new password. Invalidate all other active sessions after password change (force re-authentication on all other devices).
- Password reset (unauthenticated user): See Phase 8 (Account Recovery) for the full flow design.
- Password hash migration: If upgrading from a weaker algorithm (e.g., SHA-256 to Argon2), re-hash passwords on next successful login:
- User logs in with password.
- Verify against the old hash.
- If valid, hash the password with Argon2 and store the new hash.
- Delete the old hash.
- Track migration progress — monitor the percentage of users still on the old algorithm.
- After sufficient time, force remaining users to reset their passwords.
Phase 4: Token Design (JWT)
-
Design JWT structure and claims. If using JWTs (for API authentication, OAuth access tokens, or OIDC ID tokens):
Header:
{ "alg": "RS256", "typ": "JWT", "kid": "key-2024-01" }alg: The signing algorithm (see step 10).typ: Always"JWT".kid(Key ID): Identifies which key was used to sign the token. Critical for key rotation — consumers look up the public key bykid.
Registered claims (include these in every JWT):
iss(Issuer): URL of the authentication service that issued the token. Example:"https://auth.example.com". Consumers must validate this matches the expected issuer.sub(Subject): Unique, stable identifier for the authenticated entity. Use an opaque ID (UUID), not email or username (which can change). Example:"usr_a1b2c3d4".aud(Audience): Intended recipient(s) of the token. Example:"https://api.example.com"or["api.example.com", "admin.example.com"]. Consumers must validate that their identifier is in the audience — a token intended forapi.example.commust not be accepted bypayments.example.com.exp(Expiration): Unix timestamp after which the token is invalid. Required. Never issue tokens without expiration.iat(Issued At): Unix timestamp when the token was issued.nbf(Not Before): Unix timestamp before which the token is not valid. Use when tokens are pre-generated.jti(JWT ID): Unique identifier for the token. Use for token revocation tracking and replay prevention.
Custom claims — include only what consumers need for authorization decisions:
- Keep claims minimal. Every claim increases token size, and tokens travel with every request.
- Common custom claims:
"roles": ["admin", "editor"],"org_id": "org_xyz","permissions": ["orders:read", "orders:write"],"email": "user@example.com"(only if consumers need it). - Never include: Passwords, full user profiles, sensitive PII (SSN, credit card), large data structures, or anything that changes frequently (forcing token reissuance).
- Distinguish token types: If you issue both access tokens and refresh tokens as JWTs, include a claim to distinguish them:
"token_type": "access"vs."token_type": "refresh". This prevents a refresh token from being used as an access token.
-
Select the JWT signing algorithm. This is a security-critical choice:
Asymmetric algorithms (recommended for most systems):
- RS256 (RSA + SHA-256): Widely supported, well-understood. The authentication service signs with a private key; consumers verify with the public key (published via JWKS endpoint). Recommended as the default when broad library compatibility is needed.
- ES256 (ECDSA + P-256 + SHA-256): Smaller keys and signatures than RSA (256-bit vs. 2048-bit), faster verification. Recommended when token size matters (mobile, IoT) or for new systems without legacy compatibility concerns.
- EdDSA (Ed25519): Fastest, smallest signatures, strongest security properties. Recommended if all consumers support it (library support is growing but not yet universal).
Asymmetric advantages: The private key never leaves the authentication service. Any service can verify tokens using the public key without needing a shared secret. Supports key rotation via JWKS.
Symmetric algorithms (use only for specific cases):
- HS256 (HMAC + SHA-256): Both the issuer and consumer share the same secret key. Simpler for single-service architectures where the same service issues and validates tokens.
- Risk: Every service that validates tokens must have the signing key — any compromised service can forge tokens. Do not use in microservices architectures.
Never use:
"alg": "none". Ensure the JWT library rejects tokens withalg: none. This is a well-known attack vector. Configure the library to accept only the specific algorithm(s) you use — never let the token's header dictate which algorithm to use for verification. -
Design token lifetimes. Token lifetime is a direct tradeoff between security (shorter = less exposure window if token is stolen) and usability (shorter = more frequent re-authentication or refresh):
Access token lifetime:
- API access tokens: 15 minutes (default recommendation). Short enough to limit damage from token theft; long enough to avoid excessive refresh traffic.
- Server-to-server tokens: 30-60 minutes. Service credentials are less likely to be stolen via browser/mobile vectors.
- Highly sensitive operations (financial, health): 5 minutes. Pair with step-up authentication for critical actions.
- Never exceed 1 hour for access tokens. If someone argues for longer, they need a refresh token flow, not a longer access token.
Refresh token lifetime:
- Web applications: 7-30 days. Balances session persistence with security.
- Mobile applications: 30-90 days. Mobile users expect persistent sessions; pair with device binding and biometric unlock.
- Highly sensitive applications: 1-7 days. Users re-authenticate more frequently.
- Refresh tokens must be rotatable (see step 12).
ID token lifetime (OIDC):
- Short: 5-15 minutes. ID tokens prove authentication at a point in time. They should not be used as long-lived session tokens. The application should establish its own session after validating the ID token.
-
Design refresh token mechanics. Refresh tokens are the mechanism for obtaining new access tokens without re-authentication:
Refresh token rotation (mandatory):
- Every time a refresh token is used to obtain a new access token, issue a new refresh token and invalidate the old one.
- If an invalidated refresh token is presented (indicating potential theft of either the old or new token), immediately revoke the entire refresh token family (all tokens descended from the same login) and force re-authentication. This is called automatic reuse detection.
- Implementation: Store refresh tokens (or their hashes) in a database with a
family_idlinking all tokens from the same login session. When reuse is detected, invalidate all tokens with thatfamily_id.
Refresh token storage:
- Store refresh tokens as hashed values (SHA-256) in the database. Never store plaintext refresh tokens — if the database is breached, stolen refresh tokens should not be usable.
- Store metadata: user ID, device info, IP address, issued_at, expires_at, family_id, last_used_at, revoked status.
Refresh token binding:
- Bind refresh tokens to specific clients (client_id) and optionally to specific devices (device fingerprint, IP range). Reject refresh token use from a different client or drastically different network context.
-
Design token revocation. JWTs are stateless — they cannot be "revoked" in the traditional sense without introducing state. Design the revocation mechanism based on security requirements:
Option 1: Short access token lifetime + refresh token revocation (recommended default):
- Access tokens are short-lived (15 minutes) and accepted without a server-side check.
- Revocation targets refresh tokens: delete the refresh token from the database, so the user cannot obtain new access tokens.
- Maximum exposure window: the remaining lifetime of the last-issued access token (up to 15 minutes).
- Acceptable for most applications. If 15 minutes of exposure is too long, use Option 2.
Option 2: Token blacklist/denylist:
- Maintain a blacklist of revoked JWT
jtivalues in a fast store (Redis). - Every token validation checks the blacklist. If the
jtiis blacklisted, reject the token. - Set the blacklist entry TTL to the token's remaining lifetime — no need to keep entries for expired tokens.
- Cost: Adds a network round-trip (Redis lookup) to every authenticated request. Partially negates the "stateless" advantage of JWTs.
- Use when immediate revocation is required (logout must be instant, compromised sessions must be killed immediately).
Option 3: Token versioning:
- Store a
token_versioncounter per user in the database/cache. Includetoken_versionin the JWT claims. - On revocation, increment the user's
token_version. All tokens with the old version are instantly invalid. - Validation: Compare the JWT's
token_versionclaim to the user's current version. Requires one fast lookup per request. - Advantage over blacklist: Revokes all of a user's tokens at once (useful for "log out everywhere").
State the chosen approach and why: "Using short access tokens (15 min) with refresh token revocation. The 15-minute exposure window on revocation is acceptable for this application because [reason]. If regulatory requirements demand instant revocation, we will add a Redis-based token blacklist."
-
Design the JWKS (JSON Web Key Set) endpoint. For asymmetric JWT signing, publish the public keys at a standard endpoint:
- URL:
https://auth.example.com/.well-known/jwks.json - Response:
{ "keys": [ { "kty": "RSA", "kid": "key-2024-01", "use": "sig", "alg": "RS256", "n": "...", "e": "AQAB" } ] } - Key rotation procedure:
- Generate a new key pair with a new
kid. - Add the new public key to the JWKS endpoint (both old and new keys are published).
- Start signing new tokens with the new key.
- After all tokens signed with the old key have expired (wait for max token lifetime), remove the old public key from JWKS.
- Rotate keys at least annually, or immediately if a key compromise is suspected.
- Generate a new key pair with a new
- Consumer caching: Consumers should cache the JWKS with a TTL (e.g., 1 hour) and refresh when they encounter a token with an unknown
kid. This handles key rotation transparently.
- URL:
Phase 5: Session Management
-
Design server-side session management. If using session-based authentication (or hybrid session + token):
Session ID generation:
- Generate session IDs using a cryptographically secure random number generator (CSPRNG):
crypto.randomBytes(32)or equivalent. Minimum 128 bits of entropy. - Session IDs must be unpredictable — never derive them from user data, timestamps, or sequential counters.
Session storage:
- Redis (recommended): Fast, supports TTL for automatic expiry, supports key patterns for bulk operations (e.g., "invalidate all sessions for user X"). Set appropriate maxmemory and eviction policy (
noevictionorvolatile-ttl— neverallkeys-lruwhich could evict active sessions). - Database table: Acceptable for low-traffic applications. Include: session_id (hashed), user_id, created_at, expires_at, last_accessed_at, ip_address, user_agent, revoked flag. Add an index on user_id for "revoke all sessions" operations. Add a periodic cleanup job for expired sessions.
- In-memory (e.g., application process memory): Never in production. Sessions are lost on restart and cannot be shared across instances.
Session data: Store minimal data in the session: user ID, role/permission snapshot, session creation time, last activity time. Do not store large objects — every request loads the session.
- Generate session IDs using a cryptographically secure random number generator (CSPRNG):
-
Design session cookie configuration. The session cookie configuration is critical for security:
Set-Cookie: session_id=<value>; HttpOnly; Secure; SameSite=Lax; Path=/; Domain=.example.com; Max-Age=86400HttpOnly(mandatory): Prevents JavaScript from accessing the cookie. Mitigates XSS-based session theft. There is never a legitimate reason to make a session cookie accessible to JavaScript.Secure(mandatory): Cookie is only sent over HTTPS. Prevents interception over HTTP.SameSite:Lax(recommended default): Cookie is sent on top-level navigations (clicking a link) but not on cross-site POST requests or cross-origin subresource requests. Provides CSRF protection while maintaining usability (links from external sites still work).Strict: Cookie is never sent on cross-site requests. Maximum CSRF protection but breaks scenarios where users click links from email/external sites and expect to be logged in.None(withSecure): Cookie is sent on all cross-site requests. Required only for legitimate cross-site scenarios (embedded iframes, cross-site API calls). Requires additional CSRF protection.
Path: Set to/unless there is a specific reason to scope the cookie to a subpath.Domain: Set to the parent domain (.example.com) to share across subdomains if needed. Omit to restrict to the exact issuing domain.Max-AgeorExpires: Set to the desired session duration. Omit for a session cookie that expires when the browser closes (but note: modern browsers restore session cookies on restart).- Cookie name: Avoid default names like
JSESSIONIDorPHPSESSIDthat reveal the technology stack. Use a generic name like__Host-sid(the__Host-prefix enforces Secure, no Domain, and Path=/).
Use the
__Host-prefix (recommended):__Host-session_id. This browser-enforced prefix guarantees the cookie isSecure, has noDomainattribute (locked to the exact host), andPath=/. Prevents cookie injection attacks from subdomains. -
Design session lifecycle. Define how sessions behave over time:
Session timeout:
- Absolute timeout: Maximum session duration regardless of activity. Example: 24 hours for SaaS, 8 hours for enterprise, 30 minutes for banking. After this period, the session is invalid and the user must re-authenticate.
- Idle timeout: Session expires after a period of inactivity. Example: 30 minutes for web apps, 15 minutes for sensitive applications. Reset on each authenticated request. Idle timeout must be shorter than absolute timeout.
- Implementation: Store
created_at(for absolute timeout) andlast_accessed_at(for idle timeout) in the session record. On each request, check both:now - created_at < absolute_limit AND now - last_accessed_at < idle_limit.
Session renewal: When a session approaches its absolute timeout, optionally prompt the user to extend ("Your session will expire in 5 minutes. Click here to continue."). Issue a new session (new session ID, reset created_at) to prevent session fixation.
Concurrent session control:
- Allow multiple sessions (default for most apps): Users can be logged in from multiple devices. Track sessions and allow users to view and revoke other sessions ("Logged in devices" page).
- Limit concurrent sessions: Enforce a maximum (e.g., 3 active sessions). When the limit is exceeded, either reject the new login or invalidate the oldest session.
- Single session only: Each new login invalidates all previous sessions. Appropriate for high-security applications.
Session invalidation events — invalidate all sessions for a user when:
- User changes password.
- User enables or disables MFA.
- Admin disables the user account.
- Suspicious activity is detected on the account.
- User explicitly clicks "Log out of all devices."
-
Design client-side token storage (for SPAs and mobile apps). How tokens are stored on the client directly affects the attack surface:
For web SPAs:
- HttpOnly cookie (recommended): Store the access token (or session ID) in an HttpOnly Secure SameSite cookie. JavaScript cannot access it, eliminating XSS-based token theft. The API must be on the same domain (or a subdomain) for cookies to be sent automatically.
- BFF (Backend for Frontend) pattern (most secure for SPAs): The SPA never handles tokens directly. A server-side BFF component handles OAuth flows, stores tokens in a server-side session, and proxies API calls with the token. The SPA uses a session cookie to authenticate with the BFF. This eliminates all client-side token storage concerns.
- In-memory variable (acceptable with caveats): Store the access token in a JavaScript variable (not localStorage, not sessionStorage). Lost on page refresh — requires silent refresh via iframe or refresh token in HttpOnly cookie. More complex but avoids persistent storage.
- localStorage (not recommended): Accessible to any JavaScript on the page, including XSS payloads. If the application has any XSS vulnerability, tokens are compromised. Use only if cookie-based approaches are impossible and XSS risk is aggressively mitigated via CSP.
- sessionStorage (marginally better than localStorage): Cleared when the tab closes, but still accessible to JavaScript/XSS. Same risks as localStorage.
For mobile apps:
- iOS Keychain (recommended): Hardware-backed secure storage. Tokens are encrypted and isolated per app.
- Android Keystore + EncryptedSharedPreferences (recommended): Hardware-backed encryption for token storage.
- Never store tokens in: SharedPreferences (Android, unencrypted), UserDefaults (iOS, unencrypted), or plain files.
For server-side applications and CLIs:
- Store tokens in the operating system's credential manager (macOS Keychain, Windows Credential Manager, Linux Secret Service) or in encrypted configuration files with appropriate file permissions.
Phase 6: OAuth 2.0 and OpenID Connect
-
Design OAuth 2.0 flows. Select the correct OAuth 2.0 grant type for each consumer type. Using the wrong grant type is a security vulnerability.
Authorization Code with PKCE (recommended for all user-facing applications):
- Use for: Web applications (SPAs, server-rendered), mobile apps, desktop apps, CLI tools.
- Flow:
- Client generates a
code_verifier(random string, 43-128 chars) andcode_challenge(SHA-256 hash of verifier, base64url encoded). - Client redirects user to authorization server:
GET /authorize?response_type=code&client_id=...&redirect_uri=...&scope=openid profile email&state=...&code_challenge=...&code_challenge_method=S256. - User authenticates and consents.
- Authorization server redirects to
redirect_uriwithcodeandstate. - Client verifies
statematches the original value (CSRF protection). - Client exchanges the code for tokens:
POST /tokenwithgrant_type=authorization_code&code=...&code_verifier=...&redirect_uri=.... - Authorization server verifies the code, validates the code_verifier against the stored code_challenge, and returns access token, refresh token, and ID token (if OIDC).
- Client generates a
- PKCE is mandatory for all public clients (SPAs, mobile apps) and recommended even for confidential clients (server-side apps). PKCE prevents authorization code interception attacks.
- State parameter is mandatory: Generate a cryptographically random
statevalue, store it in the session, and verify it when the callback is received. This prevents CSRF attacks on the OAuth flow. - Never use the Implicit Grant (
response_type=token): It returns tokens in the URL fragment, which leaks through browser history, referrer headers, and logs. It is deprecated in OAuth 2.1.
Client Credentials Grant (machine-to-machine):
- Use for: Service-to-service authentication where no user context is needed. Example: a billing service calling a payment gateway API.
- Flow:
POST /tokenwithgrant_type=client_credentials&client_id=...&client_secret=...&scope=.... Returns an access token (no refresh token — the client can request a new token at any time using its credentials). - Security: The
client_secretmust be stored securely (secrets manager, not source code). Rotate client secrets periodically. Use mutual TLS client authentication (tls_client_auth) instead of client_secret for higher security.
Device Authorization Grant (input-constrained devices):
- Use for: Smart TVs, IoT devices, CLI tools on headless servers — devices that cannot render a browser.
- Flow: Device requests a user code and verification URL → displays them to the user → user visits the URL on another device (phone/laptop), enters the code, and authenticates → device polls the token endpoint until the user completes authentication.
Resource Owner Password Credentials (ROPC) — do not use:
- Deprecated in OAuth 2.1. The client directly handles the user's password, which violates the core OAuth principle of delegated authentication. Use Authorization Code with PKCE instead.
-
Design OpenID Connect (OIDC) integration. OIDC adds an identity layer on top of OAuth 2.0:
ID Token: A JWT issued by the OIDC provider that contains claims about the authenticated user:
- Standard claims:
sub(unique user ID),name,email,email_verified,picture,iss,aud,exp,iat,nonce. - The ID token proves who the user is. The access token proves what the user can do.
- Validate the ID token fully:
- Verify the signature using the provider's public keys (from JWKS endpoint).
- Verify
issmatches the expected issuer. - Verify
audcontains your client_id. - Verify
exphas not passed. - Verify
noncematches the value sent in the authorization request (prevents replay attacks). - If
azp(authorized party) is present, verify it matches your client_id.
OIDC Discovery: Use the provider's discovery endpoint (
/.well-known/openid-configuration) to dynamically discover authorization, token, JWKS, and userinfo endpoints. Do not hardcode these URLs — they may change.UserInfo endpoint:
GET /userinfowith the access token. Returns additional claims about the user. Use when the ID token does not contain sufficient user information (some providers include minimal claims in the ID token to reduce size).Scopes: Request only the scopes you need:
openid(required for OIDC): Returns an ID token.profile: Returns name, family_name, given_name, picture, etc.email: Returns email and email_verified.offline_access: Returns a refresh token.
- Standard claims:
-
Design social login and federated identity. When integrating external identity providers (Google, Apple, GitHub, Microsoft):
Registration and account linking:
- On first login via social provider, create a local user account linked to the provider's
subclaim (not email — email is not a unique, stable identifier across all providers). - Store: provider name, provider's user ID (
sub), linked local user ID, and metadata (email, name at time of linking). - Account linking by email (handle carefully): If a user registers with email/password and later logs in with Google (same email), should the accounts be auto-linked? This is convenient but risky — an attacker who controls a Google account with the victim's email could gain access. Options:
- Require the user to authenticate with the existing method and explicitly link the accounts (safest).
- Auto-link only if the email is verified by the provider (
email_verified: true) and the existing account's email is also verified (acceptable for most consumer apps). - Never auto-link unverified emails.
Multiple providers per account: Allow users to link multiple providers to one account (Google + GitHub). Store each link in a
user_identitiestable:(user_id, provider, provider_user_id, created_at).Apple Sign-In specifics:
- Apple only provides the user's name and email on the first authentication. Store them immediately — subsequent authentications do not include this data.
- Apple requires implementing Sign in with Apple if the app offers any other social login (App Store policy).
Handle provider outages: If the social login provider is temporarily unavailable, users cannot authenticate. Encourage users to set up a secondary authentication method (password, another provider) so they are not locked out.
- On first login via social provider, create a local user account linked to the provider's
Phase 7: Multi-Factor Authentication (MFA)
-
Design the MFA strategy. MFA is the most effective defense against credential theft — even if the password is compromised, the attacker needs the second factor. Define when and how MFA is used:
MFA enforcement policy:
- Required for all users: Maximum security. Recommended for enterprise SaaS, financial applications, healthcare, and any system handling sensitive data.
- Required for privileged users: Admins, users with elevated permissions. Minimum viable MFA policy.
- Optional but encouraged: Users choose to enable MFA. Show prompts and nudges to encourage adoption. Track MFA adoption rate as a security metric.
- Step-up authentication: Users authenticate normally (password), but MFA is required for specific sensitive actions (changing password, accessing financial data, modifying security settings, exporting data). The user's authentication level is elevated temporarily for the sensitive action.
- Risk-based MFA: Trigger MFA only when risk indicators are present (new device, new location, unusual time, impossible travel). Reduces friction for normal usage while protecting against suspicious access. Requires risk scoring infrastructure.
-
Select MFA methods and prioritize them. Not all second factors are equally secure. List in order of security:
Tier 1: Phishing-resistant (strongly recommended):
- WebAuthn / FIDO2 / Passkeys: Hardware security keys (YubiKey) or platform authenticators (Touch ID, Face ID, Windows Hello). The authentication is bound to the specific origin (domain) — a phishing site on a different domain cannot intercept the credential. This is the gold standard for MFA.
- Implementation: Use the WebAuthn API. Store the credential public key, credential ID, sign count, and authenticator type per user in the database. Support multiple registered authenticators per user (backup keys).
- Passkeys (discoverable credentials): Enable passwordless login — the passkey replaces both the password and the second factor. Support cross-device passkeys (synced via iCloud Keychain, Google Password Manager, or 1Password).
Tier 2: Strong (recommended):
- TOTP (Time-Based One-Time Password): Authenticator apps (Google Authenticator, Authy, 1Password). User scans a QR code containing a shared secret, and the app generates 6-digit codes that change every 30 seconds.
- Implementation: Generate a random 160-bit secret per user. Encode as base32 in the
otpauth://URI. Accept the current time step ±1 (allow one period of clock skew). Store the secret encrypted in the database. Rate-limit TOTP verification attempts. - Show recovery codes during setup (see step 24).
- Implementation: Generate a random 160-bit secret per user. Encode as base32 in the
- Push notification: Send a push notification to a registered mobile app. User approves or denies. Includes context (location, device, action). More resistant to social engineering than SMS. Implementation: register device tokens, send via APNS/FCM, implement timeout and fallback.
Tier 3: Acceptable (use only if Tier 1/2 are not feasible):
- SMS OTP: Send a one-time code via SMS. Vulnerable to SIM swapping, SS7 attacks, and real-time phishing (attacker relays the code). NIST SP 800-63B classifies SMS as a "restricted" authenticator. Use only as a fallback or for user populations that cannot use apps/hardware keys.
- If using SMS: Rate-limit sends (max 5 per hour per user), use short expiry (5 minutes), include context in the message ("Your code is 123456. If you did not request this, ignore this message."), and monitor for SIM swap indicators.
- Email OTP: Send a one-time code to the user's email. Less secure than TOTP (email accounts can be compromised, delivery delays), but accessible to all users. Similar implementation considerations as SMS.
Never use as sole second factor: Security questions ("What is your mother's maiden name?"). Answers are guessable, findable on social media, and often reused across sites.
Allow users to register multiple MFA methods for redundancy. If a user loses their phone, they should be able to use a backup hardware key or recovery codes.
- WebAuthn / FIDO2 / Passkeys: Hardware security keys (YubiKey) or platform authenticators (Touch ID, Face ID, Windows Hello). The authentication is bound to the specific origin (domain) — a phishing site on a different domain cannot intercept the credential. This is the gold standard for MFA.
-
Design MFA recovery. Users will lose their MFA device. The recovery process must balance security (preventing unauthorized access) with usability (not permanently locking out legitimate users):
Recovery codes (mandatory when MFA is enabled):
- Generate 8-10 single-use recovery codes during MFA setup. Each code: 8-10 alphanumeric characters.
- Display them once and instruct the user to store them securely (password manager, printed and stored physically).
- Store hashed recovery codes in the database (hash each code individually with bcrypt or similar). When a code is used, mark it as consumed.
- Allow users to regenerate recovery codes at any time (after authentication), which invalidates all previous codes.
Backup MFA method: Encourage users to register a second MFA method (e.g., hardware key as backup for TOTP). If one method is lost, the other still works.
Admin-assisted recovery (for enterprise/business applications):
- An administrator can temporarily disable MFA for a user's account after verifying the user's identity through an out-of-band process (video call, in-person verification, identity document verification).
- Log this action with the administrator's identity, the verification method used, and the timestamp.
- Force the user to re-enroll MFA immediately after the reset.
- Set a strict time limit: if MFA is not re-enrolled within 24 hours, lock the account.
Self-service recovery via identity verification (for consumer applications):
- If the user has verified recovery email/phone, send a verification code to initiate MFA reset.
- Require a waiting period (24-48 hours) before the MFA reset takes effect. Notify the user on all channels during this period. If the user cancels (indicating the request was fraudulent), abort the reset.
- This delays attackers and gives the legitimate user time to react.
Phase 8: Account Recovery and Password Reset
-
Design the password reset flow. Password reset is a high-risk flow — it is frequently targeted by attackers because it is a pathway to account takeover without the current password:
Secure password reset flow:
- User requests reset by entering their email address on the reset page.
- Do not confirm whether the email exists. Display a generic message: "If an account with that email exists, we have sent a password reset link." This prevents email enumeration (attackers discovering which emails are registered).
- Generate a cryptographically random, single-use reset token (minimum 128 bits of entropy). Hash it (SHA-256) and store the hash in the database with: user_id, token_hash, created_at, expires_at, used flag.
- Send the unhashed token in a reset link:
https://example.com/reset-password?token=<token>. Use HTTPS only. - Token expiry: 15-60 minutes. Shorter is more secure. After expiry, the token is invalid.
- When the user clicks the link:
- Look up the token hash in the database.
- Verify it has not expired and has not been used.
- Present the password reset form.
- User enters new password. Validate against all password policies (length, breach check).
- Hash the new password, update the user record, mark the reset token as used, and invalidate all existing sessions and refresh tokens for this user.
- Send a notification email: "Your password was recently changed. If you did not do this, contact support immediately." Include a link or instructions for recourse.
Security considerations:
- Rate-limit reset requests per email (e.g., max 3 per hour) and per IP (e.g., max 10 per hour).
- Never send the password itself in email — always use a token/link.
- Never include the old password or the new password in the notification email.
- Reset tokens must be single-use — invalidate after first use regardless of outcome.
- If the user has MFA enabled, consider requiring MFA verification after clicking the reset link and before setting the new password (this depends on threat model — if the attacker has access to the user's email, MFA provides additional protection).
-
Design account lockout and unlock. Protect against brute-force password guessing while not creating denial-of-service:
Lockout policy:
- Lock the account after N consecutive failed login attempts (recommended: 5-10 attempts).
- Time-based lockout (recommended over permanent lockout): Lock for an increasing duration: 1 minute after 5 failures, 5 minutes after 10 failures, 30 minutes after 15 failures. This slows attackers without permanently locking out legitimate users.
- Never use permanent lockout without manual unlock for consumer applications — attackers can weaponize it to lock out legitimate users (denial of service).
- Reset the failure counter after a successful login.
- Lockout scope: Lock by account, not by IP address alone. IP-based lockout is easily bypassed with botnets. Account-based lockout targets the actual attack vector.
Unlock mechanisms:
- Automatic unlock after the lockout period expires.
- Password reset (unlocks the account after successful reset).
- Admin-initiated unlock (for enterprise applications).
- CAPTCHA challenge after N failures instead of (or before) lockout — allows legitimate users to proceed while blocking automated attacks.
Credential stuffing defense (attackers trying stolen credentials from other breaches):
- Breached password detection (step 7) prevents users from using compromised passwords.
- Rate limiting per IP address (in addition to per-account lockout).
- CAPTCHA on login after initial failures or for suspicious patterns.
- Device fingerprinting — flag login attempts from unknown devices and require additional verification.
- Monitor for distributed attacks (many accounts, few attempts each, from many IPs) — this evades per-account lockout. Use anomaly detection on login patterns.
Phase 9: Service-to-Service and Machine Authentication
-
Design service-to-service authentication. Internal services authenticating to each other require different patterns than user authentication:
Mutual TLS (mTLS) (recommended for zero-trust service mesh):
- Each service has a unique X.509 certificate issued by an internal CA (or a service mesh CA like Istio's Citadel, SPIFFE/SPIRE).
- On every connection, both client and server present and verify certificates.
- Identity is derived from the certificate's Subject Alternative Name (SAN) — typically a SPIFFE ID:
spiffe://example.com/service/order-service. - Certificate rotation must be automated (short-lived certificates, auto-renewed by the service mesh or a sidecar process). Manual certificate management does not scale.
- Advantages: No shared secrets, strong cryptographic identity, works at the transport layer (no application code changes if using a service mesh).
- When to use: Microservices in a service mesh (Istio, Linkerd), zero-trust architectures, high-security environments.
OAuth 2.0 Client Credentials (recommended for API-based service-to-service):
- Each service has a
client_idandclient_secretregistered with the authorization server. - Service requests an access token with specific scopes and uses it to call other services.
- Advantages: Standard protocol, scoped access, auditable, works with API gateways.
- Credential management: Store client_secret in a secrets manager. Rotate periodically. Use asymmetric client authentication (private_key_jwt) for higher security — the service signs a JWT with its private key instead of transmitting a shared secret.
JWT-based service identity (common in cloud-native environments):
- Cloud providers offer workload identity: a service running on the platform receives a signed JWT automatically (AWS IAM roles + STS tokens, GCP service account tokens, Azure managed identity tokens).
- The service uses this token to authenticate to other services or cloud resources.
- Advantages: No secret management — credentials are injected by the platform and automatically rotated.
- Use when: Services run on a cloud platform that supports workload identity. This should be the default for cloud-native services.
API keys for internal services (acceptable for low-complexity environments):
- Use when the overhead of OAuth or mTLS is not justified (small number of services, trusted internal network, low-security context).
- Store the API key in a secrets manager, not in source code or environment variables.
- Rotate keys periodically. Implement key revocation without downtime (accept both old and new keys during rotation window).
- API keys identify the calling service but do not carry user context — they are not suitable for user-delegated access.
-
Design service account management.
- Every service should have its own dedicated identity (service account). Never share credentials across services — if one service is compromised, only its credentials need to be revoked.
- Principle of least privilege: Each service account has permissions for only the resources it needs to access. A billing service can call the payment API but not the user management API.
- Credential lifecycle:
- Creation: Automated via infrastructure-as-code or service registration API.
- Storage: Secrets manager (Vault, AWS Secrets Manager, GCP Secret Manager) with access policies restricting which services can read which secrets.
- Rotation: Automated, on a schedule (90 days) or on demand after suspected compromise. Implement graceful rotation: issue new credential → update consuming services → revoke old credential. Support dual credentials during rotation.
- Revocation: Immediate revocation capability. All dependent services must handle revocation gracefully (detect 401, alert, fail open or fail closed depending on criticality).
- Audit: Log every service authentication event: which service authenticated, when, from where, using which credential. Monitor for anomalies: authentication from unexpected IP ranges, unusual call patterns, or credential usage outside expected hours.
Phase 10: Passwordless Authentication
-
Design passwordless authentication flows. Passwordless authentication eliminates the primary attack vector (password theft) and improves user experience:
Passkeys (WebAuthn discoverable credentials) (recommended as the primary passwordless method):
- User registers a passkey using their device's platform authenticator (Touch ID, Face ID, Windows Hello) or a hardware security key.
- On login, the browser prompts the user to authenticate with their passkey — no password entry.
- Cross-device: Passkeys can be synced across devices via iCloud Keychain, Google Password Manager, or third-party password managers.
- Implementation:
- Registration: Generate a challenge → call
navigator.credentials.create()→ receive and store the credential public key, credential ID, and attestation. - Authentication: Generate a challenge → call
navigator.credentials.get()→ verify the assertion signature with the stored public key. - Store multiple passkeys per user (phone, laptop, hardware key).
- Registration: Generate a challenge → call
- Fallback: Provide alternative authentication methods (magic link, TOTP) for scenarios where passkeys are unavailable (old browsers, unsupported devices).
Magic link (email-based passwordless):
- User enters email → system sends a link with a single-use, time-limited token → user clicks the link → authenticated.
- Implementation: Same security as password reset tokens (step 25): cryptographically random token, hashed storage, short expiry (10-15 minutes), single-use, rate-limited.
- Security consideration: Security is delegated to the email account. If the user's email is compromised, so is their authentication. This is acceptable for consumer applications with moderate sensitivity.
- UX consideration: The user must switch to their email client, which interrupts the flow. On mobile, deep links can bring the user back to the app automatically.
- Send the link with contextual information: "Sign in to [App Name] from [Device] at [Location]. If this wasn't you, ignore this email."
OTP-based passwordless (email or SMS):
- Instead of a link, send a 6-digit code. User enters the code in the app.
- Advantages over magic link: Works in contexts where clicking a link is inconvenient (e.g., signing in on a TV while checking email on phone).
- Implementation: Generate a cryptographically random 6-digit code. Hash and store it with user_id, expiry (5 minutes), and attempt counter. Rate-limit verification attempts (max 3-5 attempts per code). Invalidate after use or expiry. Rate-limit code generation (max 3-5 per hour per user).
Phase 11: Single Sign-On (SSO) and Enterprise Authentication
-
Design SSO architecture. For applications serving enterprise customers who require SSO:
SAML 2.0 (legacy but still widely required by enterprises):
- Your application is the Service Provider (SP). The customer's identity system is the Identity Provider (IdP) (Okta, Azure AD, OneLogin, PingFederate, ADFS).
- SP-initiated flow (most common):
- User visits your application.
- Your application generates a SAML AuthnRequest and redirects the user to the customer's IdP.
- User authenticates at the IdP.
- IdP generates a SAML Response (XML document containing an Assertion with user attributes), signs it, and POSTs it back to your application's Assertion Consumer Service (ACS) URL.
- Your application validates the SAML Response signature, extracts user attributes, and creates a local session.
- Critical validation: Verify the XML signature (and canonicalization) using the IdP's certificate. Verify the Assertion's audience matches your SP entity ID. Verify the Assertion is not expired. Verify the InResponseTo matches the original request ID. Verify the Destination URL. Use a well-tested SAML library — SAML XML signature verification is notoriously difficult to implement correctly, and implementation flaws have led to critical authentication bypasses.
- SAML metadata exchange: Your SP publishes metadata (entity ID, ACS URL, signing certificate) and the IdP publishes metadata (entity ID, SSO URL, signing certificate). Exchange metadata during SSO configuration.
OIDC-based SSO (preferred for new implementations):
- Same as OAuth 2.0 Authorization Code with PKCE (step 19), but the customer's IdP is the OIDC provider.
- Advantages over SAML: JSON-based (simpler), better mobile support, standard token format (JWT), well-defined discovery and key rotation.
- For enterprise customers: Support OIDC discovery — customers provide their issuer URL, and your application discovers endpoints and keys automatically.
SSO provider selection per customer: Enterprise customers use different IdPs. Design a per-tenant SSO configuration:
- Store: tenant_id, SSO protocol (SAML or OIDC), provider configuration (IdP metadata URL for SAML, issuer URL for OIDC), domain mapping (users from
@customer.comare routed to this IdP). - Domain-based routing: On the login page, after the user enters their email, check the email domain against the SSO configuration. If a match is found, redirect to the customer's IdP instead of showing the password form.
- Enforce SSO: Allow enterprise customers to require SSO for all users in their organization. When SSO is enforced, password-based login is disabled for that tenant's users. Provide a "break glass" mechanism for tenant admins in case the IdP is unavailable.
-
Design Just-In-Time (JIT) provisioning. When a user authenticates via SSO for the first time:
- Automatically create a local user account using attributes from the SAML assertion or OIDC ID token (email, name, department, role).
- Map IdP groups/roles to local application roles based on a configurable mapping:
IdP group "Engineering" → application role "editor". - On subsequent SSO logins, update the user's attributes and roles from the IdP (the IdP is the source of truth for user directory data).
- Deprovisioning: When an employee leaves the customer's organization and is removed from the IdP, they can no longer authenticate via SSO. However, their local session may still be active. Design a session timeout or implement SCIM (System for Cross-domain Identity Management) for push-based deprovisioning — the IdP notifies your application to disable the user account.
Phase 12: Rate Limiting and Abuse Protection on Authentication Endpoints
-
Design authentication-specific rate limiting. Authentication endpoints are the most targeted endpoints in any application. Generic API rate limiting is insufficient — design dedicated controls:
Login endpoint rate limiting:
- Per-account rate limit: Max 5-10 failed attempts per account per 15-minute window. After exceeding, apply increasing lockout (step 26).
- Per-IP rate limit: Max 20-50 failed attempts per IP per 15-minute window across all accounts. Blocks credential stuffing from a single source.
- Global rate limit: Max failed login attempts per minute across the entire system. Alerts on anomalous spikes (e.g., > 10x normal failure rate) indicate a large-scale attack.
Registration endpoint rate limiting:
- Per-IP: Max 5-10 registrations per IP per hour. Prevents mass fake account creation.
- Require email verification before the account is fully activated (prevents spam registrations from consuming resources).
- CAPTCHA on registration (invisible reCAPTCHA or hCaptcha) to block bots.
Password reset endpoint rate limiting:
- Per-email: Max 3 reset requests per email per hour.
- Per-IP: Max 10 reset requests per IP per hour.
- Do not reveal whether the email exists (step 25).
MFA verification endpoint rate limiting:
- Max 3-5 failed verification attempts per session. After exceeding, require the user to restart the login flow.
- Rate-limit TOTP verification globally (prevents brute-forcing the 6-digit code — there are only 1,000,000 possibilities).
Token endpoint rate limiting:
- Per client_id: Max token requests per minute based on expected usage. Alert on anomalous patterns (e.g., a client requesting tokens 100x faster than normal indicates credential abuse).
-
Design CAPTCHA strategy. Use CAPTCHA judiciously — it degrades user experience:
- Do not show CAPTCHA on the first login attempt. Show it only after N failed attempts (e.g., 3), or when risk indicators are present (unknown device, suspicious IP).
- Invisible CAPTCHA (reCAPTCHA v3, Turnstile): Runs in the background and assigns a risk score. Only challenges users with low scores. Best user experience.
- Interactive CAPTCHA (reCAPTCHA v2, hCaptcha): Shows a challenge ("select all traffic lights"). Use as escalation after invisible CAPTCHA fails or for high-risk actions.
- Server-side validation: Always validate the CAPTCHA response server-side. Never trust client-side validation.
- Accessibility: Ensure CAPTCHA has accessible alternatives (audio challenges) for users with disabilities.
Phase 13: Authentication Audit Logging
-
Design authentication event logging. Authentication audit logs are critical for security monitoring, incident investigation, and compliance. Log every authentication-relevant event:
Events to log (at minimum):
Event Data to Log Login success user_id, timestamp, IP, user_agent, auth_method (password, SSO, social, passkey), MFA used (yes/no, method), session_id Login failure email_or_username (not password), timestamp, IP, user_agent, failure_reason (invalid_password, account_locked, mfa_failed, account_not_found) Logout user_id, timestamp, session_id, logout_type (user_initiated, session_expired, forced) Password change user_id, timestamp, IP Password reset request email, timestamp, IP Password reset completion user_id, timestamp, IP MFA enrollment user_id, timestamp, mfa_method, IP MFA removal user_id, timestamp, mfa_method, IP, reason Token issued user_id (or client_id), timestamp, token_type, scopes, token_id (jti) Token revoked user_id, timestamp, token_id, revocation_reason Account locked user_id, timestamp, reason, lock_duration Account unlocked user_id, timestamp, unlocked_by (self, admin, timeout) Session invalidated user_id, timestamp, session_id, reason API key created user_id, key_id, timestamp, scopes API key revoked user_id, key_id, timestamp, reason SSO configuration changed tenant_id, admin_user_id, timestamp, change_description Admin impersonation started admin_user_id, impersonated_user_id, timestamp, reason Admin impersonation ended admin_user_id, impersonated_user_id, timestamp What to NEVER log:
- Passwords (plaintext or hashed).
- Full session tokens or access tokens.
- TOTP codes or recovery codes.
- Client secrets or API key values.
- Full credit card numbers or SSNs.
- Log only opaque identifiers (user_id, session_id, token_id/jti) that can be correlated with internal systems but do not expose credentials.
-
Design authentication anomaly detection. Use the audit logs to detect suspicious authentication patterns:
Detection rules:
- Impossible travel: User logs in from New York, then 10 minutes later from Tokyo. Alert and optionally require MFA re-verification.
- Credential stuffing: High rate of failed login attempts across many different accounts from a set of IP addresses. Trigger global rate limiting and CAPTCHA.
- Brute force: Many failed attempts on a single account. Trigger account lockout (step 26).
- Session anomaly: Active session with a sudden change in IP address, user agent, or geographic location. Optionally require re-authentication.
- Unusual MFA behavior: Multiple MFA failures followed by success (attacker may be replaying codes). User disables MFA and immediately accesses sensitive data.
- Off-hours access: User accesses the system at unusual times for their historical pattern. Flag for review (especially for admin accounts).
- Account takeover sequence: Password reset → MFA change → email change in rapid succession. This is the classic account takeover pattern. Alert security team immediately.
Response actions (tiered):
- Log and monitor (low confidence).
- Require step-up authentication / MFA re-verification (medium confidence).
- Lock the account and notify the user (high confidence).
- Alert the security team for manual investigation (critical).
-
Define audit log retention and security.
- Retention: Minimum 1 year for most applications. 7 years for financial/healthcare (regulatory). Define the retention policy in compliance with applicable regulations.
- Immutability: Authentication logs must be append-only. No user or administrator should be able to modify or delete authentication logs. Write to a tamper-evident log store (append-only database, immutable object storage, or dedicated SIEM).
- Access control: Restrict access to authentication logs to security personnel. Log all access to the audit logs themselves (meta-auditing).
- Encryption: Encrypt audit logs at rest and in transit.
Phase 14: CSRF, XSS, and Authentication-Adjacent Security
-
Design CSRF protection for authentication flows. CSRF is relevant when authentication uses cookies:
CSRF protection mechanisms:
- SameSite cookies (primary defense): Setting
SameSite=Laxon the session cookie prevents CSRF on POST/PUT/DELETE requests from cross-site contexts. This handles most CSRF scenarios without additional tokens. - CSRF token (Synchronizer Token Pattern): Generate a cryptographically random CSRF token per session. Embed it in forms as a hidden field and validate it server-side on every state-changing request. Use this in addition to SameSite cookies for defense in depth.
- Double submit cookie: Set a CSRF token in both a cookie and a request header/body. Server verifies they match. Useful for stateless applications that cannot store CSRF tokens server-side.
- Custom request headers: For API calls from JavaScript, require a custom header (e.g.,
X-Requested-With: XMLHttpRequest). Simple requests from HTML forms cannot add custom headers, which blocks cross-site form submissions. Not sufficient as the sole defense — combine with SameSite.
Login CSRF: An attacker can force a user to log in to the attacker's account (the user thinks they are in their own account and may enter sensitive information). Protect with:
- CSRF token on the login form.
- After login, display the authenticated identity prominently so the user can verify they are in the correct account.
- SameSite cookies (primary defense): Setting
-
Design authentication-related XSS mitigations. XSS is the most dangerous attack vector for authentication because it can steal tokens and session identifiers:
- Content Security Policy (CSP): Set a strict CSP header that prevents inline JavaScript execution and restricts script sources. Minimum:
Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'. Eliminate'unsafe-inline'for scripts if possible (use nonces or hashes). - HttpOnly cookies: As discussed in step 16 — prevents JavaScript access to session cookies.
- Input sanitization and output encoding: All user-provided input must be encoded when rendered in HTML contexts. Use framework-provided auto-escaping (React's JSX, Django's template engine, etc.). Never inject raw user input into HTML, JavaScript, or URLs.
- Token storage: Store tokens in HttpOnly cookies or in-memory variables, not in localStorage (see step 18).
- Subresource Integrity (SRI): Add
integrityattributes to<script>and<link>tags for third-party resources to ensure they haven't been tampered with.
- Content Security Policy (CSP): Set a strict CSP header that prevents inline JavaScript execution and restricts script sources. Minimum:
Phase 15: Authentication for Special Scenarios
-
Design admin impersonation. For support and debugging purposes, administrators may need to view the application as a specific user:
- Implementation: The admin initiates impersonation via an admin interface. The system creates a special session that includes:
acting_as_user_id(the impersonated user),acting_on_behalf_of(the admin's user ID), and a flagis_impersonation = true. - Restrictions during impersonation:
- The impersonating admin cannot: change the user's password, change the user's MFA, delete the user's account, or perform financial transactions. Define a clear list of prohibited actions during impersonation.
- The UI must show a clear, non-dismissible indicator that impersonation is active ("Viewing as user@example.com — [End impersonation]").
- Audit: Log impersonation start and end (step 34). All actions performed during impersonation are logged with both the admin's and the impersonated user's identity.
- Authorization: Only designated admin roles can impersonate. Require MFA re-verification before starting impersonation.
- Time limit: Impersonation sessions expire after a short duration (e.g., 1 hour). Auto-terminate and alert if exceeded.
- Implementation: The admin initiates impersonation via an admin interface. The system creates a special session that includes:
-
Design authentication for webhooks and callbacks. When your system receives callbacks from external systems:
Webhook signature verification:
- The sender (e.g., payment gateway) signs the webhook payload using HMAC-SHA256 with a shared secret.
- Your system receives the webhook, computes the HMAC over the payload using the same secret, and compares it to the signature in the request header.
- Use constant-time comparison to prevent timing attacks.
- Verify the timestamp in the webhook payload to prevent replay attacks: reject webhooks with timestamps older than 5 minutes.
OAuth callback security:
- Validate the
stateparameter on OAuth callbacks (step 19). - Validate the authorization code can only be used once.
- Validate the
redirect_urimatches exactly — no open redirect vulnerabilities. - Register all valid
redirect_urivalues with the authorization server. Never allow wildcard redirect URIs in production.
-
Design API key management. For systems that issue API keys to third-party developers or internal services:
API key generation:
- Generate keys using CSPRNG: minimum 256 bits of entropy, base62 or base64url encoded.
- Use a recognizable prefix:
sk_live_,pk_test_,myapp_— helps identify the key source and environment in logs and code scans. - Display the full key only once at creation time. Store only the hash (SHA-256) in the database. If the user loses the key, they must generate a new one.
API key storage (server-side):
- Store: key_hash, key_prefix (first 8 characters for identification in UIs), user_id/org_id, name/description, scopes/permissions, created_at, last_used_at, expires_at, revoked flag.
- Index on key_hash for fast lookup during authentication.
API key lifecycle:
- Expiration: API keys should have an optional expiration date. For keys without expiration, implement a staleness check — flag keys not used in 90 days and prompt the owner to confirm they are still needed.
- Rotation: Allow users to create multiple keys simultaneously and deactivate old keys. This enables zero-downtime key rotation.
- Revocation: Immediate revocation via the dashboard or API. Revocation takes effect on the next request (no delay like JWT expiry).
- Scoping: Allow keys to be scoped to specific permissions (read-only, specific resources, specific environments). A key should have the minimum permissions needed.
API key security:
- Transmit API keys in headers (
Authorization: Bearer <key>orX-API-Key: <key>), never in URL query parameters (they appear in logs, browser history, and referrer headers). - Rate-limit per API key.
- Monitor for compromised keys: if a key appears in public repositories (GitHub secret scanning), alert the owner and consider automatic revocation.
Phase 16: Authentication Testing
-
Design authentication security testing. Authentication systems must be rigorously tested because failures have severe consequences:
Unit tests:
- Password hashing: Verify correct algorithm, verify hash verification works, verify rejection of incorrect passwords, verify timing consistency (constant-time comparison).
- JWT: Verify token generation with correct claims, verify token validation (correct signature, expiry, audience, issuer), verify rejection of tampered tokens, verify rejection of
alg: none, verify rejection of tokens signed with wrong key, verifykidhandling. - Session: Verify session creation, retrieval, expiry, invalidation, concurrent session enforcement.
- Rate limiting: Verify lockout triggers at correct thresholds, verify lockout duration, verify reset after cooldown.
Integration tests:
- Complete login flow: Register → login → authenticated request → logout. Verify session/token is valid, verify post-logout session/token is invalid.
- Password reset flow: Request → token validation → reset → verify new password works, verify old password does not work, verify all sessions are invalidated.
- OAuth flow: Authorization redirect → callback → token exchange → authenticated request. Verify state validation, verify code cannot be reused, verify PKCE verification.
- MFA flow: Login → MFA challenge → MFA verification → authenticated. Verify incorrect code is rejected, verify lockout after N failures.
- Session timeout: Verify idle timeout, verify absolute timeout, verify session renewal.
Security tests:
- Credential enumeration: Verify login and registration endpoints do not reveal whether an account exists (consistent response times, consistent messages).
- Token manipulation: Tamper with JWT claims, change the algorithm, modify the signature. Verify all are rejected.
- Session fixation: Verify session ID changes after login.
- CSRF: Verify state-changing authentication endpoints reject requests without valid CSRF tokens (or SameSite protection).
- Brute force: Verify rate limiting engages and accounts lock after configured thresholds.
- Open redirect: Verify redirect_uri validation in OAuth callbacks rejects unregistered URIs.
- Injection: Test login, registration, and password reset forms for SQL injection and XSS.
- Cookie security: Verify session cookies have HttpOnly, Secure, SameSite attributes.
- Token storage: Verify tokens are not stored in insecure locations (localStorage in production builds).
Penetration testing: Engage external security testers to attempt authentication bypass at least annually, and after any major authentication system changes.
Phase 17: Authentication Architecture Output and Deliverables
-
Produce authentication architecture deliverables. At the conclusion of every authentication design engagement, produce:
- Authentication architecture summary: A concise document stating the authentication domain, actor types, authentication methods, token strategy, session strategy, and key design decisions.
- Threat model: The identified threats, their severity, and the controls designed to mitigate each threat.
- Authentication flow diagrams: Sequence diagrams for each authentication flow (login, registration, password reset, OAuth/SSO, MFA enrollment, MFA verification, token refresh, logout). Include all HTTP interactions, redirects, and token exchanges.
- Token specification: JWT structure (header, payload, signature), claims inventory, signing algorithm, key management procedure, token lifetimes, and revocation mechanism.
- Session specification: Session storage mechanism, cookie configuration, timeout policies, concurrent session policy, and invalidation events.
- MFA specification: Supported MFA methods, enforcement policy, enrollment flow, verification flow, and recovery procedure.
- SSO integration guide: For each supported SSO protocol (SAML, OIDC), the configuration requirements, metadata exchange procedure, attribute mapping, and JIT provisioning logic.
- Security controls inventory: Rate limiting rules, lockout policies, CAPTCHA strategy, CSRF protection, XSS mitigations, and monitoring/alerting rules — all documented with specific thresholds and behaviors.
- Audit logging specification: Events logged, data fields per event, retention policy, and anomaly detection rules.
- API key management specification (if applicable): Key format, generation, storage, scoping, rotation, and revocation procedures.
- Build-vs-buy decision ADR: The decision to use a managed provider, self-hosted platform, or custom implementation, with context, alternatives considered, and consequences.
- Open questions: Areas requiring further stakeholder input, security review, or compliance confirmation.
Cross-Cutting Rules (Apply Throughout All Phases)
-
Defense in depth. Never rely on a single security control. Layer defenses: strong password hashing AND MFA AND session management AND rate limiting AND monitoring. If one layer fails, the others still protect the system.
-
Fail secure. When an authentication component fails (session store unavailable, JWT verification error, MFA service timeout), the default behavior must be to deny access, not to grant it. Never bypass authentication because a dependency is down. Degrade gracefully (show an error, retry, queue the request) but never silently skip authentication checks.
-
Secrets are secrets. Never log, expose in error messages, include in URLs, or store in source code: passwords, tokens, API keys, client secrets, signing keys, recovery codes, or TOTP secrets. Use secrets managers for all secret storage. Rotate secrets on a schedule and immediately after suspected compromise.
-
Use established libraries and standards. Never implement cryptographic primitives (hashing, signing, encryption), OAuth flows, SAML parsing, or JWT handling from scratch. Use well-vetted, actively maintained, standards-compliant libraries. Verify that the library you choose has not had recent critical vulnerabilities and is actively maintained.
-
State tradeoffs explicitly. Every authentication design decision involves a tradeoff between security, usability, and complexity. State it clearly: "Using 15-minute access tokens with refresh token rotation provides a balance between security (limited exposure window) and usability (users are not re-prompted every 15 minutes). Shorter tokens would improve security but increase refresh traffic and latency. Longer tokens would reduce traffic but increase exposure. 15 minutes is appropriate here because [justification]."
-
Design for the user, not the threat model alone. An authentication system so secure that users cannot use it (or constantly work around it) is a failed system. Users will choose weaker passwords if policies are onerous, bypass MFA if it is too frequent, and share credentials if individual access is too difficult. Design security controls that are proportional to the risk and frictionless whenever possible. The best authentication is one the user barely notices.
-
Make concrete recommendations, not option catalogs. Do not say "you could use Argon2 or bcrypt or scrypt." Say "Use Argon2id with memory=64MB, iterations=3, parallelism=4 because it provides the strongest resistance to GPU-based attacks. If Argon2 is unavailable in your framework, use bcrypt with cost factor 12 as a fallback." When alternatives are close, recommend one and state the conditions that would change the recommendation.
ライセンス: MIT(寛容ライセンスのため全文を引用しています) · 原本リポジトリ
詳細情報
- 作者
- Emmraan
- リポジトリ
- Emmraan/agent-skills
- ライセンス
- MIT
- 最終更新
- 2026/5/6
Source: https://github.com/Emmraan/agent-skills / ライセンス: MIT