Technical guide: architecting AI with Mastra, Next.js, and TypeScript
Learn how to architect AI with Mastra and Next.js and discover how to replace the LangChain/LangGraph.js approach in production environments.

"Written in April/2026, referencing
@mastra/core@1.25.0. Check the official changelog before implementing."
Mastra + Next.js + TypeScript form the most mature and cohesive stack today for building type-safe AI agents in JavaScript, advantageously replacing the LangChain/LangGraph.js approach in production environments. Mastra 1.0 (GA since Jan/2026, current version @mastra/core@1.25.0, ~23.1k stars on GitHub) consolidated a central registry architecture with dependency injection, durable workflows with suspend/resume, memory-as-first-class, native MCP (client and server) and transparent integration with Vercel AI SDK v5/v6. The combination with Next.js App Router delivers native streaming, type-safe Server Actions and deploy on both Vercel serverless (with Fluid Compute, up to 800s) and VPS/Docker via mastra build.
This guide consolidates four pillars — framework architecture, Next.js integration, type-safety with Zod, and design patterns — into an actionable blueprint for senior architects. All snippets are idiomatic and functional against Mastra 1.x, AI SDK v5 and Next.js 14/15.

1. Architecture and Mastra capabilities
1.1 The Mastra object as central registry
Mastra is a registry with DI orchestrating Agents, Workflows, Tools, Memory, Storage, Vector, Observability, MCP Servers and Gateways. The HTTP server is generated on top of Hono (with adapters for Express/Fastify/Koa from v1.0). In production, storage by domains (memory/workflows/scores/traces) via MastraCompositeStore is the default.
// src/mastra/index.tsimport { Mastra } from '@mastra/core';import { PinoLogger } from '@mastra/loggers';import { MastraCompositeStore } from '@mastra/core/storage';import { WorkflowsPG, ScoresPG, PgVector } from '@mastra/pg';import { MemoryLibSQL } from '@mastra/libsql';import { weatherAgent } from './agents/weather-agent';import { weatherWorkflow } from './workflows/weather-workflow';const storage = new MastraCompositeStore({ id: 'composite', domains: { memory: new MemoryLibSQL({ url: 'file:./local.db' }), workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL! }), scores: new ScoresPG({ connectionString: process.env.DATABASE_URL! }), },});export const mastra = new Mastra({ agents: { weatherAgent }, workflows: { weatherWorkflow }, storage, vectors: { pg: new PgVector({ connectionString: process.env.DATABASE_URL! }) }, logger: new PinoLogger({ name: 'Mastra', level: 'info' }), server: { port: 4111, host: '0.0.0.0', timeout: 30_000 }, // mcpServers, observability, scorers, processors, gateways, bundler...});Main packages: @mastra/core (Mastra, Agent, Workflow, Tool, Memory, Storage interfaces, Processors), @mastra/memory, @mastra/libsql, @mastra/pg, @mastra/mcp, @mastra/ai-sdk, @mastra/loggers, @mastra/observability, @mastra/client-js, mastra (CLI). From v1, subpath imports are mandatory (@mastra/core/agent, @mastra/core/workflows, etc.), except Mastra and type Config.
Current status (Apr/2026): @mastra/core@1.25.0 GA, Apache-2.0 license (with ee/ areas under Mastra Enterprise License), Maintained by the Gatsby team (Sam Bhagwat, Shane Thomas). Positioned against LangGraph.js; uses Vercel AI SDK for model routing (40+ providers, 3000+ models via Mastra Model Router).
1.2 Agent Lifecycle
import { Agent } from '@mastra/core/agent';import { Memory } from '@mastra/memory';import { LibSQLStore } from '@mastra/libsql';import { weatherTool } from '../tools/weather';export const weatherAgent = new Agent({ id: 'weather-agent', name: 'Weather Agent', description: 'Responde sobre clima.', instructions: 'Você é um assistente de clima. Use weatherTool quando preciso.', model: 'openai/gpt-5.1', // Mastra Model Router — "provider/model" tools: { weatherTool }, memory: new Memory({ storage: new LibSQLStore({ url: 'file:./agent.db' }), options: { lastMessages: 10, workingMemory: { enabled: true } }, }),});// .generate() — resposta completa, retorna { text, toolCalls, toolResults, steps, usage }const res = await weatherAgent.generate('Clima em Tóquio?', { memory: { resource: 'user-123', thread: 'conv-42' },});// .stream() — token-a-token via MastraModelOutputconst stream = await weatherAgent.stream('Planeje meu dia');for await (const chunk of stream.textStream) process.stdout.write(chunk);1.3 Memory: threads, resources, storage and vector
The Memory class combines storage (persistent history), vector (semantic recall) and embedder. Thread isolates conversations; Resource is a stable grouper (user/project) allowing multiple agents to share working memory and embeddings across threads. Default scope changed to 'resource' in Mastra 0.10+.
import { Memory } from '@mastra/memory';import { PgStore, PgVector } from '@mastra/pg';import { OpenAIEmbedder } from '@mastra/openai';const memoryPg = new Memory({ storage: new PgStore({ connectionString: process.env.DATABASE_URL! }), vector: new PgVector({ connectionString: process.env.DATABASE_URL! }), embedder: new OpenAIEmbedder({ model: 'text-embedding-3-small' }), options: { lastMessages: 20, semanticRecall: { topK: 5, messageRange: { before: 2, after: 1 }, scope: 'resource', indexConfig: { type: 'hnsw', metric: 'dotproduct', m: 16, efConstruction: 64 }, }, workingMemory: { enabled: true, template: '# User\n- First Name:\n- Last Name:', scope: 'resource', }, generateTitle: true, },});Supported vector stores: LibSQLVector, PgVector (HNSW/IVFFlat, bit, sparsevec), Pinecone, Upstash, Qdrant, Chroma, MongoDB, Astra, OpenSearch, S3Vectors, TurboPuffer, Lance, Cloudflare, Couchbase.
1.4 Workflow System
createWorkflow() / createStep() deliver durable execution: automatic snapshots at each suspend(), state serialized to JSON in storage, resume cross-process via runId. Tables mastra_workflow_snapshot, mastra_traces, mastra_messages are created automatically.
Flow control primitives: .then() (sequential), .parallel([]) (fan-out/fan-in), .branch([[cond, step]]) (router), .foreach(step, {concurrency}) (MapReduce), .dountil()/.dowhile() (loops), .map() (transform). Retry configurable at workflow level and step level.
import { createWorkflow, createStep } from '@mastra/core/workflows';import { z } from 'zod';const approvalStep = createStep({ id: 'approval', inputSchema: z.object({ amount: z.number(), needsApproval: z.boolean() }), outputSchema: z.object({ approved: z.boolean(), message: z.string() }), suspendSchema: z.object({ reason: z.string(), amount: z.number() }), resumeSchema: z.object({ approved: z.boolean(), approver: z.string() }), execute: async ({ inputData, resumeData, suspend, bail }) => { if (!inputData.needsApproval) return { approved: true, message: 'Auto' }; if (resumeData?.approved === false) { return bail({ approved: false, message: 'Rejected' }); } if (resumeData?.approved === undefined) { return await suspend({ reason: 'Human approval required', amount: inputData.amount }); } return { approved: true, message: `Approved by ${resumeData.approver}` }; },});export const paymentWorkflow = createWorkflow({ id: 'payment-workflow', inputSchema: z.object({ amount: z.number(), userId: z.string() }), outputSchema: z.object({ approved: z.boolean(), message: z.string() }), retryConfig: { attempts: 5, delay: 2000 },}) .then(analyzePurchase) .then(approvalStep) .then(executePayment) .commit();// Suspensão transparenteconst run = await paymentWorkflow.createRunAsync();const result = await run.start({ inputData: { amount: 5000, userId: 'u1' } });if (result.status === 'suspended') { // runId salvo em fila, notifique aprovador}// Retomada (mesmo ou outro processo, pelo runId)const resumed = await paymentWorkflow.createRunAsync({ runId });await resumed.resume({ resumeData: { approved: true, approver: 'mgr@acme' } });Discriminated union on returnrun.start() ('success' | 'failed' | 'suspended' | 'tripwire') ensures typed narrowing.
1.5 Multi-agent orchestration
Three approaches available:
Agent-as-tool (static supervisor): sub-agent wrapped in
createTool(). Deterministic coordination, predictable flow.agent.network()(dynamic routing): an Agent withagents,workflows, andtoolsregistered; the LLM decides which primitive to call. Requiresmemory(persists task history and detects completion). Supports suspension withagent-execution-approval/tool-execution-approval.Multi-agent workflows: steps invoking
mastra.getAgent(...).
Important deprecation (2026): the
AgentNetworkclass was deprecated. Useagent.network()or explicit supervisor.
export const routingAgent = new Agent({ id: 'routing-agent', instructions: 'Rede de pesquisadores e escritores.', model: 'openai/gpt-5.4', agents: { researchAgent, writingAgent }, workflows: { cityWorkflow }, tools: { weatherTool }, memory: new Memory({ storage: new LibSQLStore({ url: 'file:./mastra.db' }) }),});const result = await routingAgent.network('Clima em Tóquio e atividade sugerida.');for await (const chunk of result) { if (chunk.type === 'network-execution-event-step-finish') console.log(chunk.payload.result);}1.6 LLMs via Vercel AI SDK
Mastra v1 delegated routing to Vercel AI SDK (v1/v2/v3 compatible). Two ways to specify models:
// (a) String Model Router (recomendado)model: 'openai/gpt-5.4'model: 'anthropic/claude-4-5-sonnet'model: 'google/gemini-2.5-pro'// (b) Instância SDK direta (quando precisa de tipagem rigorosa)import { openai } from '@ai-sdk/openai';model: openai('gpt-4o')// (c) Fallbacks automáticos cross-providermodel: [ { model: 'openai/gpt-5', maxRetries: 3 }, { model: 'anthropic/claude-4-5-sonnet', maxRetries: 2 }, { model: 'google/gemini-2.5-pro', maxRetries: 2 },]// (d) Dinâmico por requestmodel: ({ requestContext }) => requestContext.task === 'complex' ? 'anthropic/claude-4-5-sonnet' : 'openai/gpt-5-mini'1.7 MCP (Model Context Protocol)
@mastra/mcp implements client and server. Transports stdio, SSE, and Streamable HTTP.
// Consumir MCPs externosimport { MCPClient } from '@mastra/mcp';export const mcp = new MCPClient({ id: 'main-mcp', servers: { filesystem: { command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '/tmp'] }, github: { url: new URL('https://api.githubcopilot.com/mcp/'), requestInit: { headers: { Authorization: `Bearer ${process.env.GH_PAT}` } } }, },});export const researchAgent = new Agent({ id: 'research', model: 'openai/gpt-4o', tools: await mcp.getTools(), // estático // ou dinâmico por request: const toolsets = await mcp.listToolsets();});// Expor Mastra como MCPimport { MCPServer } from '@mastra/mcp';const server = new MCPServer({ id: 'my-mcp-server', name: 'My MCP Server', version: '1.0.0', description: 'Expõe Mastra via MCP.', tools: { weatherTool }, agents: { weatherAgent }, // gera tool ask_weatherAgent workflows: { cityWorkflow }, // gera tool run_cityWorkflow});server.startStdio();2. Integration with Next.js App Router
2.1 Monorepo vs separate service
Criterion | Monorepo (embedded Mastra) | Separate service ( |
|---|---|---|
Deploy | Single ( | Two domains, CORS, cross-origin auth |
Agent↔UI latency | Zero internal network | +1 HTTP hop |
AI scale vs SSR | Coupled | Independent |
Workflows >5 min | Hard ( | Natural (VM/container) |
Multiple clients (web + mobile) | Frontend-centric | Reusable backend |
Vercel Hobby | Viable with caution | Not recommended |
MVP/prototype | Recommended | Overkill |
2.2 Directory structure (monorepo)
my-nextjs-agent/├── src/│ ├── app/│ │ ├── api/chat/route.ts # Route Handler streaming│ │ ├── chat/page.tsx # UI client│ │ ├── actions/weather.ts # Server Actions│ │ └── layout.tsx│ ├── mastra/│ │ ├── index.ts # new Mastra({...})│ │ ├── agents/weather-agent.ts│ │ ├── tools/weather-tool.ts│ │ ├── workflows/│ │ └── memory.ts│ └── lib/schemas.ts # Zod compartilhado├── next.config.ts # serverExternalPackages: ['@mastra/*']└── .env.local # OPENAI_API_KEY, DATABASE_URLRequired configuration:
// next.config.tsimport type { NextConfig } from 'next';const nextConfig: NextConfig = { serverExternalPackages: ['@mastra/*'], // impede o bundler de empacotar binários nativos};export default nextConfig;Known gotcha (vercel/next.js#74816): in some versions
serverExternalPackagesworks indevbut fails inbuild. Fallback via Webpack:config.externals.push('@mastra/core', '@mastra/libsql').
2.3 Server Actions invoking agents
Ideal for non-streaming synchronous operations (form submit, single generation). Keeps API keys on the server, integrates with Next.js cache/revalidation.
// src/app/actions/weather.ts'use server';import { z } from 'zod';import { mastra } from '@/mastra';import { revalidatePath } from 'next/cache';const WeatherInput = z.object({ city: z.string().min(1).max(100), units: z.enum(['metric', 'imperial']).default('metric'),});export type WeatherState = | { status: 'idle' } | { status: 'success'; text: string; toolCalls: unknown[] } | { status: 'error'; message: string; fieldErrors?: Record<string, string[]> };export async function getWeather(_prev: WeatherState, formData: FormData): Promise<WeatherState> { const parsed = WeatherInput.safeParse({ city: formData.get('city'), units: formData.get('units') ?? 'metric', }); if (!parsed.success) { return { status: 'error', message: 'Entrada inválida', fieldErrors: parsed.error.flatten().fieldErrors }; } try { const result = await mastra.getAgent('weatherAgent').generate( `Weather in ${parsed.data.city}? Units: ${parsed.data.units}.`, { memory: { thread: 'weather-thread', resource: 'public' } }, ); revalidatePath('/weather'); return { status: 'success', text: result.text, toolCalls: result.toolCalls ?? [] }; } catch (err) { console.error('[getWeather]', err); return { status: 'error', message: 'Falha ao consultar o agente.' }; }}Client consumption with useActionState:
'use client';import { useActionState } from 'react';import { getWeather, type WeatherState } from '@/app/actions/weather';export default function WeatherPage() { const [state, formAction, pending] = useActionState(getWeather, { status: 'idle' } as WeatherState); return ( <form action={formAction}> <input name="city" required /> <select name="units"><option value="metric">°C</option><option value="imperial">°F</option></select> <button disabled={pending}>{pending ? 'Consultando...' : 'Ver clima'}</button> {state.status === 'success' && <pre>{state.text}</pre>} {state.status === 'error' && <p>{state.message}</p>} </form> );}Important limitations: Server Actions don't stream — the client waits for the full response. Subject to maxDuration from the platform. For streaming, use Route Handler + useChat.
2.4 Route Handlers with streaming
Modern Mastra 1.0 pattern: @mastra/ai-sdk + handleChatStream().
// src/app/api/chat/route.tsimport { handleChatStream } from '@mastra/ai-sdk';import { toAISdkV5Messages } from '@mastra/ai-sdk/ui';import { createUIMessageStreamResponse } from 'ai';import { NextResponse } from 'next/server';import { mastra } from '@/mastra';export const maxDuration = 60; // 300 default com Fluid; até 800 em Proexport const runtime = 'nodejs'; // OBRIGATÓRIO — Mastra não suporta Edgeexport async function POST(req: Request) { const params = await req.json(); const stream = await handleChatStream({ mastra, agentId: 'weatherAgent', params: { ...params, memory: { thread: params.threadId ?? 'default', resource: params.resourceId ?? 'anon' }, }, }); return createUIMessageStreamResponse({ stream });}// Hidrata histórico no mountexport async function GET() { const memory = await mastra.getAgentById('weatherAgent').getMemory(); const res = await memory?.recall({ threadId: 'default', resourceId: 'anon' }); return NextResponse.json(toAISdkV5Messages(res?.messages ?? []));}Low-level alternative (full control):
export async function POST(req: Request) { const { messages } = await req.json(); const stream = await mastra.getAgent('weatherAgent').stream(messages, { format: 'aisdk', // AI SDK v5 compat memory: { thread: 'demo', resource: 'user-1' }, abortSignal: req.signal, // propaga cancelamento até o LLM }); return stream.toUIMessageStreamResponse(); // ou .toDataStreamResponse() (v4), .toTextStreamResponse()}Separate service (Next.js proxy to standalone Mastra on :4111):
import { MastraClient } from '@mastra/client-js';const client = new MastraClient({ baseUrl: process.env.MASTRA_API_URL ?? 'http://localhost:4111', retries: 3, backoffMs: 300,});export async function POST(req: Request) { const { messages } = await req.json(); const response = await client.getAgent('weatherAgent').stream({ messages, threadId: 'demo', resourceId: 'user-1', }); return new Response(response.body, { headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache, no-transform', 'X-Accel-Buffering': 'no' }, });}2.5 Chat UI with useChat and AI Elements
'use client';import { useEffect, useState } from 'react';import { useChat } from '@ai-sdk/react';import { DefaultChatTransport, type ToolUIPart } from 'ai';export default function ChatPage() { const [input, setInput] = useState(''); const { messages, setMessages, sendMessage, stop, status } = useChat({ transport: new DefaultChatTransport({ api: '/api/chat', // Com Mastra Memory: envie APENAS a última mensagem + identifiers prepareSendMessagesRequest({ messages, body }) { return { body: { ...body, messages: [messages[messages.length - 1]], threadId: 'default-thread', resourceId: 'user-1', }}; }, }), }); useEffect(() => { fetch('/api/chat').then(r => r.json()).then(setMessages).catch(() => {}); }, [setMessages]); return ( <div> {messages.map(m => ( <div key={m.id}> {m.parts?.map((part, i) => { if (part.type === 'text') return <p key={i}>{part.text}</p>; if (part.type === 'reasoning') return <details key={i}><summary>Thinking</summary>{part.text}</details>; if (part.type?.startsWith('tool-')) { const p = part as ToolUIPart; // Estados: 'input-streaming' → 'input-available' → 'output-available' | 'output-error' switch (p.state) { case 'input-available': return <Skeleton key={i} />; case 'output-available': return <ToolCard key={i} output={p.output} />; case 'output-error': return <ErrorCard key={i} text={p.errorText} />; } } return null; })} </div> ))} <input value={input} onChange={e => setInput(e.target.value)} /> <button onClick={() => { sendMessage({ text: input }); setInput(''); }}>Send</button> {status === 'streaming' && <button onClick={stop}>⏹ Stop</button>} </div> );}2.6 Deploy: limits, storages and runtimes
Vercel maxDuration (Apr/2026):
Plan | Default | Max. with Fluid | Max. without Fluid |
|---|---|---|---|
Hobby | 300s | 300s | 60s |
Pro | 300s | 800s | 300s |
Enterprise | 300s | 900s | 900s |
Fluid Compute (enabled by default since Apr/2025) allows concurrency on the same instance, active CPU pricing, and streams continue past 300s if the first byte is sent within ~25s.
Critical storage in serverless: LibSQLStore with file:./mastra.db DOES NOT work on ephemeral FS (Vercel/Lambda). Use:
// Turso (LibSQL remoto)new LibSQLStore({ url: process.env.TURSO_URL!, authToken: process.env.TURSO_TOKEN })// Postgres (Neon, Supabase, Vercel Postgres)new PostgresStore({ connectionString: process.env.DATABASE_URL! })// Upstash Redisnew UpstashStore({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! })Runtime: always export const runtime = 'nodejs' on routes that import Mastra. Edge runtime fails due to native dependencies (libsql, better-sqlite3, fs/crypto bindings).
VPS/Docker (mastra build):
npx mastra build --dir src/mastra # gera .mastra/output/ (Hono bundle)node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjsFROM node:22-alpine AS builderWORKDIR /appCOPY package*.json ./ && RUN npm ciCOPY src ./src && COPY tsconfig.json ./RUN npx mastra buildFROM node:22-alpine AS runnerWORKDIR /appRUN addgroup -g 1001 -S nodejs && adduser -S mastra -u 1001COPY --from=builder --chown=mastra:nodejs /app/.mastra/output ./.mastra/outputUSER mastraEXPOSE 4111HEALTHCHECK --interval=30s CMD wget -qO- http://localhost:4111/api/health || exit 1CMD ["node", "--import=./.mastra/output/instrumentation.mjs", ".mastra/output/index.mjs"]VercelDeployer publishes Mastra standalone as a Vercel function (no Next in front):
import { VercelDeployer } from '@mastra/deployer-vercel';export const mastra = new Mastra({ deployer: new VercelDeployer({ studio: true, maxDuration: 600, memory: 1536, regions: ['gru1', 'iad1'] }),});3. End-to-end type-safety with Zod
3.1 Zod as a quadruple contract
A Zod schema fulfills four simultaneous roles:
Role | Mechanism | Moment |
|---|---|---|
Static contract |
| compile-time |
Runtime validation |
| post-LLM |
Specification for the LLM | JSON Schema (via | pre-request |
Semantic documentation |
| pre-request |
Critical rule: .describe() directly impacts the quality of structured output — it's "prompt engineering via types". Always describe ambiguous fields.
3.2 Idiomatic patterns for LLMs
Use .nullable() instead of .optional() — OpenAI strict mode and GPT-5 reject optional() in structured output (mastra-ai/mastra#7234):
// ❌ Quebra em GPT-5 strict modeconst bad = z.object({ details: z.string().optional() });// ✅ Corretoconst good = z.object({ details: z.string().nullable().describe('null se ausente') });Discriminated unions are the standard for agent actions (ReAct, tool-routing):
export const AgentActionSchema = z.discriminatedUnion('type', [ z.object({ type: z.literal('search'), query: z.string() }), z.object({ type: z.literal('answer'), text: z.string(), confidence: z.number().min(0).max(1) }), z.object({ type: z.literal('escalate'), reason: z.string(), severity: z.enum(['low','medium','high']) }),]);3.3 Zod v3 vs v4 — impacts on AI pipelines
Aspect | v3 | v4 | Impact |
|---|---|---|---|
Parse strings/arrays | baseline | 14×/7× faster (JIT) | Almost free streaming validation |
Compile TS | baseline | ~10× faster | Monorepos with many schemas |
Bundle | baseline | 2.3× smaller | Important on edge |
| 1 arg | 2 required args | Breaks migration |
| default ignored if missing | always returns default | Careful in working memory |
Schema creation | fast | 17× slower (JIT) | Don't instantiate in hot loops |
| any position | must be last chain call | Doesn't inherit via |
Mastra ≥ beta.16 normalizes both via Standard Schema; Zod coexists via zod/v3 and zod/v4.
3.4 Typed Tools with createTool
import { createTool } from '@mastra/core/tools';import { z } from 'zod';export const githubRepoTool = createTool({ id: 'get-github-repo-info', description: 'Fetch basic insights for a public GitHub repository', inputSchema: z.object({ owner: z.string().describe('GitHub username or organization'), repo: z.string().describe('Repository name'), }), outputSchema: z.object({ stars: z.number(), forks: z.number(), issues: z.number(), license: z.string().nullable(), lastPush: z.string(), description: z.string().nullable(), }), execute: async ({ context, runtimeContext }) => { // ^ { owner: string; repo: string } inferido const res = await fetch(`https://api.github.com/repos/${context.owner}/${context.repo}`); if (res.status === 404) throw new Error(`Not found`); const d = await res.json(); return { stars: d.stargazers_count, forks: d.forks_count, issues: d.open_issues_count, license: d.license?.name ?? null, lastPush: d.pushed_at, description: d.description, }; // Incompatibilidade com outputSchema é erro em compile-time E runtime },});Return from tool.execute() is discriminated union that includes error path — narrowing via if ('error' in result && result.error). Avoid the name error as a field in outputSchema (collides with discriminator).
Typed RuntimeContext (⚠️ known bug — .get() doesn't infer; use cast):
export type SupportCtx = { 'user-tier': 'free'|'pro'|'enterprise'; language: 'en'|'pt-BR' };execute: async ({ runtimeContext }) => { const tier = runtimeContext.get('user-tier') as SupportCtx['user-tier']; const limit = tier === 'enterprise' ? 100 : tier === 'pro' ? 25 : 5; ...}3.5 Structured Outputs
API Mastra v1:
const result = await agent.generate('Who won 2012?', { structuredOutput: { schema: ElectionResultSchema, errorStrategy: 'fallback', // 'strict' | 'warn' | 'fallback' fallbackValue: { winner: 'Unknown', year: 0, party: 'Other' }, jsonPromptInjection: true, // obrigatório: Gemini 2.5 + tools },});result.object.winner; // string, totalmente tipadoPartial object streaming:
const stream = await agent.stream('...', { structuredOutput: { schema } });for await (const partial of stream.objectStream) { // DeepPartial<T> — campos chegando incrementalmente}const final = await stream.object; // T validadoPure AI SDK (modern way with Output.object(), since generateObject is deprecated):
import { generateText, Output, tool, stepCountIs } from 'ai';const { output } = await generateText({ model: 'openai/gpt-5.2', tools: { weather: tool({ inputSchema: z.object({ location: z.string() }), execute: async () => ({...}) }) }, output: Output.object({ schema: RecipeSchema }), stopWhen: stepCountIs(5), prompt: '...',});Error handling:
import { AI_NoObjectGeneratedError } from 'ai';try { const { object } = await generateObject({ ... }); return object; }catch (err) { if (AI_NoObjectGeneratedError.isInstance(err)) console.error('No object', err.text, err.cause); if (err instanceof z.ZodError) console.error('Zod failed', err.issues); throw err;}3.6 Workflows with typed steps
.then(step) only compiles if step.inputSchema is compatible with the outputSchema from the previous step — the compiler holds the pipeline shape.
const scrapeStep = createStep({ id: 'scrape', inputSchema: z.object({ url: z.string().url() }), outputSchema: z.object({ url: z.string().url(), markdown: z.string() }), execute: async ({ inputData }) => ({ url: inputData.url, markdown: await fetch(inputData.url).then(r=>r.text()) }),});const summarizeStep = createStep({ id: 'summarize', inputSchema: scrapeStep.outputSchema, outputSchema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }), execute: async ({ inputData, mastra }) => { const res = await mastra.getAgent('summarizer').generate(inputData.markdown, { structuredOutput: { schema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }) }, }); return res.object; },});export const changelogWorkflow = createWorkflow({ id: 'changelog', inputSchema: z.object({ url: z.string().url() }), outputSchema: summarizeStep.outputSchema,}).then(scrapeStep).then(summarizeStep).commit();Runtime context validation via requestContextSchema:
const workflow = createWorkflow({ id: 'tiered', inputSchema, outputSchema, requestContextSchema: z.object({ userTier: z.enum(['free','pro','enterprise']), locale: z.string() }),});3.7 Where type-safety lives
Layer | Tool | Compile | Runtime | Sent to LLM |
|---|---|---|---|---|
HTTP/Form input |
| ✅ | ✅ | — |
Tool input |
| ✅ | ✅ | ✅ |
Tool output |
| ✅ | ⚠️ informative | ✅ |
Structured output |
| ✅ | ✅ | ✅ |
Workflow step |
| ✅ | ✅ | — |
Runtime context |
| ✅ on | ⚠️ optional | ❌ |
Memory |
| ✅ | ✅ | ✅ |
4. Design patterns for agents and workflows
4.1 ReAct: implicit vs explicit
Approach | Who decides | Implementation | When to use |
|---|---|---|---|
Implicit (agent loop) | LLM, via native tool calling |
| Open-ended tasks, unknown N of steps |
Explicit (workflow) | Orchestrator code |
| Auditability, SLA, hard limits, HITL in the loop |
Recommendation: always start with implicit ReAct. Only migrate to explicit when you need granular trace, step budget, human approval mid-flight, or A/B testing by action type.
// Implícito — o agent loop já implementa ReActexport const researchAgent = new Agent({ instructions: `Follow ReAct loop: THOUGHT → ACTION (call one tool) → OBSERVATION. Repeat until confident. Never invent facts.`, model: 'openai/gpt-4o', tools: { searchDocsTool },});await researchAgent.generate('Qual a diferença entre suspend() e bail()?', { maxSteps: 8 });Reusable ReAct schema:
export const ReActStepSchema = z.object({ thought: z.string().describe('Reasoning about next step'), action: z.discriminatedUnion('type', [ z.object({ type: z.literal('tool_call'), toolName: z.string(), args: z.record(z.string(), z.unknown()) }), z.object({ type: z.literal('final_answer'), answer: z.string(), confidence: z.number().min(0).max(1) }), ]),});4.2 Plan-and-Execute
Separates Planner (powerful LLM, e.g., Claude Sonnet/GPT-5) that produces plan upfront, from Executor (cheap LLMs specialized in tool use). ~30% fewer tokens than ReAct in complex multi-step tasks (LangChain 2026 benchmark).
const plannerAgent = new Agent({ id: 'planner', instructions: 'Break goal into 3-7 concrete, tool-executable steps.', model: 'anthropic/claude-sonnet-4',});const executorAgent = new Agent({ id: 'executor', instructions: 'Execute a single plan step. Use tools. Return terse result.', model: 'openai/gpt-4o-mini', tools: { /* ... */ },});const planStep = createStep({ id: 'plan', inputSchema: z.object({ goal: z.string() }), outputSchema: z.object({ goal: z.string(), plan: z.array(z.object({ id: z.string(), description: z.string(), dependsOn: z.array(z.string()).default([]), })) }), execute: async ({ inputData, mastra }) => { const res = await mastra.getAgent('plannerAgent').generate(`Goal: ${inputData.goal}`, { output: z.object({ plan: z.array(/* ... */) }), }); return { goal: inputData.goal, plan: res.object.plan }; },});export const planAndExecuteWorkflow = createWorkflow({ id: 'plan-exec', inputSchema: z.object({ goal: z.string() }), outputSchema: z.object({ results: z.array(z.any()) }),}) .then(planStep) .map(async ({ inputData }) => inputData.plan) .foreach(executeStep, { concurrency: 3 }) .commit();4.3 Orchestration patterns matrix
Pattern | Mastra API | When to use |
|---|---|---|
Pipeline |
| Fixed steps, linear dependency (ETL, content pipeline) |
Fan-out/Fan-in |
| N fixed independent tasks; output is object with keys = step ids |
MapReduce |
| N dynamic; process list |
Router/Branch |
| Routing by classification; all branches share schemas |
Static supervisor | Agent with sub-agents as tools | Deterministic coordination |
Dynamic supervisor |
| LLM decides which primitive to call |
Evaluator-Optimizer |
| Convergent iterative refinement |
Human-in-the-loop |
| Approvals, payment >$X, irreversible actions |
Handoff | workflow + agents with shared memory | Specialist takes control |
Council |
| Multiple opinions for synthesis |
Concrete Evaluator-Optimizer:
workflow .then(generateDraft) .dowhile( createStep({ id: 'eval-refine', execute: async ({ inputData, state }) => { const scorer = createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' }); const { score } = await scorer.run({ input: state.prompt, output: inputData.draft }); if (score >= 0.85) return { ...inputData, done: true, score }; const refined = await refinerAgent.generate(/* with feedback */); return { ...inputData, draft: refined.text, done: false, score }; }, }), async ({ inputData }) => !inputData.done, ) .commit();4.4 External tools integration
REST API:
export const githubIssueTool = createTool({ id: 'github-create-issue', inputSchema: z.object({ repo: z.string().regex(/^[\w-]+\/[\w-]+$/), title: z.string().min(1).max(256), body: z.string().max(65_536).optional(), labels: z.array(z.string()).max(100).default([]), }), outputSchema: z.object({ number: z.number(), url: z.string().url() }), execute: async ({ context, tracingContext }) => { const span = tracingContext?.currentSpan?.startSpan('github.api.call'); try { const res = await fetch(`https://api.github.com/repos/${context.repo}/issues`, { method: 'POST', headers: { Authorization: `Bearer ${process.env.GITHUB_TOKEN}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ title: context.title, body: context.body, labels: context.labels }), signal: AbortSignal.timeout(10_000), }); if (!res.ok) throw new Error(`GitHub ${res.status}: ${await res.text()}`); const data = await res.json(); return { number: data.number, url: data.html_url }; } finally { span?.end(); } },});Database (Drizzle):
export const getUserTool = createTool({ id: 'get-user', inputSchema: z.object({ userId: z.string().uuid() }), outputSchema: z.object({ id: z.string(), email: z.string(), plan: z.enum(['free','pro','enterprise']) }).nullable(), execute: async ({ context }) => { const db = drizzle(process.env.DATABASE_URL!); const [row] = await db.select().from(users).where(eq(users.id, context.userId)).limit(1); return row ?? null; },});Error handling in tools:
Strategy | When | Example |
|---|---|---|
Throw | Unrecoverable error | Auth failure, timeout after retries |
Structured return | LLM must react/retry |
|
Internal retry | Transient failure |
|
Circuit breaker | Unstable API |
|
Timeout | Prevent stuck agent |
|
// Padrão "return estruturado" — melhor para o loop ReActoutputSchema: z.union([ z.object({ success: z.literal(true), data: z.object({ /* ... */ }) }), z.object({ success: z.literal(false), error: z.object({ code: z.string(), message: z.string() }) }),]),4.5 Observability
Structured logging (PinoLogger):
import { PinoLogger } from '@mastra/loggers';export const mastra = new Mastra({ logger: new PinoLogger({ name: 'Mastra', level: process.env.LOG_LEVEL ?? 'info', mixin() { return { traceId: getCurrentTraceId(), service: 'ai-api', env: process.env.NODE_ENV }; }, }),});Native AI Tracing with OTel + multiple exporters:
import { DefaultExporter } from '@mastra/observability';import { LangfuseExporter } from '@mastra/langfuse';export const mastra = new Mastra({ observability: { default: { enabled: true }, configs: { langfuse: { serviceName: 'prod-agents', exporters: [new LangfuseExporter({ publicKey: process.env.LANGFUSE_PUBLIC_KEY!, secretKey: process.env.LANGFUSE_SECRET_KEY!, baseUrl: process.env.LANGFUSE_BASE_URL, })], sampling: { type: 'ratio', probability: 0.1 }, // 10% em prod }, debug: { exporters: [new DefaultExporter()], sampling: { type: 'always' } }, }, configSelector: (ctx) => ctx.runtimeContext?.get('supportMode') ? 'debug' : 'langfuse', },});Platform comparison:
Platform | Strong in | Weak in | When to choose |
|---|---|---|---|
Langfuse | LLM-native (prompts, cost, evals). Self-host. | Generic infra tracing | Prompt engineering, cost per feature, evals |
Braintrust | Production evals, A/B side-by-side | Less rich tracing | Teams focused on regression testing |
LangSmith | LangChain integration, datasets | Vendor lock-in | Stack already LangChain/LangGraph |
SigNoz/Datadog (OTel) | Full-stack APM | Not LLM-first | Unified APM (not just AI) |
Mastra Studio + DuckDB | Built-in, zero setup, cost/latency | Local/single-node | Local dev, small teams |
Evals / Scorers (Mastra 2026 — replaces legacy evals):
Scorers run async after response, with pipeline preprocess → analyze → generateScore → generateReason. Built-in: answer-relevancy, answer-similarity, faithfulness, hallucination, completeness, tool-call-accuracy, trajectory-accuracy, bias, toxicity, prompt-alignment.
import { createAnswerRelevancyScorer, createToxicityScorer, createHallucinationScorer } from '@mastra/evals/scorers/llm';export const supportAgent = new Agent({ id: 'support', model: 'openai/gpt-4o', scorers: { relevancy: { scorer: createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' }), sampling: { type: 'ratio', rate: 0.2 } }, hallucination: { scorer: createHallucinationScorer({ model: 'openai/gpt-4o-mini' }), sampling: { type: 'ratio', rate: 1.0 } }, toxicity: { scorer: createToxicityScorer({ model: 'openai/gpt-4o-mini' }), sampling: { type: 'ratio', rate: 1.0 } }, },});CI/CD with Vitest:
import { runEvals } from '@mastra/core/evals';describe('Support Agent', () => { it('meets quality thresholds', async () => { const result = await runEvals({ target: supportAgent, data: [{ input: 'How to cancel?', groundTruth: 'cancellation policy' }], scorers: [relevancyScorer, hasSourcesScorer], concurrency: 3, }); expect(result.scores['answer-relevancy']).toBeGreaterThanOrEqual(0.8); });});5. Consolidated architectural blueprint
TL;DR for architects:
Start simple. Single agent + tools. Workflow/multi-agent only when steps are known or context becomes unmanageable.
Workflows for auditability/SLO;
agent.network()for flexibility. Workflows = code dictates flow. Networks = LLM dictates flow.Zod everywhere. Every tool
inputSchema/outputSchema, every step, every scorer. It's your only defense against hallucination in tool args.Persistence from day 1. Postgres (prod) or LibSQL (dev). Without it,
suspend/resumedoesn't work and you lose traces on restart.serverExternalPackages: ['@mastra/*']+runtime = 'nodejs'. Non-negotiable in Next.js.Route Handlers for streaming; Server Actions for sync. Don't try to stream via Server Action.
Vercel Fluid Compute + remote Postgres/Turso for serverless production. Never local LibSQL.
Observability by environment via
configSelector: dev→Default, staging→10% Langfuse, prod→1% + Datadog.Scorers with low sampling in prod (5-20%), 100% in toxicity/safety.
MCP before reimplementing tools. GitHub, Slack, Notion, filesystem, Playwright already have official servers.
HITL via
suspend()whenever cost/irreversibility > convenience. Payment >$X, deletion, bulk send.Prefer
agent.network()or explicit supervisor overAgentNetworkclass (deprecated).Plan-and-Execute > ReAct for tasks >5 steps. Powerful planner + cheap executors saves 20-30% tokens.
Known critical pitfalls
optional()breaks OpenAI/GPT-5 strict mode → use.nullable()with.describe()(mastra-ai/mastra#7234).Gemini 2.5 + tools + structured output → always
jsonPromptInjection: true.z.record()in Zod v4 needs 2 required args.Field named
errorinoutputSchemabreakstool.execute()narrowing.RuntimeContext.get()doesn't infer — manual cast needed..describe()/.meta()must be last chain call (doesn't inherit via.optional()/.extend()).tool()helper from AI SDK is mandatory for inference;createToolfrom Mastra doesn't suffer from this.generateObjectdeprecated → migrate togenerateText({ output: Output.object(...) }).Zod v4 creates schemas 17× slower (JIT) — never instantiate in hot render/loop.
toDataStreamResponse()+output: zodSchemaconflicts (mastra-ai/mastra#5544) — useexperimental_output.serverExternalPackageshas build issue (vercel/next.js#74816) — keep Webpack fallback.AgentNetworkclass → deprecated; useagent.network().legacy_workflows→ replaced bycreateWorkflow/createStep.
Reference stack for production
┌──────────────────────────────────────────────────────────────┐│ Next.js App Router (Node runtime) ││ ├─ Route Handlers (streaming, useChat) ││ └─ Server Actions (síncrono, forms) │├──────────────────────────────────────────────────────────────┤│ Mastra (embedded ou standalone) ││ ├─ Agents (ReAct implícito) + agent.network() ││ ├─ Workflows (.then/.parallel/.branch/.foreach/.dountil) ││ ├─ Tools (createTool + Zod) + MCP (client + server) ││ └─ Memory (threads/resources, semanticRecall, workingMemory)│├──────────────────────────────────────────────────────────────┤│ Storage: MastraCompositeStore ││ ├─ memory: LibSQL (dev) / Postgres (prod) ││ ├─ workflows: Postgres (snapshots persistentes) ││ ├─ scores: Postgres ││ └─ vectors: pgvector / Pinecone / Upstash │├──────────────────────────────────────────────────────────────┤│ LLM: Vercel AI SDK v5/v6 ││ ├─ openai/gpt-5.x, anthropic/claude-4-5-sonnet, ││ │ google/gemini-2.5-pro ││ └─ Fallbacks automáticos cross-provider │├──────────────────────────────────────────────────────────────┤│ Observabilidade ││ ├─ PinoLogger (structured) ││ ├─ OTel tracing → Langfuse/Braintrust/SigNoz ││ └─ Scorers async (relevancy, hallucination, toxicity) │├──────────────────────────────────────────────────────────────┤│ Deploy ││ ├─ Vercel (Fluid Compute, maxDuration: 800s Pro) ││ ├─ VPS (mastra build → node + PM2) ││ └─ Docker (multi-stage, Alpine, healthcheck) │└──────────────────────────────────────────────────────────────┘Conclusion
The Mastra 1.x + Next.js 15 + AI SDK v5/v6 ecosystem is today, in April 2026, the most cohesive and type-safe approach for building AI agents in TypeScript — surpassing LangChain/LangGraph.js in ergonomics, DX and native integration with the JavaScript runtime. The three architectural decisions that most impact scale and maintainability are: (1) choosing between Mastra embedded in Next.js (MVP, single frontend) vs. standalone (independent scale, multiple clients), (2) migrating from local LibSQL to remote Postgres on day zero in serverless (without this suspend/resume and traces are illusions), and (3) investing in Zod as a quadruple contract (compile-time, runtime, prompt to LLM, semantic documentation) from the very first tool.
The counter-intuitive insight here is that the biggest quality gain doesn't come from the most powerful model, but from the granularity of Zod schemas: .describe() well written in outputSchema fields are disguised prompt engineering, and .nullable() instead of .optional() eliminates entire classes of failures in OpenAI strict mode. Combined with durable workflows (suspend/resume/bail), agent.network() for dynamic routing, MCP for interop without reimplementation, and continuous scorers with stratified sampling, the stack delivers auditable, resilient and observable agents — non-negotiable requirements in production.
The framework's immediate roadmap (post-1.25) focuses on AI SDK v3 (native ToolLoopAgent), consolidation of MastraCompositeStore, expansion of providers in the Model Router and maturation of Agent Networks as the definitive replacement for the deprecated class. For architects deciding today: adopting Mastra 1.x is safe for production, with the caveat of monitoring weekly deprecations in the official changelog (high evolution pace) and keeping canonical snippets referenced against node_modules/@mastra/*/dist/docs/ or https://mastra.ai/llms.txt instead of dated blog posts.


