Artificial Intelligence

Technical guide: architecting AI with Mastra, Next.js, and TypeScript

Learn how to architect AI with Mastra and Next.js and discover how to replace the LangChain/LangGraph.js approach in production environments.

Gby Genildo SouzaJun 1126 min read

Technical guide: architecting AI with Mastra, Next.js, and TypeScript

"Written in April/2026, referencing @mastra/core@1.25.0. Check the official changelog before implementing."

Mastra + Next.js + TypeScript form the most mature and cohesive stack today for building type-safe AI agents in JavaScript, advantageously replacing the LangChain/LangGraph.js approach in production environments. Mastra 1.0 (GA since Jan/2026, current version @mastra/core@1.25.0, ~23.1k stars on GitHub) consolidated a central registry architecture with dependency injection, durable workflows with suspend/resume, memory-as-first-class, native MCP (client and server) and transparent integration with Vercel AI SDK v5/v6. The combination with Next.js App Router delivers native streaming, type-safe Server Actions and deploy on both Vercel serverless (with Fluid Compute, up to 800s) and VPS/Docker via mastra build.

This guide consolidates four pillars — framework architecture, Next.js integration, type-safety with Zod, and design patterns — into an actionable blueprint for senior architects. All snippets are idiomatic and functional against Mastra 1.x, AI SDK v5 and Next.js 14/15.

1. Architecture and Mastra capabilities

1.1 The `Mastra` object as central registry

Mastra is a registry with DI orchestrating Agents, Workflows, Tools, Memory, Storage, Vector, Observability, MCP Servers and Gateways. The HTTP server is generated on top of Hono (with adapters for Express/Fastify/Koa from v1.0). In production, storage by domains (memory/workflows/scores/traces) via MastraCompositeStore is the default.

// src/mastra/index.tsimport { Mastra } from '@mastra/core';import { PinoLogger } from '@mastra/loggers';import { MastraCompositeStore } from '@mastra/core/storage';import { WorkflowsPG, ScoresPG, PgVector } from '@mastra/pg';import { MemoryLibSQL } from '@mastra/libsql';import { weatherAgent } from './agents/weather-agent';import { weatherWorkflow } from './workflows/weather-workflow';const storage = new MastraCompositeStore({  id: 'composite',  domains: {    memory:    new MemoryLibSQL({ url: 'file:./local.db' }),    workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL! }),    scores:    new ScoresPG({ connectionString: process.env.DATABASE_URL! }),  },});export const mastra = new Mastra({  agents:    { weatherAgent },  workflows: { weatherWorkflow },  storage,  vectors:   { pg: new PgVector({ connectionString: process.env.DATABASE_URL! }) },  logger:    new PinoLogger({ name: 'Mastra', level: 'info' }),  server:    { port: 4111, host: '0.0.0.0', timeout: 30_000 },  // mcpServers, observability, scorers, processors, gateways, bundler...});

Main packages: @mastra/core (Mastra, Agent, Workflow, Tool, Memory, Storage interfaces, Processors), @mastra/memory, @mastra/libsql, @mastra/pg, @mastra/mcp, @mastra/ai-sdk, @mastra/loggers, @mastra/observability, @mastra/client-js, mastra (CLI). From v1, subpath imports are mandatory (@mastra/core/agent, @mastra/core/workflows, etc.), except Mastra and type Config.

Current status (Apr/2026): @mastra/core@1.25.0 GA, Apache-2.0 license (with ee/ areas under Mastra Enterprise License), Maintained by the Gatsby team (Sam Bhagwat, Shane Thomas). Positioned against LangGraph.js; uses Vercel AI SDK for model routing (40+ providers, 3000+ models via Mastra Model Router).

1.2 Agent Lifecycle

import { Agent } from '@mastra/core/agent';import { Memory } from '@mastra/memory';import { LibSQLStore } from '@mastra/libsql';import { weatherTool } from '../tools/weather';export const weatherAgent = new Agent({  id:   'weather-agent',  name: 'Weather Agent',  description: 'Responde sobre clima.',  instructions: 'Você é um assistente de clima. Use weatherTool quando preciso.',  model: 'openai/gpt-5.1',            // Mastra Model Router — "provider/model"  tools: { weatherTool },  memory: new Memory({    storage: new LibSQLStore({ url: 'file:./agent.db' }),    options: { lastMessages: 10, workingMemory: { enabled: true } },  }),});// .generate() — resposta completa, retorna { text, toolCalls, toolResults, steps, usage }const res = await weatherAgent.generate('Clima em Tóquio?', {  memory: { resource: 'user-123', thread: 'conv-42' },});// .stream() — token-a-token via MastraModelOutputconst stream = await weatherAgent.stream('Planeje meu dia');for await (const chunk of stream.textStream) process.stdout.write(chunk);

1.3 Memory: threads, resources, storage and vector

The Memory class combines storage (persistent history), vector (semantic recall) and embedder. Thread isolates conversations; Resource is a stable grouper (user/project) allowing multiple agents to share working memory and embeddings across threads. Default scope changed to 'resource' in Mastra 0.10+.

import { Memory } from '@mastra/memory';import { PgStore, PgVector } from '@mastra/pg';import { OpenAIEmbedder } from '@mastra/openai';const memoryPg = new Memory({  storage: new PgStore({ connectionString: process.env.DATABASE_URL! }),  vector:  new PgVector({ connectionString: process.env.DATABASE_URL! }),  embedder: new OpenAIEmbedder({ model: 'text-embedding-3-small' }),  options: {    lastMessages: 20,    semanticRecall: {      topK: 5,      messageRange: { before: 2, after: 1 },      scope: 'resource',      indexConfig: { type: 'hnsw', metric: 'dotproduct', m: 16, efConstruction: 64 },    },    workingMemory: {      enabled: true,      template: '# User\n- First Name:\n- Last Name:',      scope: 'resource',    },    generateTitle: true,  },});

Supported vector stores: LibSQLVector, PgVector (HNSW/IVFFlat, bit, sparsevec), Pinecone, Upstash, Qdrant, Chroma, MongoDB, Astra, OpenSearch, S3Vectors, TurboPuffer, Lance, Cloudflare, Couchbase.

1.4 Workflow System

createWorkflow() / createStep() deliver durable execution: automatic snapshots at each suspend(), state serialized to JSON in storage, resume cross-process via runId. Tables mastra_workflow_snapshot, mastra_traces, mastra_messages are created automatically.

Flow control primitives: .then() (sequential), .parallel([]) (fan-out/fan-in), .branch([[cond, step]]) (router), .foreach(step, {concurrency}) (MapReduce), .dountil()/.dowhile() (loops), .map() (transform). Retry configurable at workflow level and step level.

import { createWorkflow, createStep } from '@mastra/core/workflows';import { z } from 'zod';const approvalStep = createStep({  id: 'approval',  inputSchema:   z.object({ amount: z.number(), needsApproval: z.boolean() }),  outputSchema:  z.object({ approved: z.boolean(), message: z.string() }),  suspendSchema: z.object({ reason: z.string(), amount: z.number() }),  resumeSchema:  z.object({ approved: z.boolean(), approver: z.string() }),  execute: async ({ inputData, resumeData, suspend, bail }) => {    if (!inputData.needsApproval) return { approved: true, message: 'Auto' };    if (resumeData?.approved === false) {      return bail({ approved: false, message: 'Rejected' });    }    if (resumeData?.approved === undefined) {      return await suspend({ reason: 'Human approval required', amount: inputData.amount });    }    return { approved: true, message: `Approved by ${resumeData.approver}` };  },});export const paymentWorkflow = createWorkflow({  id: 'payment-workflow',  inputSchema:  z.object({ amount: z.number(), userId: z.string() }),  outputSchema: z.object({ approved: z.boolean(), message: z.string() }),  retryConfig:  { attempts: 5, delay: 2000 },})  .then(analyzePurchase)  .then(approvalStep)  .then(executePayment)  .commit();// Suspensão transparenteconst run = await paymentWorkflow.createRunAsync();const result = await run.start({ inputData: { amount: 5000, userId: 'u1' } });if (result.status === 'suspended') {  // runId salvo em fila, notifique aprovador}// Retomada (mesmo ou outro processo, pelo runId)const resumed = await paymentWorkflow.createRunAsync({ runId });await resumed.resume({ resumeData: { approved: true, approver: 'mgr@acme' } });

Discriminated union on returnrun.start() ('success' | 'failed' | 'suspended' | 'tripwire') ensures typed narrowing.

1.5 Multi-agent orchestration

Three approaches available:

Agent-as-tool (static supervisor): sub-agent wrapped in createTool(). Deterministic coordination, predictable flow.
agent.network() (dynamic routing): an Agent with agents, workflows, and tools registered; the LLM decides which primitive to call. Requires memory (persists task history and detects completion). Supports suspension with agent-execution-approval / tool-execution-approval.
Multi-agent workflows: steps invoking mastra.getAgent(...).

Important deprecation (2026): the AgentNetwork class was deprecated. Use agent.network() or explicit supervisor.

export const routingAgent = new Agent({  id: 'routing-agent',  instructions: 'Rede de pesquisadores e escritores.',  model: 'openai/gpt-5.4',  agents:    { researchAgent, writingAgent },  workflows: { cityWorkflow },  tools:     { weatherTool },  memory:    new Memory({ storage: new LibSQLStore({ url: 'file:./mastra.db' }) }),});const result = await routingAgent.network('Clima em Tóquio e atividade sugerida.');for await (const chunk of result) {  if (chunk.type === 'network-execution-event-step-finish') console.log(chunk.payload.result);}

1.6 LLMs via Vercel AI SDK

Mastra v1 delegated routing to Vercel AI SDK (v1/v2/v3 compatible). Two ways to specify models:

// (a) String Model Router (recomendado)model: 'openai/gpt-5.4'model: 'anthropic/claude-4-5-sonnet'model: 'google/gemini-2.5-pro'// (b) Instância SDK direta (quando precisa de tipagem rigorosa)import { openai } from '@ai-sdk/openai';model: openai('gpt-4o')// (c) Fallbacks automáticos cross-providermodel: [  { model: 'openai/gpt-5',                maxRetries: 3 },  { model: 'anthropic/claude-4-5-sonnet', maxRetries: 2 },  { model: 'google/gemini-2.5-pro',       maxRetries: 2 },]// (d) Dinâmico por requestmodel: ({ requestContext }) =>  requestContext.task === 'complex' ? 'anthropic/claude-4-5-sonnet' : 'openai/gpt-5-mini'

1.7 MCP (Model Context Protocol)

@mastra/mcp implements client and server. Transports stdio, SSE, and Streamable HTTP.

// Consumir MCPs externosimport { MCPClient } from '@mastra/mcp';export const mcp = new MCPClient({  id: 'main-mcp',  servers: {    filesystem: { command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '/tmp'] },    github:     { url: new URL('https://api.githubcopilot.com/mcp/'),                  requestInit: { headers: { Authorization: `Bearer ${process.env.GH_PAT}` } } },  },});export const researchAgent = new Agent({  id: 'research', model: 'openai/gpt-4o',  tools: await mcp.getTools(),      // estático  // ou dinâmico por request: const toolsets = await mcp.listToolsets();});// Expor Mastra como MCPimport { MCPServer } from '@mastra/mcp';const server = new MCPServer({  id: 'my-mcp-server', name: 'My MCP Server', version: '1.0.0',  description: 'Expõe Mastra via MCP.',  tools:     { weatherTool },  agents:    { weatherAgent },       // gera tool ask_weatherAgent  workflows: { cityWorkflow },       // gera tool run_cityWorkflow});server.startStdio();

2. Integration with Next.js App Router

2.1 Monorepo vs separate service

Criterion	Monorepo (embedded Mastra)	Separate service (`mastra dev` + `@mastra/client-js`)
Deploy	Single (`vercel deploy`)	Two domains, CORS, cross-origin auth
Agent↔UI latency	Zero internal network	+1 HTTP hop
AI scale vs SSR	Coupled	Independent
Workflows >5 min	Hard (`maxDuration`)	Natural (VM/container)
Multiple clients (web + mobile)	Frontend-centric	Reusable backend
Vercel Hobby	Viable with caution	Not recommended
MVP/prototype	Recommended	Overkill

2.2 Directory structure (monorepo)

TYPESCRIPT

my-nextjs-agent/├── src/│   ├── app/│   │   ├── api/chat/route.ts        # Route Handler streaming│   │   ├── chat/page.tsx            # UI client│   │   ├── actions/weather.ts       # Server Actions│   │   └── layout.tsx│   ├── mastra/│   │   ├── index.ts                 # new Mastra({...})│   │   ├── agents/weather-agent.ts│   │   ├── tools/weather-tool.ts│   │   ├── workflows/│   │   └── memory.ts│   └── lib/schemas.ts               # Zod compartilhado├── next.config.ts                   # serverExternalPackages: ['@mastra/*']└── .env.local                       # OPENAI_API_KEY, DATABASE_URL

Required configuration:

// next.config.tsimport type { NextConfig } from 'next';const nextConfig: NextConfig = {  serverExternalPackages: ['@mastra/*'],  // impede o bundler de empacotar binários nativos};export default nextConfig;

Known gotcha (vercel/next.js#74816): in some versions serverExternalPackages works in dev but fails in build. Fallback via Webpack: config.externals.push('@mastra/core', '@mastra/libsql').

2.3 Server Actions invoking agents

Ideal for non-streaming synchronous operations (form submit, single generation). Keeps API keys on the server, integrates with Next.js cache/revalidation.

// src/app/actions/weather.ts'use server';import { z } from 'zod';import { mastra } from '@/mastra';import { revalidatePath } from 'next/cache';const WeatherInput = z.object({  city: z.string().min(1).max(100),  units: z.enum(['metric', 'imperial']).default('metric'),});export type WeatherState =  | { status: 'idle' }  | { status: 'success'; text: string; toolCalls: unknown[] }  | { status: 'error'; message: string; fieldErrors?: Record<string, string[]> };export async function getWeather(_prev: WeatherState, formData: FormData): Promise<WeatherState> {  const parsed = WeatherInput.safeParse({    city: formData.get('city'),    units: formData.get('units') ?? 'metric',  });  if (!parsed.success) {    return { status: 'error', message: 'Entrada inválida',             fieldErrors: parsed.error.flatten().fieldErrors };  }  try {    const result = await mastra.getAgent('weatherAgent').generate(      `Weather in ${parsed.data.city}? Units: ${parsed.data.units}.`,      { memory: { thread: 'weather-thread', resource: 'public' } },    );    revalidatePath('/weather');    return { status: 'success', text: result.text, toolCalls: result.toolCalls ?? [] };  } catch (err) {    console.error('[getWeather]', err);    return { status: 'error', message: 'Falha ao consultar o agente.' };  }}

Client consumption with useActionState:

TSX

'use client';import { useActionState } from 'react';import { getWeather, type WeatherState } from '@/app/actions/weather';export default function WeatherPage() {  const [state, formAction, pending] = useActionState(getWeather, { status: 'idle' } as WeatherState);  return (    <form action={formAction}>      <input name="city" required />      <select name="units"><option value="metric">°C</option><option value="imperial">°F</option></select>      <button disabled={pending}>{pending ? 'Consultando...' : 'Ver clima'}</button>      {state.status === 'success' && <pre>{state.text}</pre>}      {state.status === 'error' && <p>{state.message}</p>}    </form>  );}

Important limitations: Server Actions don't stream — the client waits for the full response. Subject to maxDuration from the platform. For streaming, use Route Handler + useChat.

2.4 Route Handlers with streaming

Modern Mastra 1.0 pattern: @mastra/ai-sdk + handleChatStream().

// src/app/api/chat/route.tsimport { handleChatStream } from '@mastra/ai-sdk';import { toAISdkV5Messages } from '@mastra/ai-sdk/ui';import { createUIMessageStreamResponse } from 'ai';import { NextResponse } from 'next/server';import { mastra } from '@/mastra';export const maxDuration = 60;        // 300 default com Fluid; até 800 em Proexport const runtime = 'nodejs';      // OBRIGATÓRIO — Mastra não suporta Edgeexport async function POST(req: Request) {  const params = await req.json();  const stream = await handleChatStream({    mastra,    agentId: 'weatherAgent',    params: {      ...params,      memory: { thread: params.threadId ?? 'default', resource: params.resourceId ?? 'anon' },    },  });  return createUIMessageStreamResponse({ stream });}// Hidrata histórico no mountexport async function GET() {  const memory = await mastra.getAgentById('weatherAgent').getMemory();  const res = await memory?.recall({ threadId: 'default', resourceId: 'anon' });  return NextResponse.json(toAISdkV5Messages(res?.messages ?? []));}

Low-level alternative (full control):

export async function POST(req: Request) {  const { messages } = await req.json();  const stream = await mastra.getAgent('weatherAgent').stream(messages, {    format: 'aisdk',                  // AI SDK v5 compat    memory: { thread: 'demo', resource: 'user-1' },    abortSignal: req.signal,          // propaga cancelamento até o LLM  });  return stream.toUIMessageStreamResponse();   // ou .toDataStreamResponse() (v4), .toTextStreamResponse()}

Separate service (Next.js proxy to standalone Mastra on :4111):

import { MastraClient } from '@mastra/client-js';const client = new MastraClient({  baseUrl: process.env.MASTRA_API_URL ?? 'http://localhost:4111',  retries: 3, backoffMs: 300,});export async function POST(req: Request) {  const { messages } = await req.json();  const response = await client.getAgent('weatherAgent').stream({    messages, threadId: 'demo', resourceId: 'user-1',  });  return new Response(response.body, {    headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache, no-transform',               'X-Accel-Buffering': 'no' },  });}

2.5 Chat UI with useChat and AI Elements

TSX

'use client';import { useEffect, useState } from 'react';import { useChat } from '@ai-sdk/react';import { DefaultChatTransport, type ToolUIPart } from 'ai';export default function ChatPage() {  const [input, setInput] = useState('');  const { messages, setMessages, sendMessage, stop, status } = useChat({    transport: new DefaultChatTransport({      api: '/api/chat',      // Com Mastra Memory: envie APENAS a última mensagem + identifiers      prepareSendMessagesRequest({ messages, body }) {        return { body: { ...body,          messages: [messages[messages.length - 1]],          threadId: 'default-thread', resourceId: 'user-1',        }};      },    }),  });  useEffect(() => {    fetch('/api/chat').then(r => r.json()).then(setMessages).catch(() => {});  }, [setMessages]);  return (    <div>      {messages.map(m => (        <div key={m.id}>          {m.parts?.map((part, i) => {            if (part.type === 'text')      return <p key={i}>{part.text}</p>;            if (part.type === 'reasoning') return <details key={i}><summary>Thinking</summary>{part.text}</details>;            if (part.type?.startsWith('tool-')) {              const p = part as ToolUIPart;              // Estados: 'input-streaming' → 'input-available' → 'output-available' | 'output-error'              switch (p.state) {                case 'input-available':  return <Skeleton key={i} />;                case 'output-available': return <ToolCard key={i} output={p.output} />;                case 'output-error':     return <ErrorCard key={i} text={p.errorText} />;              }            }            return null;          })}        </div>      ))}      <input value={input} onChange={e => setInput(e.target.value)} />      <button onClick={() => { sendMessage({ text: input }); setInput(''); }}>Send</button>      {status === 'streaming' && <button onClick={stop}>⏹ Stop</button>}    </div>  );}

2.6 Deploy: limits, storages and runtimes

Vercel maxDuration (Apr/2026):

Plan	Default	Max. with Fluid	Max. without Fluid
Hobby	300s	300s	60s
Pro	300s	800s	300s
Enterprise	300s	900s	900s

Fluid Compute (enabled by default since Apr/2025) allows concurrency on the same instance, active CPU pricing, and streams continue past 300s if the first byte is sent within ~25s.

Critical storage in serverless: LibSQLStore with file:./mastra.db DOES NOT work on ephemeral FS (Vercel/Lambda). Use:

// Turso (LibSQL remoto)new LibSQLStore({ url: process.env.TURSO_URL!, authToken: process.env.TURSO_TOKEN })// Postgres (Neon, Supabase, Vercel Postgres)new PostgresStore({ connectionString: process.env.DATABASE_URL! })// Upstash Redisnew UpstashStore({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! })

Runtime: always export const runtime = 'nodejs' on routes that import Mastra. Edge runtime fails due to native dependencies (libsql, better-sqlite3, fs/crypto bindings).

VPS/Docker (mastra build):

BASH

npx mastra build --dir src/mastra    # gera .mastra/output/ (Hono bundle)node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs

DOCKERFILE

FROM node:22-alpine AS builderWORKDIR /appCOPY package*.json ./ && RUN npm ciCOPY src ./src && COPY tsconfig.json ./RUN npx mastra buildFROM node:22-alpine AS runnerWORKDIR /appRUN addgroup -g 1001 -S nodejs && adduser -S mastra -u 1001COPY --from=builder --chown=mastra:nodejs /app/.mastra/output ./.mastra/outputUSER mastraEXPOSE 4111HEALTHCHECK --interval=30s CMD wget -qO- http://localhost:4111/api/health || exit 1CMD ["node", "--import=./.mastra/output/instrumentation.mjs", ".mastra/output/index.mjs"]

VercelDeployer publishes Mastra standalone as a Vercel function (no Next in front):

import { VercelDeployer } from '@mastra/deployer-vercel';export const mastra = new Mastra({  deployer: new VercelDeployer({ studio: true, maxDuration: 600, memory: 1536, regions: ['gru1', 'iad1'] }),});

3. End-to-end type-safety with Zod

3.1 Zod as a quadruple contract

A Zod schema fulfills four simultaneous roles:

Role	Mechanism	Moment
Static contract	`z.infer<typeof schema>`	compile-time
Runtime validation	`.parse()` / `.safeParse()`	post-LLM
Specification for the LLM	JSON Schema (via `zodSchema()` from AI SDK)	pre-request
Semantic documentation	`.describe()` read by the model	pre-request

Critical rule: .describe() directly impacts the quality of structured output — it's "prompt engineering via types". Always describe ambiguous fields.

3.2 Idiomatic patterns for LLMs

Use .nullable() instead of .optional() — OpenAI strict mode and GPT-5 reject optional() in structured output (mastra-ai/mastra#7234):

// ❌ Quebra em GPT-5 strict modeconst bad = z.object({ details: z.string().optional() });// ✅ Corretoconst good = z.object({ details: z.string().nullable().describe('null se ausente') });

Discriminated unions are the standard for agent actions (ReAct, tool-routing):

export const AgentActionSchema = z.discriminatedUnion('type', [  z.object({ type: z.literal('search'), query: z.string() }),  z.object({ type: z.literal('answer'), text: z.string(), confidence: z.number().min(0).max(1) }),  z.object({ type: z.literal('escalate'), reason: z.string(), severity: z.enum(['low','medium','high']) }),]);

3.3 Zod v3 vs v4 — impacts on AI pipelines

Aspect	v3	v4	Impact
Parse strings/arrays	baseline	14×/7× faster (JIT)	Almost free streaming validation
Compile TS	baseline	~10× faster	Monorepos with many schemas
Bundle `core`	baseline	2.3× smaller	Important on edge
`z.record()`	1 arg	2 required args	Breaks migration
`.optional().default()`	default ignored if missing	always returns default	Careful in working memory
Schema creation	fast	17× slower (JIT)	Don't instantiate in hot loops
`.describe()`/`.meta()`	any position	must be last chain call	Doesn't inherit via `.optional()/.extend()`

Mastra ≥ beta.16 normalizes both via Standard Schema; Zod coexists via zod/v3 and zod/v4.

3.4 Typed Tools with `createTool`

import { createTool } from '@mastra/core/tools';import { z } from 'zod';export const githubRepoTool = createTool({  id: 'get-github-repo-info',  description: 'Fetch basic insights for a public GitHub repository',  inputSchema: z.object({    owner: z.string().describe('GitHub username or organization'),    repo:  z.string().describe('Repository name'),  }),  outputSchema: z.object({    stars:   z.number(),    forks:   z.number(),    issues:  z.number(),    license: z.string().nullable(),    lastPush: z.string(),    description: z.string().nullable(),  }),  execute: async ({ context, runtimeContext }) => {    //              ^ { owner: string; repo: string } inferido    const res = await fetch(`https://api.github.com/repos/${context.owner}/${context.repo}`);    if (res.status === 404) throw new Error(`Not found`);    const d = await res.json();    return {      stars: d.stargazers_count, forks: d.forks_count, issues: d.open_issues_count,      license: d.license?.name ?? null, lastPush: d.pushed_at, description: d.description,    };    // Incompatibilidade com outputSchema é erro em compile-time E runtime  },});

Return from tool.execute() is discriminated union that includes error path — narrowing via if ('error' in result && result.error). Avoid the name error as a field in outputSchema (collides with discriminator).

Typed RuntimeContext (⚠️ known bug — .get() doesn't infer; use cast):

export type SupportCtx = { 'user-tier': 'free'|'pro'|'enterprise'; language: 'en'|'pt-BR' };execute: async ({ runtimeContext }) => {  const tier = runtimeContext.get('user-tier') as SupportCtx['user-tier'];  const limit = tier === 'enterprise' ? 100 : tier === 'pro' ? 25 : 5;  ...}

3.5 Structured Outputs

API Mastra v1:

const result = await agent.generate('Who won 2012?', {  structuredOutput: {    schema: ElectionResultSchema,    errorStrategy: 'fallback',                  // 'strict' | 'warn' | 'fallback'    fallbackValue: { winner: 'Unknown', year: 0, party: 'Other' },    jsonPromptInjection: true,                  // obrigatório: Gemini 2.5 + tools  },});result.object.winner;  // string, totalmente tipado

Partial object streaming:

const stream = await agent.stream('...', { structuredOutput: { schema } });for await (const partial of stream.objectStream) {  // DeepPartial<T> — campos chegando incrementalmente}const final = await stream.object;   // T validado

Pure AI SDK (modern way with Output.object(), since generateObject is deprecated):

import { generateText, Output, tool, stepCountIs } from 'ai';const { output } = await generateText({  model: 'openai/gpt-5.2',  tools: { weather: tool({ inputSchema: z.object({ location: z.string() }), execute: async () => ({...}) }) },  output: Output.object({ schema: RecipeSchema }),  stopWhen: stepCountIs(5),  prompt: '...',});

Error handling:

import { AI_NoObjectGeneratedError } from 'ai';try { const { object } = await generateObject({ ... }); return object; }catch (err) {  if (AI_NoObjectGeneratedError.isInstance(err)) console.error('No object', err.text, err.cause);  if (err instanceof z.ZodError) console.error('Zod failed', err.issues);  throw err;}

3.6 Workflows with typed steps

.then(step) only compiles if step.inputSchema is compatible with the outputSchema from the previous step — the compiler holds the pipeline shape.

const scrapeStep = createStep({  id: 'scrape',  inputSchema:  z.object({ url: z.string().url() }),  outputSchema: z.object({ url: z.string().url(), markdown: z.string() }),  execute: async ({ inputData }) => ({ url: inputData.url, markdown: await fetch(inputData.url).then(r=>r.text()) }),});const summarizeStep = createStep({  id: 'summarize',  inputSchema: scrapeStep.outputSchema,  outputSchema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }),  execute: async ({ inputData, mastra }) => {    const res = await mastra.getAgent('summarizer').generate(inputData.markdown, {      structuredOutput: { schema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }) },    });    return res.object;  },});export const changelogWorkflow = createWorkflow({  id: 'changelog', inputSchema: z.object({ url: z.string().url() }), outputSchema: summarizeStep.outputSchema,}).then(scrapeStep).then(summarizeStep).commit();

Runtime context validation via requestContextSchema:

const workflow = createWorkflow({  id: 'tiered', inputSchema, outputSchema,  requestContextSchema: z.object({ userTier: z.enum(['free','pro','enterprise']), locale: z.string() }),});

3.7 Where type-safety lives

Layer	Tool	Compile	Runtime	Sent to LLM
HTTP/Form input	`safeParse` in Server Action	✅	✅	—
Tool input	`createTool({ inputSchema })`	✅	✅	✅
Tool output	`createTool({ outputSchema })`	✅	⚠️ informative	✅
Structured output	`generate({ structuredOutput })`	✅	✅	✅
Workflow step	`createStep({ inputSchema, outputSchema })`	✅	✅	—
Runtime context	`RuntimeContext<T>`	✅ on `set`; ⚠️ on `get`	⚠️ optional	❌
Memory	`workingMemory: { schema }`	✅	✅	✅

4. Design patterns for agents and workflows

4.1 ReAct: implicit vs explicit

Approach	Who decides	Implementation	When to use
Implicit (agent loop)	LLM, via native tool calling	`Agent` with `tools`; Mastra runs loop automatically	Open-ended tasks, unknown N of steps
Explicit (workflow)	Orchestrator code	`createWorkflow` + `.dountil()` calling step with agent	Auditability, SLA, hard limits, HITL in the loop

Recommendation: always start with implicit ReAct. Only migrate to explicit when you need granular trace, step budget, human approval mid-flight, or A/B testing by action type.

// Implícito — o agent loop já implementa ReActexport const researchAgent = new Agent({  instructions: `Follow ReAct loop: THOUGHT → ACTION (call one tool) → OBSERVATION.                 Repeat until confident. Never invent facts.`,  model: 'openai/gpt-4o',  tools: { searchDocsTool },});await researchAgent.generate('Qual a diferença entre suspend() e bail()?', { maxSteps: 8 });

Reusable ReAct schema:

export const ReActStepSchema = z.object({  thought: z.string().describe('Reasoning about next step'),  action: z.discriminatedUnion('type', [    z.object({ type: z.literal('tool_call'), toolName: z.string(), args: z.record(z.string(), z.unknown()) }),    z.object({ type: z.literal('final_answer'), answer: z.string(), confidence: z.number().min(0).max(1) }),  ]),});

4.2 Plan-and-Execute

Separates Planner (powerful LLM, e.g., Claude Sonnet/GPT-5) that produces plan upfront, from Executor (cheap LLMs specialized in tool use). ~30% fewer tokens than ReAct in complex multi-step tasks (LangChain 2026 benchmark).

const plannerAgent = new Agent({  id: 'planner', instructions: 'Break goal into 3-7 concrete, tool-executable steps.',  model: 'anthropic/claude-sonnet-4',});const executorAgent = new Agent({  id: 'executor', instructions: 'Execute a single plan step. Use tools. Return terse result.',  model: 'openai/gpt-4o-mini', tools: { /* ... */ },});const planStep = createStep({  id: 'plan', inputSchema: z.object({ goal: z.string() }),  outputSchema: z.object({ goal: z.string(), plan: z.array(z.object({    id: z.string(), description: z.string(), dependsOn: z.array(z.string()).default([]),  })) }),  execute: async ({ inputData, mastra }) => {    const res = await mastra.getAgent('plannerAgent').generate(`Goal: ${inputData.goal}`, {      output: z.object({ plan: z.array(/* ... */) }),    });    return { goal: inputData.goal, plan: res.object.plan };  },});export const planAndExecuteWorkflow = createWorkflow({  id: 'plan-exec', inputSchema: z.object({ goal: z.string() }), outputSchema: z.object({ results: z.array(z.any()) }),})  .then(planStep)  .map(async ({ inputData }) => inputData.plan)  .foreach(executeStep, { concurrency: 3 })  .commit();

4.3 Orchestration patterns matrix

Pattern	Mastra API	When to use
Pipeline	`.then()`	Fixed steps, linear dependency (ETL, content pipeline)
Fan-out/Fan-in	`.parallel([])`	N fixed independent tasks; output is object with keys = step ids
MapReduce	`.foreach(step, {concurrency})`	N dynamic; process list
Router/Branch	`.branch([[cond, step]])`	Routing by classification; all branches share schemas
Static supervisor	Agent with sub-agents as tools	Deterministic coordination
Dynamic supervisor	`agent.network()`	LLM decides which primitive to call
Evaluator-Optimizer	`.dowhile()` / `.dountil()` + scorer	Convergent iterative refinement
Human-in-the-loop	`suspend()` / `resume()` / `bail()`	Approvals, payment >$X, irreversible actions
Handoff	workflow + agents with shared memory	Specialist takes control
Council	`.parallel()` + synthesis step	Multiple opinions for synthesis

Concrete Evaluator-Optimizer:

workflow  .then(generateDraft)  .dowhile(    createStep({      id: 'eval-refine',      execute: async ({ inputData, state }) => {        const scorer = createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' });        const { score } = await scorer.run({ input: state.prompt, output: inputData.draft });        if (score >= 0.85) return { ...inputData, done: true, score };        const refined = await refinerAgent.generate(/* with feedback */);        return { ...inputData, draft: refined.text, done: false, score };      },    }),    async ({ inputData }) => !inputData.done,  )  .commit();

4.4 External tools integration

REST API:

export const githubIssueTool = createTool({  id: 'github-create-issue',  inputSchema: z.object({    repo: z.string().regex(/^[\w-]+\/[\w-]+$/),    title: z.string().min(1).max(256),    body: z.string().max(65_536).optional(),    labels: z.array(z.string()).max(100).default([]),  }),  outputSchema: z.object({ number: z.number(), url: z.string().url() }),  execute: async ({ context, tracingContext }) => {    const span = tracingContext?.currentSpan?.startSpan('github.api.call');    try {      const res = await fetch(`https://api.github.com/repos/${context.repo}/issues`, {        method: 'POST',        headers: { Authorization: `Bearer ${process.env.GITHUB_TOKEN}`, 'Content-Type': 'application/json' },        body: JSON.stringify({ title: context.title, body: context.body, labels: context.labels }),        signal: AbortSignal.timeout(10_000),      });      if (!res.ok) throw new Error(`GitHub ${res.status}: ${await res.text()}`);      const data = await res.json();      return { number: data.number, url: data.html_url };    } finally { span?.end(); }  },});

Database (Drizzle):

export const getUserTool = createTool({  id: 'get-user', inputSchema: z.object({ userId: z.string().uuid() }),  outputSchema: z.object({ id: z.string(), email: z.string(), plan: z.enum(['free','pro','enterprise']) }).nullable(),  execute: async ({ context }) => {    const db = drizzle(process.env.DATABASE_URL!);    const [row] = await db.select().from(users).where(eq(users.id, context.userId)).limit(1);    return row ?? null;  },});

Error handling in tools:

Strategy	When	Example
Throw	Unrecoverable error	Auth failure, timeout after retries
Structured return	LLM must react/retry	`{ success: false, error: { code, message } }` in union
Internal retry	Transient failure	`p-retry` inside the `execute`
Circuit breaker	Unstable API	`opossum`, opens after N failures
Timeout	Prevent stuck agent	`AbortSignal.timeout(ms)`

// Padrão "return estruturado" — melhor para o loop ReActoutputSchema: z.union([  z.object({ success: z.literal(true),  data: z.object({ /* ... */ }) }),  z.object({ success: z.literal(false), error: z.object({ code: z.string(), message: z.string() }) }),]),

4.5 Observability

Structured logging (PinoLogger):

import { PinoLogger } from '@mastra/loggers';export const mastra = new Mastra({  logger: new PinoLogger({    name: 'Mastra', level: process.env.LOG_LEVEL ?? 'info',    mixin() { return { traceId: getCurrentTraceId(), service: 'ai-api', env: process.env.NODE_ENV }; },  }),});

Native AI Tracing with OTel + multiple exporters:

import { DefaultExporter } from '@mastra/observability';import { LangfuseExporter } from '@mastra/langfuse';export const mastra = new Mastra({  observability: {    default: { enabled: true },    configs: {      langfuse: {        serviceName: 'prod-agents',        exporters: [new LangfuseExporter({          publicKey: process.env.LANGFUSE_PUBLIC_KEY!, secretKey: process.env.LANGFUSE_SECRET_KEY!,          baseUrl: process.env.LANGFUSE_BASE_URL,        })],        sampling: { type: 'ratio', probability: 0.1 },   // 10% em prod      },      debug: { exporters: [new DefaultExporter()], sampling: { type: 'always' } },    },    configSelector: (ctx) => ctx.runtimeContext?.get('supportMode') ? 'debug' : 'langfuse',  },});

Platform comparison:

Platform	Strong in	Weak in	When to choose
Langfuse	LLM-native (prompts, cost, evals). Self-host.	Generic infra tracing	Prompt engineering, cost per feature, evals
Braintrust	Production evals, A/B side-by-side	Less rich tracing	Teams focused on regression testing
LangSmith	LangChain integration, datasets	Vendor lock-in	Stack already LangChain/LangGraph
SigNoz/Datadog (OTel)	Full-stack APM	Not LLM-first	Unified APM (not just AI)
Mastra Studio + DuckDB	Built-in, zero setup, cost/latency	Local/single-node	Local dev, small teams

Evals / Scorers (Mastra 2026 — replaces legacy evals):

Scorers run async after response, with pipeline preprocess → analyze → generateScore → generateReason. Built-in: answer-relevancy, answer-similarity, faithfulness, hallucination, completeness, tool-call-accuracy, trajectory-accuracy, bias, toxicity, prompt-alignment.

import { createAnswerRelevancyScorer, createToxicityScorer, createHallucinationScorer } from '@mastra/evals/scorers/llm';export const supportAgent = new Agent({  id: 'support', model: 'openai/gpt-4o',  scorers: {    relevancy:     { scorer: createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' }), sampling: { type: 'ratio', rate: 0.2 } },    hallucination: { scorer: createHallucinationScorer({ model: 'openai/gpt-4o-mini' }),   sampling: { type: 'ratio', rate: 1.0 } },    toxicity:      { scorer: createToxicityScorer({ model: 'openai/gpt-4o-mini' }),        sampling: { type: 'ratio', rate: 1.0 } },  },});

CI/CD with Vitest:

import { runEvals } from '@mastra/core/evals';describe('Support Agent', () => {  it('meets quality thresholds', async () => {    const result = await runEvals({      target: supportAgent,      data: [{ input: 'How to cancel?', groundTruth: 'cancellation policy' }],      scorers: [relevancyScorer, hasSourcesScorer],      concurrency: 3,    });    expect(result.scores['answer-relevancy']).toBeGreaterThanOrEqual(0.8);  });});

5. Consolidated architectural blueprint

TL;DR for architects:

Start simple. Single agent + tools. Workflow/multi-agent only when steps are known or context becomes unmanageable.
Workflows for auditability/SLO; agent.network() for flexibility. Workflows = code dictates flow. Networks = LLM dictates flow.
Zod everywhere. Every tool inputSchema/outputSchema, every step, every scorer. It's your only defense against hallucination in tool args.
Persistence from day 1. Postgres (prod) or LibSQL (dev). Without it, suspend/resume doesn't work and you lose traces on restart.
serverExternalPackages: ['@mastra/*'] + runtime = 'nodejs'. Non-negotiable in Next.js.
Route Handlers for streaming; Server Actions for sync. Don't try to stream via Server Action.
Vercel Fluid Compute + remote Postgres/Turso for serverless production. Never local LibSQL.
Observability by environment via configSelector: dev→Default, staging→10% Langfuse, prod→1% + Datadog.
Scorers with low sampling in prod (5-20%), 100% in toxicity/safety.
MCP before reimplementing tools. GitHub, Slack, Notion, filesystem, Playwright already have official servers.
HITL via suspend() whenever cost/irreversibility > convenience. Payment >$X, deletion, bulk send.
Prefer agent.network() or explicit supervisor over AgentNetwork class (deprecated).
Plan-and-Execute > ReAct for tasks >5 steps. Powerful planner + cheap executors saves 20-30% tokens.

Known critical pitfalls

optional() breaks OpenAI/GPT-5 strict mode → use .nullable() with .describe() (mastra-ai/mastra#7234).
Gemini 2.5 + tools + structured output → always jsonPromptInjection: true.
z.record() in Zod v4 needs 2 required args.
Field named error in outputSchema breaks tool.execute() narrowing.
RuntimeContext.get() doesn't infer — manual cast needed.
.describe()/.meta() must be last chain call (doesn't inherit via .optional()/.extend()).
tool() helper from AI SDK is mandatory for inference; createTool from Mastra doesn't suffer from this.
generateObject deprecated → migrate to generateText({ output: Output.object(...) }).
Zod v4 creates schemas 17× slower (JIT) — never instantiate in hot render/loop.
toDataStreamResponse() + output: zodSchema conflicts (mastra-ai/mastra#5544) — use experimental_output.
serverExternalPackages has build issue (vercel/next.js#74816) — keep Webpack fallback.
AgentNetwork class → deprecated; use agent.network().
legacy_workflows → replaced by createWorkflow/createStep.

Reference stack for production

TYPESCRIPT

┌──────────────────────────────────────────────────────────────┐│  Next.js App Router (Node runtime)                           ││  ├─ Route Handlers (streaming, useChat)                      ││  └─ Server Actions (síncrono, forms)                         │├──────────────────────────────────────────────────────────────┤│  Mastra (embedded ou standalone)                             ││  ├─ Agents (ReAct implícito) + agent.network()               ││  ├─ Workflows (.then/.parallel/.branch/.foreach/.dountil)    ││  ├─ Tools (createTool + Zod) + MCP (client + server)         ││  └─ Memory (threads/resources, semanticRecall, workingMemory)│├──────────────────────────────────────────────────────────────┤│  Storage: MastraCompositeStore                               ││  ├─ memory: LibSQL (dev) / Postgres (prod)                   ││  ├─ workflows: Postgres (snapshots persistentes)             ││  ├─ scores: Postgres                                         ││  └─ vectors: pgvector / Pinecone / Upstash                   │├──────────────────────────────────────────────────────────────┤│  LLM: Vercel AI SDK v5/v6                                    ││  ├─ openai/gpt-5.x, anthropic/claude-4-5-sonnet,             ││  │  google/gemini-2.5-pro                                    ││  └─ Fallbacks automáticos cross-provider                     │├──────────────────────────────────────────────────────────────┤│  Observabilidade                                             ││  ├─ PinoLogger (structured)                                  ││  ├─ OTel tracing → Langfuse/Braintrust/SigNoz                ││  └─ Scorers async (relevancy, hallucination, toxicity)       │├──────────────────────────────────────────────────────────────┤│  Deploy                                                      ││  ├─ Vercel (Fluid Compute, maxDuration: 800s Pro)            ││  ├─ VPS (mastra build → node + PM2)                          ││  └─ Docker (multi-stage, Alpine, healthcheck)                │└──────────────────────────────────────────────────────────────┘

Conclusion

The Mastra 1.x + Next.js 15 + AI SDK v5/v6 ecosystem is today, in April 2026, the most cohesive and type-safe approach for building AI agents in TypeScript — surpassing LangChain/LangGraph.js in ergonomics, DX and native integration with the JavaScript runtime. The three architectural decisions that most impact scale and maintainability are: (1) choosing between Mastra embedded in Next.js (MVP, single frontend) vs. standalone (independent scale, multiple clients), (2) migrating from local LibSQL to remote Postgres on day zero in serverless (without this suspend/resume and traces are illusions), and (3) investing in Zod as a quadruple contract (compile-time, runtime, prompt to LLM, semantic documentation) from the very first tool.

The counter-intuitive insight here is that the biggest quality gain doesn't come from the most powerful model, but from the granularity of Zod schemas: .describe() well written in outputSchema fields are disguised prompt engineering, and .nullable() instead of .optional() eliminates entire classes of failures in OpenAI strict mode. Combined with durable workflows (suspend/resume/bail), agent.network() for dynamic routing, MCP for interop without reimplementation, and continuous scorers with stratified sampling, the stack delivers auditable, resilient and observable agents — non-negotiable requirements in production.

The framework's immediate roadmap (post-1.25) focuses on AI SDK v3 (native ToolLoopAgent), consolidation of MastraCompositeStore, expansion of providers in the Model Router and maturation of Agent Networks as the definitive replacement for the deprecated class. For architects deciding today: adopting Mastra 1.x is safe for production, with the caveat of monitoring weekly deprecations in the official changelog (high evolution pace) and keeping canonical snippets referenced against node_modules/@mastra/*/dist/docs/ or https://mastra.ai/llms.txt instead of dated blog posts.

"Written in April/2026, referencing @mastra/core@1.25.0. Check the official changelog before implementing."

1. Architecture and Mastra capabilities

1.1 The `Mastra` object as central registry

// src/mastra/index.tsimport { Mastra } from '@mastra/core';import { PinoLogger } from '@mastra/loggers';import { MastraCompositeStore } from '@mastra/core/storage';import { WorkflowsPG, ScoresPG, PgVector } from '@mastra/pg';import { MemoryLibSQL } from '@mastra/libsql';import { weatherAgent } from './agents/weather-agent';import { weatherWorkflow } from './workflows/weather-workflow';const storage = new MastraCompositeStore({  id: 'composite',  domains: {    memory:    new MemoryLibSQL({ url: 'file:./local.db' }),    workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL! }),    scores:    new ScoresPG({ connectionString: process.env.DATABASE_URL! }),  },});export const mastra = new Mastra({  agents:    { weatherAgent },  workflows: { weatherWorkflow },  storage,  vectors:   { pg: new PgVector({ connectionString: process.env.DATABASE_URL! }) },  logger:    new PinoLogger({ name: 'Mastra', level: 'info' }),  server:    { port: 4111, host: '0.0.0.0', timeout: 30_000 },  // mcpServers, observability, scorers, processors, gateways, bundler...});

1.2 Agent Lifecycle

import { Agent } from '@mastra/core/agent';import { Memory } from '@mastra/memory';import { LibSQLStore } from '@mastra/libsql';import { weatherTool } from '../tools/weather';export const weatherAgent = new Agent({  id:   'weather-agent',  name: 'Weather Agent',  description: 'Responde sobre clima.',  instructions: 'Você é um assistente de clima. Use weatherTool quando preciso.',  model: 'openai/gpt-5.1',            // Mastra Model Router — "provider/model"  tools: { weatherTool },  memory: new Memory({    storage: new LibSQLStore({ url: 'file:./agent.db' }),    options: { lastMessages: 10, workingMemory: { enabled: true } },  }),});// .generate() — resposta completa, retorna { text, toolCalls, toolResults, steps, usage }const res = await weatherAgent.generate('Clima em Tóquio?', {  memory: { resource: 'user-123', thread: 'conv-42' },});// .stream() — token-a-token via MastraModelOutputconst stream = await weatherAgent.stream('Planeje meu dia');for await (const chunk of stream.textStream) process.stdout.write(chunk);

1.3 Memory: threads, resources, storage and vector

import { Memory } from '@mastra/memory';import { PgStore, PgVector } from '@mastra/pg';import { OpenAIEmbedder } from '@mastra/openai';const memoryPg = new Memory({  storage: new PgStore({ connectionString: process.env.DATABASE_URL! }),  vector:  new PgVector({ connectionString: process.env.DATABASE_URL! }),  embedder: new OpenAIEmbedder({ model: 'text-embedding-3-small' }),  options: {    lastMessages: 20,    semanticRecall: {      topK: 5,      messageRange: { before: 2, after: 1 },      scope: 'resource',      indexConfig: { type: 'hnsw', metric: 'dotproduct', m: 16, efConstruction: 64 },    },    workingMemory: {      enabled: true,      template: '# User\n- First Name:\n- Last Name:',      scope: 'resource',    },    generateTitle: true,  },});

1.4 Workflow System

import { createWorkflow, createStep } from '@mastra/core/workflows';import { z } from 'zod';const approvalStep = createStep({  id: 'approval',  inputSchema:   z.object({ amount: z.number(), needsApproval: z.boolean() }),  outputSchema:  z.object({ approved: z.boolean(), message: z.string() }),  suspendSchema: z.object({ reason: z.string(), amount: z.number() }),  resumeSchema:  z.object({ approved: z.boolean(), approver: z.string() }),  execute: async ({ inputData, resumeData, suspend, bail }) => {    if (!inputData.needsApproval) return { approved: true, message: 'Auto' };    if (resumeData?.approved === false) {      return bail({ approved: false, message: 'Rejected' });    }    if (resumeData?.approved === undefined) {      return await suspend({ reason: 'Human approval required', amount: inputData.amount });    }    return { approved: true, message: `Approved by ${resumeData.approver}` };  },});export const paymentWorkflow = createWorkflow({  id: 'payment-workflow',  inputSchema:  z.object({ amount: z.number(), userId: z.string() }),  outputSchema: z.object({ approved: z.boolean(), message: z.string() }),  retryConfig:  { attempts: 5, delay: 2000 },})  .then(analyzePurchase)  .then(approvalStep)  .then(executePayment)  .commit();// Suspensão transparenteconst run = await paymentWorkflow.createRunAsync();const result = await run.start({ inputData: { amount: 5000, userId: 'u1' } });if (result.status === 'suspended') {  // runId salvo em fila, notifique aprovador}// Retomada (mesmo ou outro processo, pelo runId)const resumed = await paymentWorkflow.createRunAsync({ runId });await resumed.resume({ resumeData: { approved: true, approver: 'mgr@acme' } });

Discriminated union on returnrun.start() ('success' | 'failed' | 'suspended' | 'tripwire') ensures typed narrowing.

1.5 Multi-agent orchestration

Three approaches available:

Agent-as-tool (static supervisor): sub-agent wrapped in createTool(). Deterministic coordination, predictable flow.
agent.network() (dynamic routing): an Agent with agents, workflows, and tools registered; the LLM decides which primitive to call. Requires memory (persists task history and detects completion). Supports suspension with agent-execution-approval / tool-execution-approval.
Multi-agent workflows: steps invoking mastra.getAgent(...).

Important deprecation (2026): the AgentNetwork class was deprecated. Use agent.network() or explicit supervisor.

export const routingAgent = new Agent({  id: 'routing-agent',  instructions: 'Rede de pesquisadores e escritores.',  model: 'openai/gpt-5.4',  agents:    { researchAgent, writingAgent },  workflows: { cityWorkflow },  tools:     { weatherTool },  memory:    new Memory({ storage: new LibSQLStore({ url: 'file:./mastra.db' }) }),});const result = await routingAgent.network('Clima em Tóquio e atividade sugerida.');for await (const chunk of result) {  if (chunk.type === 'network-execution-event-step-finish') console.log(chunk.payload.result);}

1.6 LLMs via Vercel AI SDK

Mastra v1 delegated routing to Vercel AI SDK (v1/v2/v3 compatible). Two ways to specify models:

// (a) String Model Router (recomendado)model: 'openai/gpt-5.4'model: 'anthropic/claude-4-5-sonnet'model: 'google/gemini-2.5-pro'// (b) Instância SDK direta (quando precisa de tipagem rigorosa)import { openai } from '@ai-sdk/openai';model: openai('gpt-4o')// (c) Fallbacks automáticos cross-providermodel: [  { model: 'openai/gpt-5',                maxRetries: 3 },  { model: 'anthropic/claude-4-5-sonnet', maxRetries: 2 },  { model: 'google/gemini-2.5-pro',       maxRetries: 2 },]// (d) Dinâmico por requestmodel: ({ requestContext }) =>  requestContext.task === 'complex' ? 'anthropic/claude-4-5-sonnet' : 'openai/gpt-5-mini'

1.7 MCP (Model Context Protocol)

@mastra/mcp implements client and server. Transports stdio, SSE, and Streamable HTTP.

// Consumir MCPs externosimport { MCPClient } from '@mastra/mcp';export const mcp = new MCPClient({  id: 'main-mcp',  servers: {    filesystem: { command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '/tmp'] },    github:     { url: new URL('https://api.githubcopilot.com/mcp/'),                  requestInit: { headers: { Authorization: `Bearer ${process.env.GH_PAT}` } } },  },});export const researchAgent = new Agent({  id: 'research', model: 'openai/gpt-4o',  tools: await mcp.getTools(),      // estático  // ou dinâmico por request: const toolsets = await mcp.listToolsets();});// Expor Mastra como MCPimport { MCPServer } from '@mastra/mcp';const server = new MCPServer({  id: 'my-mcp-server', name: 'My MCP Server', version: '1.0.0',  description: 'Expõe Mastra via MCP.',  tools:     { weatherTool },  agents:    { weatherAgent },       // gera tool ask_weatherAgent  workflows: { cityWorkflow },       // gera tool run_cityWorkflow});server.startStdio();

2. Integration with Next.js App Router

2.1 Monorepo vs separate service

Criterion	Monorepo (embedded Mastra)	Separate service (`mastra dev` + `@mastra/client-js`)
Deploy	Single (`vercel deploy`)	Two domains, CORS, cross-origin auth
Agent↔UI latency	Zero internal network	+1 HTTP hop
AI scale vs SSR	Coupled	Independent
Workflows >5 min	Hard (`maxDuration`)	Natural (VM/container)
Multiple clients (web + mobile)	Frontend-centric	Reusable backend
Vercel Hobby	Viable with caution	Not recommended
MVP/prototype	Recommended	Overkill

2.2 Directory structure (monorepo)

TYPESCRIPT

my-nextjs-agent/├── src/│   ├── app/│   │   ├── api/chat/route.ts        # Route Handler streaming│   │   ├── chat/page.tsx            # UI client│   │   ├── actions/weather.ts       # Server Actions│   │   └── layout.tsx│   ├── mastra/│   │   ├── index.ts                 # new Mastra({...})│   │   ├── agents/weather-agent.ts│   │   ├── tools/weather-tool.ts│   │   ├── workflows/│   │   └── memory.ts│   └── lib/schemas.ts               # Zod compartilhado├── next.config.ts                   # serverExternalPackages: ['@mastra/*']└── .env.local                       # OPENAI_API_KEY, DATABASE_URL

Required configuration:

// next.config.tsimport type { NextConfig } from 'next';const nextConfig: NextConfig = {  serverExternalPackages: ['@mastra/*'],  // impede o bundler de empacotar binários nativos};export default nextConfig;

Known gotcha (vercel/next.js#74816): in some versions serverExternalPackages works in dev but fails in build. Fallback via Webpack: config.externals.push('@mastra/core', '@mastra/libsql').

2.3 Server Actions invoking agents

Ideal for non-streaming synchronous operations (form submit, single generation). Keeps API keys on the server, integrates with Next.js cache/revalidation.

// src/app/actions/weather.ts'use server';import { z } from 'zod';import { mastra } from '@/mastra';import { revalidatePath } from 'next/cache';const WeatherInput = z.object({  city: z.string().min(1).max(100),  units: z.enum(['metric', 'imperial']).default('metric'),});export type WeatherState =  | { status: 'idle' }  | { status: 'success'; text: string; toolCalls: unknown[] }  | { status: 'error'; message: string; fieldErrors?: Record<string, string[]> };export async function getWeather(_prev: WeatherState, formData: FormData): Promise<WeatherState> {  const parsed = WeatherInput.safeParse({    city: formData.get('city'),    units: formData.get('units') ?? 'metric',  });  if (!parsed.success) {    return { status: 'error', message: 'Entrada inválida',             fieldErrors: parsed.error.flatten().fieldErrors };  }  try {    const result = await mastra.getAgent('weatherAgent').generate(      `Weather in ${parsed.data.city}? Units: ${parsed.data.units}.`,      { memory: { thread: 'weather-thread', resource: 'public' } },    );    revalidatePath('/weather');    return { status: 'success', text: result.text, toolCalls: result.toolCalls ?? [] };  } catch (err) {    console.error('[getWeather]', err);    return { status: 'error', message: 'Falha ao consultar o agente.' };  }}

Client consumption with useActionState:

TSX

'use client';import { useActionState } from 'react';import { getWeather, type WeatherState } from '@/app/actions/weather';export default function WeatherPage() {  const [state, formAction, pending] = useActionState(getWeather, { status: 'idle' } as WeatherState);  return (    <form action={formAction}>      <input name="city" required />      <select name="units"><option value="metric">°C</option><option value="imperial">°F</option></select>      <button disabled={pending}>{pending ? 'Consultando...' : 'Ver clima'}</button>      {state.status === 'success' && <pre>{state.text}</pre>}      {state.status === 'error' && <p>{state.message}</p>}    </form>  );}

Important limitations: Server Actions don't stream — the client waits for the full response. Subject to maxDuration from the platform. For streaming, use Route Handler + useChat.

2.4 Route Handlers with streaming

Modern Mastra 1.0 pattern: @mastra/ai-sdk + handleChatStream().

// src/app/api/chat/route.tsimport { handleChatStream } from '@mastra/ai-sdk';import { toAISdkV5Messages } from '@mastra/ai-sdk/ui';import { createUIMessageStreamResponse } from 'ai';import { NextResponse } from 'next/server';import { mastra } from '@/mastra';export const maxDuration = 60;        // 300 default com Fluid; até 800 em Proexport const runtime = 'nodejs';      // OBRIGATÓRIO — Mastra não suporta Edgeexport async function POST(req: Request) {  const params = await req.json();  const stream = await handleChatStream({    mastra,    agentId: 'weatherAgent',    params: {      ...params,      memory: { thread: params.threadId ?? 'default', resource: params.resourceId ?? 'anon' },    },  });  return createUIMessageStreamResponse({ stream });}// Hidrata histórico no mountexport async function GET() {  const memory = await mastra.getAgentById('weatherAgent').getMemory();  const res = await memory?.recall({ threadId: 'default', resourceId: 'anon' });  return NextResponse.json(toAISdkV5Messages(res?.messages ?? []));}

Low-level alternative (full control):

export async function POST(req: Request) {  const { messages } = await req.json();  const stream = await mastra.getAgent('weatherAgent').stream(messages, {    format: 'aisdk',                  // AI SDK v5 compat    memory: { thread: 'demo', resource: 'user-1' },    abortSignal: req.signal,          // propaga cancelamento até o LLM  });  return stream.toUIMessageStreamResponse();   // ou .toDataStreamResponse() (v4), .toTextStreamResponse()}

Separate service (Next.js proxy to standalone Mastra on :4111):

import { MastraClient } from '@mastra/client-js';const client = new MastraClient({  baseUrl: process.env.MASTRA_API_URL ?? 'http://localhost:4111',  retries: 3, backoffMs: 300,});export async function POST(req: Request) {  const { messages } = await req.json();  const response = await client.getAgent('weatherAgent').stream({    messages, threadId: 'demo', resourceId: 'user-1',  });  return new Response(response.body, {    headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache, no-transform',               'X-Accel-Buffering': 'no' },  });}

2.5 Chat UI with useChat and AI Elements

TSX

'use client';import { useEffect, useState } from 'react';import { useChat } from '@ai-sdk/react';import { DefaultChatTransport, type ToolUIPart } from 'ai';export default function ChatPage() {  const [input, setInput] = useState('');  const { messages, setMessages, sendMessage, stop, status } = useChat({    transport: new DefaultChatTransport({      api: '/api/chat',      // Com Mastra Memory: envie APENAS a última mensagem + identifiers      prepareSendMessagesRequest({ messages, body }) {        return { body: { ...body,          messages: [messages[messages.length - 1]],          threadId: 'default-thread', resourceId: 'user-1',        }};      },    }),  });  useEffect(() => {    fetch('/api/chat').then(r => r.json()).then(setMessages).catch(() => {});  }, [setMessages]);  return (    <div>      {messages.map(m => (        <div key={m.id}>          {m.parts?.map((part, i) => {            if (part.type === 'text')      return <p key={i}>{part.text}</p>;            if (part.type === 'reasoning') return <details key={i}><summary>Thinking</summary>{part.text}</details>;            if (part.type?.startsWith('tool-')) {              const p = part as ToolUIPart;              // Estados: 'input-streaming' → 'input-available' → 'output-available' | 'output-error'              switch (p.state) {                case 'input-available':  return <Skeleton key={i} />;                case 'output-available': return <ToolCard key={i} output={p.output} />;                case 'output-error':     return <ErrorCard key={i} text={p.errorText} />;              }            }            return null;          })}        </div>      ))}      <input value={input} onChange={e => setInput(e.target.value)} />      <button onClick={() => { sendMessage({ text: input }); setInput(''); }}>Send</button>      {status === 'streaming' && <button onClick={stop}>⏹ Stop</button>}    </div>  );}

2.6 Deploy: limits, storages and runtimes

Vercel maxDuration (Apr/2026):

Plan	Default	Max. with Fluid	Max. without Fluid
Hobby	300s	300s	60s
Pro	300s	800s	300s
Enterprise	300s	900s	900s

Fluid Compute (enabled by default since Apr/2025) allows concurrency on the same instance, active CPU pricing, and streams continue past 300s if the first byte is sent within ~25s.

Critical storage in serverless: LibSQLStore with file:./mastra.db DOES NOT work on ephemeral FS (Vercel/Lambda). Use:

// Turso (LibSQL remoto)new LibSQLStore({ url: process.env.TURSO_URL!, authToken: process.env.TURSO_TOKEN })// Postgres (Neon, Supabase, Vercel Postgres)new PostgresStore({ connectionString: process.env.DATABASE_URL! })// Upstash Redisnew UpstashStore({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! })

Runtime: always export const runtime = 'nodejs' on routes that import Mastra. Edge runtime fails due to native dependencies (libsql, better-sqlite3, fs/crypto bindings).

VPS/Docker (mastra build):

BASH

npx mastra build --dir src/mastra    # gera .mastra/output/ (Hono bundle)node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs

DOCKERFILE

FROM node:22-alpine AS builderWORKDIR /appCOPY package*.json ./ && RUN npm ciCOPY src ./src && COPY tsconfig.json ./RUN npx mastra buildFROM node:22-alpine AS runnerWORKDIR /appRUN addgroup -g 1001 -S nodejs && adduser -S mastra -u 1001COPY --from=builder --chown=mastra:nodejs /app/.mastra/output ./.mastra/outputUSER mastraEXPOSE 4111HEALTHCHECK --interval=30s CMD wget -qO- http://localhost:4111/api/health || exit 1CMD ["node", "--import=./.mastra/output/instrumentation.mjs", ".mastra/output/index.mjs"]

VercelDeployer publishes Mastra standalone as a Vercel function (no Next in front):

import { VercelDeployer } from '@mastra/deployer-vercel';export const mastra = new Mastra({  deployer: new VercelDeployer({ studio: true, maxDuration: 600, memory: 1536, regions: ['gru1', 'iad1'] }),});

3. End-to-end type-safety with Zod

3.1 Zod as a quadruple contract

A Zod schema fulfills four simultaneous roles:

Role	Mechanism	Moment
Static contract	`z.infer<typeof schema>`	compile-time
Runtime validation	`.parse()` / `.safeParse()`	post-LLM
Specification for the LLM	JSON Schema (via `zodSchema()` from AI SDK)	pre-request
Semantic documentation	`.describe()` read by the model	pre-request

Critical rule: .describe() directly impacts the quality of structured output — it's "prompt engineering via types". Always describe ambiguous fields.

3.2 Idiomatic patterns for LLMs

Use .nullable() instead of .optional() — OpenAI strict mode and GPT-5 reject optional() in structured output (mastra-ai/mastra#7234):

// ❌ Quebra em GPT-5 strict modeconst bad = z.object({ details: z.string().optional() });// ✅ Corretoconst good = z.object({ details: z.string().nullable().describe('null se ausente') });

Discriminated unions are the standard for agent actions (ReAct, tool-routing):

export const AgentActionSchema = z.discriminatedUnion('type', [  z.object({ type: z.literal('search'), query: z.string() }),  z.object({ type: z.literal('answer'), text: z.string(), confidence: z.number().min(0).max(1) }),  z.object({ type: z.literal('escalate'), reason: z.string(), severity: z.enum(['low','medium','high']) }),]);

3.3 Zod v3 vs v4 — impacts on AI pipelines

Aspect	v3	v4	Impact
Parse strings/arrays	baseline	14×/7× faster (JIT)	Almost free streaming validation
Compile TS	baseline	~10× faster	Monorepos with many schemas
Bundle `core`	baseline	2.3× smaller	Important on edge
`z.record()`	1 arg	2 required args	Breaks migration
`.optional().default()`	default ignored if missing	always returns default	Careful in working memory
Schema creation	fast	17× slower (JIT)	Don't instantiate in hot loops
`.describe()`/`.meta()`	any position	must be last chain call	Doesn't inherit via `.optional()/.extend()`

Mastra ≥ beta.16 normalizes both via Standard Schema; Zod coexists via zod/v3 and zod/v4.

3.4 Typed Tools with `createTool`

import { createTool } from '@mastra/core/tools';import { z } from 'zod';export const githubRepoTool = createTool({  id: 'get-github-repo-info',  description: 'Fetch basic insights for a public GitHub repository',  inputSchema: z.object({    owner: z.string().describe('GitHub username or organization'),    repo:  z.string().describe('Repository name'),  }),  outputSchema: z.object({    stars:   z.number(),    forks:   z.number(),    issues:  z.number(),    license: z.string().nullable(),    lastPush: z.string(),    description: z.string().nullable(),  }),  execute: async ({ context, runtimeContext }) => {    //              ^ { owner: string; repo: string } inferido    const res = await fetch(`https://api.github.com/repos/${context.owner}/${context.repo}`);    if (res.status === 404) throw new Error(`Not found`);    const d = await res.json();    return {      stars: d.stargazers_count, forks: d.forks_count, issues: d.open_issues_count,      license: d.license?.name ?? null, lastPush: d.pushed_at, description: d.description,    };    // Incompatibilidade com outputSchema é erro em compile-time E runtime  },});

Typed RuntimeContext (⚠️ known bug — .get() doesn't infer; use cast):

export type SupportCtx = { 'user-tier': 'free'|'pro'|'enterprise'; language: 'en'|'pt-BR' };execute: async ({ runtimeContext }) => {  const tier = runtimeContext.get('user-tier') as SupportCtx['user-tier'];  const limit = tier === 'enterprise' ? 100 : tier === 'pro' ? 25 : 5;  ...}

3.5 Structured Outputs

API Mastra v1:

const result = await agent.generate('Who won 2012?', {  structuredOutput: {    schema: ElectionResultSchema,    errorStrategy: 'fallback',                  // 'strict' | 'warn' | 'fallback'    fallbackValue: { winner: 'Unknown', year: 0, party: 'Other' },    jsonPromptInjection: true,                  // obrigatório: Gemini 2.5 + tools  },});result.object.winner;  // string, totalmente tipado

Partial object streaming:

const stream = await agent.stream('...', { structuredOutput: { schema } });for await (const partial of stream.objectStream) {  // DeepPartial<T> — campos chegando incrementalmente}const final = await stream.object;   // T validado

Pure AI SDK (modern way with Output.object(), since generateObject is deprecated):

import { generateText, Output, tool, stepCountIs } from 'ai';const { output } = await generateText({  model: 'openai/gpt-5.2',  tools: { weather: tool({ inputSchema: z.object({ location: z.string() }), execute: async () => ({...}) }) },  output: Output.object({ schema: RecipeSchema }),  stopWhen: stepCountIs(5),  prompt: '...',});

Error handling:

import { AI_NoObjectGeneratedError } from 'ai';try { const { object } = await generateObject({ ... }); return object; }catch (err) {  if (AI_NoObjectGeneratedError.isInstance(err)) console.error('No object', err.text, err.cause);  if (err instanceof z.ZodError) console.error('Zod failed', err.issues);  throw err;}

3.6 Workflows with typed steps

.then(step) only compiles if step.inputSchema is compatible with the outputSchema from the previous step — the compiler holds the pipeline shape.

const scrapeStep = createStep({  id: 'scrape',  inputSchema:  z.object({ url: z.string().url() }),  outputSchema: z.object({ url: z.string().url(), markdown: z.string() }),  execute: async ({ inputData }) => ({ url: inputData.url, markdown: await fetch(inputData.url).then(r=>r.text()) }),});const summarizeStep = createStep({  id: 'summarize',  inputSchema: scrapeStep.outputSchema,  outputSchema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }),  execute: async ({ inputData, mastra }) => {    const res = await mastra.getAgent('summarizer').generate(inputData.markdown, {      structuredOutput: { schema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }) },    });    return res.object;  },});export const changelogWorkflow = createWorkflow({  id: 'changelog', inputSchema: z.object({ url: z.string().url() }), outputSchema: summarizeStep.outputSchema,}).then(scrapeStep).then(summarizeStep).commit();

Runtime context validation via requestContextSchema:

const workflow = createWorkflow({  id: 'tiered', inputSchema, outputSchema,  requestContextSchema: z.object({ userTier: z.enum(['free','pro','enterprise']), locale: z.string() }),});

3.7 Where type-safety lives

Layer	Tool	Compile	Runtime	Sent to LLM
HTTP/Form input	`safeParse` in Server Action	✅	✅	—
Tool input	`createTool({ inputSchema })`	✅	✅	✅
Tool output	`createTool({ outputSchema })`	✅	⚠️ informative	✅
Structured output	`generate({ structuredOutput })`	✅	✅	✅
Workflow step	`createStep({ inputSchema, outputSchema })`	✅	✅	—
Runtime context	`RuntimeContext<T>`	✅ on `set`; ⚠️ on `get`	⚠️ optional	❌
Memory	`workingMemory: { schema }`	✅	✅	✅

4. Design patterns for agents and workflows

4.1 ReAct: implicit vs explicit

Approach	Who decides	Implementation	When to use
Implicit (agent loop)	LLM, via native tool calling	`Agent` with `tools`; Mastra runs loop automatically	Open-ended tasks, unknown N of steps
Explicit (workflow)	Orchestrator code	`createWorkflow` + `.dountil()` calling step with agent	Auditability, SLA, hard limits, HITL in the loop

Recommendation: always start with implicit ReAct. Only migrate to explicit when you need granular trace, step budget, human approval mid-flight, or A/B testing by action type.

// Implícito — o agent loop já implementa ReActexport const researchAgent = new Agent({  instructions: `Follow ReAct loop: THOUGHT → ACTION (call one tool) → OBSERVATION.                 Repeat until confident. Never invent facts.`,  model: 'openai/gpt-4o',  tools: { searchDocsTool },});await researchAgent.generate('Qual a diferença entre suspend() e bail()?', { maxSteps: 8 });

Reusable ReAct schema:

export const ReActStepSchema = z.object({  thought: z.string().describe('Reasoning about next step'),  action: z.discriminatedUnion('type', [    z.object({ type: z.literal('tool_call'), toolName: z.string(), args: z.record(z.string(), z.unknown()) }),    z.object({ type: z.literal('final_answer'), answer: z.string(), confidence: z.number().min(0).max(1) }),  ]),});

4.2 Plan-and-Execute

const plannerAgent = new Agent({  id: 'planner', instructions: 'Break goal into 3-7 concrete, tool-executable steps.',  model: 'anthropic/claude-sonnet-4',});const executorAgent = new Agent({  id: 'executor', instructions: 'Execute a single plan step. Use tools. Return terse result.',  model: 'openai/gpt-4o-mini', tools: { /* ... */ },});const planStep = createStep({  id: 'plan', inputSchema: z.object({ goal: z.string() }),  outputSchema: z.object({ goal: z.string(), plan: z.array(z.object({    id: z.string(), description: z.string(), dependsOn: z.array(z.string()).default([]),  })) }),  execute: async ({ inputData, mastra }) => {    const res = await mastra.getAgent('plannerAgent').generate(`Goal: ${inputData.goal}`, {      output: z.object({ plan: z.array(/* ... */) }),    });    return { goal: inputData.goal, plan: res.object.plan };  },});export const planAndExecuteWorkflow = createWorkflow({  id: 'plan-exec', inputSchema: z.object({ goal: z.string() }), outputSchema: z.object({ results: z.array(z.any()) }),})  .then(planStep)  .map(async ({ inputData }) => inputData.plan)  .foreach(executeStep, { concurrency: 3 })  .commit();

4.3 Orchestration patterns matrix

Pattern	Mastra API	When to use
Pipeline	`.then()`	Fixed steps, linear dependency (ETL, content pipeline)
Fan-out/Fan-in	`.parallel([])`	N fixed independent tasks; output is object with keys = step ids
MapReduce	`.foreach(step, {concurrency})`	N dynamic; process list
Router/Branch	`.branch([[cond, step]])`	Routing by classification; all branches share schemas
Static supervisor	Agent with sub-agents as tools	Deterministic coordination
Dynamic supervisor	`agent.network()`	LLM decides which primitive to call
Evaluator-Optimizer	`.dowhile()` / `.dountil()` + scorer	Convergent iterative refinement
Human-in-the-loop	`suspend()` / `resume()` / `bail()`	Approvals, payment >$X, irreversible actions
Handoff	workflow + agents with shared memory	Specialist takes control
Council	`.parallel()` + synthesis step	Multiple opinions for synthesis

Concrete Evaluator-Optimizer:

workflow  .then(generateDraft)  .dowhile(    createStep({      id: 'eval-refine',      execute: async ({ inputData, state }) => {        const scorer = createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' });        const { score } = await scorer.run({ input: state.prompt, output: inputData.draft });        if (score >= 0.85) return { ...inputData, done: true, score };        const refined = await refinerAgent.generate(/* with feedback */);        return { ...inputData, draft: refined.text, done: false, score };      },    }),    async ({ inputData }) => !inputData.done,  )  .commit();

4.4 External tools integration

REST API:

export const githubIssueTool = createTool({  id: 'github-create-issue',  inputSchema: z.object({    repo: z.string().regex(/^[\w-]+\/[\w-]+$/),    title: z.string().min(1).max(256),    body: z.string().max(65_536).optional(),    labels: z.array(z.string()).max(100).default([]),  }),  outputSchema: z.object({ number: z.number(), url: z.string().url() }),  execute: async ({ context, tracingContext }) => {    const span = tracingContext?.currentSpan?.startSpan('github.api.call');    try {      const res = await fetch(`https://api.github.com/repos/${context.repo}/issues`, {        method: 'POST',        headers: { Authorization: `Bearer ${process.env.GITHUB_TOKEN}`, 'Content-Type': 'application/json' },        body: JSON.stringify({ title: context.title, body: context.body, labels: context.labels }),        signal: AbortSignal.timeout(10_000),      });      if (!res.ok) throw new Error(`GitHub ${res.status}: ${await res.text()}`);      const data = await res.json();      return { number: data.number, url: data.html_url };    } finally { span?.end(); }  },});

Database (Drizzle):

export const getUserTool = createTool({  id: 'get-user', inputSchema: z.object({ userId: z.string().uuid() }),  outputSchema: z.object({ id: z.string(), email: z.string(), plan: z.enum(['free','pro','enterprise']) }).nullable(),  execute: async ({ context }) => {    const db = drizzle(process.env.DATABASE_URL!);    const [row] = await db.select().from(users).where(eq(users.id, context.userId)).limit(1);    return row ?? null;  },});

Error handling in tools:

Strategy	When	Example
Throw	Unrecoverable error	Auth failure, timeout after retries
Structured return	LLM must react/retry	`{ success: false, error: { code, message } }` in union
Internal retry	Transient failure	`p-retry` inside the `execute`
Circuit breaker	Unstable API	`opossum`, opens after N failures
Timeout	Prevent stuck agent	`AbortSignal.timeout(ms)`

// Padrão "return estruturado" — melhor para o loop ReActoutputSchema: z.union([  z.object({ success: z.literal(true),  data: z.object({ /* ... */ }) }),  z.object({ success: z.literal(false), error: z.object({ code: z.string(), message: z.string() }) }),]),

4.5 Observability

Structured logging (PinoLogger):

import { PinoLogger } from '@mastra/loggers';export const mastra = new Mastra({  logger: new PinoLogger({    name: 'Mastra', level: process.env.LOG_LEVEL ?? 'info',    mixin() { return { traceId: getCurrentTraceId(), service: 'ai-api', env: process.env.NODE_ENV }; },  }),});

Native AI Tracing with OTel + multiple exporters:

import { DefaultExporter } from '@mastra/observability';import { LangfuseExporter } from '@mastra/langfuse';export const mastra = new Mastra({  observability: {    default: { enabled: true },    configs: {      langfuse: {        serviceName: 'prod-agents',        exporters: [new LangfuseExporter({          publicKey: process.env.LANGFUSE_PUBLIC_KEY!, secretKey: process.env.LANGFUSE_SECRET_KEY!,          baseUrl: process.env.LANGFUSE_BASE_URL,        })],        sampling: { type: 'ratio', probability: 0.1 },   // 10% em prod      },      debug: { exporters: [new DefaultExporter()], sampling: { type: 'always' } },    },    configSelector: (ctx) => ctx.runtimeContext?.get('supportMode') ? 'debug' : 'langfuse',  },});

Platform comparison:

Platform	Strong in	Weak in	When to choose
Langfuse	LLM-native (prompts, cost, evals). Self-host.	Generic infra tracing	Prompt engineering, cost per feature, evals
Braintrust	Production evals, A/B side-by-side	Less rich tracing	Teams focused on regression testing
LangSmith	LangChain integration, datasets	Vendor lock-in	Stack already LangChain/LangGraph
SigNoz/Datadog (OTel)	Full-stack APM	Not LLM-first	Unified APM (not just AI)
Mastra Studio + DuckDB	Built-in, zero setup, cost/latency	Local/single-node	Local dev, small teams

Evals / Scorers (Mastra 2026 — replaces legacy evals):

import { createAnswerRelevancyScorer, createToxicityScorer, createHallucinationScorer } from '@mastra/evals/scorers/llm';export const supportAgent = new Agent({  id: 'support', model: 'openai/gpt-4o',  scorers: {    relevancy:     { scorer: createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' }), sampling: { type: 'ratio', rate: 0.2 } },    hallucination: { scorer: createHallucinationScorer({ model: 'openai/gpt-4o-mini' }),   sampling: { type: 'ratio', rate: 1.0 } },    toxicity:      { scorer: createToxicityScorer({ model: 'openai/gpt-4o-mini' }),        sampling: { type: 'ratio', rate: 1.0 } },  },});

CI/CD with Vitest:

import { runEvals } from '@mastra/core/evals';describe('Support Agent', () => {  it('meets quality thresholds', async () => {    const result = await runEvals({      target: supportAgent,      data: [{ input: 'How to cancel?', groundTruth: 'cancellation policy' }],      scorers: [relevancyScorer, hasSourcesScorer],      concurrency: 3,    });    expect(result.scores['answer-relevancy']).toBeGreaterThanOrEqual(0.8);  });});

5. Consolidated architectural blueprint

TL;DR for architects:

Start simple. Single agent + tools. Workflow/multi-agent only when steps are known or context becomes unmanageable.
Workflows for auditability/SLO; agent.network() for flexibility. Workflows = code dictates flow. Networks = LLM dictates flow.
Zod everywhere. Every tool inputSchema/outputSchema, every step, every scorer. It's your only defense against hallucination in tool args.
Persistence from day 1. Postgres (prod) or LibSQL (dev). Without it, suspend/resume doesn't work and you lose traces on restart.
serverExternalPackages: ['@mastra/*'] + runtime = 'nodejs'. Non-negotiable in Next.js.
Route Handlers for streaming; Server Actions for sync. Don't try to stream via Server Action.
Vercel Fluid Compute + remote Postgres/Turso for serverless production. Never local LibSQL.
Observability by environment via configSelector: dev→Default, staging→10% Langfuse, prod→1% + Datadog.
Scorers with low sampling in prod (5-20%), 100% in toxicity/safety.
MCP before reimplementing tools. GitHub, Slack, Notion, filesystem, Playwright already have official servers.
HITL via suspend() whenever cost/irreversibility > convenience. Payment >$X, deletion, bulk send.
Prefer agent.network() or explicit supervisor over AgentNetwork class (deprecated).
Plan-and-Execute > ReAct for tasks >5 steps. Powerful planner + cheap executors saves 20-30% tokens.

Known critical pitfalls

optional() breaks OpenAI/GPT-5 strict mode → use .nullable() with .describe() (mastra-ai/mastra#7234).
Gemini 2.5 + tools + structured output → always jsonPromptInjection: true.
z.record() in Zod v4 needs 2 required args.
Field named error in outputSchema breaks tool.execute() narrowing.
RuntimeContext.get() doesn't infer — manual cast needed.
.describe()/.meta() must be last chain call (doesn't inherit via .optional()/.extend()).
tool() helper from AI SDK is mandatory for inference; createTool from Mastra doesn't suffer from this.
generateObject deprecated → migrate to generateText({ output: Output.object(...) }).
Zod v4 creates schemas 17× slower (JIT) — never instantiate in hot render/loop.
toDataStreamResponse() + output: zodSchema conflicts (mastra-ai/mastra#5544) — use experimental_output.
serverExternalPackages has build issue (vercel/next.js#74816) — keep Webpack fallback.
AgentNetwork class → deprecated; use agent.network().
legacy_workflows → replaced by createWorkflow/createStep.

Reference stack for production

TYPESCRIPT

┌──────────────────────────────────────────────────────────────┐│  Next.js App Router (Node runtime)                           ││  ├─ Route Handlers (streaming, useChat)                      ││  └─ Server Actions (síncrono, forms)                         │├──────────────────────────────────────────────────────────────┤│  Mastra (embedded ou standalone)                             ││  ├─ Agents (ReAct implícito) + agent.network()               ││  ├─ Workflows (.then/.parallel/.branch/.foreach/.dountil)    ││  ├─ Tools (createTool + Zod) + MCP (client + server)         ││  └─ Memory (threads/resources, semanticRecall, workingMemory)│├──────────────────────────────────────────────────────────────┤│  Storage: MastraCompositeStore                               ││  ├─ memory: LibSQL (dev) / Postgres (prod)                   ││  ├─ workflows: Postgres (snapshots persistentes)             ││  ├─ scores: Postgres                                         ││  └─ vectors: pgvector / Pinecone / Upstash                   │├──────────────────────────────────────────────────────────────┤│  LLM: Vercel AI SDK v5/v6                                    ││  ├─ openai/gpt-5.x, anthropic/claude-4-5-sonnet,             ││  │  google/gemini-2.5-pro                                    ││  └─ Fallbacks automáticos cross-provider                     │├──────────────────────────────────────────────────────────────┤│  Observabilidade                                             ││  ├─ PinoLogger (structured)                                  ││  ├─ OTel tracing → Langfuse/Braintrust/SigNoz                ││  └─ Scorers async (relevancy, hallucination, toxicity)       │├──────────────────────────────────────────────────────────────┤│  Deploy                                                      ││  ├─ Vercel (Fluid Compute, maxDuration: 800s Pro)            ││  ├─ VPS (mastra build → node + PM2)                          ││  └─ Docker (multi-stage, Alpine, healthcheck)                │└──────────────────────────────────────────────────────────────┘

Technical guide: architecting AI with Mastra, Next.js, and TypeScript

1. Architecture and Mastra capabilities

1.1 The Mastra object as central registry

1.2 Agent Lifecycle

1.3 Memory: threads, resources, storage and vector

1.4 Workflow System

1.5 Multi-agent orchestration

1.6 LLMs via Vercel AI SDK

1.7 MCP (Model Context Protocol)

2. Integration with Next.js App Router

2.1 Monorepo vs separate service

2.2 Directory structure (monorepo)

2.3 Server Actions invoking agents

2.4 Route Handlers with streaming

2.5 Chat UI with useChat and AI Elements

2.6 Deploy: limits, storages and runtimes

3. End-to-end type-safety with Zod

3.1 Zod as a quadruple contract

3.2 Idiomatic patterns for LLMs

3.3 Zod v3 vs v4 — impacts on AI pipelines

3.4 Typed Tools with createTool

3.5 Structured Outputs

3.6 Workflows with typed steps

3.7 Where type-safety lives

4. Design patterns for agents and workflows

4.1 ReAct: implicit vs explicit

4.2 Plan-and-Execute

4.3 Orchestration patterns matrix

4.4 External tools integration

4.5 Observability

5. Consolidated architectural blueprint

Known critical pitfalls

Reference stack for production

Conclusion

The elite dev's arsenal.

Conversational AI Exhausted? How to Migrate to Agentic Workflows and Execute Real Actions

GLM-5.2 vs. Kimi K2.7: Why GLM Wins the Code Reliability Test

How to choose an AI SDK: why the fear of lock-in is a mistake and how to decide based on your app's format

Technical guide: architecting AI with Mastra, Next.js, and TypeScript

1. Architecture and Mastra capabilities

1.1 The Mastra object as central registry

1.2 Agent Lifecycle

1.3 Memory: threads, resources, storage and vector

1.4 Workflow System

1.5 Multi-agent orchestration

1.6 LLMs via Vercel AI SDK

1.7 MCP (Model Context Protocol)

2. Integration with Next.js App Router

2.1 Monorepo vs separate service

2.2 Directory structure (monorepo)

2.3 Server Actions invoking agents

2.4 Route Handlers with streaming

2.5 Chat UI with useChat and AI Elements

2.6 Deploy: limits, storages and runtimes

3. End-to-end type-safety with Zod

3.1 Zod as a quadruple contract

3.2 Idiomatic patterns for LLMs

3.3 Zod v3 vs v4 — impacts on AI pipelines

3.4 Typed Tools with createTool

3.5 Structured Outputs

3.6 Workflows with typed steps

3.7 Where type-safety lives

4. Design patterns for agents and workflows

4.1 ReAct: implicit vs explicit

4.2 Plan-and-Execute

4.3 Orchestration patterns matrix

4.4 External tools integration

4.5 Observability

5. Consolidated architectural blueprint

Known critical pitfalls

Reference stack for production

Conclusion

The elite dev's arsenal.

Conversational AI Exhausted? How to Migrate to Agentic Workflows and Execute Real Actions

GLM-5.2 vs. Kimi K2.7: Why GLM Wins the Code Reliability Test

How to choose an AI SDK: why the fear of lock-in is a mistake and how to decide based on your app's format

1.1 The `Mastra` object as central registry

3.4 Typed Tools with `createTool`

1.1 The `Mastra` object as central registry

3.4 Typed Tools with `createTool`