Artificial Intelligence

Technical guide: architecting AI with Mastra, Next.js, and TypeScript

Learn how to architect AI with Mastra and Next.js and discover how to replace the LangChain/LangGraph.js approach in production environments.

Technical guide: architecting AI with Mastra, Next.js, and TypeScript

"Written in April/2026, referencing @mastra/core@1.25.0. Check the official changelog before implementing."

Mastra + Next.js + TypeScript form the most mature and cohesive stack today for building type-safe AI agents in JavaScript, advantageously replacing the LangChain/LangGraph.js approach in production environments. Mastra 1.0 (GA since Jan/2026, current version @mastra/core@1.25.0, ~23.1k stars on GitHub) consolidated a central registry architecture with dependency injection, durable workflows with suspend/resume, memory-as-first-class, native MCP (client and server) and transparent integration with Vercel AI SDK v5/v6. The combination with Next.js App Router delivers native streaming, type-safe Server Actions and deploy on both Vercel serverless (with Fluid Compute, up to 800s) and VPS/Docker via mastra build.

This guide consolidates four pillars — framework architecture, Next.js integration, type-safety with Zod, and design patterns — into an actionable blueprint for senior architects. All snippets are idiomatic and functional against Mastra 1.x, AI SDK v5 and Next.js 14/15.


1. Architecture and Mastra capabilities

1.1 The Mastra object as central registry

Mastra is a registry with DI orchestrating Agents, Workflows, Tools, Memory, Storage, Vector, Observability, MCP Servers and Gateways. The HTTP server is generated on top of Hono (with adapters for Express/Fastify/Koa from v1.0). In production, storage by domains (memory/workflows/scores/traces) via MastraCompositeStore is the default.

TS
// src/mastra/index.tsimport { Mastra } from '@mastra/core';import { PinoLogger } from '@mastra/loggers';import { MastraCompositeStore } from '@mastra/core/storage';import { WorkflowsPG, ScoresPG, PgVector } from '@mastra/pg';import { MemoryLibSQL } from '@mastra/libsql';import { weatherAgent } from './agents/weather-agent';import { weatherWorkflow } from './workflows/weather-workflow';const storage = new MastraCompositeStore({  id: 'composite',  domains: {    memory:    new MemoryLibSQL({ url: 'file:./local.db' }),    workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL! }),    scores:    new ScoresPG({ connectionString: process.env.DATABASE_URL! }),  },});export const mastra = new Mastra({  agents:    { weatherAgent },  workflows: { weatherWorkflow },  storage,  vectors:   { pg: new PgVector({ connectionString: process.env.DATABASE_URL! }) },  logger:    new PinoLogger({ name: 'Mastra', level: 'info' }),  server:    { port: 4111, host: '0.0.0.0', timeout: 30_000 },  // mcpServers, observability, scorers, processors, gateways, bundler...});

Main packages: @mastra/core (Mastra, Agent, Workflow, Tool, Memory, Storage interfaces, Processors), @mastra/memory, @mastra/libsql, @mastra/pg, @mastra/mcp, @mastra/ai-sdk, @mastra/loggers, @mastra/observability, @mastra/client-js, mastra (CLI). From v1, subpath imports are mandatory (@mastra/core/agent, @mastra/core/workflows, etc.), except Mastra and type Config.

Current status (Apr/2026): @mastra/core@1.25.0 GA, Apache-2.0 license (with ee/ areas under Mastra Enterprise License), Maintained by the Gatsby team (Sam Bhagwat, Shane Thomas). Positioned against LangGraph.js; uses Vercel AI SDK for model routing (40+ providers, 3000+ models via Mastra Model Router).

1.2 Agent Lifecycle

TS
import { Agent } from '@mastra/core/agent';import { Memory } from '@mastra/memory';import { LibSQLStore } from '@mastra/libsql';import { weatherTool } from '../tools/weather';export const weatherAgent = new Agent({  id:   'weather-agent',  name: 'Weather Agent',  description: 'Responde sobre clima.',  instructions: 'Você é um assistente de clima. Use weatherTool quando preciso.',  model: 'openai/gpt-5.1',            // Mastra Model Router — "provider/model"  tools: { weatherTool },  memory: new Memory({    storage: new LibSQLStore({ url: 'file:./agent.db' }),    options: { lastMessages: 10, workingMemory: { enabled: true } },  }),});// .generate() — resposta completa, retorna { text, toolCalls, toolResults, steps, usage }const res = await weatherAgent.generate('Clima em Tóquio?', {  memory: { resource: 'user-123', thread: 'conv-42' },});// .stream() — token-a-token via MastraModelOutputconst stream = await weatherAgent.stream('Planeje meu dia');for await (const chunk of stream.textStream) process.stdout.write(chunk);

1.3 Memory: threads, resources, storage and vector

The Memory class combines storage (persistent history), vector (semantic recall) and embedder. Thread isolates conversations; Resource is a stable grouper (user/project) allowing multiple agents to share working memory and embeddings across threads. Default scope changed to 'resource' in Mastra 0.10+.

TS
import { Memory } from '@mastra/memory';import { PgStore, PgVector } from '@mastra/pg';import { OpenAIEmbedder } from '@mastra/openai';const memoryPg = new Memory({  storage: new PgStore({ connectionString: process.env.DATABASE_URL! }),  vector:  new PgVector({ connectionString: process.env.DATABASE_URL! }),  embedder: new OpenAIEmbedder({ model: 'text-embedding-3-small' }),  options: {    lastMessages: 20,    semanticRecall: {      topK: 5,      messageRange: { before: 2, after: 1 },      scope: 'resource',      indexConfig: { type: 'hnsw', metric: 'dotproduct', m: 16, efConstruction: 64 },    },    workingMemory: {      enabled: true,      template: '# User\n- First Name:\n- Last Name:',      scope: 'resource',    },    generateTitle: true,  },});

Supported vector stores: LibSQLVector, PgVector (HNSW/IVFFlat, bit, sparsevec), Pinecone, Upstash, Qdrant, Chroma, MongoDB, Astra, OpenSearch, S3Vectors, TurboPuffer, Lance, Cloudflare, Couchbase.

1.4 Workflow System

createWorkflow() / createStep() deliver durable execution: automatic snapshots at each suspend(), state serialized to JSON in storage, resume cross-process via runId. Tables mastra_workflow_snapshot, mastra_traces, mastra_messages are created automatically.

Flow control primitives: .then() (sequential), .parallel([]) (fan-out/fan-in), .branch([[cond, step]]) (router), .foreach(step, {concurrency}) (MapReduce), .dountil()/.dowhile() (loops), .map() (transform). Retry configurable at workflow level and step level.

TS
import { createWorkflow, createStep } from '@mastra/core/workflows';import { z } from 'zod';const approvalStep = createStep({  id: 'approval',  inputSchema:   z.object({ amount: z.number(), needsApproval: z.boolean() }),  outputSchema:  z.object({ approved: z.boolean(), message: z.string() }),  suspendSchema: z.object({ reason: z.string(), amount: z.number() }),  resumeSchema:  z.object({ approved: z.boolean(), approver: z.string() }),  execute: async ({ inputData, resumeData, suspend, bail }) => {    if (!inputData.needsApproval) return { approved: true, message: 'Auto' };    if (resumeData?.approved === false) {      return bail({ approved: false, message: 'Rejected' });    }    if (resumeData?.approved === undefined) {      return await suspend({ reason: 'Human approval required', amount: inputData.amount });    }    return { approved: true, message: `Approved by ${resumeData.approver}` };  },});export const paymentWorkflow = createWorkflow({  id: 'payment-workflow',  inputSchema:  z.object({ amount: z.number(), userId: z.string() }),  outputSchema: z.object({ approved: z.boolean(), message: z.string() }),  retryConfig:  { attempts: 5, delay: 2000 },})  .then(analyzePurchase)  .then(approvalStep)  .then(executePayment)  .commit();// Suspensão transparenteconst run = await paymentWorkflow.createRunAsync();const result = await run.start({ inputData: { amount: 5000, userId: 'u1' } });if (result.status === 'suspended') {  // runId salvo em fila, notifique aprovador}// Retomada (mesmo ou outro processo, pelo runId)const resumed = await paymentWorkflow.createRunAsync({ runId });await resumed.resume({ resumeData: { approved: true, approver: 'mgr@acme' } });

Discriminated union on returnrun.start() ('success' | 'failed' | 'suspended' | 'tripwire') ensures typed narrowing.

1.5 Multi-agent orchestration

Three approaches available:

  1. Agent-as-tool (static supervisor): sub-agent wrapped in createTool(). Deterministic coordination, predictable flow.

  2. agent.network() (dynamic routing): an Agent with agents, workflows, and tools registered; the LLM decides which primitive to call. Requires memory (persists task history and detects completion). Supports suspension with agent-execution-approval / tool-execution-approval.

  3. Multi-agent workflows: steps invoking mastra.getAgent(...).

Important deprecation (2026): the AgentNetwork class was deprecated. Use agent.network() or explicit supervisor.

TS
export const routingAgent = new Agent({  id: 'routing-agent',  instructions: 'Rede de pesquisadores e escritores.',  model: 'openai/gpt-5.4',  agents:    { researchAgent, writingAgent },  workflows: { cityWorkflow },  tools:     { weatherTool },  memory:    new Memory({ storage: new LibSQLStore({ url: 'file:./mastra.db' }) }),});const result = await routingAgent.network('Clima em Tóquio e atividade sugerida.');for await (const chunk of result) {  if (chunk.type === 'network-execution-event-step-finish') console.log(chunk.payload.result);}

1.6 LLMs via Vercel AI SDK

Mastra v1 delegated routing to Vercel AI SDK (v1/v2/v3 compatible). Two ways to specify models:

TS
// (a) String Model Router (recomendado)model: 'openai/gpt-5.4'model: 'anthropic/claude-4-5-sonnet'model: 'google/gemini-2.5-pro'// (b) Instância SDK direta (quando precisa de tipagem rigorosa)import { openai } from '@ai-sdk/openai';model: openai('gpt-4o')// (c) Fallbacks automáticos cross-providermodel: [  { model: 'openai/gpt-5',                maxRetries: 3 },  { model: 'anthropic/claude-4-5-sonnet', maxRetries: 2 },  { model: 'google/gemini-2.5-pro',       maxRetries: 2 },]// (d) Dinâmico por requestmodel: ({ requestContext }) =>  requestContext.task === 'complex' ? 'anthropic/claude-4-5-sonnet' : 'openai/gpt-5-mini'

1.7 MCP (Model Context Protocol)

@mastra/mcp implements client and server. Transports stdio, SSE, and Streamable HTTP.

TS
// Consumir MCPs externosimport { MCPClient } from '@mastra/mcp';export const mcp = new MCPClient({  id: 'main-mcp',  servers: {    filesystem: { command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '/tmp'] },    github:     { url: new URL('https://api.githubcopilot.com/mcp/'),                  requestInit: { headers: { Authorization: `Bearer ${process.env.GH_PAT}` } } },  },});export const researchAgent = new Agent({  id: 'research', model: 'openai/gpt-4o',  tools: await mcp.getTools(),      // estático  // ou dinâmico por request: const toolsets = await mcp.listToolsets();});// Expor Mastra como MCPimport { MCPServer } from '@mastra/mcp';const server = new MCPServer({  id: 'my-mcp-server', name: 'My MCP Server', version: '1.0.0',  description: 'Expõe Mastra via MCP.',  tools:     { weatherTool },  agents:    { weatherAgent },       // gera tool ask_weatherAgent  workflows: { cityWorkflow },       // gera tool run_cityWorkflow});server.startStdio();

2. Integration with Next.js App Router

2.1 Monorepo vs separate service

Criterion

Monorepo (embedded Mastra)

Separate service (mastra dev + @mastra/client-js)

Deploy

Single (vercel deploy)

Two domains, CORS, cross-origin auth

Agent↔UI latency

Zero internal network

+1 HTTP hop

AI scale vs SSR

Coupled

Independent

Workflows >5 min

Hard (maxDuration)

Natural (VM/container)

Multiple clients (web + mobile)

Frontend-centric

Reusable backend

Vercel Hobby

Viable with caution

Not recommended

MVP/prototype

Recommended

Overkill

2.2 Directory structure (monorepo)

TYPESCRIPT
my-nextjs-agent/├── src/│   ├── app/│   │   ├── api/chat/route.ts        # Route Handler streaming│   │   ├── chat/page.tsx            # UI client│   │   ├── actions/weather.ts       # Server Actions│   │   └── layout.tsx│   ├── mastra/│   │   ├── index.ts                 # new Mastra({...})│   │   ├── agents/weather-agent.ts│   │   ├── tools/weather-tool.ts│   │   ├── workflows/│   │   └── memory.ts│   └── lib/schemas.ts               # Zod compartilhado├── next.config.ts                   # serverExternalPackages: ['@mastra/*']└── .env.local                       # OPENAI_API_KEY, DATABASE_URL

Required configuration:

TS
// next.config.tsimport type { NextConfig } from 'next';const nextConfig: NextConfig = {  serverExternalPackages: ['@mastra/*'],  // impede o bundler de empacotar binários nativos};export default nextConfig;

Known gotcha (vercel/next.js#74816): in some versions serverExternalPackages works in dev but fails in build. Fallback via Webpack: config.externals.push('@mastra/core', '@mastra/libsql').

2.3 Server Actions invoking agents

Ideal for non-streaming synchronous operations (form submit, single generation). Keeps API keys on the server, integrates with Next.js cache/revalidation.

TS
// src/app/actions/weather.ts'use server';import { z } from 'zod';import { mastra } from '@/mastra';import { revalidatePath } from 'next/cache';const WeatherInput = z.object({  city: z.string().min(1).max(100),  units: z.enum(['metric', 'imperial']).default('metric'),});export type WeatherState =  | { status: 'idle' }  | { status: 'success'; text: string; toolCalls: unknown[] }  | { status: 'error'; message: string; fieldErrors?: Record<string, string[]> };export async function getWeather(_prev: WeatherState, formData: FormData): Promise<WeatherState> {  const parsed = WeatherInput.safeParse({    city: formData.get('city'),    units: formData.get('units') ?? 'metric',  });  if (!parsed.success) {    return { status: 'error', message: 'Entrada inválida',             fieldErrors: parsed.error.flatten().fieldErrors };  }  try {    const result = await mastra.getAgent('weatherAgent').generate(      `Weather in ${parsed.data.city}? Units: ${parsed.data.units}.`,      { memory: { thread: 'weather-thread', resource: 'public' } },    );    revalidatePath('/weather');    return { status: 'success', text: result.text, toolCalls: result.toolCalls ?? [] };  } catch (err) {    console.error('[getWeather]', err);    return { status: 'error', message: 'Falha ao consultar o agente.' };  }}

Client consumption with useActionState:

TSX
'use client';import { useActionState } from 'react';import { getWeather, type WeatherState } from '@/app/actions/weather';export default function WeatherPage() {  const [state, formAction, pending] = useActionState(getWeather, { status: 'idle' } as WeatherState);  return (    <form action={formAction}>      <input name="city" required />      <select name="units"><option value="metric">°C</option><option value="imperial">°F</option></select>      <button disabled={pending}>{pending ? 'Consultando...' : 'Ver clima'}</button>      {state.status === 'success' && <pre>{state.text}</pre>}      {state.status === 'error' && <p>{state.message}</p>}    </form>  );}

Important limitations: Server Actions don't stream — the client waits for the full response. Subject to maxDuration from the platform. For streaming, use Route Handler + useChat.

2.4 Route Handlers with streaming

Modern Mastra 1.0 pattern: @mastra/ai-sdk + handleChatStream().

TS
// src/app/api/chat/route.tsimport { handleChatStream } from '@mastra/ai-sdk';import { toAISdkV5Messages } from '@mastra/ai-sdk/ui';import { createUIMessageStreamResponse } from 'ai';import { NextResponse } from 'next/server';import { mastra } from '@/mastra';export const maxDuration = 60;        // 300 default com Fluid; até 800 em Proexport const runtime = 'nodejs';      // OBRIGATÓRIO — Mastra não suporta Edgeexport async function POST(req: Request) {  const params = await req.json();  const stream = await handleChatStream({    mastra,    agentId: 'weatherAgent',    params: {      ...params,      memory: { thread: params.threadId ?? 'default', resource: params.resourceId ?? 'anon' },    },  });  return createUIMessageStreamResponse({ stream });}// Hidrata histórico no mountexport async function GET() {  const memory = await mastra.getAgentById('weatherAgent').getMemory();  const res = await memory?.recall({ threadId: 'default', resourceId: 'anon' });  return NextResponse.json(toAISdkV5Messages(res?.messages ?? []));}

Low-level alternative (full control):

TS
export async function POST(req: Request) {  const { messages } = await req.json();  const stream = await mastra.getAgent('weatherAgent').stream(messages, {    format: 'aisdk',                  // AI SDK v5 compat    memory: { thread: 'demo', resource: 'user-1' },    abortSignal: req.signal,          // propaga cancelamento até o LLM  });  return stream.toUIMessageStreamResponse();   // ou .toDataStreamResponse() (v4), .toTextStreamResponse()}

Separate service (Next.js proxy to standalone Mastra on :4111):

TS
import { MastraClient } from '@mastra/client-js';const client = new MastraClient({  baseUrl: process.env.MASTRA_API_URL ?? 'http://localhost:4111',  retries: 3, backoffMs: 300,});export async function POST(req: Request) {  const { messages } = await req.json();  const response = await client.getAgent('weatherAgent').stream({    messages, threadId: 'demo', resourceId: 'user-1',  });  return new Response(response.body, {    headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache, no-transform',               'X-Accel-Buffering': 'no' },  });}

2.5 Chat UI with useChat and AI Elements

TSX
'use client';import { useEffect, useState } from 'react';import { useChat } from '@ai-sdk/react';import { DefaultChatTransport, type ToolUIPart } from 'ai';export default function ChatPage() {  const [input, setInput] = useState('');  const { messages, setMessages, sendMessage, stop, status } = useChat({    transport: new DefaultChatTransport({      api: '/api/chat',      // Com Mastra Memory: envie APENAS a última mensagem + identifiers      prepareSendMessagesRequest({ messages, body }) {        return { body: { ...body,          messages: [messages[messages.length - 1]],          threadId: 'default-thread', resourceId: 'user-1',        }};      },    }),  });  useEffect(() => {    fetch('/api/chat').then(r => r.json()).then(setMessages).catch(() => {});  }, [setMessages]);  return (    <div>      {messages.map(m => (        <div key={m.id}>          {m.parts?.map((part, i) => {            if (part.type === 'text')      return <p key={i}>{part.text}</p>;            if (part.type === 'reasoning') return <details key={i}><summary>Thinking</summary>{part.text}</details>;            if (part.type?.startsWith('tool-')) {              const p = part as ToolUIPart;              // Estados: 'input-streaming' → 'input-available' → 'output-available' | 'output-error'              switch (p.state) {                case 'input-available':  return <Skeleton key={i} />;                case 'output-available': return <ToolCard key={i} output={p.output} />;                case 'output-error':     return <ErrorCard key={i} text={p.errorText} />;              }            }            return null;          })}        </div>      ))}      <input value={input} onChange={e => setInput(e.target.value)} />      <button onClick={() => { sendMessage({ text: input }); setInput(''); }}>Send</button>      {status === 'streaming' && <button onClick={stop}>⏹ Stop</button>}    </div>  );}

2.6 Deploy: limits, storages and runtimes

Vercel maxDuration (Apr/2026):

Plan

Default

Max. with Fluid

Max. without Fluid

Hobby

300s

300s

60s

Pro

300s

800s

300s

Enterprise

300s

900s

900s

Fluid Compute (enabled by default since Apr/2025) allows concurrency on the same instance, active CPU pricing, and streams continue past 300s if the first byte is sent within ~25s.

Critical storage in serverless: LibSQLStore with file:./mastra.db DOES NOT work on ephemeral FS (Vercel/Lambda). Use:

TS
// Turso (LibSQL remoto)new LibSQLStore({ url: process.env.TURSO_URL!, authToken: process.env.TURSO_TOKEN })// Postgres (Neon, Supabase, Vercel Postgres)new PostgresStore({ connectionString: process.env.DATABASE_URL! })// Upstash Redisnew UpstashStore({ url: process.env.UPSTASH_URL!, token: process.env.UPSTASH_TOKEN! })

Runtime: always export const runtime = 'nodejs' on routes that import Mastra. Edge runtime fails due to native dependencies (libsql, better-sqlite3, fs/crypto bindings).

VPS/Docker (mastra build):

BASH
npx mastra build --dir src/mastra    # gera .mastra/output/ (Hono bundle)node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs
DOCKERFILE
FROM node:22-alpine AS builderWORKDIR /appCOPY package*.json ./ && RUN npm ciCOPY src ./src && COPY tsconfig.json ./RUN npx mastra buildFROM node:22-alpine AS runnerWORKDIR /appRUN addgroup -g 1001 -S nodejs && adduser -S mastra -u 1001COPY --from=builder --chown=mastra:nodejs /app/.mastra/output ./.mastra/outputUSER mastraEXPOSE 4111HEALTHCHECK --interval=30s CMD wget -qO- http://localhost:4111/api/health || exit 1CMD ["node", "--import=./.mastra/output/instrumentation.mjs", ".mastra/output/index.mjs"]

VercelDeployer publishes Mastra standalone as a Vercel function (no Next in front):

TS
import { VercelDeployer } from '@mastra/deployer-vercel';export const mastra = new Mastra({  deployer: new VercelDeployer({ studio: true, maxDuration: 600, memory: 1536, regions: ['gru1', 'iad1'] }),});

3. End-to-end type-safety with Zod

3.1 Zod as a quadruple contract

A Zod schema fulfills four simultaneous roles:

Role

Mechanism

Moment

Static contract

z.infer<typeof schema>

compile-time

Runtime validation

.parse() / .safeParse()

post-LLM

Specification for the LLM

JSON Schema (via zodSchema() from AI SDK)

pre-request

Semantic documentation

.describe() read by the model

pre-request

Critical rule: .describe() directly impacts the quality of structured output — it's "prompt engineering via types". Always describe ambiguous fields.

3.2 Idiomatic patterns for LLMs

Use .nullable() instead of .optional() — OpenAI strict mode and GPT-5 reject optional() in structured output (mastra-ai/mastra#7234):

TS
// ❌ Quebra em GPT-5 strict modeconst bad = z.object({ details: z.string().optional() });// ✅ Corretoconst good = z.object({ details: z.string().nullable().describe('null se ausente') });

Discriminated unions are the standard for agent actions (ReAct, tool-routing):

TS
export const AgentActionSchema = z.discriminatedUnion('type', [  z.object({ type: z.literal('search'), query: z.string() }),  z.object({ type: z.literal('answer'), text: z.string(), confidence: z.number().min(0).max(1) }),  z.object({ type: z.literal('escalate'), reason: z.string(), severity: z.enum(['low','medium','high']) }),]);

3.3 Zod v3 vs v4 — impacts on AI pipelines

Aspect

v3

v4

Impact

Parse strings/arrays

baseline

14×/7× faster (JIT)

Almost free streaming validation

Compile TS

baseline

~10× faster

Monorepos with many schemas

Bundle core

baseline

2.3× smaller

Important on edge

z.record()

1 arg

2 required args

Breaks migration

.optional().default()

default ignored if missing

always returns default

Careful in working memory

Schema creation

fast

17× slower (JIT)

Don't instantiate in hot loops

.describe()/.meta()

any position

must be last chain call

Doesn't inherit via .optional()/.extend()

Mastra ≥ beta.16 normalizes both via Standard Schema; Zod coexists via zod/v3 and zod/v4.

3.4 Typed Tools with createTool

TS
import { createTool } from '@mastra/core/tools';import { z } from 'zod';export const githubRepoTool = createTool({  id: 'get-github-repo-info',  description: 'Fetch basic insights for a public GitHub repository',  inputSchema: z.object({    owner: z.string().describe('GitHub username or organization'),    repo:  z.string().describe('Repository name'),  }),  outputSchema: z.object({    stars:   z.number(),    forks:   z.number(),    issues:  z.number(),    license: z.string().nullable(),    lastPush: z.string(),    description: z.string().nullable(),  }),  execute: async ({ context, runtimeContext }) => {    //              ^ { owner: string; repo: string } inferido    const res = await fetch(`https://api.github.com/repos/${context.owner}/${context.repo}`);    if (res.status === 404) throw new Error(`Not found`);    const d = await res.json();    return {      stars: d.stargazers_count, forks: d.forks_count, issues: d.open_issues_count,      license: d.license?.name ?? null, lastPush: d.pushed_at, description: d.description,    };    // Incompatibilidade com outputSchema é erro em compile-time E runtime  },});

Return from tool.execute() is discriminated union that includes error path — narrowing via if ('error' in result && result.error). Avoid the name error as a field in outputSchema (collides with discriminator).

Typed RuntimeContext (⚠️ known bug — .get() doesn't infer; use cast):

TS
export type SupportCtx = { 'user-tier': 'free'|'pro'|'enterprise'; language: 'en'|'pt-BR' };execute: async ({ runtimeContext }) => {  const tier = runtimeContext.get('user-tier') as SupportCtx['user-tier'];  const limit = tier === 'enterprise' ? 100 : tier === 'pro' ? 25 : 5;  ...}

3.5 Structured Outputs

API Mastra v1:

TS
const result = await agent.generate('Who won 2012?', {  structuredOutput: {    schema: ElectionResultSchema,    errorStrategy: 'fallback',                  // 'strict' | 'warn' | 'fallback'    fallbackValue: { winner: 'Unknown', year: 0, party: 'Other' },    jsonPromptInjection: true,                  // obrigatório: Gemini 2.5 + tools  },});result.object.winner;  // string, totalmente tipado

Partial object streaming:

TS
const stream = await agent.stream('...', { structuredOutput: { schema } });for await (const partial of stream.objectStream) {  // DeepPartial<T> — campos chegando incrementalmente}const final = await stream.object;   // T validado

Pure AI SDK (modern way with Output.object(), since generateObject is deprecated):

TS
import { generateText, Output, tool, stepCountIs } from 'ai';const { output } = await generateText({  model: 'openai/gpt-5.2',  tools: { weather: tool({ inputSchema: z.object({ location: z.string() }), execute: async () => ({...}) }) },  output: Output.object({ schema: RecipeSchema }),  stopWhen: stepCountIs(5),  prompt: '...',});

Error handling:

TS
import { AI_NoObjectGeneratedError } from 'ai';try { const { object } = await generateObject({ ... }); return object; }catch (err) {  if (AI_NoObjectGeneratedError.isInstance(err)) console.error('No object', err.text, err.cause);  if (err instanceof z.ZodError) console.error('Zod failed', err.issues);  throw err;}

3.6 Workflows with typed steps

.then(step) only compiles if step.inputSchema is compatible with the outputSchema from the previous step — the compiler holds the pipeline shape.

TS
const scrapeStep = createStep({  id: 'scrape',  inputSchema:  z.object({ url: z.string().url() }),  outputSchema: z.object({ url: z.string().url(), markdown: z.string() }),  execute: async ({ inputData }) => ({ url: inputData.url, markdown: await fetch(inputData.url).then(r=>r.text()) }),});const summarizeStep = createStep({  id: 'summarize',  inputSchema: scrapeStep.outputSchema,  outputSchema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }),  execute: async ({ inputData, mastra }) => {    const res = await mastra.getAgent('summarizer').generate(inputData.markdown, {      structuredOutput: { schema: z.object({ library: z.string(), latestVersion: z.string(), breakingChanges: z.array(z.string()) }) },    });    return res.object;  },});export const changelogWorkflow = createWorkflow({  id: 'changelog', inputSchema: z.object({ url: z.string().url() }), outputSchema: summarizeStep.outputSchema,}).then(scrapeStep).then(summarizeStep).commit();

Runtime context validation via requestContextSchema:

TS
const workflow = createWorkflow({  id: 'tiered', inputSchema, outputSchema,  requestContextSchema: z.object({ userTier: z.enum(['free','pro','enterprise']), locale: z.string() }),});

3.7 Where type-safety lives

Layer

Tool

Compile

Runtime

Sent to LLM

HTTP/Form input

safeParse in Server Action

Tool input

createTool({ inputSchema })

Tool output

createTool({ outputSchema })

⚠️ informative

Structured output

generate({ structuredOutput })

Workflow step

createStep({ inputSchema, outputSchema })

Runtime context

RuntimeContext<T>

✅ on set; ⚠️ on get

⚠️ optional

Memory

workingMemory: { schema }


4. Design patterns for agents and workflows

4.1 ReAct: implicit vs explicit

Approach

Who decides

Implementation

When to use

Implicit (agent loop)

LLM, via native tool calling

Agent with tools; Mastra runs loop automatically

Open-ended tasks, unknown N of steps

Explicit (workflow)

Orchestrator code

createWorkflow + .dountil() calling step with agent

Auditability, SLA, hard limits, HITL in the loop

Recommendation: always start with implicit ReAct. Only migrate to explicit when you need granular trace, step budget, human approval mid-flight, or A/B testing by action type.

TS
// Implícito — o agent loop já implementa ReActexport const researchAgent = new Agent({  instructions: `Follow ReAct loop: THOUGHT → ACTION (call one tool) → OBSERVATION.                 Repeat until confident. Never invent facts.`,  model: 'openai/gpt-4o',  tools: { searchDocsTool },});await researchAgent.generate('Qual a diferença entre suspend() e bail()?', { maxSteps: 8 });

Reusable ReAct schema:

TS
export const ReActStepSchema = z.object({  thought: z.string().describe('Reasoning about next step'),  action: z.discriminatedUnion('type', [    z.object({ type: z.literal('tool_call'), toolName: z.string(), args: z.record(z.string(), z.unknown()) }),    z.object({ type: z.literal('final_answer'), answer: z.string(), confidence: z.number().min(0).max(1) }),  ]),});

4.2 Plan-and-Execute

Separates Planner (powerful LLM, e.g., Claude Sonnet/GPT-5) that produces plan upfront, from Executor (cheap LLMs specialized in tool use). ~30% fewer tokens than ReAct in complex multi-step tasks (LangChain 2026 benchmark).

TS
const plannerAgent = new Agent({  id: 'planner', instructions: 'Break goal into 3-7 concrete, tool-executable steps.',  model: 'anthropic/claude-sonnet-4',});const executorAgent = new Agent({  id: 'executor', instructions: 'Execute a single plan step. Use tools. Return terse result.',  model: 'openai/gpt-4o-mini', tools: { /* ... */ },});const planStep = createStep({  id: 'plan', inputSchema: z.object({ goal: z.string() }),  outputSchema: z.object({ goal: z.string(), plan: z.array(z.object({    id: z.string(), description: z.string(), dependsOn: z.array(z.string()).default([]),  })) }),  execute: async ({ inputData, mastra }) => {    const res = await mastra.getAgent('plannerAgent').generate(`Goal: ${inputData.goal}`, {      output: z.object({ plan: z.array(/* ... */) }),    });    return { goal: inputData.goal, plan: res.object.plan };  },});export const planAndExecuteWorkflow = createWorkflow({  id: 'plan-exec', inputSchema: z.object({ goal: z.string() }), outputSchema: z.object({ results: z.array(z.any()) }),})  .then(planStep)  .map(async ({ inputData }) => inputData.plan)  .foreach(executeStep, { concurrency: 3 })  .commit();

4.3 Orchestration patterns matrix

Pattern

Mastra API

When to use

Pipeline

.then()

Fixed steps, linear dependency (ETL, content pipeline)

Fan-out/Fan-in

.parallel([])

N fixed independent tasks; output is object with keys = step ids

MapReduce

.foreach(step, {concurrency})

N dynamic; process list

Router/Branch

.branch([[cond, step]])

Routing by classification; all branches share schemas

Static supervisor

Agent with sub-agents as tools

Deterministic coordination

Dynamic supervisor

agent.network()

LLM decides which primitive to call

Evaluator-Optimizer

.dowhile() / .dountil() + scorer

Convergent iterative refinement

Human-in-the-loop

suspend() / resume() / bail()

Approvals, payment >$X, irreversible actions

Handoff

workflow + agents with shared memory

Specialist takes control

Council

.parallel() + synthesis step

Multiple opinions for synthesis

Concrete Evaluator-Optimizer:

TS
workflow  .then(generateDraft)  .dowhile(    createStep({      id: 'eval-refine',      execute: async ({ inputData, state }) => {        const scorer = createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' });        const { score } = await scorer.run({ input: state.prompt, output: inputData.draft });        if (score >= 0.85) return { ...inputData, done: true, score };        const refined = await refinerAgent.generate(/* with feedback */);        return { ...inputData, draft: refined.text, done: false, score };      },    }),    async ({ inputData }) => !inputData.done,  )  .commit();

4.4 External tools integration

REST API:

TS
export const githubIssueTool = createTool({  id: 'github-create-issue',  inputSchema: z.object({    repo: z.string().regex(/^[\w-]+\/[\w-]+$/),    title: z.string().min(1).max(256),    body: z.string().max(65_536).optional(),    labels: z.array(z.string()).max(100).default([]),  }),  outputSchema: z.object({ number: z.number(), url: z.string().url() }),  execute: async ({ context, tracingContext }) => {    const span = tracingContext?.currentSpan?.startSpan('github.api.call');    try {      const res = await fetch(`https://api.github.com/repos/${context.repo}/issues`, {        method: 'POST',        headers: { Authorization: `Bearer ${process.env.GITHUB_TOKEN}`, 'Content-Type': 'application/json' },        body: JSON.stringify({ title: context.title, body: context.body, labels: context.labels }),        signal: AbortSignal.timeout(10_000),      });      if (!res.ok) throw new Error(`GitHub ${res.status}: ${await res.text()}`);      const data = await res.json();      return { number: data.number, url: data.html_url };    } finally { span?.end(); }  },});

Database (Drizzle):

TS
export const getUserTool = createTool({  id: 'get-user', inputSchema: z.object({ userId: z.string().uuid() }),  outputSchema: z.object({ id: z.string(), email: z.string(), plan: z.enum(['free','pro','enterprise']) }).nullable(),  execute: async ({ context }) => {    const db = drizzle(process.env.DATABASE_URL!);    const [row] = await db.select().from(users).where(eq(users.id, context.userId)).limit(1);    return row ?? null;  },});

Error handling in tools:

Strategy

When

Example

Throw

Unrecoverable error

Auth failure, timeout after retries

Structured return

LLM must react/retry

{ success: false, error: { code, message } } in union

Internal retry

Transient failure

p-retry inside the execute

Circuit breaker

Unstable API

opossum, opens after N failures

Timeout

Prevent stuck agent

AbortSignal.timeout(ms)

TS
// Padrão "return estruturado" — melhor para o loop ReActoutputSchema: z.union([  z.object({ success: z.literal(true),  data: z.object({ /* ... */ }) }),  z.object({ success: z.literal(false), error: z.object({ code: z.string(), message: z.string() }) }),]),

4.5 Observability

Structured logging (PinoLogger):

TS
import { PinoLogger } from '@mastra/loggers';export const mastra = new Mastra({  logger: new PinoLogger({    name: 'Mastra', level: process.env.LOG_LEVEL ?? 'info',    mixin() { return { traceId: getCurrentTraceId(), service: 'ai-api', env: process.env.NODE_ENV }; },  }),});

Native AI Tracing with OTel + multiple exporters:

TS
import { DefaultExporter } from '@mastra/observability';import { LangfuseExporter } from '@mastra/langfuse';export const mastra = new Mastra({  observability: {    default: { enabled: true },    configs: {      langfuse: {        serviceName: 'prod-agents',        exporters: [new LangfuseExporter({          publicKey: process.env.LANGFUSE_PUBLIC_KEY!, secretKey: process.env.LANGFUSE_SECRET_KEY!,          baseUrl: process.env.LANGFUSE_BASE_URL,        })],        sampling: { type: 'ratio', probability: 0.1 },   // 10% em prod      },      debug: { exporters: [new DefaultExporter()], sampling: { type: 'always' } },    },    configSelector: (ctx) => ctx.runtimeContext?.get('supportMode') ? 'debug' : 'langfuse',  },});

Platform comparison:

Platform

Strong in

Weak in

When to choose

Langfuse

LLM-native (prompts, cost, evals). Self-host.

Generic infra tracing

Prompt engineering, cost per feature, evals

Braintrust

Production evals, A/B side-by-side

Less rich tracing

Teams focused on regression testing

LangSmith

LangChain integration, datasets

Vendor lock-in

Stack already LangChain/LangGraph

SigNoz/Datadog (OTel)

Full-stack APM

Not LLM-first

Unified APM (not just AI)

Mastra Studio + DuckDB

Built-in, zero setup, cost/latency

Local/single-node

Local dev, small teams

Evals / Scorers (Mastra 2026 — replaces legacy evals):

Scorers run async after response, with pipeline preprocess → analyze → generateScore → generateReason. Built-in: answer-relevancy, answer-similarity, faithfulness, hallucination, completeness, tool-call-accuracy, trajectory-accuracy, bias, toxicity, prompt-alignment.

TS
import { createAnswerRelevancyScorer, createToxicityScorer, createHallucinationScorer } from '@mastra/evals/scorers/llm';export const supportAgent = new Agent({  id: 'support', model: 'openai/gpt-4o',  scorers: {    relevancy:     { scorer: createAnswerRelevancyScorer({ model: 'openai/gpt-4o-mini' }), sampling: { type: 'ratio', rate: 0.2 } },    hallucination: { scorer: createHallucinationScorer({ model: 'openai/gpt-4o-mini' }),   sampling: { type: 'ratio', rate: 1.0 } },    toxicity:      { scorer: createToxicityScorer({ model: 'openai/gpt-4o-mini' }),        sampling: { type: 'ratio', rate: 1.0 } },  },});

CI/CD with Vitest:

TS
import { runEvals } from '@mastra/core/evals';describe('Support Agent', () => {  it('meets quality thresholds', async () => {    const result = await runEvals({      target: supportAgent,      data: [{ input: 'How to cancel?', groundTruth: 'cancellation policy' }],      scorers: [relevancyScorer, hasSourcesScorer],      concurrency: 3,    });    expect(result.scores['answer-relevancy']).toBeGreaterThanOrEqual(0.8);  });});

5. Consolidated architectural blueprint

TL;DR for architects:

  1. Start simple. Single agent + tools. Workflow/multi-agent only when steps are known or context becomes unmanageable.

  2. Workflows for auditability/SLO; agent.network() for flexibility. Workflows = code dictates flow. Networks = LLM dictates flow.

  3. Zod everywhere. Every tool inputSchema/outputSchema, every step, every scorer. It's your only defense against hallucination in tool args.

  4. Persistence from day 1. Postgres (prod) or LibSQL (dev). Without it, suspend/resume doesn't work and you lose traces on restart.

  5. serverExternalPackages: ['@mastra/*'] + runtime = 'nodejs'. Non-negotiable in Next.js.

  6. Route Handlers for streaming; Server Actions for sync. Don't try to stream via Server Action.

  7. Vercel Fluid Compute + remote Postgres/Turso for serverless production. Never local LibSQL.

  8. Observability by environment via configSelector: dev→Default, staging→10% Langfuse, prod→1% + Datadog.

  9. Scorers with low sampling in prod (5-20%), 100% in toxicity/safety.

  10. MCP before reimplementing tools. GitHub, Slack, Notion, filesystem, Playwright already have official servers.

  11. HITL via suspend() whenever cost/irreversibility > convenience. Payment >$X, deletion, bulk send.

  12. Prefer agent.network() or explicit supervisor over AgentNetwork class (deprecated).

  13. Plan-and-Execute > ReAct for tasks >5 steps. Powerful planner + cheap executors saves 20-30% tokens.

Known critical pitfalls

  • optional() breaks OpenAI/GPT-5 strict mode → use .nullable() with .describe() (mastra-ai/mastra#7234).

  • Gemini 2.5 + tools + structured output → always jsonPromptInjection: true.

  • z.record() in Zod v4 needs 2 required args.

  • Field named error in outputSchema breaks tool.execute() narrowing.

  • RuntimeContext.get() doesn't infer — manual cast needed.

  • .describe()/.meta() must be last chain call (doesn't inherit via .optional()/.extend()).

  • tool() helper from AI SDK is mandatory for inference; createTool from Mastra doesn't suffer from this.

  • generateObject deprecated → migrate to generateText({ output: Output.object(...) }).

  • Zod v4 creates schemas 17× slower (JIT) — never instantiate in hot render/loop.

  • toDataStreamResponse() + output: zodSchema conflicts (mastra-ai/mastra#5544) — use experimental_output.

  • serverExternalPackages has build issue (vercel/next.js#74816) — keep Webpack fallback.

  • AgentNetwork class → deprecated; use agent.network().

  • legacy_workflows → replaced by createWorkflow/createStep.

Reference stack for production

TYPESCRIPT
┌──────────────────────────────────────────────────────────────┐│  Next.js App Router (Node runtime)                           ││  ├─ Route Handlers (streaming, useChat)                      ││  └─ Server Actions (síncrono, forms)                         │├──────────────────────────────────────────────────────────────┤Mastra (embedded ou standalone)                             ││  ├─ Agents (ReAct implícito) + agent.network()               ││  ├─ Workflows (.then/.parallel/.branch/.foreach/.dountil)    ││  ├─ Tools (createTool + Zod) + MCP (client + server)         ││  └─ Memory (threads/resources, semanticRecall, workingMemory)│├──────────────────────────────────────────────────────────────┤Storage: MastraCompositeStore                               ││  ├─ memory: LibSQL (dev) / Postgres (prod)                   ││  ├─ workflows: Postgres (snapshots persistentes)             ││  ├─ scores: Postgres                                         ││  └─ vectors: pgvector / Pinecone / Upstash                   │├──────────────────────────────────────────────────────────────┤LLM: Vercel AI SDK v5/v6                                    ││  ├─ openai/gpt-5.x, anthropic/claude-4-5-sonnet,             ││  │  google/gemini-2.5-pro                                    ││  └─ Fallbacks automáticos cross-provider                     │├──────────────────────────────────────────────────────────────┤│  Observabilidade                                             ││  ├─ PinoLogger (structured)                                  ││  ├─ OTel tracing → Langfuse/Braintrust/SigNoz                ││  └─ Scorers async (relevancy, hallucination, toxicity)       │├──────────────────────────────────────────────────────────────┤│  Deploy                                                      ││  ├─ Vercel (Fluid Compute, maxDuration: 800s Pro)            ││  ├─ VPS (mastra build → node + PM2)                          ││  └─ Docker (multi-stage, Alpine, healthcheck)                │└──────────────────────────────────────────────────────────────┘

Conclusion

The Mastra 1.x + Next.js 15 + AI SDK v5/v6 ecosystem is today, in April 2026, the most cohesive and type-safe approach for building AI agents in TypeScript — surpassing LangChain/LangGraph.js in ergonomics, DX and native integration with the JavaScript runtime. The three architectural decisions that most impact scale and maintainability are: (1) choosing between Mastra embedded in Next.js (MVP, single frontend) vs. standalone (independent scale, multiple clients), (2) migrating from local LibSQL to remote Postgres on day zero in serverless (without this suspend/resume and traces are illusions), and (3) investing in Zod as a quadruple contract (compile-time, runtime, prompt to LLM, semantic documentation) from the very first tool.

The counter-intuitive insight here is that the biggest quality gain doesn't come from the most powerful model, but from the granularity of Zod schemas: .describe() well written in outputSchema fields are disguised prompt engineering, and .nullable() instead of .optional() eliminates entire classes of failures in OpenAI strict mode. Combined with durable workflows (suspend/resume/bail), agent.network() for dynamic routing, MCP for interop without reimplementation, and continuous scorers with stratified sampling, the stack delivers auditable, resilient and observable agents — non-negotiable requirements in production.

The framework's immediate roadmap (post-1.25) focuses on AI SDK v3 (native ToolLoopAgent), consolidation of MastraCompositeStore, expansion of providers in the Model Router and maturation of Agent Networks as the definitive replacement for the deprecated class. For architects deciding today: adopting Mastra 1.x is safe for production, with the caveat of monitoring weekly deprecations in the official changelog (high evolution pace) and keeping canonical snippets referenced against node_modules/@mastra/*/dist/docs/ or https://mastra.ai/llms.txt instead of dated blog posts.