Lazy Loading: load less to deliver more
The web has never been so fast, yet never swallowed so many megabytes. The way out isn't speeding up delivery, but deferring what doesn't matter now.

There is a paradox at the heart of the modern web: we have never had such fast connections, and pages have never been so heavy. The average weight of a web page surpassed 2 MB years ago and continues to grow, driven by high-resolution images, ever-larger JavaScript bundles, custom fonts, embedded videos, and third-party scripts. Software engineering's response to this problem was not merely to compress more or transmit faster — it was to question a fundamental premise: why load everything at once?
Lazy loading (or on-demand loading) is the materialization of that question. Instead of downloading, processing, and rendering all of an application's resources at the initial moment, lazy loading defers the loading of each resource until the instant it becomes necessary — when the user scrolls the page to an image, navigates to a route, opens a modal, or interacts with a component. What began as a niche optimization for photo galleries has become an architectural pattern present in virtually all modern frameworks, browsers, and platforms.
This article examines lazy loading in depth: its technical fundamentals, its relationship with the performance metrics that govern the current web, its pitfalls, and the trends that are redefining its limits.
Part 1 — The problem: the real cost of loading everything
The critical rendering path
When a browser loads a page, it traverses what we call critical rendering path: downloads the HTML, discovers the referenced resources (CSS, JavaScript, images, fonts), downloads them, builds the DOM tree and the CSSOM tree, executes scripts, calculates the layout, and finally paints the pixels on the screen. Each resource added to this path has three distinct costs:
Network cost — bytes transferred, which consume bandwidth, time, and, on mobile data plans, the user's money.
Processing cost — JavaScript needs to be parsed, compiled, and executed; images need to be decoded. On entry-level mobile devices, parsing 1 MB of JavaScript can take more than one second of blocked CPU.
Memory cost — each decoded image, each mounted component, and each registered listener occupies RAM, a scarce resource on modest devices.
The central insight of lazy loading is that a large part of these costs is wasted. Telemetry studies consistently show that a significant fraction of images loaded on web pages never enters the viewport — the user simply doesn't scroll to them. The same holds for code: most sessions in a SPA (Single Page Application) visit only two or three routes, but the traditional bundle loads the code for all of them.
The tyranny of the first impression
The problem is aggravated by the fact that the first seconds of loading are disproportionately important. User behavior research shows that the probability of abandonment rises sharply with each additional second of waiting. Google formalized this reality in the Core Web Vitals, metrics that directly influence search ranking:
LCP (Largest Contentful Paint) — how long it takes for the largest visible element to be rendered. Target: below 2.5 seconds.
INP (Interaction to Next Paint) — how responsive the page is to user interactions. Target: below 200 ms.
CLS (Cumulative Layout Shift) — how much the layout "shifts" during loading. Target: below 0.1.
Lazy loading directly affects all three. By removing non-critical resources from the initial path, it frees up bandwidth and CPU for the content that matters (improves LCP); by reducing the JavaScript executed during loading, it frees the main thread (improves INP); and, when poorly implemented, it can cause layout shifts (worsens CLS) — one of the pitfalls we'll see ahead.
Part 2 — Anatomy of lazy loading
Lazy loading of images: from scroll listener to native attribute
The history of lazy loading of images is a good synthesis of the evolution of the web platform. In the first generation (2000s and early 2010s), developers used scroll listeners that manually calculated the position of each image relative to the viewport — a functional, but costly approach, as each scroll event triggered synchronous layout calculations (getBoundingClientRect)layout thrashing).
Listeners calculavam offset a cada evento de scroll, forçando cálculos síncronos de layout e causando jank.
Delegação assíncrona ao navegador para detectar entrada no viewport, com rootMargin que antecipa o download.
Atributo loading="lazy" elimina scripts. O navegador aplica heurísticas adaptativas de distância por tipo de conexão.
From manual position calculation to native and automated browser delegation.
The second generation arrived with the Intersection Observer API, which delegates to the browser the task of observing when an element enters the viewport, asynchronously and efficiently:
const observer = new IntersectionObserver((entries) => { entries.forEach((entry) => { if (entry.isIntersecting) { const img = entry.target; img.src = img.dataset.src; // troca o placeholder pela imagem real observer.unobserve(img); // para de observar após carregar } });}, { rootMargin: "200px 0px" // começa a carregar 200px antes de entrar na tela});document.querySelectorAll("img[data-src]").forEach((img) => observer.observe(img));The parameter rootMargin is the soul of this technique: it creates a margin of anticipation, initiating the download before the user reaches the image, so that the content is already ready when it enters the scene. It is the balance between laziness and anticipation that defines good lazy loading.
The third generation eliminated JavaScript completely. Since 2019/2020, browsers support lazy loading native:
<img src="foto.webp" alt="Descrição da foto" loading="lazy" width="800" height="600"><iframe src="https://exemplo.com/video" loading="lazy" title="Vídeo"></iframe>A single attribute, zero dependencies, distance heuristics managed by the browser itself (which considers the user's connection type). Note the attributes width and height: they allow the browser to reserve the image's space before loading, preventing layout shifts — a small detail with huge impact on CLS.
Lazy loading of code: code splitting and dynamic import
If images were the first target, JavaScript became the most important. ECMAScript introduced the import() dynamic, which returns a Promise and instructs the bundler (Webpack, Vite, Rollup, esbuild) to split the module into a chunk independent, downloaded only when the function is called:
// Em vez de importar no topo do arquivo (carregamento imediato):// import { gerarRelatorioPDF } from "./relatorios";// Importa apenas quando o usuário clica no botão:botaoExportar.addEventListener("click", async () => { const { gerarRelatorioPDF } = await import("./relatorios"); gerarRelatorioPDF(dados);});Frameworks built elegant abstractions on top of this mechanism. In React:
import { lazy, Suspense } from "react";// O componente Dashboard só é baixado quando renderizado pela primeira vezconst Dashboard = lazy(() => import("./pages/Dashboard"));function App() { return ( <Suspense fallback={<TelaDeCarregamento />}> <Dashboard /> </Suspense> );}In Vue, defineAsyncComponent plays the same role; in Angular, routes with loadChildren and loadComponent perform route-based code splitting declaratively. The dominant pattern is the route-based splitting: each route of the application turns into a chunk, and the user downloads only the code of the pages they actually visit.
Lazy loading of data: pagination, infinite scroll, and virtualization
The same philosophy applies to data. Rather than fetching a thousand records from an API, a page of twenty is fetched and the rest is loaded as the user advances — whether through explicit pagination, or through infinite scroll (frequently implemented with the same Intersection Observer, observing a sentinel element at the end of the list).
Virtualization (with libraries like TanStack Virtual or react-window) takes the concept to the extreme: even data already loaded is only rendered when visible. A list of ten thousand items keeps in the DOM only the few dozen that fit on the screen, recycling elements as the user scrolls. It is lazy loading applied not to the network, but to rendering.
Lazy loading in other layers
The pattern permeates the entire stack. In ORMs like SQLAlchemy and Hibernate, relationships are loaded on demand by default (with the famous risk of the N+1 problem, when a loop fires a query per item). In languages, lazy generators and iterators (Python, Haskell, and Java streams) compute values only when consumed. In operating systems, the demand paging loads memory pages from executables only when accessed. Lazy loading is not a front-end technique: it is an engineering principle — defer work until it is demonstrably necessary.
Relacionamentos carregados sob demanda, com risco de queries N+1 quando iterados em loops.
Geradores e iteradores preguiçosos computam valores apenas no momento do consumo.
Demand paging traz páginas de memória de executáveis apenas quando acessadas.
Dados já carregados só renderizam quando visíveis no viewport, reciclando nós no scroll.
Engineering principle: defer work until it is demonstrably necessary.
Part 3 — The pitfalls: when laziness costs dearly
No technique is free, and poorly applied lazy loading can degrade exactly the metrics it intends to improve.
The cardinal error: lazy loading critical content
The most common and most harmful trap is applying loading="lazy" to the hero's main image — precisely the element that is usually the page's LCP. The browser, instructed to deprioritize the image, delays its download, and the LCP worsens drastically. The rule is crystal clear: above-the-fold content must never be lazy. For the LCP element, the correct approach is the opposite — prioritize it aggressively:
<img src="hero.webp" alt="Banner principal" fetchpriority="high" width="1200" height="600"><link rel="preload" as="image" href="hero.webp">Layout shifts and the importance of reserving space
When a lazy image loads and pushes content down, the user who was reading loses their position — and CLS spikes. Prevention is simple and mandatory: always declare dimensions (width/height or aspect-ratio in CSS) so that space is reserved before loading. The same applies to lazy components: the fallback of Suspense must have dimensions close to those of the final component (hence the popularity of skeleton screens).
Loading cascades (waterfalls)
Chained lazy loading creates serial waits: the route loads, which then discovers it needs a component, which then discovers it needs data. Each link adds a network round trip. Modern frameworks attack this problem with preloading intelligent — loading the next route when the user hovers over the link (a strategy popularized by Remix and by quicklink by Google) or in parallel with data (React Router loaders, for example).
SEO and content discovery
Content loaded only after interaction may be invisible to crawlers. Googlebot renders JavaScript and processes Intersection Observer, but content behind clicks or custom scroll events may not be indexed. The recommendation is to keep content essential for SEO in the initial HTML (via SSR or SSG) and reserve lazy loading for what is complementary.
The cost of the transition
Lazy loading trades initial load time for interaction latency. If the checkout modal chunk takes two seconds to download when the user clicks "Buy", the optimization turned into a loss at the most critical moment of the funnel. The answer is a hierarchy of priorities: critical conversion paths deserve preload; what is peripheral can wait.
Part 4 — Trends: where on-demand loading is heading
1. From lazy loading to "lazy execution": resumability and Islands
The current frontier is no longer downloading less, but executing less. The cost of hydration — re-executing on the client all the JavaScript of a server-rendered page — has become the new bottleneck of SPAs. Two architectures tackle this problem:
The Islands Architecture (popularized by Astro) treats the page as static HTML with "islands" of interactivity. Each island is hydrated independently and can be lazy: client:visible hydrates the component only when it enters the viewport, client:idle when the main thread is idle. It's lazy loading applied to interactivity.
The resumability (central concept of Qwik) goes further: serializes the application state into HTML and eliminates hydration entirely. Each event handler is a tiny chunk, downloaded only when the event occurs. It's lazy loading taken to its logical conclusion — granularity at the function level, not the component or route level.
2. React Server Components and the shifting of code to the server
React Server Components (adopted by the Next.js App Router) represent another answer: components that run exclusively on the server send to the client only the rendered result — their code never is downloaded. Instead of deferring JavaScript loading, the need for it is eliminated. Client-side lazy loading then applies only to the genuinely interactive fraction of the application.
3. Lazy loading driven by real data and machine learning
The heuristic "load when within 200px of the viewport" is static. The trend is to make it adaptive: libraries such as quicklink already factor in connection quality (via Network Information API) and the user's data saver mode. Emerging research and tools use real browsing analytics to predict which routes a user will likely visit from the current page, doing probabilistic prefetch — loading ceases to be lazy or eager to become predictive.
4. The platform absorbing the pattern
The history of image lazy loading — from hack with scroll listeners to native attribute — tends to repeat itself in other layers. APIs such as content-visibility: auto in CSS (which defers rendering of off-screen sections without any JavaScript), on-demand loading of compression dictionaries and the speculation rules for declarative prefetch/prerender indicate the direction: the browser increasingly assumes responsibility for intelligent resource orchestration, and the developer shifts to declaring intentions rather than implementing mechanisms.
5. Sustainability: bytes not transferred are carbon not emitted
An emerging motivation is environmental. Internet infrastructure accounts for a relevant slice of global electricity consumption, and every byte transferred has an energy cost in servers, networks, and devices. The sustainable web design positions lazy loading as an efficiency practice not merely economic, but ecological — loading only what is necessary is also a way to waste less energy at a planetary scale.
Part 5 — Practical decision guide
The synthesis of everything discussed fits in one question per resource: is this resource necessary for the first impression or for the user's critical path?
If yes — hero image, critical CSS, initial route code, main text fonts — the resource should be loaded immediately and, ideally, prioritized (fetchpriority="high", preload). If not — images below the fold, secondary routes, modals, chat widgets, video players, export libraries — the resource is a natural candidate for lazy loading, with three non-negotiable precautions: reserve layout space, anticipate loading with margin or preload on hover, and measure impact with real field data (CrUX, RUM) instead of just lab tests.
Conclusion
Lazy loading is the technical expression of an idea older than computing: don't do now what may never need to be done. Its trajectory — from photo gallery hack to native browser attribute, from image optimization to architectural principle that rewrites entire frameworks — illustrates how performance engineering evolves: identifying waste, deferring it, and finally eliminating it by design.
Trends point to a future where the term may disappear by becoming ubiquitous: when browsers orchestrate resources predictively, servers send only the computation result and interactivity is serialized into resumable fragments, every loading will be, to some degree, on demand. Laziness, well applied, will have won — not by doing less, but by doing only what matters, at the exact moment it matters.
Article generated in June 2026. The techniques and APIs cited (Intersection Observer, loading=\"lazy\", content-visibility, dynamic import(), React Server Components, Islands Architecture) reflect the state of the web platform on that date.


