Why Your await fetch Is Slow — and How to Fix It

A deep dive into nine common performance pitfalls of using await fetch() in JavaScript, from cold TCP connections and DNS/TLS overhead to JSON parsing bottlenecks and HTTP/2 head-of-line blocking, with practical fixes for each.

Today we'll look at why the innocent-looking line await fetch() unexpectedly becomes a performance bottleneck, where exactly it buries precious milliseconds — and what you can do about it.

Cold TCP Connections: 200 ms Out of Nowhere

Symptom: the first request to an API is consistently slower than subsequent ones, and during bursts, t95 skyrockets.

Every naive fetch() opens a new socket: 1x DNS, 1x TCP handshake, 1x TLS. Average RTT in Europe is ~50 ms — multiply that out and you get hundreds of extra milliseconds.

Benchmarks show a 3x improvement when reusing 1000 connections compared to the "one connection per request" approach.

Fix

// Node 22+, global fetch is already available
import { Agent } from 'undici';

const api = new Agent({
  keepAliveTimeout: 30_000,   // keep socket alive for 30s
  connections: 100,           // connection pool
});

const res = await fetch('https://api.payments.local/v1/orders', { dispatcher: api });

In the browser, use the Connection: keep-alive header plus switch to HTTP/2 or HTTP/3. On the frontend, you can add <link rel="preconnect" href="https://api.payments.local"> to shift the handshake ahead of the actual user click.

Undici maintains keep-alive natively and provides nearly a 10x boost over the core HTTP client.

DNS + TLS: Two Hidden Dragons

Even with keep-alive, the first request to a different domain is still slow.

DNS lookup blocks the JS thread in the browser (up to 100 ms on mobile networks). TLS handshake takes three round-trips instead of TCP's one.

It's better to explicitly cache DNS lookups and increase maxSockets if you're dealing with dozens of domain names.

Fix

# nginx.conf
resolver 9.9.9.9 valid=300s;
// Node
const agent = new Agent({ connect: { lookup: dnsCache.lookup }});

An alternative is QUIC/HTTP-3 with 0-RTT, but keep in mind the quirky behavior of proxies — benchmark results for HTTP/2 vs HTTP/1.1 can be unpredictable.

response.json() Blocks the Event Loop

The server returns 5-10 MB of JSON — and your API route hangs while CPU jumps to 100%.

Response.prototype.json() reads the entire stream into memory and only then calls JSON.parse, monopolizing the main thread.

Fix: Stream Parsing

import { parse } from 'stream-json';
import { chain } from 'stream-chain';

const pipeline = chain([
  response.body,  // ReadableStream from fetch()
  parse(),
  ({ key, value }) => { /* ...process fragments... */ },
]);
await finished(pipeline);

Switching from WebStreams to native Node streams speeds up reading by 60-90%.

Response Weight: Compression and Formats

Network speed is fine, but downloads are still slow — especially on mobile 4G.

This happens when the API serves "raw" JSON — 2-3x excess bytes. The server doesn't support Brotli, the client doesn't request it. You're serving images as PNG instead of WebP/AVIF.

Fix

// frontend
await fetch(url, {
  headers: { 'Accept-Encoding': 'br, gzip' }
});

// backend (Fastify)
fastify.register(require('@fastify/compress'), {
  brotliOptions: { params: { [zlib.constants.BROTLI_PARAM_QUALITY]: 5 } }
});

Brotli on text payloads provides up to 25% traffic savings beyond Gzip.

Await in a Loop

You have 100 parallel tasks, but total execution time is approximately 100 x single task time.

The classic mistake:

for (const id of ids) {
  const res = await fetch(`/api/item/${id}`);
  items.push(await res.json());
}

You're linearizing all requests.

Fix

const concurrency = 10;        // be gentle with the backend
const queue = [...ids];
const results = [];

await Promise.all(
  Array.from({ length: concurrency }, async function worker() {
    while (queue.length) {
      const id = queue.pop();
      const res = await fetch(`/api/item/${id}`);
      results.push(await res.json());
    }
  })
);

Slow Client: fetch vs undici.request

Even with all optimizations, t95 is higher than you'd like.

Node-fetch is implemented as a "wrapper on top of a wrapper": web compatibility means extra GC and abstractions. Matt Pocock measured: undici.request is approximately 2-3x faster than built-in fetch.

Fix

import { request } from 'undici';

const { body } = await request('https://inventory.local/items', {
  method: 'POST',
  body: JSON.stringify(payload),
  headers: { 'content-type': 'application/json' },
});
const data = await body.json();        // still convenient

If your application is primarily HTTP calls, switching to undici.request gives a 3x speedup.

CORS Preflight

Every first call to a third-party API consistently adds +150-250 ms. In the Performance tab, you see an extra OPTIONS request followed by your actual GET/POST.

The browser sends a preflight request for non-simple methods/headers (any Content-Type other than application/x-www-form-urlencoded, custom headers, credentials: true, etc.). Until the server responds with 200 OK, the main request doesn't start — double RTT plus backend processing.

Fix

  1. Simplify the request: use GET or POST with Content-Type: text/plain (if acceptable) and no custom headers.
  2. Cache preflight on the server side:
// Express + cors
app.options('*', cors({
  origin: true,
  credentials: true,
  maxAge: 86_400,      // browser won't send OPTIONS for 24 hours
}));
  1. Consolidate domains — move the API to a subpath (/api) of the same origin so CORS disappears entirely.

Head-of-Line Blocking in HTTP/2: One Fat Stream Slows Everyone Down

On an HTTP/2 backend, t99 spikes happen precisely during large uploads/downloads (files > 5 MB). Logs are clean, CPU is fine, network is fine — but the frontend freezes.

When a packet is lost, the TCP connection pauses everything after the head. In HTTP/2, all streams are multiplexed into a single TCP connection, so a single packet drop during a file upload stalls all parallel API calls.

Fix

Separate large and small requests: maintain a separate agent/domain/port for heavy uploads.

const bulkAgent = new Agent({ maxStreamsPerConnection: 1 }); // one stream each
await fetch(largeUrl, { dispatcher: bulkAgent, body: bigFile });

Enable HTTP/3 (QUIC) — it runs over UDP, making each stream independent. Tune prioritization: http2_max_concurrent_streams, priority hints, or separate WAF rules so a packet loss doesn't bring down the entire connection.

In testing, switching large upload handlers to HTTP/3 yielded -45% median and -79% p99 latency improvements.

JSON.stringify Before Sending

During bulk POST /bulk operations, you observe 100% CPU while network traffic starts late — the process is burning cycles before fetch() even fires.

The request won't send until you serialize the payload. The classic pattern:

await fetch(url, { body: JSON.stringify(hugeObject) });

converts tens of megabytes to a string in one shot, blocking the event loop. If you also clone the object with body: {...data}, the time doubles. In Node 24/Chromium 122, structuredClone() is available and faster than deep copy, but it's still synchronous.

Fix

1. Streaming multipart/NDJSON upload — serialize in chunks:

import { Readable } from 'node:stream';
const enc = new TextEncoder();
const stream = Readable.from(
  bigArray.map(x => enc.encode(JSON.stringify(x) + '\n'))
);
await fetch('/bulk/ingest', { method: 'POST', body: stream });

This approach keeps memory stable at < 50 MB even for 1 GB of data.

2. Write in BSON/MsgPack, bypassing stringify entirely.

3. Skip unnecessary deep copies — instead of const safe = structuredClone(huge);, simple schema validation is often sufficient.

Summary

Measure. Use npx bench-rest, autocannon, or the browser Performance tab. Fix layer by layer: network > protocol > client > serialization > algorithm. Don't fear alternatives: GraphQL-over-HTTP/2, gRPC-web, msgpack instead of JSON. Automate. Add Artillery to CI, set alerts on p95 > 200 ms.

FAQ

What is this article about in one sentence?

This article explains the core idea in practical terms and focuses on what you can apply in real work.

Who is this article for?

It is written for engineers, technical leaders, and curious readers who want a clear, implementation-focused explanation.

What should I read next?

Use the related articles below to continue with closely connected topics and concrete examples.