Skip to content

Performance

zloop is faster than uvloop on the workloads that matter most for real servers. This page leads with uvloop's own benchmark (the fairest comparison there is), then shows lower-level micro-benchmarks, and is honest about the caveats.

uvloop's own benchmark

The most credible way to compare against uvloop is to run uvloop's benchmark. So that's what we do: uvloop ships an echo-server benchmark in examples/bench, with three server styles - proto (a raw asyncio.Protocol), buffered (a BufferedProtocol), and streams (the high-level streams API) - driven by a multi-process client that measures requests/second.

We run it unchanged, except for adding a --zloop flag to the server that mirrors the existing --uvloop one. The client is byte-for-byte uvloop's.

Results (macOS arm64, CPython 3.14, 3 workers, best of 3), requests/sec:

Message Server mode asyncio uvloop zloop zloop vs uvloop
1 KiB proto 113k 113k 121k +7%
1 KiB buffered 115k 115k 123k +7%
1 KiB streams 83k 90k 103k +14%
10 KiB proto 105k 110k 113k +3%
10 KiB buffered 105k 105k 124k +18%
10 KiB streams 81k 86k 95k +11%
Echo throughput: uvloop vs zloop across server modes and message sizes
uvloop vs zloop, requests/sec (macOS arm64, best of 3). zloop leads in every configuration, with the widest margins on streams and buffered.

For the small-to-medium messages common in HTTP requests, WebSocket frames, and RPC calls (1 to 10 KiB), zloop is ahead of uvloop in every cell measured, with the biggest, most reproducible margins on the streams and buffered paths.

About large (100 KiB) messages

uvloop's benchmark also has a 100 KiB row. We don't report a winner there: at that size the test measures loopback bandwidth, not the event loop, and the numbers swing wildly between runs for all three loops (we saw asyncio's streams read 42k one run and 24k the next). To measure large-message behavior meaningfully you need real hardware, not loopback.

Reproduce it

The harness lives in bench_uvloop/:

$ NUM=50000 WORKERS=3 BEST_OF=3 bash bench_uvloop/run_matrix.sh

(The table above used NUM=50000; the script's default is 100000.)

Micro-benchmarks

Beyond echo throughput, these isolate individual loop operations (each in its own process; bench.py reports the best of several warmed-up runs):

Workload asyncio uvloop zloop zloop vs uvloop
call_soon (schedule + run) 2.4 M/s 4.2 M/s 6.1 M/s +46%
call_later (timers) 0.6 M/s 3.6 M/s 4.0 M/s +12%
create_future ~0.04 M/s ~0.04 M/s ~0.04 M/s tie

create_future is a genuine tie because all three loops reuse CPython's C-accelerated asyncio.Future - there's nothing there to differentiate. See What zloop reuses.

Reproduce with python bench.py in the repository.

Where the speed comes from

It's not magic, it's doing the per-callback work in Zig and avoiding Python-level overhead on the hot paths:

  • The contextvars work goes through the raw C-API (PyContext_Enter / PyContext_Exit / PyContext_CopyCurrent) instead of the Python-level context.run() and contextvars.copy_context(). This was the single biggest win for timers and scheduling.
  • Reads are zero-copy: a socket read goes straight into a freshly allocated bytes object that's then shrunk to size, with no intermediate buffer copy.
  • The hot asyncio callables are cached (Future, Task, ensure_future) instead of being re-imported on every call.
  • I/O readiness callbacks are native Zig closures - a socket becoming readable doesn't make a round trip through Python just to learn a byte arrived.
  • The timer heap and ready queue live in Zig, with no per-operation Python allocation.

Caveats, stated plainly

Benchmarks are easy to get wrong, so:

  • Single machine, warm cache, macOS loopback - no Linux numbers yet. uvloop's published "2 to 4x faster than asyncio" is a Linux number; on macOS's loopback + kqueue stack the spread is much tighter (you can see even uvloop barely beats asyncio above). We expect zloop's relative standing to carry to Linux's epoll path, but we haven't measured it - so treat that as an expectation, not a result. Run it on your own target.
  • Run-to-run variance is real (~10%). The 1 to 10 KiB lead is reproducible; exact figures wobble.
  • Measure each metric in isolation. Running everything back-to-back in one process degrades the later measurements and produces misleading ratios.