Heap sizing, GC strategy, and how event-loop lag becomes the most honest metric in any node-js fleet.
The defaults that ship with node js are designed for a single developer machine. Production fleets need explicit choices about heap size, GC strategy, and concurrency.
For typical API workloads, set --max-old-space-size to ~75% of the container's available memory and leave headroom for off-heap buffers.
Watch for incremental marking pauses above 50ms. They almost always show up as p99 spikes, never as averages.
Event-loop lag is the most honest signal a nodejs process produces. CPU can be misleading; lag is not.