The media is filled with hyperbolic claims that "Our network is the fastest!"
And there are many so-called "Speed Test" tools available on the Internet. Most are easily run in a web browser.
Should you trust those tools?
The popular speed testing tools provide a very narrow and limited measure of network "speed".
It is quite possible that a network that is rated as "fast" could actually deliver poor results to many applications.
Why is this so?
What's In Those Speed Tests?
Most speed test tools on the Internet run a limited regime of tests:
- ICMP Echo/Reply ("ping") to measure round-trip time (although most tests are unclear whether they are reporting round-trip time or dividing by two to estimate one-way latency.)
- HTTP GET (download) and PUT (upload) to measure TCP bandwidth.
Some more sophisticated tools may add things like:
- Traceroute (properly done with UDP packets, improperly done using ICMP Echo packets)
- DNS queries
Some speed test tools use IPv4, some use IPv6, some use whatever the underlying web browser and IP stack chooses.
Sounds Good, So What's Wrong With 'Em?
Network performance is highly related to the way that the devices on a network converse with one another.
- Does the application software (and its server) use UDP or TCP?
UDP is vulnerable to many network phenomena such as IP fragmentation, highly variable latency/jitter, packet loss, or alteration of the sequence of packets (i.e. the sender sends packets A, B, and C, the receiver gets them in the order B, C, A.), etc.
TCP, on the other hand, although reliable, may withhold delivering data to the receiver while it internally tries to deal with packet losses, changes in end-to-end latency, and network congestion.
- Does the application's data have real-time constraints? For example, voice or video conferencing applications have very tight time constraints else the images may break up, freeze, or words be lost.
- How big are the chunks of data being sent? Larger data, particularly very large high-definition video, is more vulnerable to loss on the network, transient congestion problems, or IP fragmentation issues than are small data packets.
The bandwidth number generated by most speed test tools is based on World-Wide-Web HTTP GET (upload) and HTTP POST (download) transactions. These are bulk transfers of large amounts of data over TCP connections.
Bandwidth numbers based on TCP bulk transfers tend to be good indicators of how long it may take to download a large web page. But those numbers can be weak indicators of performance for more interactive applications (eg. Zoom).
Moreover, TCP tries to be a good citizen on the network by trying hard to avoid contributing to network congestion. TCP contains several algorithms that kick in when a new connection is started and when congestion is perceived. These algorithms cause the sending TCP stack to reduce its transmission rate and slowly creep back up to full speed. This means that each new TCP connection begins with a "slow start". In addition any lost packets or changes in perceived round-trip time may send the sending TCP stack into its congestion avoidance regime during which traffic flows will be reduced.
Modern web pages tend to be filled with large numbers of subsidiary references. Each of those tends to engender a Domain Name System lookup (UDP) and a fresh TCP connection (each with its own slow start penalty.) As a consequence, modern web page performance is often not so much limited by network bandwidth but more by protocol algorithms and network round-trip times.
So What Do We Really Need?
Unfortunately a full measure of the quality and speed of a network path includes a large number of often obtuse numbers.
- Whether the path contains parallel elements due to load balancing or bonding of physical links. (In other words, it is good to know whether all the traffic follows the same path or whether it is divided among multiple paths with possibly quite different characteristics.)
- Whether the network path is symmetrical or whether each direction takes a different route. (This is very common.)
- Path MTU (Maximum Transmission Unit for the entire one-way path - a separate value is needed for each direction.)
- End-to-end latency, and often more importantly, a statistical measure of the packet-to-packet variation of that delay, often called "jitter".
- Packet loss rates and a measure of whether that loss occurs continuously or in bursts. (This is particularly important on paths that include technologies subject to outside interference and noise such as wireless links.)
- Buffering along the path (in other words, whether the path may suffer from "bufferbloat".)
- Packet re-sequencing rates and a measure of whether that is burst behavior or continuous.
- Whether there are "hidden" proxy devices (most likely HTTP/HTTPS or SIP proxies) that are relaying the traffic.
- Whether there are any rate limiters or data quotas on the path.
What Can A User Do?
Users are somewhat limited in their ability to control protocols and applications.
The user can check the following things:
- If using a wi-fi network at home or work in conjunction with Bluetooth, make sure that you are attached to the wi-fi on the 2.4Ghz band. Many user devices have only one radio. If that device is connected to wi-fi in the 5Ghz band then that radio is being rapidly switched between Bluetooth on the 2.4Ghz band and wi-fi onthe 5Ghz band. That's a recipe for generating destructive packet loss and jitter.
- Make sure your home wi-fi and router devices have some of the new anti-bufferbloat code. See What Can I Do About Bufferbloat?
- Be aware when you may be sharing your network resources with other users or other applications.
What Tools Do Developers Have To Make Sure Applications Behave Well Under Real Life Conditions? Enter the Network Emulator.
Speed test tools tend to give an optimistic report of how a network behaves for a highly constrained number of applications. Similarly, many network developers test their code only under optimal, laboratory conditions.
There are tools available to developers so that they can assure that their code and products are robust and behave well in the face of inevitable sub-optimal network conditions.
Most of these tools come under the heading of "network emulators". These effectively act as a bothersome man-in-the-middle, delaying some packets, tossing others, perhaps re-sequencing packets or even duplicating them.
Network Emulators come in a variety of capabilities and accuracies:
- Simple emulators are built into some mobile phones.
- There are a couple of open-source packages that typically exist as kernel modules for Linux or FreeBSD. These usually must be used through an arcane command-line interface. And their accuracy can vary wildly depending on the underlying hardware.
- There are external devices that are inserted into an Ethernet link (like one would insert an Ethernet switch.) These devices tend to have better accuracy and performance, and often have web-based graphical user interfaces. IWL's KMAX is in this category.
There are also mathematical emulators. Those are more for those who are designing large networks and want to perform queueing theory analysis of how that network might perform if new links are added or removed.