Use a caching proxy with the Debian mirror redirector service

Debian’s package repository is distributed among many mirror servers. Picking the best mirror previously required you to manually test network speeds, find the fastest network topological route from your client, check that it was kept up to date, and then to regularly repeat these tests to confirm your mirror of choice was still the best option.

The Debian mirror redirector service at http://httpredir.debian.org/ takes the place of a mirror in the Apt repository listing and will automatically point you to the best available mirror. It will also handle load-balancing between the different mirrors, and offers clients faster downloads in parallel from multiple mirrors at the same time. These advantages are obtained by having the client send every request for a package to the redirector who in turn returns a redirect pointing to the best choice for the requested package.

The redirector service does, however, interfere with any caching proxies on your local network or transparent proxies on the internet service provider level. A caching proxy receives HTTP network requests; checks to see if it has a fresh copy of the requested webpage, resource, and then either downloads the requested resource into its cache or serves a copy directly from the cache.

This can be a great way to reduce external network bandwidth usage as well as speed up environments were multiple clients need to download the same Debian packages (for example as part of a nightly or weekly maintenance window.)

Unfortunately, the redirector’s responses aren’t cachable so proxies have to ask the redirector every time a client asks it for the same package. The redirector will either redirect to the same mirror which could already be cached by the proxy, or it could redirect to another mirror for the same package. Requiring the proxy to download and store two copies of the same file from two different mirrors.

The redirector is almost doing its job as intended under these circumstances. Firstly, it incorrectly returns a HTTP 301 Permanent Redirect code indicating that the same package will forevermore be found at the new address it redirects to. Yet, it doesn’t make the redirection cachable so clients and proxies can’t trust the permanent redirect. Ten clients behind a proxy who update at the same time will mean the proxy have to check the redirect ten times and possibly serve different copies of the same package to each of the clients.

I’ve submitted a patch to the Debian http-redirector that — if accepted — will improve the situation somewhat for proxied networks where updates occur to many clients around at the same time. With the patch, the redirector will correctly send a HTTP 307 Temporary Redirect status code and the redirects will be cachable by proxies and clients for a few minutes.

Depending on the number of available mirrors in your client’s region and their cache controls, using the redirector service through a proxy will be slightly slower than compared to using one dedicated mirror. If bandwidth conservation is your one and only concern, you ought to use a dedicated mirror known to support caching rather than the redirection service. However, if the redirector accepts my patch, updating many clients behind a proxy within a short time frame should go faster and conserve more external bandwidth.