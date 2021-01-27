Published on: January 27, 2021
Subtle differences in proxy setting implementations led to surprising problems for a GitLab customer. Here's how we got to the root of it.
If you've used a Web proxy server before, you're probably familiar with
the environment variables
http_proxy or
HTTP_PROXY. You may be less
familiar with
no_proxy, which provides a way to exclude traffic
destined to certain hosts from using the proxy. While HTTP is a
well-defined standard, no standard exists for how clients should handle
these variables. As a result, Web clients support these variables in
subtly different ways. For one GitLab customer, these differences led
to a weekend of troubleshooting that uncovered why certain services
stopped communicating.
A proxy server acts as an intermediary between your computer or network and the internet. When you send a request to access a website or other online resource, that request first goes to the proxy server. The proxy server then forwards the request to the actual destination and delivers the response back to you. Proxies can serve various purposes, including improving security, enhancing privacy, and controlling internet usage.
Let's now look at what proxy server environment variables are, and how to define exemptions and handle exclusions with
no_proxy.
Today, most Web clients support connection to proxy servers via environment variables:
http_proxy /
HTTP_PROXY
https_proxy /
HTTPS_PROXY
no_proxy /
NO_PROXY
These variables tell the client what URL should be used to access the
proxy servers and which exceptions should be made. For example, if you
had a proxy server listening on
http://alice.example.com:8080, you
might use it via:
export http_proxy=http://alice.example.com:8080
Which proxy server gets used if troublesome Bob also defines the
all-caps version,
HTTP_PROXY?
export HTTP_PROXY=http://bob.example.com:8080
The answer surprised us: it depends. In some cases, the Alice proxy wins, and in other cases Bob wins. We'll discuss the details later.
no_proxy
What happens if you want to make exceptions? For example, suppose you
want to use a proxy server for everything but
internal.example.com and
internal2.example.com. That's where the
no_proxy variable comes into
play. Then you would define
no_proxy as follows:
export no_proxy=internal.example.com,internal2.example.com
no_proxy
What if you want to exclude IP addresses? Can you use asterisks or
wildcards? Can you use CIDR blocks (e.g.
192.168.1.1/32)? The answer
again: it depends.
Let's dig into the evolution of proxy variables, and how they are used today.
In 1994, most Web clients used CERN's
libwww, which supported
http_proxy and the
no_proxy environment variables.
libwww only used the lowercase form of
http_proxy, and the
no_proxy syntax was
simple:
no_proxy is a comma- or space-separated list of machine
or domain names, with optional :port part. If no :port
part is present, it applies to all ports on that domain.
Example:
no_proxy="cern.ch,some.domain:8001"
New clients emerged that added their own HTTP implementations without
linking
libwww. In January 1996, Hrvoje Niksic released
geturl, the predecessor of what is now
wget. A month later,
geturl, added support for
http_proxy in v1.1.
In May 1996,
geturl v1.3 added support for
no_proxy. Just as with
libwww,
geturl only supported the lowercase form.
In January 1998, Daniel Stenberg released
curl v5.1, which supported the
http_proxy and
no_proxy variables.
In addition,
curl allowed the uppercase forms,
HTTP_PROXY and
NO_PROXY.
Plot twist: In March 2009, curl v7.19.4 dropped support for the
uppercase form of
HTTP_PROXY due to security concerns. However, while
curl ignores
HTTP_PROXY,
HTTPS_PROXY still works today.
Fast-forward to today. As my colleague Nourdin el Bacha researched, we can see that how these proxy server variables are handled varies, depending on what language or tool you are using.
Knowing how proxy variables are handled across languages allows you to set them so that they work properly. Here’s a quick rundown.
http_proxy and
https_proxy
In the following table, each row represents a supported behavior, while
each column holds the tool (e.g.
curl) or language (e.g.
Ruby) to
which it applies:
|curl
|wget
|Ruby
|Python
|Go
|
http_proxy
|Yes
|Yes
|Yes
|Yes
|Yes
|
HTTP_PROXY
|No
|No
|Yes (warning)
|Yes (if
REQUEST_METHOD not in env)
|Yes
|
https_proxy
|Yes
|Yes
|Yes
|Yes
|Yes
|
HTTPS_PROXY
|Yes
|No
|Yes
|Yes
|Yes
|Case precedence
|lowercase
|lowercase only
|lowercase
|lowercase
|Uppercase
|Reference
|source
|source
|source
|source
|source
Note that
http_proxy and
https_proxy are always supported across the
board, while
HTTP_PROXY is not always supported. Python (via
urllib) complicates
the picture even more:
HTTP_PROXY can be used as long as
REQUEST_METHOD is not defined in the environment.
While you might expect environment variables to be all-caps,
http_proxy came first, so that's the de facto standard. When in doubt,
go with the lowercase form because that's universally supported.
Instead of environment variables, Java uses system properties. This avoids case issues entirely.
Unlike most implementations, Go tries the uppercase version before falling back to the lowercase version. We will see later why that caused issues for one GitLab customer.
no_proxy format
Some users have discussed the lack of the
no_proxy specification in this issue. As
no_proxy specifies an exclusion list, many questions arise about
how it behaves. For example, suppose your
no_proxy configuration is defined:
export no_proxy=example.com
Does this mean that the domain must be an exact match, or will
subdomain.example.com also match against this configuration? The
following table shows the state of various implementations. It turns out
all implementations will match suffixes properly, as shown in the
Matches suffixes? row:
|curl
|wget
|Ruby
|Python
|Go
|Java
|
no_proxy
|Yes
|Yes
|Yes
|Yes
|Yes
|No*
|
NO_PROXY
|Yes
|No
|Yes
|Yes
|Yes
|No*
|Case precedence
|lowercase
|lowercase only
|lowercase
|lowercase
|Uppercase
|N/A
|Matches suffixes?
|Yes
|Yes
|Yes
|Yes
|Yes
|No
|Strips leading
.?
|Yes
|No
|Yes
|Yes
|No
|No
|
* matches all hosts?
|Yes
|No
|No
|Yes
|Yes
|Yes
|Supports regexes?
|No
|No
|No
|No
|No
|No
|Supports CIDR blocks?
|No
|No
|Yes
|No
|Yes
|No
|Detects loopback IPs?
|No
|No
|No
|No
|Yes
|No
|Resolves IP addresses?
|No
|No
|Yes
|No
|Yes
|No
|Reference
|source
|source
|source
|source
|source
|documentation
http.nonProxyHosts system property.
However, if there is a leading
. in the
no_proxy setting, the
behavior varies. For example,
curl and
wget behave
differently.
curl will always strip the leading
. and match against
a domain suffix. This call bypasses the proxy:
$ env https_proxy=http://non.existent/ no_proxy=.gitlab.com curl https://gitlab.com
<html><body>You are being <a href="https://about.gitlab.com/">redirected</a>.</body></html>
However,
wget does not strip the leading
. and performs an exact
string match against a hostname. As a result,
wget attempts to use a
proxy if a top-level domain is used:
$ env https_proxy=http://non.existent/ no_proxy=.gitlab.com wget https://gitlab.com
Resolving non.existent (non.existent)... failed: Name or service not known.
wget: unable to resolve host address 'non.existent'
In all implementations, regular expressions are never supported. I suspect using regexes complicates matters further, because regexes have their own flavors (e.g. PCRE, POSIX, etc.). Using regexes also introduces potential performance and security issues.
In some cases, setting
no_proxy to
* effectively disables proxies
altogether, but this is not a universal rule.
Only Ruby performs a DNS lookup to resolve a hostname to an IP address when deciding if a proxy should be used. Be careful if you use IP addresses with Ruby because it’s possible a hostname may resolve to an excluded IP address. In general, do not specify IP addresses in no_proxy variable unless you expect that the IPs are explicitly used by the client.
The same holds true for CIDR blocks, such as
18.240.0.1/24. CIDR
blocks only work when the request is directly made to an IP
address. Only Go and Ruby allow CIDR blocks. Unlike other
implementations, Go even automatically disables the use of a proxy if it
detects a loopback IP addresses.
Discrepancies in proxy environment variable handling, particularly between Ruby and Go, can lead to a real-world issues where Git pushes worked via the command line but failed in the web UI for a GitLab customer. Understanding these inconsistencies is crucial for troubleshooting and configuring applications that operate across multiple languages within corporate networks utilizing proxy servers.
If you have an application written in multiple languages that needs to work behind a corporate firewall with a proxy server, you may need to pay attention to these differences. For example, GitLab is composed of a few services written in Ruby and Go. One customer set its proxy configuration to something like the following:
HTTP_PROXY: http://proxy.company.com
HTTPS_PROXY: http://proxy.company.com
NO_PROXY: .correct-company.com
The customer reported the following issue with GitLab:
git push from the command line worked
Our support engineers discovered that due to a Kubernetes configuration issue, a few stale values lingered. The pod actually had an environment that looked something like:
HTTP_PROXY: http://proxy.company.com
HTTPS_PROXY: http://proxy.company.com
NO_PROXY: .correct-company.com
no_proxy: .wrong-company.com
The inconsistent definitions in
no_proxy and
NO_PROXY set off red
flags, and we could have resolved the issue by making them consistent or
removing the incorrect entry. But let's drill into what happened.
Remember from above that:
As a result, services written in Go, such as GitLab Workhorse, had the
correct proxy configuration. A
git push from the command line worked
fine because the Go services primarily handled this activity:
The gRPC call in step 2 never attempted to use the proxy because
no_proxy was configured properly to connect directly to Gitaly.
However, when a user makes a change in the UI, Gitaly forwards the
request to a
gitaly-ruby service, which is written in
Ruby.
gitaly-ruby makes changes to the repository and reports back
via a gRPC call back to its parent process. However,
as seen in step 4 below, the reporting step didn't happen:
Because gRPC uses HTTP/2 as the underlying transport,
gitaly-ruby
attempted a CONNECT to the proxy since it was configured with the wrong
no_proxy setting. The proxy immediately rejected this HTTP request,
causing the failure in the Web UI push case.
Once we eliminated the lowercase
no_proxy from the environment, pushes
from the UI worked as expected, and
gitaly-ruby connected directly to
the parent Gitaly process. Step 4 worked properly in the diagram below:
We also discovered that gRPC does not support HTTPS proxies. This again subtly affects the behavior of the system depending on how
HTTPS_PROXY is set.
HTTPS_proxy
Note that the customer set
HTTPS_PROXY to an unencrypted HTTP proxy;
notice that
http:// is used instead of
https://. While this isn't
ideal from a security standpoint, some people do this to avoid the
hassle of clients failing due to TLS certificate verification issues.
Ironically, if an HTTPS proxy were specified, we would not have seen this problem. If an HTTPS proxy is used, gRPC will ignore this setting since HTTPS proxies are not supported.
I think we can all agree that one should never define inconsistent values with lowercase and uppercase proxy settings. However, if you ever have to manage a stack written in multiple languages, you might need to consider setting HTTP proxy configurations to the lowest common denominator.
http_proxy and
https_proxy
HTTP_PROXY is not always supported or recommended.
no_proxy
Use lowercase form.
Use comma-separated
hostname:port values.
IP addresses are okay, but hostnames are never resolved.
Suffixes are always matched (e.g.
example.com will match
test.example.com).
If top-level domains need to be matched, avoid using a leading dot (
.).
Avoid using CIDR matching since only Go and Ruby support that.
no_proxy
Knowing the least common denominator can help avoid issues if these
definitions are copied for different Web clients. But should
no_proxy
and the other proxy settings have a documented standard rather than an
ad hoc convention? The list below may serve as a starting point for a
proposal:
http_proxy should be searched before
HTTP_PROXY).
hostname:port values.
* to match all hosts.
.) and match against domain suffixes.
no_proxy).
It's been over 25 years since the first Web proxy was released. While
the basic mechanics of configuring a Web client via environment
variables have not changed much, a number of subtleties have emerged
across different implementations. We saw for one customer, erroneously
defining conflicting
no_proxy and
NO_PROXY variables led to hours of
troubleshooting due to the differences with which Ruby and Go parse
these settings. We hope highlighting these differences will avoid future
issues in your production stack, and we hope that Web client maintainers
will standardize the behavior to avoid such issues in the first place.
