Is SOCKS5 Cool Now?
Kyoto, as of version v0.9.0
, has a tor feature flag which enables a builtin tor client which helps protect network metadata leaks. This protects the client’s IP address from the remote node. It also keeps the ISP in the dark as to who the client is talking to, although the ISP can probably tell they are using tor since the entry nodes are known. Sealing up metadata leaks is important, but in this case, there is a high cost.
First, tor makes the application slow as hell. Turns out bouncing around that onion routing system ain’t free. The second issue, more for maintainers, is that the tor client dependency is large and causes issues for the library. It forces the library’s MSRV to 1.70.0
from 1.63.0
. And there is a doozy of an issue with the sqlite dependency where only one version (due to some underlying linking requirements) of libsqlite3-sys
can be specified. This is a high cost for a feature that very well might not be used at all.
We could just scrap the tor integration completely, but there might be a middle ground. Many applications offer tor integration through a SOCKS5 API. It requires that a tor daemon is running somewhere for the application to use, so no longer nicely builtin to the app. But this greatly reduces the responsibilities and technical complexity of the app. On a desktop environment, it is pretty easy to fire up a tor daemon and there are actually performance benefits to sharing it between processes. In other environments, like mobile, getting a tor daemon might be a larger ask.
But in any case, why use SOCKS5? I have never looked into it much and for some reason it has a vague big-corporate feel to me, probably from my experience using it in the past…at big corps.
SOCKS5 is unsurprisingly the fifth version of SOCKS (SOCKet Secure) and is defined in RFC 1928. The RFC was published in 1996 so it’s been doing its thing for awhile. And giving the RFC a read now, SOCKS5 is relatively light-weight compared to my assumptions. However, the original use case was exactly for big-corps, so my feelings weren’t entirely off.
Back in the early nineties, network firewalls were usually running on special purpose machines instead of the routers themselves like now-a-days. This was due to resource constraints. Routers were geared entirely to packet forwarding, but firewalls need more general-purpose CPUs for inspection. A corporate network would have a firewall between its users and the router which interfaced with the internet. Firewalls were still pretty limited, usually stateless IP/port based access/deny rules. What if big-corp wanted to allow certain applications or users more access? This is the void SOCKS5 filled.
[Internal Clients] → [Firewall] → [SOCKS5 Server] → [Internet]
SOCKS dropped in after the firewall.
The firewall is updated to funnel traffic to the SOCKS5 server which it delegates further security. Only the SOCKS5 server can access the internet, internal users cannot directly reach out.
SOCKS5 is a session layer proxy, below application. It adds user authentication and then handles creating and maintaining TCP or UDP connections. The client performs a very compact handshake with the socks server, just enough info to connect to the target server. After that the client just treats the socket as TCP (or UDP) connection with the target server. The only complexity comes from the initial, optional, authentication.
OK, so SOCKS5 isn’t as big an scary as I had assumed. But that still leaves the question, why is tor using this old big-corp protocol? Well, because it is super simple and…old! If you toss the original big-corp use case, SOCKS sounds a lot like what tor provides. “Give me your end destination, I’ll handle the connection, just treat it as a dumb pipe”. A tor daemon exposes a SOCKS5 interface without any authentication. Tor could have designed its own protocol which would look very much like SOCKS5-minus-auth, but why not instead just use SOCKS5 so that every existing SOCKS5 client is also a tor client?