banner
raye~

Raye's Journey

且趁闲身未老,尽放我、些子疏狂。
medium
tg_channel
twitter
github
email
nintendo switch
playstation
steam_profiles

Analysis of the Principles of the SOCKS5 Protocol and Comparison of Implementations

scoks5xieyi

SOCKS5 Tunnel Principle#

In fact, we often use the SOCKS protocol when bypassing restrictions, but we have never been very clear about its working principle. Taking advantage of the weekend, I sorted it out.

First, what is a network tunnel?

The definitions provided by various encyclopedias are summarized as follows:

A network tunnel is a new virtual network connection established on top of existing network protocols. By encapsulating the data packets of one network protocol within another, it enables data transmission between different networks. This method can isolate data transmission over public networks (such as the Internet) from private networks or other networks, thereby enhancing the security of data transmission.

However, this definition is too difficult to understand, so I started to think about why it is called a tunnel?

Analogous to tunnels we see in daily life, such as a train tunnel through a mountain, when there is a large mountain between point A and point B, a tunnel is dug from A to B, allowing trains and other vehicles to pass through.

So, analogously in network protocols, due to some reasons, point A and point B cannot communicate directly (reason dddd), so we dig a tunnel (SOCKS5 protocol) between A and B, and then transport our train (HTTP data, after all, we are just doing HTTP communication online) through this tunnel.

Isn't it easy to understand the SOCKS network tunnel this way? I’m quite clever!

Similarly, in secure penetration, there is also an HTTP tunnel, which utilizes certain features of the HTTP protocol (such as chunked) to establish an HTTP tunnel for transmitting HTTP communication data (no nesting allowed/doge), but that’s a topic for another time; this article only studies SOCKS network tunnels.

Untitled

From the above analogy, we can see that the conditions for establishing a SOCKS network tunnel are as follows:

  • Destination, i.e., the target that the SOCKS proxy needs to connect to.
  • Construction team, i.e., the SOCKS proxy server.

In other words, client A needs to have a construction team and tell the construction team where it wants to go, and then the construction team will dig a tunnel for you.

Finally, let’s provide a definition of a SOCKS5 tunnel; doesn’t it seem much easier to understand after the analogy?

A SOCKS5 tunnel is a network protocol tunnel used to transmit data between a client and a target server. SOCKS5 is the fifth version of the SOCKS protocol, which supports multiple authentication methods, as well as IPv4 and IPv6 addresses. SOCKS5 tunnels allow various protocols (such as HTTP, FTP, SMTP, etc.) to run on top of them and provide intermediate proxy services between the client and the target server.

The working principle of a SOCKS5 tunnel is to establish a proxy server between the client and the target server. The client does not communicate directly with the target server but sends data to the SOCKS5 proxy. The SOCKS5 proxy receives the data and then forwards it to the target server. The target server sends the response back to the SOCKS5 proxy, which then forwards the response to the client.

The main advantage of SOCKS5 tunnels is that they provide a universal network proxy solution that supports multiple protocols and address types. This allows SOCKS5 tunnels to be used to bypass firewalls and content filters, enabling access to restricted network resources.

******************************************** Implementing a SOCKS Proxy Service ********************************************

Here, we choose Go and Rust to compare the implementation of a SOCKS5 proxy server, i.e., the construction team of the tunnel, and briefly compare the performance to see which performs better in SOCKS5 proxying.

TCP Proxy Server Implementation#

First, let’s take a look at how to set up a general TCP server. Here is a diagram illustrating the transmission of communication data:

+-----------+       +--------------+       +--------------+
|  Browser  | <---> | TCP Proxy    | <---> | Target Server|
+-----------+       +--------------+       +--------------+

Note that the TCP Proxy essentially just receives TCP data and forwards it, so in fact, the initiator of the SOCKS5 request is the browser, which is why we usually need to install a Chrome extension (like Proxy Switchy Omega) to select the proxy method.

In golang, implementing a proxy server is very simple; you just need to use net.Listen to open a port. After opening the port, the server just needs to keep calling Accept, spawning a goroutine for each incoming connection.

Untitled 1

func main() {
	server, err := net.Listen("tcp", ":1081")
	if err != nil {
		fmt.Printf("Listen failed: %v\n", err)
		return
	}

	for {
		client, err := server.Accept()
		if err != nil {
			fmt.Printf("Accept failed: %v", err)
			continue
		}
		go process(client)
	}
}

In this client, it simultaneously contains the request sent by the browser to us and exposes a write interface for us to write response data.

Correspondingly, in Rust, there is a similar implementation to goroutines called Tokio, which implements asynchronous IO tasks. The basic code is as follows:

#[tokio::main]
async fn main() {
    let listener = TcpListener::bind("127.0.0.1:1080").await.unwrap();
    loop {
        let (client, _) = listener.accept().await.unwrap();
        spawn(handle_client(client));
    }
}

Note that the most commonly used port for SOCKS5 proxies is 1080. If you want to capture packets in Wireshark, it can only parse SOCKS5 communication on port 1080.

Implementing SOCKS5 Proxy#

The SOCKS5 protocol is essentially an application layer protocol, and the data will be packed into the payload of TCP packets. The SOCKS5 protocol can be divided into several parts, analogous to digging a tunnel:

  • socks5auth First, find the construction team.
  • socks5connect Start digging the tunnel.
  • socks5forward The tunnel is open!

socks5forward is the phase where the tunnel is in operation. At this stage, SOCKS5 is no longer involved because the tunnel has already been dug, allowing HTTP packets to roam freely!

Untitled 2

Untitled 5

SOCKS5 Auth: First, Find the Construction Team#

The SOCKS5 protocol is initiated by the client:

# Client sends
+----+----------+----------+
|VER | NMETHODS | METHODS  |
+----+----------+----------+
| 1  |    1     | 1 to 255 |
+----+----------+----------+

# Server responds
+----+--------+
|VER | METHOD |
+----+--------+
| 1  |   1    |
+----+--------+

The specific fields are as follows:

Client request:

  • VER Version number 1 byte
  • NMETHODS Number of available authentication methods, 1 byte
  • METHODS (length equals NMETHODS) one byte per method

Server response:

  • VER Version number
  • METHOD Authentication method; we directly use no authentication, filling in 0x00.

Thus, the first step is to read the request and then return 0x05,0x00 to the client to indicate agreement to connect.

func Socks5Auth(client net.Conn) (err error) {
	buf := make([]byte, 256)

	// Read VER and NMETHODS
	n, err := io.ReadFull(client, buf[:2])

	ver, nMethods := int(buf[0]), int(buf[1])

	// Read METHODS list
	n, err = io.ReadFull(client, buf[:nMethods])

	// No authentication required
	n, err = client.Write([]byte{0x05, 0x00})

	return nil
}

Similarly, the Rust implementation is:

async fn socks5_auth(client: &mut TcpStream) -> Result<(), Box<dyn std::error::Error>> {
    let mut buf = [0u8; 2]; // Initialize to [0,0]
    client.read_exact(&mut buf).await?;
    let ver = buf[0];
    let n_methods = buf[1];

    let mut methods = vec![0u8; n_methods as usize];
    client.read_exact(&mut methods).await?;

    client.write_all(&[0x05, 0x00]).await?;

    Ok(())
}

Thus, the first step of the SOCKS5 protocol has been completed, and the construction team has been found, informing the client that they will help dig the tunnel!

SOCKS5 Connect: Start Digging the Tunnel#

Untitled 3

The protocol details are as follows (numbers indicate byte lengths):

# Client sends
+----+-----+-------+------+----------+----------+
|VER | CMD |  RSV  | ATYP | DST.ADDR | DST.PORT |
+----+-----+-------+------+----------+----------+
| 1  |  1  | X'00' |  1   | Variable |    2     |
+----+-----+-------+------+----------+----------+

# Server responds
+----+-----+-------+------+----------+----------+
|VER | REP |  RSV  | ATYP | BND.ADDR | BND.PORT |
+----+-----+-------+------+----------+----------+
| 1  |  1  | X'00' |  1   | Variable |    2     |
+----+-----+-------+------+----------+----------+

Client request:

  • VER Version number 1 byte, default is 5
  • CMD 0x01 indicates connection
  • RSV Reserved fixed byte 0x00
  • ATYP Request type, 0x01 for IPv4, 0x03 for domain name, 0x04 for IPv6
  • DST.ADDR Address; if the request is a domain name, the first byte is the length of the domain name; otherwise, it is a 4-byte IPv4 address (IPv6 is ignored).
  • DST.PORT Port 2 bytes

Server response:

  • VER Version number 1 byte, default is 5
  • REP Confirmation response 0x00 succeed
  • RSV Reserved, default 0

The following fields only apply to the client's BIND command (not the CONNECT command we are using), so just send 0 for them.

  • ATYP Response type, 0x01 indicates IPv4, 0x03 indicates domain name, 0x04 indicates IPv6
  • BND.ADDR Address
  • BND.PORT Port

Since this step is about digging a tunnel, we need to know where the client wants us to dig the tunnel to, so this actually breaks down into two steps:

  • Parse the destination sent by the client (according to the protocol above).
  • Establish a TCP connection to the destination.

Client → SOCKS proxy

The code for the SOCKS proxy to the client is just one line, which I’ve written in the comments.

func Socks5Connect(client net.Conn) (net.Conn, error) {
	buf := make([]byte, 256)

	n, err := io.ReadFull(client, buf[:4])

	// The first four bytes
	ver, cmd, _, atyp := buf[0], buf[1], buf[2], buf[3]

	addr := ""
	switch atyp {
	case 1: // Assume only the first case of IPv4
		n, err = io.ReadFull(client, buf[:4])
		if n != 4 {
			return nil, errors.New("invalid IPv4: " + err.Error())
		}
		addr = fmt.Sprintf("%d.%d.%d.%d", buf[0], buf[1], buf[2], buf[3])
		//  ...

	default:
		return nil, errors.New("invalid atyp")
	}

	// Parse the port, note the byte order
	n, err = io.ReadFull(client, buf[:2])

	port := binary.BigEndian.Uint16(buf[:2])

	// Destination address obtained!
	destAddrPort := fmt.Sprintf("%s:%d", addr, port)

	// Start digging the tunnel
	dest, err := net.Dial("tcp", destAddrPort)

	// Response to the client, the tunnel is complete!
	_, err = client.Write([]byte{0x05, 0x00, 0x00, 0x01, 0, 0, 0, 0, 0, 0})

	return dest, nil
}

Similarly, we implement it in Rust:

async fn socks5_connect(client: &mut TcpStream) -> Result<TcpStream, Box<dyn std::error::Error>> {
    let mut buf = [0u8; 4];
    client.read_exact(&mut buf).await?;

    let ver = buf[0];
    let cmd = buf[1];
    let atyp = buf[3];

    let target_addr = match atyp {
        1 => {
            let mut addr = [0u8; 4];
            client.read_exact(&mut addr).await?;
            format!("{}.{}.{}.{}", addr[0], addr[1], addr[2], addr[3])
        }
   
        _ => return Err("Invalid atyp".into()),
    };

    let mut port_buf = [0u8; 2];
    client.read_exact(&mut port_buf).await?;
    let port = u16::from_be_bytes(port_buf);

    // Start digging the tunnel!
    let target = TcpStream::connect(format!("{}:{}", target_addr, port)).await?;
		
    // Inform the client that the tunnel is complete!
    client
        .write_all(&[0x05, 0x00, 0x00, 0x01, 0, 0, 0, 0, 0, 0])
        .await?;

    Ok(target)
}

SOCKS5 Forward: The Tunnel is Open!#

Untitled 4

At this point, we need to establish a connection between the client's client and the remote target, effectively stitching this tunnel together. This is somewhat similar to the strategy used by Zhan Tianyou when he dug the Beijing-Zhangjiakou railway tunnel, where both ends advanced simultaneously.

In Go, we directly use io.Copy to implement it.

func Socks5Forward(client, target net.Conn) {
	forward := func(src, dest net.Conn) {
		defer src.Close()
		defer dest.Close()
		io.Copy(src, dest)
	}
	go forward(client, target)
	go forward(target, client)
}

In Rust, there is a similar API, tokio::io::copy.

let (mut cr, mut cw) = client.split();
let (mut tr, mut tw) = target.split();

let c_to_t = async {
    match tokio::io::copy(&mut cr, &mut tw).await {
        Ok(_) => {}
        Err(e) => {
            eprintln!("Error forwarding from client to target: {}", e);
        }
    }
};

let t_to_c = async {
    match tokio::io::copy(&mut tr, &mut cw).await {
        Ok(_) => {}
        Err(e) => {
            eprintln!("Error forwarding from target to client: {}", e);
        }
    }
};

Thus, a SOCKS5 network tunnel has been established, and thereafter, HTTP packets (the train) can start to roam freely.

Wireshark Packet Capture Testing#

As mentioned earlier, Wireshark can only correctly parse SOCKS protocol when it operates on port 1080.

In the marked area of the figure below, the three processes of SOCKS are illustrated, and the specific packet details can be viewed by yourself:

Untitled 5

Performance Testing Comparison#

The idea behind the performance test is to set up an HTTP server and then use both the Go and Rust implementations of the SOCKS5 proxy to establish tunnels and initiate requests to see the actual QPS performance.

To facilitate this, we will use Gin to set up an HTTP server.

package main

import "github.com/gin-gonic/gin"

func main() {
	r := gin.Default()
	r.GET("/ping", func(c *gin.Context) {
		c.String(200, "pong")
	})
	r.Run(":8082")
}

Using this benchmark tool, which supports the SOCKS5 protocol:

go install github.com/cnlh/benchmark@latest

First, let’s measure the QPS of the web server. Since the M1 Mac machine has relatively low specifications, we will use 100 concurrent connections to run 100,000 requests.

 benchmark -c 100 -n 100000 http://127.0.0.1:8082/ping -ignore-err
Running 100000 test @ 127.0.0.1:8082 by 100 connections
Request as following format:

GET /ping HTTP/1.1
Host: 127.0.0.1:8082

100000 requests in 4.45s, 11.46MB read, 4.20MB write
Requests/sec: 22464.38
Transfer/sec: 3.52MB
Error(s)    : 0
Percentage of the requests served within a certain time (ms)
    50%				2
    65%				2
    75%				3
    80%				4
    90%				8
    95%				17
    98%				29
    99%				51
   100%				82

The data here can be interpreted as follows:

  • A total of 100,000 requests were completed in 4.45 seconds.
  • The average requests per second is 22,464.38, i.e., the QPS is 22k.
  • The last segment of data provides the percentage of requests served within a certain time, along with the corresponding response times:
    • 50% of requests were responded to within 2ms.
    • 99% of requests were responded to within 51ms.

Next, let’s bring on the two construction teams. First, the Go representative, goroutines. We can see that although the QPS has decreased somewhat, the drop is not significant, and the request time distribution is surprisingly more uniform.

 benchmark -c 100 -n 100000 -proxy socks5://127.0.0.1:1080 http://127.0.0.1:8082/ping -ignore-err
Running 100000 test @ 127.0.0.1:8082 by 100 connections
Request as following format:

GET /ping HTTP/1.1
Host: 127.0.0.1:8082

100000 requests in 4.49s, 11.46MB read, 4.20MB write
Requests/sec: 22295.77
Transfer/sec: 3.49MB
Error(s)    : 0
Percentage of the requests served within a certain time (ms)
    50%				2
    65%				3
    75%				4
    80%				4
    90%				7
    95%				14
    98%				25
    99%				35
   100%				63

Next, let’s bring on the Rust representative, Tokio. The QPS dropped by about 2k, and the request time distribution showed greater variance.

 benchmark -c 100 -n 100000 -proxy socks5://127.0.0.1:1080 http://127.0.0.1:8082/ping -ignore-err
Running 100000 test @ 127.0.0.1:8082 by 100 connections
Request as following format:

GET /ping HTTP/1.1
Host: 127.0.0.1:8082

100000 requests in 4.95s, 11.46MB read, 4.20MB write
Requests/sec: 20218.10
Transfer/sec: 3.17MB
Error(s)    : 0
Percentage of the requests served within a certain time (ms)
    50%				3
    65%				4
    75%				4
    80%				5
    90%				7
    95%				13
    98%				25
    99%				34
   100%				92

It seems that the goroutine representative is indeed superior!
Untitled 6

References:

https://segmentfault.com/a/1190000038247560

http://www.moye.me/2017/08/03/analyze-socks5-protocol/

https://zgao.top/ 奇安信实习五 - socks5 协议抓包分析 /

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.