Should You Use A Buffer?
// #Rust
The BIP-324 protocol follows a pattern once the ciphers of a channel are fired up. A reader on one end reads 3 bytes to get the length of a packet, then reads the packet length. And then loops back around. 3 bytes, packet, 3 bytes packet…and so on. It is also heavily implied in the spec that each packet should contain just one bitcoin p2p message. So if a remote fires over 3 p2p messages, that could be 6 read syscalls on the local receiver. Now theoretically, if 3 packets are sitting in the queue, they could be read all at once and pieced apart later on. 6 syscalls into 1. This kinda sounds like the main selling point in the brochure for a BufReader
.
BufReader in Rust is a struct that adds buffering to any reader, improving efficiency by reducing the number of system calls when reading data. It is particularly useful for reading small amounts of data repeatedly from files or network streams.
BufReaders
’s aren’t free else we would probably just use them by default everywhere. There is another memory allocation required. And the data needs to move at least one more time, an extra copy. So you have to measure the cost of a syscall against these. Obviously if the reader is reading from a source which is already in a program’s memory (e.g. a vector) a buffer would be purely overhead, no syscalls to soak up.
So what is the cost of a syscall? I think this is hard to say. By definition, a syscall (system call) is architecture dependent. They are low-level, providing an abstraction layer and security to the caller. When we write “high-level” rust code, we don’t care about the exact network card a machine is using, we are coding against the Read
interface which uses a read syscall under the hood. We also have a general expectation that no other programs are reading and/or writing bits from our read stream. This is why we made syscalls.
But this also means syscalls have to do some work, they are not a one-time mapping at startup or something (or at least, not usually). When a program requires a syscall they hand over control to the kernel. It is a context switch. With how optimized things are at the CPU level these days, I am not sure the cost of that switch. Are caches super busted or is it a quick switchback? Does the cost depend on other program activity on the machine? Or the device being used? I think the answer to all of these is “it depends”, which means the answer for when to use a BufReader is “it depends”. But I think for BufReader, if we set aside the memory cost, the worst-case scenario is a quite machine. That is when the buffer is least helpful. So for the bip324 case, I think it will almost always help and at worse, break even.
I wrote a little example benchmark which tries to capture the idea. For each test scenario, there is a reader and a writer. The writer just dumps a bunch of messages into the connection. The reader reads them all. This doesn’t model the characteristics of a real life bitcoin p2p connection (bursty, dormant), or latency, or partially written packets over a bad connection. But it does show that the buffer helps in “receive heavy” scenarios, and doesn’t really ever hurt in this case (unless you are super sensitive to memory usage and don’t want the extra buffer).