Wrapping I/O

BIP-324 Series

The I/O traits covered back in the Async I/O log are powerful due to their simplicity. To recap Read real quick, the only required method to implement is fn read(&mut self, buf: &mut [u8]) -> Result<usize>;, it reads bytes from a source and writes them to buf.

The contract for Read is that it reads any number of bytes, it is stream oriented. For some scenarios, this is totally fine and a wrapper or adapter approach can be taken where a type takes ownership of a Read source and implements Read itself. A higher level caller doesn’t care that the adapter has been spliced in, the interface hasn’t changed. Importantly, that interface is just spittin out bytes.

Most use cases, even ones which compose well, have some sort of “minimum meaningful unit”. Think of a compression algorithm, there usually isn’t much to compress based on just a single byte of data. It needs to chunk it up some how. So there is naturally going to be some buffering on both the read and write ends to build up enough data to work on. But compression is still just slinging bytes, the caller doesn’t need to be aware of the buffering under the hood. It is an implemenation detail.

Technically speaking, the BIP-324 protocol could be implemented as a Read/Write adapter even though it is packet oriented. Messages are serialized over the wire with a length prefix and then its encrypted payload. This payload has an authentication tag which is used to authenticate the whole packet. I have the tag broken out in the structure below, but it is technically part of the AEAD output.

struct BIP324Packet {
    length: [u8; 3],            // Packet length.
    encrypted_payload: Vec<u8>, // Contains a header and the contents.
    tag: [u8; 16],              // Authentication tag for the packet.
}

The structure of a BIP-324 packet.

The BIP-324 spec strongly suggests that the contents should just be one application layer p2p message, with the first bytes describing the type of message and the rest assumed to be the message payload. BIP-324 does introduce some special serialization for some message types, but all messages are still written with their network serialization definition which includes length prefixes. So theoretically multiple messages could be parsed out of a single packet. But as of today, the BIP-324 packets are always 1:1 with the application layer p2p message. But I think if anything, this creates a stronger imdediance mis-match between the packets and the I/O traits.

For the Write side, a chunk of bytes would be considered a packet and end up on the wire in the above structure. But there is nothing in the Write contract that bytes passed to it need to be written right away, what if it buffers some calls? It probably shouldn’t send them in one packet due to how the spec is worded, but there is also nothing in the Write contract which says “each byte array is a packet chunk”. An implementation would work, but it would be a little weird having this implicit semantic contract.

On the Read side, a wrapper could pull bytes from a source until it has the whole packet, authenticate and decrpt it, and then toss back the plaintext bytes. It would be doing the work to find the packet boundries, but then kinda hiding it from the caller since it just returns a byte array. And there is nothing in the Read contract that this array should be only one packet. Again, an implicit detail of the implementation.

So in general, the triats could work but information is lost or implicit. The byte boundries are not an implementation detail like they are in compression, they are actually protocol semantics. A caller cares about the boundries, and it would just a pain to have to re-discover them.

How to Wrap

Ok, I get it, we won’t be implementing an I/O adaptor for the BIP-324 transport. But how should a wrapper over a Read and Write trait be implemented?

I think there are a few options.

Just take ownership of the source and sink.
Accept mutable references to the source and sink when needed.
Implement as a source and sink itself…JK!
Extend the source and sink traits.

The first iteration of the bip324 crate’s AsyncProtocol wrapper went with option #2. And to be honest, I am not sure why we do that over option #1. Option #1 allows for a simpler API for callers, they don’t have to pass around the references. It also allows the wrapper to optimize its usage of the source and sink. For example, wrapping it in a buffer so handle some of the edge cases of the handshake. The tradeoff is if the caller wants to share the source and sink with other things as well. But in the BIP-324 case, the connection is built on top of the complex handshake, I don’t see a reason to use it for something else.

That leaves the more interesting option #4.

pub trait BIP324Ext: AsyncRead + AsyncWrite {
    async fn read_bip324_packet(&mut self) -> Result<Vec<u8>, Error> {
        // Use own Read methods.
    }
    
    async fn write_bip324_packet(&mut self, data: &[u8]) -> Result<(), Error> {
        // Use own Write methods.
    }
}

// Blanket implementation across anything which
// can serve as a source and a sink.
impl<T: AsyncRead + AsyncWrite + Unpin> BIP324Ext for T {}

let mut stream = TcpStream::connect("127.0.0.1:8333").await?;
let packet = stream.read_bip324_packet().await?;

Magic with a trait extension…minus a few details.

This looks neat, the stream becomes the wrapper. But it is missing an important detail, the BIP-324 transport is state-ful, the ciphers need to be used on every call to keep things in-sync. Is it too much to ask the caller to pass some sort of “context” type to every call which holds the ciphers? That is an obvious downside compared to the simplicity of #1, but I think there are a few more potential issues. The first is that the caller could use the stream for something else on accident and forget about the ciphers. Or pass the wrong context. And they have to manage the lifetime of the context which is associated with this source and sink. It is kinda the inverse of option #2, but we have traded the source/sink for a context. And I don’t think a context would ever be used on a different source/sink, so not sure the separation is worth it.

And maybe just a question for me, but how discoverable are extension traits? I think I am just new to the concept, but the API surface of a type gets a little weird.