Have an interesting error design scenario I have been struggling with working on the encoding stuff. To set the stage, I am working on decoders which are reading bytes and putting them into some structures. Decoders are generally more complicated than encoders, because there are more failure scenarios. Inefficient data, malformed data, invalid data, protocol violations, resource limits…all of these just doesn’t happen when encoding. And these error variants can be unique per-decoder type, so it is tough to just enumerate a handful.
When decoding, especially for the bitcoin domain, we are often composing decoders together. A header decoder is made up of six inner decoders, which are composed of other more primitive decoders (e.g. “4 byte array decoder”). So what is the type of error returned by that high level header decoder? Technically it is a large enumeration of all the inner decoder error types. If there are only two inner decoders, the error type is an enum of the first error and the second error.
pub enum Either<L, R> {
First(L),
Second(R),
}
A sum type which enumerates two error conditions.
As the decoder nesting gets more complex, the type of the error gets more complex.
type Decoder2Error = Either<ArrayError, NetworkError>;
type Decoder4Error = Either<
Either<ArrayError, NetworkError>,
Either<ParseError, ValidationError>
>;
type Decoder6Error = Either<
Either<Either<ArrayError, NetworkError>, ParseError>,
Either<Either<ValidationError, FormatError>, IoError>
>;
Either
’s everywhere.
These are a pain to name, but also a pain for a caller to handle.
match nested {
Either::First(Either::First(array_err)) => handle_array_error(array_err),
Either::First(Either::Second(net_err)) => handle_network_error(net_err),
Either::Second(Either::First(parse_err)) => handle_parse_error(parse_err),
Either::Second(Either::Second(val_err)) => handle_validation_error(val_err),
}
Handling nested error types.
So are there any other options? The Decoder
trait has an associated error type, and this makes sense, it would be kind of weird to make decoder generic across an error type Decoder<E>
. If this was the case, a decodable type could have multiple decoder implementations, each with its own error. This is clearly not what we want to model. How can we compose decoders, but keep the simple Decoder
trait?
Another option is to just shed information and have composed decoders return some sort of CompositeError
. The decoder implementation maps inner errors to it, so the caller only has to deal with one error type now. A static amount of information could be in the error type, like maybe the index of the inner decoder which failed and a display string. The caller’s interface is simpler, but might be more difficult to debug failure scenarios of complex decoders. The caller might want to act on the failure as well, for example if more data or a large buffer is required. Preferably this is still captured in the type system instead of shoved into a string.
Some information could be brought back into the picture by hanging on to the source errors. If you keep things generic though, you get back to super-huge-nested-type. So it might be a use case for dynamic dispatch, where the composite holds the source behind a Box<dyn Error>
. A big upside here is that it could hook into rust’s existing error conventions and impl Error::source
to build the nested type for debugging. The dynamic dispatch always brings a few corner cases to the table. In this case, Box
is only available if alloc
is enabled, and Error
is only there if std
is enabled. This matters in no_std compliant crate like in rust-bitcoin.
Maybe to look at it another way, some caller knows they are using decoders which only use three different error variants. So their composite decoder should just be an enum with three variants, I don’t think the “level” of where the error occurred is really all that helpful beyond a debug message. The caller could define this enum and some converters for the inner decoder (although that might get stick with coherence rules). This becomes a “Composite Decoder which Return This Error” type. The Composite Decoder is generic, not the Decoder trait. But since the error type is never actually held onto at any point by the Composite Decoder, some PhantomData is required to tag the type.
struct CompositeDecoder<E> {
_phantom: PhantomData<E>,
}
impl<E> Decode for CompositeDecoder<E>
// Make sure inner errors can be converted into composite error.
where
E: From<ArrayError>,
{
type Error = E;
}
Connecting a PhantomData
held generic to the associated type.
I am not sure it is possible to get around having to name the inner error types, in which case, very limiting on the caller since they can’t really define their own composite error type.