Fuzzing
I added some fuzz targets the bip324 library. To be honest, still not sure if I have acquired effective fuzzing skills. But the targets did help me find a little DoS attack vector, so not a waste.
The idea behind fuzz testing is neat. Normal unit tests treat some code as a black box which you feed known inputs into and expect known outputs. Those outputs could be values or maybe expected errors. In any case, if something doesn’t match up, the box has been broken. Fuzzing involves throwing inputs at the same black box, however, the input space is now anything. The goal is to find inputs the developer didn’t think of which can cause catastrophic crashes.
If your function has a very constrained interface, fuzzing might not be all that interesting. You could manually test every input. But take for example the receive_key(their_key: [u8; NUM_ELLIGATOR_SWIFT_BYTES])
function of the bip324 handshake. It takes a 64 byte array. And what really makes this a potentially great function to test is that those bytes are coming from the remote peer. It could be a bad actor looking to mess with you.
So the fuzzer harness goes to town and starts tossing bytes at receive_key
. But now we have another problem, that input domain is huge. This might take forever. Fuzzers can be a little smarter though. They don’t treat the function as a blackbox, but more like a greybox. The code being fuzzed is instrumented in a way that the fuzzer can tell how far an input made it through the box, just like unit test code coverage. If an input makes it farther than all the previous ones, the fuzzer can run with that for a bit by just making small mutations.
The receive_key
function is actually doing some cryptography under the hood, not a simple operation, so I couldn’t say with 100% confidence every input will work fine. It seems like a great function to fuzz! I made a mistake the first time a wrote a target for it tough. A target is just wiring together the random data the fuzz harness produces every cycle into the target function. In my first attempt, I created a new handshake and called receive_key
just like it would happen in real life. The problem is that the new
call creates a random local secret key by default. This mean each run would be unique, non-deterministic. And this is a big no-no when fuzzing. If the harness actually causes a crash it saves everything it knows about the state into an artifact file so that the developer can go back, figure out what happened, and fix it. The harness doesn’t know about the random internal secret key, so this would be a hard target to reproduce from a crash artifact.
I fixed the non-determinism by just using a hardcoded seed for the local secret. This feels a little weird since it is very much not how the code is used irl, but I think it still serves the purpose of fuzzing. The same code paths and input domain is used.
I also wrote a target for the receive_garbage
function which takes a whole slice of bytes from the remote peer as input. After running both targets for a few hours on my local machines, I haven’t seen any crashes (no artifacts created). But I did take a look at the coverage report produced for receive_garbage
and noticed something funny. One of the code paths I was fully expecting to be exercised by at least some of the inputs had 0 hits. Taking a look at my code, I realized my implementation processing the whole slice no matter what, even though the spec only allows 4095 bytes of garbage to be sent. A bad actor could send a whole butt load of bytes my way and my machine would sit there and churn through it all. A quick slice of the slice fixed that attack vector.
I considered adding targets to some of the other functions which receive data from the remote peer, like the length decryption or packet decryptor. However, I found it surprisingly difficult to set these up in a way to give any interesting data. Both implementations bail out fast on the random data due to things like the AEAD tag not being real. So they have large input domains without much help from the “greybox” coverage analysis. Maybe it is still worth tossing a target on them just in case, but wondering if perhaps it’s a sign that instead the lower level cryptography primitives are where the fuzzing should live.