Modules

// #Craft #Rust

kyoto is reaching a good spot in functionality, but I am having trouble wrapping my around some of the data flows and internal dependencies. I have some suspicions on stuff I am tripping on which I’ll dive into here.

Opinionated Runtime?

The crates external interface is a node and a client, but I am still struggling to see what the benefit is for the caller to deal with the node. BDK’s bdk-ffi library uses kyoto and has it wrapped up with some CbfNode and CbfClient structs. The CbfNode struct just wraps a node and tosses it in its own tokio runtime, multi-tread variant, on a separate operating system thread. I guess this is maybe a more opinionated approach for callers of bdk-ffi?

If client and node were the only high level tasks exposed by kyoto, I think I could understand the clean break. The caller can decide how to run these two tasks. However, as of kyoto v0.10.0, there are two internal spots which call spawn. They are both in the network module, which makes sense as the I/O heavy spot, peer_map#dispatch and peer#run. spawn implicitly reaches up to the runtime executor to add another root task. This is how concurrency is added to an app using tokio (or any async runtime I guess), without it, everything would just be one big, sequential, state machine. And it probably totally makes sense to do it for these network use cases. But if kyoto is already talking to the runtime, why not just toss the node on its own task too and only give the client to the caller?

Another thought I had was using an internal runtime (e.g. tokio::runtime::Builder::new_multi_thread()) and tossing node on that. It looks like some libraries in the rust space optionally provide this features, like sqlx, but I am not sure the benefits vs. just spawning a task, unless you really wanted to hide the tokio dependency from the caller.

Scope and the Database

k y o t o c n n d c h e o a l a t d t i i w e a e n o b n / r a t k s / e

High level modules of kyoto, chain and network are the beefy ones.

There is some hand waving here, not listing some smaller modules, but this is how I see the high level modules of kyoto. I have chain and network called out as especially complex modules, each has a bunch of child modules.

  • chain // Validates and holds the state of the blockchain and any forks.
  • network // Manages peers and requesting data from the network.
  • node // Coordinates between chain data, network data, and client requests.
  • database // Persists data between runs.
  • client // User interface for requesting data.

As it stands, there are a lot of connections between these modules. And this is complicated by the super dynamic nature of async code, there are a lot of possible flows depending on runtime things (e.g. when does a network request return). I think ideally, chain and network interfaces are simplified so it’s obvious to any contributors how data flows in and out of them. I haven’t sunk into network yet, but chain is complex enough to be its own crate (not that there is any demand for that). Giving it a scope’d interface, even just for internal use, would make it easier to reason about kyoto as a whole.

Rob’s DETAILS.md notes on the structure are a little out of date now, stuff like filters has been folded into chain, but can see that I am not totally off. But some follow up questions I want to dive into more.

  • Can chain’s interface be made synchronous? If it doesn’t interact directly with the network or database, and offloads that to node then I don’t think there is any need for it to be async and would be simpler to grasp how data flows in and out.
  • network calls must be async, but does it have to be aware of the database or can that be pushed over to node? Can the interface be simplified to data requests?

My thinking is that chain and network may define how their primitives are serialized, but maybe don’t have to deal with the database calls themselves. Let node decide when things are persisted or loaded. For example, on startup node could pull the database and toss it at chain and network, kind of like a “start here”.

Kinda off topic, but sure would be nice to have an MSRV 1.75+ so we could use async in trait functions. Would make defining dependencies way cleaner.