Homa Projects
Interested in working on Homa? This page describes various possible projects. Some are large and very ambitious, while others are smaller; some could potentially lead to research papers, while others would have more practical impact in terms of making Homa useful to the world.
Reduced Buffer Utilization
Homa already uses less buffer space in network switches than TCP, but it should be possible to improve its buffer usage even more without sacrificing performance. Here are a few ideas:
Have receivers tune the amount of unscheduled data that can be sent in incoming messages. Right now this is set to “rttBytes”, the amount of data that can be sent in the time it takes for the first data packet to reach the receiver and the receiver to return a grant packet. However, consider a situation where 50% of a receiver’s incoming link bandwidth is taken by unscheduled packets. In this case, the unscheduled limit could be dropped to rttBytes/2: those packets will only be able to use about half of the receiver’s bandwidth, so by the time the last of those packets has been sent, the receiver should have had enough time to return a grant packet. It seems likely that some sort of adaptive mechanism could reduce buffer utilization, especially under high loads. Receivers could notify senders of the right amount of unscheduled data to send using the same mechanism used to announce priority cutoffs for unscheduled packets.
Senders could also reduce the amount of unscheduled data in some situations without loss of performance. For example, suppose a sender is transmitting a high priority message A when a new lower priority message B is initiated. In the current protocol, no packets will be sent for B as long as packets can be transmitted for A; then, once A has completed (or run out of grants), a full load of unscheduled data will be sent for B. However, supposed the sender preempts A to send one unscheduled packet for B, then returns to A. If A continues transmitting for at least one RTT, there’s no need to send additional unscheduled packets for B, since the receiver will already have had plenty of time to receive the initial packet and return a grant. This approach has the disadvantage of delaying A slightly, but that effect might prove to be insignificant.
The overcommitment mechanism could be modified to lower the total amount of granted-but-not-yet-received data that can exist at any given time. Right now it is set to the degree of overcommittment (typically 8) times rttBytes, but this could potentially be reduced, either by reducing the degree of overcommitment or by reducing the amount of granted data for each of the messages (perhaps the highest priority messages would be given more grants than lower priority ones?). This optimization could reduce the effectiveness of overcommitment at maintaining full bandwidth utilization, so it would need study.
RPC Framework Integration
The best way to encourage widespread usage of Homa is to add Homa support to all of the major RPC frameworks. This should be done in a way that allows applications to switch to Homa with trivial changes. Work is already underway on gRPC integration, but it does not yet support all of the languages supported by gRPC. In addition, it would be great to add Homa support to Apache Thrift and any other widely used frameworks.
New NIC Architecture
At some point in the not-too-distant future, transport protocol implementations need to move to the NIC. It no longer makes sense to run them in software, either in the operating system or in applications; see the ATC Homa paper for a discussion of all the reasons. Existing “smart NICs” are woefully inadequate for this task; for example those based on general-purpose cores are way too slow (this is just software in a different place), and those based on FPGAs are too hard to program. Thus we need a new NIC architecture with at least the following characteristics:
Fast enough to process incoming packets at line rate, even if the packets are small.
Programmable enough to support a variety of transport protocols as well as other functions such as network virtualization and management.
Supports kernel-bypass, where applications communicate directly with the NIC to send and receive messages, without involving the operating system in the common case.
Load balancing: the NIC must dispatch incoming messages across a collection of cooperating threads in order to support severs with very high throughput.
This is a very large and very ambitious project, but it should also be a very interesting one.