Each of the benchmarks in the FireHose suite has two parts: a front-end generator that writes a stream of events (datums) to one or more UDP sockets in a text-based format, and a back-end analytic that reads the stream and processes the datums.
The generators create a stream of datums with the following attributes:
The computations performed by the analytics have the following attributes:
The FireHose tarball provides code for each generator. They are meant to be used as-is, without modification. The tarball also provides sample implementations of the analytics, but users are free to re-implement them in a manner optimized for their streaming framework or their machine, or with better algorithms, so long as they follow the rules of the benchmark and measure the relevant scoring metrics.
UDP sockets are used as the link between the generator and analytic, so that the generator is not throttled by the analytic, just as in stream processing scenarios where data must be processed as it appears in real-time. The UDP protocol means the analytic will drop datums if it cannot keep up with the generated stream, which is one of the effects we wish to measure.