The framework currently traces binaries linearly which is an inherently slow process. Compilation is relatively fast, but tracing each binary with Intel's PIN framework is by far the largest overhead. If possible, multi-threading this procedure would extensively speed up the time needed to generate a larger dataset.