Getting Started
You can get started with evaluating your decision tree model's performance on Vollo Trees using the Vollo Trees compiler which doesn't require an FPGA accelerator.
When you are ready, you can run inferences with your model on a Vollo FPGA accelerator using an evaluation license.
Performance estimation and model design with the Vollo Trees compiler
You can use the Vollo Trees compiler to compile and estimate the performance of your model in an ML user's environment without any accelerator.
The Vollo Trees compiler execution time is typically on the order of seconds, enabling fast model iteration for tuning models to meet a latency target.
To estimate performance of your model with the Vollo SDK:
-
Download and extract the Vollo SDK.
-
Install the Vollo Trees compiler Python libraries.
-
Compile your model using the Vollo-rtees compiler and evaluate the compiled program on inference data to generate a compute latency estimate that will be achieved with Vollo Trees.
See Vollo Trees compiler Example for a fully worked example of this including performance estimation.
-
Iterate on your model architecture to meet your combined latency and accuracy requirements.
Validating inference performance using the Vollo Trees FPGA accelerator
When you are ready to run inferences with your models on a Vollo Trees accelerator, you will need a compatible FPGA based PCIe accelerator card and a Vollo Trees license.
Evaluation licenses can be provided free of charge by contacting vollo@myrtle.ai.
To validate inference performance on Vollo Trees:
-
Compile your model and save it as a
.vollo
program file using the Vollo Trees compiler.See Vollo Trees compiler Example for a fully worked example.
-
Run and benchmark your model on the accelerator using the Vollo runtime C example.
Make sure to pass the example application the path to your saved
.vollo
program when you invoke it on the command line.
Note that the Vollo Trees SDK includes prebuilt FPGA bitstreams for selected PCIe accelerator cards so no FPGA compilation or configuration is required after initial accelerator setup. As a result loading user models to run on Vollo takes under a second, enabling fast onboard iteration and evaluation of different models.