Rust’y predictions using gRPC.
Rust based gRPC client for Tensorflow Server.
gRPC is a high performance RPC framework used in a variety of scenarios. One of its main features being the ability write efficient client libraries.
Rust is the most loved programming language by developers over the last five years (based on StackOverflow’s 2020 survey). It helps write performant and safe code, powered by a strong compiler.
Tensorflow is one of the most popular open source machine learning platform for everyone. TF serving offers model services to clients for inference using REST / gRPC.
There are client libraries available in several languages for inference (prediction) on trained TF models. However, there are fewer resources to demonstrate this in Rust. This article aims to demonstrate the goodness of Rust with gRPC as a TF client library.
For the demo, we use tonic (rust’s grpc client) with tokio for making async prediction calls to trained TF models.
We’ll use the pre-trained, Half Plus Two Service available at -https://github.com/tensorflow/serving/tree/master/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_tf2_cpu/00000123.
If you have TF model server configured, you could fire up the server. Or use docker to fire up the model service like so:
>docker run -d -p 8500:8500 --name=half_plus_two -v $TF_SERVING_ROOT/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_tf2_cpu:/models/half_plus_two -e MODEL_NAME=half_plus_two intel/intel-optimized-tensorflow-serving:2.3.0>docker logs half_plus_two # check the logs to see the model fire up!2020-11-25 18:51:57.428174: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: half_plus_two version: 123}2020-11-25 18:51:57.454063: I tensorflow_serving/model_servers/server.cc:367] Running gRPC ModelServer at 0.0.0.0:8500 ...2020-11-25 18:51:57.465917: I tensorflow_serving/model_servers/server.cc:387] Exporting HTTP/REST API at:localhost:8501 ...
Now that we have the service up, we’ll write the client part.
- gRPC clients parse *.proto files to generate service stubs. Tonic is a well know client library in Rust for gRPC. It uses prost as dependency to generate stub files.
- Before we could generate stub files, we need to gather all proto files required for prediction call.
- For the purpose of this demo, i used this python script to dump proto files. The script collects all proto files from tensorflow and tensorflow.serving.
>python3 dump_tf_predict_protos.py pb #pb holds the protos.
>cargo build
The complete rust code is available here.
The client build contains tonic build to trigger the generation of tensorflow.rs and tensorflow.serving.rs ( tensorflowlib ) files.
(Note: tensorflow_serving.rs represents the final merge of tensorflow.rs + tensorflow.serving.rs with a few reference fixes).
The prediction client uses the generated structs (stubs) to prepare prediction request. They are made available from tensorflowlib module.
// model spec.
let model_spec = ModelSpec {
name: "half_plus_two".to_string(),
signature_name: "serving_default".to_string(),
version_choice: Some(model_spec::VersionChoice::Version(123)),
};
// fetch some random floats in the range 0..10 (inputs).
let step = Uniform::new(0.0, 10.0);
let input_vec: Vec<f32> = step
.sample_iter(&mut rand::thread_rng())
.take(total_inputs as usize)
.collect();// prediction request.
let request = tonic::Request::new(PredictRequest {
model_spec: Some(model_spec),
inputs: inputs,
output_filter: vec!["y".to_string()],
});
Each output item represents the response from Half Plus Two model service for corresponding input item (computed as y = x * 0.5 + 2).
>cargo runINPUTS: [0.24603248,8.930102,7.868023,]Calling Half Plus Two Service (y = x*0.5 + 2):Response from Half Plus Two Service:OUTPUTS : [2.1230164,6.465051,5.9340115,]