White Papers
21 CheXNet – Inference with Nvidia T4 on Dell EMC PowerEdge R7425
output_names = [node.split(":")[0] for node in outputs]
graph = tf.Graph()
with tf.Session(graph=graph) as sess:
tf.saved_model.loader.load(
sess, meta_graph.meta_info_def.tags, savedmodel_dir)
frozen_graph_def = tf.graph_util.convert_variables_to_constants(
sess, graph.as_graph_def(), output_names)
write_graph_to_file(_GRAPH_FILE, frozen_graph_def, output_dir)
return frozen_graph_def
Freezing a model means pulling the values for all the variables from the latest model file, and then
replace each variable op with a constant that has the numerical data for the weights stored in its
attributes. It then strips away all the extraneous nodes that aren't used for forward inference, and saves
out the resulting GraphDef into a just single output file, which is easily deployable for production[14].
Load the frozen graph file from disk:
def get_frozen_graph(graph_file):
with tf.gfile.FastGFile(graph_file, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
Create and save GraphDef for the TensorRTâ„¢ inference using TensorRTâ„¢ library:
def get_trt_graph(graph_name, graph_def, precision_mode, output_dir,
output_node, batch_size=128, workspace_size=2<<10):
trt_graph = trt.create_inference_graph(
input_graph_def=graph_def,
outputs=[output_node],