Export serialization #2176

peri044 · 2023-08-04T19:41:47Z

peri044
Aug 4, 2023
Collaborator

Proposed workflow:

For a simple graph, input -> conv -> relu, here are the options

a) Case 1: Where output of TRT compilation is a TorchTRTModule, the graph looks as follows

graph():
    %arg0 : [num_users=1] = placeholder[target=arg0]
    %_param_constant0 : [num_users=1] = get_attr[target=_param_constant0]
    %_param_constant1 : [num_users=1] = get_attr[target=_param_constant1]
    %fused_0 : [num_users=1] = call_module[target=fused_0](args = (%arg0, %_param_constant0, %_param_constant1), kwargs = {})
    return [fused_0]

Serializing this graph using GraphModuleSerializer(graph_signature, call_spec).serialize(graph_module) fails because call_module cannot be serialized as call_module nodes cannot be handled currently in GraphModuleSerializer.

b) Case 2 : Where output of TRT compilation is a Node, the graph looks as follows

graph():
    %arg7_1 : [num_users=1] = placeholder[target=arg7_1]
    %execute_engine : [num_users=1] = call_function[target=torch.ops.tensorrt.execute_engine](args = ([%arg7_1], Torch-TensorRT TensorRT Engine:
  Name: _run_on_acc_0_engine
  Inputs: [
    id: 0
      name: arg7_1
      shape: [1, 3, 224, 224]
      dtype: Float
  ]
  Outputs: [
    id: 0
      name: output0
      shape: [1, 16, 222, 222]
      dtype: Float
  }
  Device: Device(ID: 0, Name: NVIDIA GeForce RTX 3080 Ti, SM Capability: 8.6, Type: GPU)
), kwargs = {})

Now this call_function node can be serialized by handling it in handle_call_function. The following custom implementations are required for this

TorchTRTExportedProgramSerializer which inherits ExportedProgramSerializer
TorchTRTSerializer which inherits GraphModuleSerializer.
TorchTRTExportedProgramDeserializer which inherits ExportedProgramDeserializer
TorchTRTDeserializer which inherits GraphModuleSerializer.
torch_tensorrt.dynamo.serialize - Custom version of torch._export.serde.serialize will dump this serialized_exp_program into bytes
torch_tensorrt.dynamo.deserialize - torch.export.serde.deserialize to deserialize accordingly

So the workflow would look like

exp_model = torch.export(input_nn_module, (inputs))

# Run torch_trt compilation
trt_graph_mod = torch_tensorrt.dynamo.compile(exp_model.graph_module, inputs)

# Replace the graph_module in-place and return the exported program
exp_model.graph_module = trt_graph_mod
return exp_model 

# Serialize the compiled TRT model
serialized_trt_model, state_dict = torch_tensorrt.dynamo.serialize(exp_model) -> TorchTRTExportedProgramSerializer().serialize(exp_model) -> TorchTRTSerializer(exp_model.graph_signature, exp_model.call_spec).serialize(exp_model.graph_module)

# Write the serialized TRT model and state_dict
with open("trt.ep", "wb") as file:
     file.write(serialized_trt_model)

with open("state_dict.ep", "wb") as file:
     file.write(state_dict)

# Load the serialized TRT model and state_dict
serialized_trt_model = open("trt.ep", "rb").read()
serialized_state_dict = open("state_dict.ep", "rb").read()

# Deserialize the compiled TRT model
deserialized_exp_program = torch_tensorrt.dynamo.deserialize(serialized_trt_model, serialized_state_dict) -> TorchTRTExportedProgramDeserializer().deserialize(exp_model) -> TorchTRTDeserializer(exp_model.graph_signature, exp_model.call_spec).deserialize(exp_model.graph_module)

Downsides of these custom versions:

Continuous maintenance with upstream export functionality.
We will need TorchTRTExportedProgramDeserializer accordingly (with custom implementation of torch.export.serde.deserialize . So to load an already compiled exported program, we need to import the entire torch_tensorrt library (which can pose memory constraints due to size)

serialized_exp_program = json.load("path_to_exp_program")
exp_program = torch_tensorrt.dynamo.deserialize(serialized_exp_program, state_dict)

# Perform inference
outputs = exp_program(inputs)

Question to Meta:

It looks like we need to subclass and write our own exporter if we have custom ops. Is this true ? Is there any alternative to the above implementation ?
How do torchvision ops work with export ?

angelayi · 2023-08-04T22:07:46Z

angelayi
Aug 4, 2023
Collaborator

It looks like we need to subclass and write our own exporter if we have custom ops. Is this true ? Is there any alternative to the above implementation ?

The serializer specially cases on torch.ops.OpOverloads, because it contains a schema field which we use to serialize with BC/FC in mind. If your custom ops are that type then it should be fine. If not, then the current solution is to subclass and add support for your custom ops. If your custom ops don't have a schema, we have helper functions implemented in other contexts to address this issue, which we could make open sourced if needed.

How do torchvision ops work with export ?

~~I'm not familiar with working with torchvision, but I would expect that torchvision ops are treated the same as any other custom operator.~~
I just tried it out, and yes, torchvision ops traced as torch.ops.OpOverloads in the graph and so they go through the normal flow of serialization/deserialization. Example graph:

graph():
    %arg0_1 : [num_users=1] = placeholder[target=arg0_1]
    %arg1_1 : [num_users=1] = placeholder[target=arg1_1]
    %roi_align_default : [num_users=1] = call_function[target=torch.ops.torchvision.roi_align.default](args = (%arg0_1, %arg1_1, 1.0, 5, 5, 2, False), kwargs = {})
    return (roi_align_default,)

So to load an already compiled exported program, we need to import the entire torch_tensorrt library (which can pose memory constraints due to size)

Curious: in what context do you plan to load the serialized program? Are you loading it in an environment that doesn't have the torch_tensorrt library? In that case, do you not expect the ops/graph to be runnable?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export serialization #2176

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Export serialization #2176

peri044 Aug 4, 2023 Collaborator

Replies: 1 comment

angelayi Aug 4, 2023 Collaborator

peri044
Aug 4, 2023
Collaborator

angelayi
Aug 4, 2023
Collaborator