torch_tensorrt Reorganization and API #656

narendasan · 2021-10-08T20:15:50Z

narendasan
Oct 8, 2021
Collaborator

torch_tensorrt package structure

Python

With the incoming rebrand and a needing the space to be able to support alternative intermediate representations, we need to define a package structure and high level apis. The philosophy here is that the top level api will be two generalist wrapper functions: compile and convert_method_to_trt_engine (now is a good time to change this name). These functions will take a module of some format (torch.nn.Module, torch.jit.ScriptModule , etc) and a set of settings passed as kwargs (ones that are universal will be explicitly defined like inputs or enabled_precisions, but ir specific options like fallback will be behind **kwargs). Internal to the top level function will be a router that calls compile functions for specific ir formats. This can be controlled explicitly by setting the ir flag to select a path. [Should an explicit selection be rigid or try its best to convert? e.g if the user sets the ir flag to torchscript but provides an torch.nn.Module should we convert to torch.nn.Module?]. In the the case they set ir to default it goes through our preference of IR to TRT. Default also means we will take what ever users provide in mod and do our best to compile. So in the case of torch.nn.Module we would convert to torchscript for people via torch.jit.script. The default option lets us change the default code path under the hood without shifting the API.

The top level package will also contain general purpose classes that will be used across IR compilers

TL: `torch_tensorrt`

torch_tensorrt.compile(mod, 
                       ir="default", 
                       inputs=[], 
                       enabled_precisions=set(torch_tensorrt.dtype.float),
                       **kwargs)

ir="default"
- if module is torch.nn.Module
  - convert to torchscript via torch.jit.script, if it fails print error saying convert explicitly before hand by script or trace
ir="torchscript" or ir="ts"
- Expects a torch.jit.ScriptModule or nn.Module
ir="fx"
- TBD

```
torch_tensorrt.convert_method_to_trt_engine(mod, 
  method,  
  ir="default", 
  inputs=[],  
  enabled_precisions=set(torch_tensorrt.dtype.float), 
  **kwargs)
```
- ir="default"
  - if module is torch.nn.Module
    - convert to torchscript via torch.jit.script, if it fails print error saying convert explicitly before hand by script or trace
- ir="torchscript" or ir="ts"
  - Expects a torch.jit.ScriptModule???
- ir="fx"
  - TBD
torch_tensorrt.dump_build_info()
torch_tensorrt.Input
torch_tensorrt.Device
torch_tensorrt.dtype
torch_tensorrt.DeviceType
torch_tensorrt.EngineCapability
torch_tensorrt.TensorFormat

`torch_tensorrt.torchscript` (aliased to `torch_tensorrt.ts`)

Expects only a torch.jit.ScriptModule

torch_tensorrt.ts.compile(mod, 
                          inputs=[], 
                          device=torch_tensorrt.Device._get_current_device(),
                          enabled_precisions=set(torch_tensorrt.dtype.float),
                          sparse_weights=False,
                          disable_tf32=False,
                          refit=False,
                          debug=False,
                          strict_types=False,
                          capability=torch_tensorrt.EngineCapability.Default,
                          num_min_timing_iters=2,
                          num_min_timing_iters=1,
                          workspace_size=0 #(1GB if current_dev cc > 5.0 else 0.25G)
                          truncate_long_and_double=False,
                          torch_fallback={"enabled": True})

torch_tensorrt.ts.convert_method_to_trt_engine(mod,
                          method_name,
                          inputs=[], 
                          device=torch_tensorrt.Device._get_current_device(),
                          enabled_precisions=set(torch_tensorrt.dtype.float),
                          sparse_weights=False,
                          disable_tf32=False,
                          refit=False,
                          debug=False,
                          strict_types=False,
                          capability=torch_tensorrt.EngineCapability.Default,
                          num_min_timing_iters=2,
                          num_min_timing_iters=1,
                          workspace_size=0 #(1GB if current_dev cc > 5.0 else 0.25G)
                          truncate_long_and_double=False,
                          torch_fallback={"enabled": True})

torch_tensorrt.ts.embed_engine_in_module(serialized_engine, device: trtorch.Device)
torch_tensorrt.ts.CompileSpec

C++

Lots of open questions here: There is a concept of a torch::nn::Module which can be defined and trained in C++ which is not torchscript compatible (as far as i know). I believe we should treat this as a separate IR to be supported at a later date. Similar to Python there will be a new sub namespace for the API torch_tensorrt::ts which will hold compile and convert_graph_to_engine targeted at torchscript.

We are naming the libraries libtorchtrt.so, libtorchtrt_runtime.so and libtorchtrt_plugins.so

CLI `torchtrtc`

I think the CLI basically stays the same. In the event of a new serialized IR to support we can add an IR flag or auto detect the IR and select the correct code path. This will primarily remain targeted at Torchscript for the time being

ncomly-nvidia · 2021-10-08T20:24:45Z

ncomly-nvidia
Oct 8, 2021

@narendasan, what's the thought with with ir kwarg? So users can select what IR they want to use from the top level? would setting ir="torchscript" be equivalent to calling torch_tensorrt.ts.compile()?

5 replies

ncomly-nvidia Oct 8, 2021

I'm thinking in the future we may end up with an ensemble of IRs used. How would this kwarg hold up then? Not that we couldn't change it...

narendasan Oct 8, 2021
Collaborator Author

yeah so the ir flag is the method we will use to switch the internal implementation. So lets say in the future fx is the preferred path but someone wants to try torchscript they would put in torchscript

ncomly-nvidia Oct 8, 2021

Works for me. What about having scripting & tracing? Are we sticking to just scripting?

narendasan Oct 8, 2021
Collaborator Author

My idea is that the default is "default". This lets us make the decision for the user, so today it might be torch.jit.script -> torch_tensorrt.ts.compile but in the future it could be torch.fx.trace -> torch_tensorrt.fx.compile or something

narendasan Oct 8, 2021
Collaborator Author

IMO we should stick to scripting for now, implementation wise its much easier, we can easily in the future add some specialization like "ts_trace". the issue is that tracing requires additional arguments so we should take some time to make sure we set it up right

peri044 · 2021-10-08T20:24:54Z

peri044
Oct 8, 2021
Collaborator

What is the intended use of torch_tensorrt.dump_build_info(). Is it verbose logging ?

1 reply

narendasan Oct 8, 2021
Collaborator Author

Its to get the dependencies and target compiler, its for debugging if users file bugs

narendasan · 2021-10-08T21:20:47Z

narendasan
Oct 8, 2021
Collaborator Author

One question I had @ncomly-nvidia @peri044 is when a user gives us ir="torchscript" but they give us an torch.nn.Module should we error out or try to convert it. I feel like we should convert it so that in the future this is like a method to constrain the ir but still have torch_tensorrt do the conversion for you.

4 replies

ncomly-nvidia Oct 8, 2021

I agree we should try to convert it. It's as if default maps to torchscript internally.

peri044 Oct 8, 2021
Collaborator

I agree, we should try to convert it.
Do you think having "ir=default" is more cleaner ?
(vs)
having ir="torchscript" default now (which doesn't need to be specified by user) and internally we change "ir=fx" to default when FX time comes.

narendasan Oct 8, 2021
Collaborator Author

I think there's value in having an agnostic option, like maybe in the future we can try running it through one IR and if that doesn't work then trying another but I'm not tied to it

vinhngx Oct 22, 2021

if we do conversion nevertheless, regardless of a mis-specification on user part, then do we need users to specify 'ir' after all?

Or should torch-tensorrt just detect what input it actually receives?

narendasan · 2021-10-22T01:07:29Z

narendasan
Oct 22, 2021
Collaborator Author

There’s a default argument but it exists so in the future if there’s multiple paths users can specify that we should use one particular one

…

________________________________ From: Vinh Nguyen ***@***.***> Sent: Thursday, October 21, 2021 8:45:36 PM To: NVIDIA/TRTorch ***@***.***> Cc: Naren Dasan ***@***.***>; Mention ***@***.***> Subject: Re: [NVIDIA/TRTorch] torch_tensorrt Reorganization and API (Discussion #656) if we do conversion nevertheless, regardless of a mis-specification on user part, then do we need users to specify 'ir' after all? Or should torch-tensorrt just detect what input it actually receives? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#656 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AANVFFOQ243RUL2ER5EGTJDUICX3BANCNFSM5FUKWQ3Q>.

0 replies

torch_tensorrt Reorganization and API #656

Uh oh!

Uh oh!

narendasan Oct 8, 2021 Collaborator

torch_tensorrt package structure

Python

TL: torch_tensorrt

torch_tensorrt.torchscript (aliased to torch_tensorrt.ts)

C++

CLI torchtrtc

Replies: 4 comments · 10 replies

Uh oh!

ncomly-nvidia Oct 8, 2021

Uh oh!

Uh oh!

ncomly-nvidia Oct 8, 2021

Uh oh!

narendasan Oct 8, 2021 Collaborator Author

Uh oh!

ncomly-nvidia Oct 8, 2021

Uh oh!

narendasan Oct 8, 2021 Collaborator Author

Uh oh!

Uh oh!

narendasan Oct 8, 2021 Collaborator Author

Uh oh!

peri044 Oct 8, 2021 Collaborator

Uh oh!

narendasan Oct 8, 2021 Collaborator Author

Uh oh!

narendasan Oct 8, 2021 Collaborator Author

Uh oh!

ncomly-nvidia Oct 8, 2021

Uh oh!

Uh oh!

peri044 Oct 8, 2021 Collaborator

Uh oh!

narendasan Oct 8, 2021 Collaborator Author

Uh oh!

vinhngx Oct 22, 2021

Uh oh!

narendasan Oct 22, 2021 Collaborator Author

narendasan
Oct 8, 2021
Collaborator

TL: `torch_tensorrt`

`torch_tensorrt.torchscript` (aliased to `torch_tensorrt.ts`)

CLI `torchtrtc`

Replies: 4 comments 10 replies

ncomly-nvidia
Oct 8, 2021

narendasan Oct 8, 2021
Collaborator Author

narendasan Oct 8, 2021
Collaborator Author

narendasan Oct 8, 2021
Collaborator Author

peri044
Oct 8, 2021
Collaborator

narendasan Oct 8, 2021
Collaborator Author

narendasan
Oct 8, 2021
Collaborator Author

peri044 Oct 8, 2021
Collaborator

narendasan Oct 8, 2021
Collaborator Author

narendasan
Oct 22, 2021
Collaborator Author