onnx2tf
Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC).
You should use LiteRT Torch rather than onnx2tf. https://github.com/google-ai-edge/litert-torch
Note
Click to Click to expand
-
The torch.script-based
torch.onnx.exporthas already been moved to maintenance mode, and we recommend moving to the FX graph-basedtorch.onnx.dynamo_exportstarting with PyTorch v2.2.0. -
The greatest advantage of ONNX generated by
torch.onnx.dynamo_exportwould be that it directly references the PyTorch implementation, allowing for the conversion of any OP that was previously difficult to convert to ONNX. -
The maintainers of ONNX and PyTorch have assured us that they will not add new OPs after
opset=18to the existingtorch.onnx.export. -
https://pytorch.org/docs/stable/onnx_dynamo.html#torch.onnx.dynamo_export
-
This can be converted directly into an ONNX graph using Pythonic code using
onnxscript. -
For future model versatility, it would be a good idea to consider moving to
torch.onnx.dynamo_exportat an early stage. -
Google AI Edge Torch AI Edge Torch is a python library that supports converting PyTorch models into a .tflite format, which can then be run with TensorFlow Lite and MediaPipe. This enables applications for Android, iOS and IOT that can run models completely on-device. AI Edge Torch offers broad CPU coverage, with initial GPU and NPU support. AI Edge Torch seeks to closely integrate with PyTorch, building on top of torch.export() and providing good coverage of Core ATen operators.
https://github.com/google-ai-edge/ai-edge-torch?tab=readme-ov-file#pytorch-converter
import torch
import torchvision
import ai_edge_torch
# Use resnet18 with pre-trained weights.
resnet18 = torchvision.models.resnet18(torchvision.models.ResNet18_Weights.IMAGENET1K_V1)
sample_inputs = (torch.randn(1, 3, 224, 224),)
# Convert and serialize PyTorch model to a tflite flatbuffer. Note that we
# are setting the model to evaluation mode prior to conversion.
edge_model = ai_edge_torch.convert(resnet18.eval(), sample_inputs)
edge_model.export("resnet18.tflite") -
Google for Developers Blog MAY 14, 2024 - AI Edge Torch: High Performance Inference of PyTorch Models on Mobile Devices
-
Considering the compatibility of Pythonic code with TensorFlow/Keras/TFLite and the beauty of the conversion workflow, nobuco is the most optimal choice going forward.
-
The role of
onnx2tfwill end within the next one to two years. I don't intend to stop the maintenance ofonnx2tfitself anytime soon, but I will continue to maintain it little by little as long as there is demand for it from everyone. The end ofonnx2tfwill be whenTensorRTand other runtimes support porting from FX Graph based models.
Model Conversion Status
https://github.com/PINTO0309/onnx2tf/wiki/model_status
Supported layers
-
: Supported : Partial support Help wanted: Pull Request are welcome
See the list of supported layers
OP Status Abs Acosh Acos Add AffineGrid And ArgMax ArgMin Asinh Asin Atanh Atan Attention AveragePool BatchNormalization Bernoulli BitShift BitwiseAnd BitwiseNot BitwiseOr BitwiseXor BlackmanWindow Cast Ceil Celu CenterCropPad Clip Col2Im Compress ConcatFromSequence Concat ConstantOfShape Constant Conv ConvInteger ConvTranspose Cosh Cos CumProd CumSum DeformConv DepthToSpace Det DequantizeLinear DFT Div Dropout DynamicQuantizeLinear Einsum Elu Equal Erf Expand Exp EyeLike Flatten Floor FusedConv GatherElements GatherND Gather Gelu Gemm GlobalAveragePool GlobalLpPool GlobalMaxPool GreaterOrEqual Greater GridSample GroupNormalization GRU HammingWindow HannWindow Hardmax HardSigmoid HardSwish Identity If ImageDecoder Input InstanceNormalization Inverse IsInf IsNaN LayerNormalization LeakyRelu LessOrEqual Less Log LogSoftmax Loop LpNormalization LpPool LRN LSTM MatMul MatMulInteger MaxPool Max MaxRoiPool MaxUnpool Mean MeanVarianceNormalization MelWeightMatrix Min Mish Mod Mul Multinomial Neg NegativeLogLikelihoodLoss NonMaxSuppression NonZero Optional OptionalGetElement OptionalHasElement Not OneHot Or Pad Pow PRelu QLinearAdd QLinearAveragePool QLinearConcat QLinearConv QGemm QLinearGlobalAveragePool QLinearLeakyRelu QLinearMatMul QLinearMul QLinearSigmoid QLinearSoftmax QuantizeLinear RandomNormalLike RandomNormal RandomUniformLike RandomUniform Range Reciprocal ReduceL1 ReduceL2 ReduceLogSum ReduceLogSumExp ReduceMax ReduceMean ReduceMin ReduceProd ReduceSum ReduceSumSquare RegexFullMatch Relu Reshape Resize ReverseSequence RNN RoiAlign RotaryEmbedding Round ScaleAndTranslate Scatter ScatterElements ScatterND Scan Selu SequenceAt SequenceConstruct SequenceEmpty SequenceErase SequenceInsert SequenceLength Shape Shrink Sigmoid Sign Sinh Sin Size Slice Softmax SoftmaxCrossEntropyLoss Softplus Softsign SpaceToDepth Split SplitToSequence Sqrt Squeeze STFT StringConcat StringNormalizer StringSplit Sub Sum Tan Tanh TensorScatter TfIdfVectorizer ThresholdedRelu Tile TopK Transpose Trilu Unique Unsqueeze Upsample Where Xor
Warning
flatbuffer_direct is an experimental backend. Behavior, supported patterns, and conversion quality may change between releases.
For production use, keep tf_converter as baseline and validate flatbuffer_direct per model with --report_op_coverage.
[WIP*experimental] flatbuffer_direct support status for ONNX ops in this list
The flatbuffer_direct conversion option exists to convert a QAT quantized ONNX model to an optimized quantized tflite (LiteRT) model. The goal is to completely remove the dependency on the TensorFlow runtime in the future. By the way, if you want to generate a highly optimized quantized tflite for your ONNX model, I recommend using this package. https://github.com/NXP/eiq-onnx2tflite
| INT8 ONNX | INT8 TFLite(LiteRT) |
|---|---|
Click to expand
- Scope: ONNX ops listed in the
Supported layerstable above. - Source of truth:
onnx2tf/tflite_builder/op_registry.pyand--report_op_coverageoutput. - Current summary:
- Listed ONNX ops in this README section:
208 - Policy counts are generated in
*_op_coverage_report.json(schema_policy_counts). - Check each conversion run with
--report_op_coveragefor the latest numbers.
- Listed ONNX ops in this README section:
Notes:
flatbuffer_directsupports only a subset of ONNX ops as TFLite builtins.- Some ops are conditionally supported (rank/attribute/constant-input constraints).
- For model-specific results, use
--report_op_coverageand check*_op_coverage_report.json.
Builtin supported (ONNX -> TFLite) in flatbuffer_direct
| ONNX OP | TFLite OP | Key constraints (flatbuffer_direct) |
|---|---|---|
| Abs | ABS | - |
| Acos | MUL + SUB + SQRT + ATAN2 | Input/output dtype must be FLOAT16 or FLOAT32 |
| Acosh | SUB + ADD + SQRT + MUL + LOG | Input/output dtype must be FLOAT16 or FLOAT32 |
| Add | ADD | - |
| And | LOGICAL_AND | - |
| ArgMax | ARG_MAX (+ optional RESHAPE for keepdims) | axis must be in range, keepdims must be 0 or 1, select_last_index=0, output dtype must be INT32 or INT64 |
| ArgMin | ARG_MIN (+ optional RESHAPE for keepdims) | axis must be in range, keepdims must be 0 or 1, select_last_index=0, output dtype must be INT32 or INT64 |
| Asin | MUL + SUB + SQRT + ATAN2 | Input/output dtype must be FLOAT16 or FLOAT32 |
| Asinh | MUL + ADD + SQRT + LOG | Input/output dtype must be FLOAT16 or FLOAT32 |
| Atan | ATAN2 | Input/output dtype must be FLOAT16
|