Setup the ONNX parser
In this blog, we’ll explore how to parse a simple ONNX model and print its operators. This step is an initial step for building a TVM-like compiler from scratch, as it introduces ONNX parsing and sets the stage for converting ONNX models into a custom intermediate representation (IR).
Because the focus of the project is not the frontend, we will use the existing ONNX parser instead of building the parser from scratch.
What is ONNX?
ONNX (Open Neural Network Exchange) is a framework-independent format for representing machine learning models. It allows models trained in one framework (like PyTorch or TensorFlow) to be exported and used in another. It is a commonly used representation for machine learning model.
Our goal in this post is to load an ONNX model and extract its operator names—a simple yet significant step toward creating a TVM-like front-end.
Later, we will translate the ONNX model into IRs.
Setting Up the Project
To build a robust and modular project (and simplify the complexities of parsing ONNX models), we will use an existing ONNX parser as our front-end. The ONNX parser relies on Protobuf (Protocol Buffers), which requires careful handling of dependencies. Instead of manually managing Protobuf and other dependencies, we will leverage Conan, a powerful C++ package manager, to automate this process. Additionally, we will follow a clean directory structure to make the project scalable and maintainable.
Project Directory Layout
Here’s the proposed directory structure for our project:
myTVM/
├── include/ # Public headers
│ ├── mytvm/ # Project namespace
│ │ ├── onnx_parser.h # ONNX parser header
│ ├── onnx/ # ONNX Protobuf headers
│ └── onnx.pb.h # Generated by protoc
├── src/ # Source files
│ ├── frontend/ # Front-end implementation
│ │ └── onnx_parser.cpp # ONNX parser implementation
│ ├── onnx/ # ONNX Protobuf implementation
│ └── onnx.pb.cc # Generated by protoc
│ └── main.cpp # Entry point for the project
├── CMakeLists.txt # Top-level CMake configuration
├── conanfile.txt # Conan dependencies
└── README.md # Project description
Setting Up Dependencies
Step 1: Install Conan
First, make sure Conan is installed.
Step 2: Define Dependencies in conanfile.txt
Create a conanfile.txt
to specify the required dependencies:
[requires]
protobuf/3.21.4
[generators]
CMakeToolchain
CMakeDeps
This ensures that Conan will handle the Protobuf library for us, required for onnx parser.
Step 3: Install Dependencies
Run the following command to install the dependencies:
conan install . --output-folder=_conan --build=missing
Generating ONNX Parser Files
The ONNX parser relies on Protobuf definitions. To generate the necessary onnx.pb.h
and onnx.pb.cc
files, we do the following:
Clone the ONNX repository to obtain the
onnx.proto
file:git clone https://github.com/onnx/onnx.git cd onnx
Use the
protoc
compiler (use the same version in Conan) to generate the files:protoc --cpp_out=generated/ onnx/onnx.proto
Move the files in the locations shown in the directory structure
Integrating with CMake
To build the project, we’ll use CMake. Update your CMakeLists.txt
as follows:
...
# Add libraries
add_library(frontend src/frontend/onnx_parser.cpp)
add_library(onnx src/onnx/onnx.pb.cc)
find_package(protobuf)
target_link_libraries(frontend PRIVATE protobuf::protobuf)
target_link_libraries(onnx PRIVATE protobuf::protobuf)
# Add subdirectories
add_subdirectory(src)
include_directories(${CMAKE_SOURCE_DIR}/include)
add_executable(myTvm src/main.cpp)
target_link_libraries(myTvm PRIVATE frontend onnx)
Next Steps
Now that the ONNX parser is set up, I am ready to use the onnx parser and print placeholder messages. Next I will start parsing ONNX models and extracting operator names. In future posts, we’ll explore how to translate ONNX models into intermediate representations (IRs), optimize them, and integrate with back-end code generation.
The relevant github commit for this part is https://github.com/ybllnl/myTVM/commit/11f9eff8496bea54d84ee4c8460f96d22707afb7