Learning ONNX for trading - page 4

 

What is ONNX Runtime (ORT)?



What is ONNX Runtime (ORT)?

ONNX Runtime (ORT) is a library that optimizes and accelerates machine learning inferencing, allowing users to train their models in any supported machine learning library, export to ONNX format, and perform inferencing in their preferred language. The speaker highlights an example of performing inference using PyTorch with ONNX Runtime and points out that users can visit ONNXRuntime.ai to explore the different APIs and tools required for their preferred setup.

What is ONNX Runtime (ORT)?
What is ONNX Runtime (ORT)?
  • 2022.01.07
  • www.youtube.com
#onnxruntime #machinelearning #inference #python
 

2020 ONNX Roadmap Discussion #1 20200903



2020 ONNX Roadmap Discussion #1 20200903

The ONNX roadmap document, which has been open to contributions from the public, is a key topic in this video. The discussion covers extending ONNX on a machine learning pipeline, including evolving data, pre-processing, and extending ONNX onto horizontal pipelines like QFLO. Suggestions made by contributors include supporting data frames and adopting new operators for pre-processing. The speakers also discuss the adoption of the Python data API standard to expand ONNX's support and guarantee interoperability among other libraries. Additionally, the speakers discuss integrating ONNX into Kubernetes and Kubeflow to streamline ML development for users. The group plans to continue assessing the impact of the proposal and welcomes feedback through the roadmap or steering committee.

  • 00:00:00 In this section, the speaker discusses the ONNX roadmap document that has been open to contributions from the public and highlights the importance of these contributions for implementing changes. The speaker mentions that there will be six roadmap discussions, hosted weekly, with a community meeting planned for October 14th. The discussion is divided into three main parts, with the second part focusing on extending ONNX on a machine learning pipeline. Specifically, the discussion covers evolving data and pre-processing it, and extending ONNX onto horizontal pipelines like QFLO. The speaker also summarizes some of the suggestions made by contributors, such as supporting data frames and adopting new operators for pre-processing.

  • 00:05:00 In this section, the discussion focuses on the ONNX roadmap and various suggestions, such as supporting audio spectrogram processing and expanding data layout support. The conversation also covers the proposed extension of ONNX's influence on the ML pipeline and the potential benefits of supporting data frames within ONNX. The participants express their opinions, with one member sharing insights into the Python Data API Consortium's efforts to build a data API standard for array and data frame interoperability. The group seems to agree that expanding ONNX's capabilities in these areas is a good thing and aligns with broader industry initiatives.

  • 00:10:00 In this section, the speakers discuss the adoption of the Python data API standard as a way to expand the support of ONNX and guarantee interoperability among all other libraries of the same standard. The speaker says that the adoption of the standard would make model exchange easier and aligning the time schedule with the greater consortium would be better for users to use ONNX. They also discuss the difference between the ONNX and traditional data structure like data frame, and the need for adoption of the same standard by other libraries.

  • 00:15:00 can integrate ONNX into the Kuflow pipeline to make ML development easier for end-users. Chen mentions the concept of a pipeline where all components work together to orchestrate end-to-end ML development. Kuflow combines model and data content with infrastructure to provide a seamless experience for end-users. Chen is exploring how ONNX can be integrated into this pipeline to broaden its usage and make it easier for ML developers.

  • 00:20:00 In this section, the speakers discuss the idea of making it easier for users who use Kubernetes and Kubeflow to leverage ONNX for their infrastructure and environments. The goal is to develop an easy-to-access API to take models from the model zoo and create an end-to-end pipeline using ONNX. The speakers showcase an example where they use ONNX to describe the inference part of a machine learning process in Kubeflow and outline ideas for developing ONNX components to cover more steps, including data processing and distributed training. The idea is to leverage the power of Kubernetes while also covering more steps in the machine learning process.

  • 00:25:00 In this section, the speakers discuss expanding QFlow to have ONNX job to enable distributed training and adding data processing and transformation to get to the model training part. ONNX runtime itself supports training transformer models from PyTorch today, but there is still progress to be made with ONNX model training. The speakers suggest starting with the models in the model zoo to see how the data needs to be pre-processed and transformed for the models, but note that this is not purely inside the ONNX core project and requires a higher-level framework for defining components like CoolFlow.

  • 00:30:00 In this section, the participants discuss a proposal that was made for the ONNX roadmap and suggest linking the slides to the document. The group plans to continue assessing the impact of the proposal in subsequent meetings and hopes to have more closure on the implementation. They also welcome feedback and encourage users to submit it through the roadmap or steering committee. The discussion concludes with a farewell and invitation to future meetings.
 

2020 ONNX Roadmap Discussion #2 20200909



2020 ONNX Roadmap Discussion #2 20200909

In the "ONNX Roadmap Discussion" video, the speakers discuss various topics related to ONNX's roadmap, including shape inference, operator definitions, reference implementations, and the ONNX spec. The speakers suggest building a generic shape inference infrastructure to improve shape inference optimization, reducing the number of primitive operators, adding reference implementations for every operator, and better-defined test cases to ensure proper implementation and testing of ONNX. The group plans to continue discussions within the operator SIG and on the GitHub discussions board for adding a new operator.

  • 00:00:00 In this section, the speakers discuss the ONNX roadmap and cover a few suggested topics, specifically shape inference, off definition, and the IR, all of which were previously mentioned in the roadmap document, with comments from various individuals. The speakers ask if Changming or Buchan are available to explain their proposals. Buchan discusses his feedback on shape interference and how he had issues with it in the past. He suggests building a generic shape influence infrastructure that recompiles the shape whenever there is a change in IR to ensure that shape inference optimization goes hand in hand and improves optimization simultaneously. The speakers conclude that this applies more to optimization than directly to shape inference.

  • 00:05:00 understand the current capabilities of ONNX in terms of shape inference and optimization passes. The existing infrastructure already supports shape inference based on known input values and can be used to infer output shapes. There may be low-hanging fruit in updating the model checker, but other changes may require more discussion. The group discusses whether these changes belong in ONNX or in a different place. They also consider the idea of calling shape inference methods for each operator in consecutive loops to achieve the desired results. Ultimately, the existing infrastructure already allows for optimization passes and shape inference, but changes and improvements could enhance these capabilities.

  • 00:10:00 In this section, the speakers discuss the operator definitions and suggest reducing the number of primitive operators as other operators can be made composable with lower-level operators. They also discuss the topic of reference implementation and the need for the infrastructure to evaluate a sequence of operators. The current reference implementation exists in the form of a Python implementation in the test case generator but is not organized in a way that makes it easy to evaluate a sequence of operators. The speakers suggest adding some runtime like ONNX runtime to the CI's to verify function subgraphs, which can be used for validation.

  • 00:15:00 In this section, the speakers discuss the need for reference implementations for every operator to ensure that runtimes do not diverge from the author's expectations. They suggest using the reference implementation as a unit test to validate parity with the runtime and also as an interpreter mode to check for interopretability and validation. The speakers note that using the ONNX runtime for validating the function is possible under the assumption that the function is composed of primitive ops that exist in ONNX. However, using ONNX runtime for new operators in a subgraph that includes new primitive ops is not possible since no other runtime would have that implementation. They acknowledge that creating a reference implementation requires a lot of work, but it is mandatory for every op. They also emphasize the need for ONNX compliance to ensure that the runtime does not diverge from the author's expectations.

  • 00:20:00 In this section, the speakers discuss the use of reference implementations in the ONNX spec and the importance of clear and concise language in the specification. While some advocate for the use of reference implementations to remove ambiguity in the English text of the spec, others argue that the spec should be clear enough so that reference implementations are unnecessary. The speakers also discuss the importance of rigorous compliance testing to ensure that all possible corner cases are tested. Ultimately, the consensus seems to be that while reference implementations can be useful, they should not be required in the ONNX spec.

  • 00:25:00 In this section, there is a discussion about the requirements for implementing an operator in ONNX, specifically regarding the need for a reference implementation and testing procedures. While some argue that a reference implementation for the operator should be mandatory for generating test data, others disagree, stating that it is enough to provide either a Python function to generate test data or a fixed data set. However, it is noted that having a reference implementation is crucial for someone implementing the operator in a runtime to properly test their implementation, especially for complicated operators with many different attributes. The discussion also clarifies that while a reference runtime for ONNX is not necessary, a reference implementation for each operator is required.

  • 00:30:00 In this section of the video, the speakers discussed the importance of having a reference implementation and better-defined test cases to ensure proper implementation and testing of ONNX. They noted that relying on generated test data can be insufficient and that having a simple code available solves the problem for everyone. The conversation also touched on the need for a complete specification and the freedom of the runtime to determine what to do in undefined behavior cases. Some speakers expressed concern about adding a burden to those proposing operators by adding a reference implementation. They suggested minimizing the engineering effort and revisiting the current requirements for adding ops to the system.

  • 00:35:00 In this section of the video, the speakers discuss the importance of having a complete and unambiguous spec for ONNX and ways to enforce it. They agree that a reference implementation in Python would be helpful to ensure that people implementing operators in the runtimes can verify all the test cases. However, they also acknowledge that implementing a spec is not simple and there are still issues to address. They discuss ways to clarify how the spec can be used and propose that practice and feedback should guide the proposal of new operators, rather than proposing an operator and then implementing it in a runtime. They also note that one requirement for adding a new op is that it should be implemented in a well-known framework.

  • 00:40:00 In this section of the ONNX Roadmap Discussion, the group discusses the process for adding new operators to ONNX's spec. The proposal is to change the policy for adding a new operator, which already requires a reference implementation in Python. Their discussion centers around reference implementations and compliance tests, and they plan to continue the conversation within the operator SIG and on the GitHub discussions board. The group plans to continue discussions in their next meeting scheduled for the following week.
 

2020 ONNX Roadmap Discussion #3 20200916



2020 ONNX Roadmap Discussion #3 20200916

The discussion in this video centers around various topics related to ONNX, including improving error handling, adding a predefined metadata schema field to denote the creation of the model, the need for quantization physical optimization, and the possibility of updating ONNX models from Model Zoo to the most recent versions. The team plans to prioritize these topics based on their impact and cost and work on them post-1.8 release. Additionally, the group considers the idea of creating different language bindings for the ONNX toolset, with a particular interest in Java, in order to support different platforms such as Spark. The speakers also discuss the possibility of creating a Java wrapper around the ONNX Runtime.

  • 00:00:00 In this section, the speaker suggests discussing three topics with the community: error handling, enhancing the model zoo, and implementing more operations or operators for quantization. They plan to utilize the next three sessions to talk about the cost and impact of these topics and to figure out prioritization based on the items with the highest impact and low cost. They also address a question about the impact of these topics on release 1.8 and explain that most of these changes will be made post-1.8. One community member suggests improving the error handling so that if the runtime encounters malformed protobufs, it won't terminate and instead will return an error code or throw an exception to ensure a better user experience.

  • 00:05:00 In this section, the discussion centers on improving the error handling of the loading code in ONNX in order to prevent crashes and improve functionality. The team has conducted fuzzing on the code and found that untrusted models have the potential to take down the entire process, making it a top priority to address. The ONNX runtime has a different checking process from the ONNX checker, and it's not yet clear if they can share the same checker. Additionally, the topic of better error handling during audits is raised, and the team plans to follow up on this suggestion.

  • 00:10:00 In this section, the speaker discusses their library called Trivial which interacts with the ONNX ecosystem and serves ONNX models. They propose adding a predefined metadata schema field in ONNX to denote the timestamp when the model was created, the training algorithm used for the model, and what source library was used to generate it. They suggest defining standard key names for the metadata field and typing them as well. The speaker believes having a schema for the metadata field would be useful for libraries and other users serving ONNX models. The conversation then shifts to the need to extend model testing to cover all models in Model Zoo and provide good examples with high quality.

  • 00:15:00 In this section, the discussion centers around the need for a quantization physical optimization as well as expanding ONNX's model zoo to include quantized models. There have been several requests to include quantized models in the model zoo, and the team hopes to find contributors. They mention a blog where the quantized ONNX model from Hugging Face has done well, but they would need permission from Hugging Face to post it. It was also suggested that the transformer library's top model could be an example for quantization, and Microsoft and space both work on it. Additionally, there was discussion about optimization and some agreed that it is better to leave optimization to the runtime as it is beyond the scope of ONNX's spec.

  • 00:20:00 In this section, the participants discuss the possibility of updating the ONNX models from Model Zoo to the most recent versions using the Version Converter tool. However, they note that the Version Converter is not fully up to date and there is some uncertainty as to whether the conversion is necessary, as ONNX supports all previous versions. The group also considers the idea of different language bindings for the ONNX toolset, with a particular interest in Java, in order to support different platforms such as Spark. The addition of a Java API or bindings would facilitate the loading and validation of model files and making a converter from other libraries into the ONNX format.

  • 00:25:00 In this section, the speakers discuss the possibility of creating a Java wrapper around the ONNX Runtime, which would make things easier for JVM-based machine learning projects such as Spark. Although it is a non-trivial undertaking, using the Java CPP presets to auto-generate stubs could be a good starting point. Backwards compatibility is crucial for large projects like Spark, and targeting Java 8 would require significant work. However, if there is enough interest and willingness from the community to contribute, it could be a good thing to explore.
 

2020 ONNX Roadmap Discussion #4 20200923



2020 ONNX Roadmap Discussion #4 20200923

The fourth part of the ONNX roadmap discussion covers the topics of data frame support, pre-processing, standardization, end-to-end machine learning pipeline, and tool recommendations. Data frame support is evaluated as valuable for classical machine learning models and could eliminate the need for pre-processing. The need for pre-processing to be captured within the ONNX model is highlighted to improve performance, with a focus on standardizing high-level categories like image-processing. The end-to-end pipeline is rated as a low priority, but gradually adding components to the pipeline is suggested. The discussion concludes with a recommendation to use a tool to aid further discussion and analysis of the agenda items.

  • 00:00:00 In this section, the speakers discuss the ONNX roadmap and the features that were suggested by the community. They have covered three sections so far including the ML pipeline and data processing, op definition or IRS, and core robotics. The roadmap document includes a table of the suggested features, which are rated high, medium, and low priority. However, some of the topics are too generic, making it difficult to assess their importance. The speakers plan to spend the next 30 minutes discussing why some of these features were rated high and gathering feedback from the community on which features are most important.

  • 00:05:00 was wondering how the ONNX roadmap is being prioritized, this section of the video discusses the importance of a data frame support feature and how it could potentially solve other issues within the platform. The speaker explains that this feature would be valuable for data scientists and could potentially negate the need for a pre-processing feature. They also mention the need to get an engineering cost estimate for each item on the roadmap in order to prioritize tasks effectively. Suggestions are welcomed as this is the first time the roadmap is being presented in this way.

  • 00:10:00 In this section of the ONNX Roadmap Discussion, the importance of data frame support for machine learning models is discussed. It is believed that data frame support is mainly for classical machine learning models, rather than DNNs or other models. Data frame is different from sequence in that it is a heterogeneous collection of tensors or a relational table with columns that can have different types. The importance of each feature is evaluated based on the impact it will have, and engineering costs will be factored in. It is suggested that a comment per box be provided to highlight why a feature is high or low in importance.

  • 00:15:00 In this section, the importance of pre-processing within an ONNX model is discussed. The conversation highlights the need to have all necessary steps captured within the ONNX model rather than relying on external libraries, especially in the context of training, where pre-processing can have a significant impact on performance. Additionally, pre-processing can be useful on the inferencing side, particularly if not in a Python-based environment. The discussion also touches on the challenges of standardizing pre-processing due to the heterogeneous nature of data types. Although pre-processing is a broad and complex topic, the conversation concludes that it is necessary to consider missing operators and types within ONNX to standardize pre-processing.

  • 00:20:00 In this section, the speakers discuss the broad scope of pre-processing and how it could include not just vision-related processing but also audio data. While pre-processing is important to consider, the speakers note that it may not be necessary to support every data type, and instead, standardizing on high-level categories like image-processing could be more beneficial for developers. However, the speakers caution that even seemingly simple pre-processing tasks like image resizing can have subtle edge case differences between libraries, making standardization an engineering challenge. Nonetheless, standardizing pre-processing tasks can be helpful, and the speakers suggest collecting common pre-processing steps for future consideration.

  • 00:25:00 In this section, the speakers discuss the priority of including the end-to-end machine learning pipeline in ONNX, with some stating that it is a low priority given the other items that need to be addressed. However, they recognize the usefulness of having an end-to-end example and illustration of how ONNX can be applied, particularly when ONNX Runtime is brought into the mix. The idea of gradually adding components to the pipeline is suggested, with a focus on the training part, fine-tuning ONNX, and eventually adding pre-processing into the mix. The discussion ends with the recommendation to use a tool to facilitate further discussion and impact analysis of the items on the agenda.

  • 00:30:00 In this section, the speaker thanks everyone for joining and informs the audience that they will try to post the discussion on social media and the ONNX website.
 

2020 ONNX Roadmap Discussion #5 20201001



2020 ONNX Roadmap Discussion #5 20201001

During the ONNX Roadmap Discussion, the ONNX team discussed various features that were suggested by community members and scored by different people, including the steering committee. While some features were unanimously agreed upon, others split the community. The team discussed the possibility of changing ONNX IR to multiple IRs and centralized IR optimization libraries. They also discussed the idea of centralizing optimization libraries within ONNX and the requirement for ops to implement a standard interface and coding style. The team also debated the possibility of having a simple runtime for ONNX models and the use of custom Python ops for cases where the ONNX runtime is not available. Additionally, the team explored the relationship between pre-processing operations and the use of data frames, planning to turn their ideas into actionable proposals for future work.

  • 00:00:00 In this section, the ONNX team discusses the impact analysis spreadsheet that was set up to capture different people's thoughts on what is important for the project. They listed out all the different features that were suggested and had scores from different people, including the steering committee and other community members. They noticed that there were some features where everyone seems to agree that it's either really important or not important at all, and others where the community was split. They discussed the ones that were split and the next steps for the ones they agreed were important. They also talked about setting up criteria for what's considered high priority and how that depends on who is willing to commit the time to implement a feature.

  • 00:05:00 In this section of the ONNX Roadmap Discussion, the participants discuss the idea of changing ONNX IR to multiple IRs and centralized IR optimization libraries. There is some debate about whether these two ideas should be grouped together since optimization and IR are separate issues. The goal of having multiple IRs is to simplify and concatenate simpler operations while optimization libraries would improve the core ONNX. There is further discussion about what is meant by ONNX IR and clarification is needed. Participants also discuss how these potential changes could impact their current scores on the ONNX roadmap.

  • 00:10:00 In this section, the team discusses the possibility of centralizing optimization libraries in ONNX, but ultimately agrees that optimization should be part of the runtime and that it's lower priority compared to other issues. They also discuss the requirement for ops to be implemented in a specific way, with a standard interface and coding style, which is already a requirement but may need tweaking. They suggest that if someone proposes a specific style, it can be accepted if it seems acceptable.

  • 00:15:00 In this section, the speakers discuss the idea of having a simple runtime for ONNX models, which raises concerns about the complexity of requiring an execution flow and internal IR to process the model. However, there is a value in being able to run and evaluate ONNX models for testing and establishing correctness, especially in revealing any gaps in unit tests for operators. While it may be debatable how much effort and cost it would take to implement a simple runtime, the ONNX runtime does have the capability of plugging in Python ops, which could be used for this purpose.

  • 00:20:00 In this section, the participants of the ONNX Roadmap Discussion talked about the possibility of using a custom Python op for specific cases where ONNX runtime is not available. They discussed the limitations of the Python op and the need for a standard interface to ensure feasibility. Additionally, the group discussed the need for more pre-processing capabilities inside the ONNX graph to make models more self-contained and portable, especially for image-based pre-processing like scaling and handling bounding boxes. The group noted that text pre-processing, specifically tokenization, is a more complicated and comprehensive issue, but they may be able to abstract some common pre-processing scenarios.

  • 00:25:00 In this section, the participants discuss the relationship between pre-processing operations and the use of data frames. While they agree that pre-processing and data frames are linked, they see them as separate entities that require different types of work. Pre-processing is seen as an operator that works row-wise across a column of a data frame, while data frame extraction itself maps the pre-processing operator across the rows of a column. The group sees the two as closely linked and plans to turn their ideas into actionable proposals for future work.
 

2021 ONNX Roadmap Discussion #1 20210908


2021 ONNX Roadmap Discussion #1 20210908

During the ONNX Roadmap Discussion, IBM Research introduced their proposal for a new machine learning pipeline framework that converts typical data preprocessing patterns on Pandas Dataframe into ONNX format. The framework, called Data Frame Pipeline, is open-sourced on GitHub and can be defined using their provided API, which runs on Python during the training phase. The speakers also discussed the need to make ONNX visible in languages other than Python, such as Java, C#, and C++, and the exporting of ONNX models and emitting them from other languages. Additionally, they discussed the current functionalities of the ONNX Python and C++ converters and the need for scoping, naming, and patching functionalities when writing ONNX models.

  • 00:00:00 In this section, Takuya from IBM Research introduces their proposal for a new machine learning pipeline framework with new ONNX operators. The motivation for the proposal was due to the inability of existing pipeline frameworks to represent a typical pattern of data pre-processing. They prototyped a new pipeline framework called Data Frame Pipeline on Python, which converts typical data preprocessing patterns on Pandas Dataframe into ONNX format. They protected three new ONNX operators, including a date operator and two simple operators, string concatenators and string splitters. The pipeline framework is open-sourced on GitHub and can be defined using their provided API, which runs on Python during the training phase. The model is trained using the data that is outputted from the data frame pipeline, and their framework can consume already converted ONNX machine learning models.

  • 00:05:00 In this section, the speaker discusses the ONNX format and how it can be used with ONNX runtime, provided by Microsoft. They mention that in their prototype, they implemented 11 data frame transformers in Python and mapped them to ONNX operators, with most being simple mappings but some requiring analysis and conversion, such as the function transformer. They also discuss their approach to generating ONNX operators with charged body properties, rather than performing aggregation operators on ONNX. The speaker shares preliminary experimental results that show a significant speedup when learning pre-processing on ONNX, with a 300 times performance improvement for categorical encoding. They also compare the prediction accuracy and mention their proposal, opening the floor for questions and comments on the operators presented.

  • 00:10:00 In this section, Adam Pogba from Oracle Labs suggests that ONNX should be made visible in languages other than Python, as the current functionality is all wrapped in Python and it's not clear if C++ is a valid target for binding. Pogba explains that the model checker should be visible in other languages so that users can interact with it without needing a valid Python environment. Additionally, Pogba mentions that the ONNX Runtime occasionally segfaults when consuming models due to parsing issues, and the model checker could be used to validate and easily fix this issue.

  • 00:15:00 In this section, the speaker discusses the core functionality of model checking and how it would be useful to be exposed across other languages. While they would like to have it in Java, they understand that not everyone would write a Java API, so a C API is a better option for most languages to easily bind to. However, there needs to be a stable and appropriate target for people to bind to, and it is not immediately clear if the C++ API of any of these tools is considered to be an appropriate target for binding. The speaker is willing to participate in this effort, but it is not worth trying to galvanize a large effort unless there is interest from the community.

  • 00:20:00 In this section, the speaker discusses the exporting of ONNX models and emitting them from other languages besides Python, such as C# and Java, with specific focus on ML.NET and Trivial Library. The speaker urges the need for a common API that all the projects could use to generate ONNX models, especially considering the currently present three different implementations without shared code that are prone to bugs. The common API would ensure only one place to update and validate the nodes and graphs, providing an opportunity to share strength and make it easier for other machine learning libraries to emit ONNX models. The speaker acknowledges that though this might be a lot of work to do, the shared effort could grow the ONNX ecosystem beyond Python.

  • 00:25:00 In this section, the speakers discuss the ONNX Python and C++ converters and their current functionalities. They note that the ONNX documentation is not specific enough, which makes it hard to understand certain functionality. However, they assert that many of the functionalities necessary for ONNX export already exist in these converters, but need to be exposed in the right way to other projects. Additionally, they discuss the need for scoping, naming, and patching functionalities when writing ONNX models. Finally, they suggest that the converters could benefit from being linked to the architecture infrastructure sig so that they can be used easily by different people.
ONNX Roadmap Discussion #1 20210908
ONNX Roadmap Discussion #1 20210908
  • 2021.09.08
  • www.youtube.com
1. Takuya Nakaike (IBM) – New operators for data processing to cover ML pipeline (eg: StringConcatenator, StringSplitter, Date)2. Adam Pocock (Oracle Labs) –...
 

2021 ONNX Roadmap Discussion #2 20210917



2021 ONNX Roadmap Discussion #2 20210917

In the ONNX Roadmap Discussion #2 20210917, various speakers discussed several key areas where ONNX needs improvement, including quantization and fusion friendliness, optimizing kernels for specific hardware platforms, and adding model local functions to ONNX. Other topics included feedback on end-to-end pipeline support, challenges faced by clients on different platforms, and issues with converting GRU and LSTM graphs. Some suggested solutions included providing more information for backends to execute pre-quantized graphs, improving interoperability of different frameworks, and including a namespace related to the original graph to allow for both a general and optimized solution. Additionally, speakers discussed the need for better deployment of packages for wider adoption and the potential for more converters to be developed to support multi-modal models.

  • 00:00:00 In this section, Martin from Green Waves Technologies discusses the two areas where ONNX needs improvement, quantization and fusion friendliness. For quantization, Martin suggests providing more information for backends to execute pre-quantized graphs, as it is impossible for ONNX to follow all the different ways customers want to implement specialized things. To aid in this, Martin suggests adding min max, standard deviation, and mean information to tensors, with additional information like outlier statistics, channel per channel information, and distribution information as possible add-ons. For fusion friendliness, Martin suggests improving the interoperability of different frameworks by providing better importing/exporting features, which would make it easier for ONNX to identify the right converters to use when importing/exporting different graphs.

  • 00:05:00 In this section, the speaker discusses the current use of functions for composed operators and the difficulty in optimizing kernels for specific hardware platforms when operators are broken up. The idea of grouping exported functions under a higher-level container, possibly a function, and mapping that container to an optimized kernel on a specific backend is suggested. The speaer also suggests including a namespace related to the original graph, allowing for both a general solution and an optimized solution. The addition of an ability to import model local functions in the latest ONNX release is also mentioned.

  • 00:10:00 In this section, the speakers discuss the addition of model local functions to ONNX, which allows converter operators to include a function body into the module proto as a placeholder for operators that are not defined in the ONNX standard. However, the speakers also note that it should be a best practice for converters to label and comment on what they are exporting in a way that is machine-readable. They also touch on how optimization can affect naming conventions and suggest continuing the discussion on the topic either in the Slack channel or in an extra meeting. The next presentation, which is about ONNX profiling, is introduced.

  • 00:15:00 In this section, a feedback on end-to-end pipeline support is discussed, with ONNX being seen as a great fit for lightweight deployments to different operating systems that don't demand heavy ecosystem requirements. The speakers express hope in the direction of enabling ONNX operators across both ONNX and ONNX ML to execute not just models but also data preparation stages, including cover other types of data production operations. They claim that a simplified or common deployment artifact or model could add value, along with the ability to save effort and consistency by focusing on low-hanging fruit around standard conversions.

  • 00:20:00 In this section, the speaker discusses some of the challenges faced by clients on different platforms and notes the potential value in continuing to develop and broaden the ONNX platform. They touch on the issue of siloing and the need to simplify the deployment of packages for better adoption. The conversation also includes comments from a participant who confirms facing similar issues and proposes options to merge Linux server ONNX or find better ways to help users convert custom code into ONNX. The speaker also touches on the topic of multi-modal support and the need for an ensemble of models to be represented as a single ONNX graph. They discuss the potential need for more converters and suggest a general movement in the right direction.

  • 00:25:00 In this section of the ONNX Roadmap Discussion, the team discusses examples of proxies for proxy models to showcase the types of things customers are using in enterprise environments for non-image types of use cases. One team member mentions an example of a proxy for a fraud detection model that uses some open data and is a relatively simple two-layer LSTM model. The team is investigating the matter further and trying to get more examples of proxy models to bring forward. They also discuss issues with the GRU and LSTM not being converted correctly and mention that they would like to add support for all cases.

  • 00:30:00 In this section, the speakers discuss the challenges of converting GRU (gated recurrent unit) graphs into a format that can be read by the converter of a back end. They mention that there are certain cases where the breakdown already occurs in TensorFlow, but it is challenging to turn it back into GRU. They suggest using the `--custom ops` flag and making a kernel that works for it, before moving on to the idea of making it a function or preserving it in terms of semantics. They note that the best option is to explicitly denote whether the user wants it broken down or not, and that using custom ops might be the only way to do it robustly.

  • 00:35:00 In this section, the speakers discuss whether it's better to have the full function body in both ONNX and at a high level or just having a TF base. For them, the TF base would be sufficient as the ONNX could be used as proof of result along the chain. However, they caution against making ONNX TensorFlow-centric as ONNX should be able to come from different places. They also touched upon the attractiveness of having a named subgraph with a semantic meaning, thinking of it almost as an operator, which needs to be defined and generated by various different front-ends. Finally, they agreed to have deeper presentations to continue the discussion with more knowledgeable people.
ONNX Roadmap Discussion #2 20210917
ONNX Roadmap Discussion #2 20210917
  • 2021.09.17
  • www.youtube.com
1. Martin Croome (Greenwaves) – Add meta information in tensors2. Andrew Sica (IBM) – E2E pipeline with ONNX operators (include Keras, TF, Scikit-learn/Spark...
 

2021 ONNX Roadmap Discussion #3 20210922



2021 ONNX Roadmap Discussion #3 20210922

During the ONNX Roadmap Discussion, speakers addressed the need to fix issues with ONNX's offset conversion tool to improve adoption of ONNX with the latest optimized stack for certain use cases. The speakers proposed a better coverage of models to test offset conversion and resolution of intermediate steps that are currently missing in operator or layer tests. They also discussed the importance of metadata and federated learning infrastructure, including the need to include metadata in the ONNX spec for transfer learning annotations and the concept of federated learning to enable privacy, efficiency, and use of computational resources. The speakers encouraged collaboration from the community and requested feedback to further discuss and implement these ideas. The next session is scheduled for October 1st.

  • 00:00:00 In this section, Manoj from Intel addresses gaps in offset conversions for ONNX models that have been causing issues for many customers. The basic problem lies in offset conversion as many customers don't go and keep updating the offset once they deploy a model in production. Customers are facing multiple issues with offset conversion, such as moving from 7 to 10 to 13 for quantization or not being able to convert older offsets to newer ones to take advantage of performance and good accuracy. Additionally, the unit test or all the tests related to every operator or layer are not up to the point where ISVs are happy, and hence, most customers are still on 10 or 9 offset.

  • 00:05:00 In this section, the speakers discuss the need to resolve issues with ONNX's offset conversion tool, as it is hindering the adoption of ONNX with the latest optimized stack for certain use cases. Developers who are integrating AI and shipping it in their applications are providing feedback on the need to fix conversion tools and make sure they're seamless. They share examples of issues they faced, such as missing intermediate steps and missing adapter implementations, which are preventing the transition to performance models that are quantized. The speakers emphasize the need for better coverage and more models to be tested to ensure better adoption of ONNX.

  • 00:10:00 In this section of the video, the speakers discuss the need for approval of at least one failing model from a top creator company for further improvements in ONNX. The discussion moves on to the improvement in fp16 conversion, which was one of the gaps between different ecosystems like mobile and Windows, and how it has been fixed lately with Microsoft converter tools. The responsibility for conversion is unclear, but the discussion moves on to the next presentation concerning model zoo, where the inclusion of phonics operators for training will help cover all models across categories. They propose starting with transformer or NLP training samples and moving on to more models to showcase distributed training infrastructure and techniques applicable to ONNX.

  • 00:15:00 In this section, the speakers discuss the involvement of ONNX models in training, including the importance of quantization aware training and mixed precision usage. They request the original fp32 model to better compare accuracy and showcase mixed precision usage for training with ONNX models. They prioritize contributing to transformer samples but request help from the community in contributing other popular categories. They also discuss future proposals for better reflecting mixed precision usage within a model as part of metadata. Finally, Gabe Stevens presents a deployment configuration that Intel is starting to take a look at.

  • 00:20:00 In this section, the speaker discusses the concept of distributed and federated learning and its advantages in terms of privacy, latency, efficiency, and use of computational resources. The idea is to deploy models onto a fleet of devices, where some of the devices have a training group that enriches the model using the data they see. The proposed modifications to ONNX would allow it to facilitate federated learning, making it more likely for developers to use ONNX. The minimal set of additions to the API includes a way to query the part to get the parameters of the local model, update those parameters, and notify the server of how the model has changed to consolidate into a new model that includes the findings from the devices.

  • 00:25:00 In this section of the video, the speaker discusses the idea of including metadata in the ONNX spec to allow for transfer learning annotations and make it easier to train a smaller dataset with a model trained on a larger dataset. However, the implementation of such a system involves multiple design decisions that need to be left to the implementers. The speaker suggests three items that could facilitate the basic infrastructure of such a system without limiting flexibility needed for application developers. They also mention the need for consistency in model version deployment across a fleet of devices and the importance of not mandating that only ONNX models are allowed to participate in a federated learning system. The speaker solicits feedback on whether the spec designers are interested in paying attention to this type of configuration of learning and whether they would be open to further discussion. Another speaker suggests trying to do this with ONNX runtime, as it supports training and some proof of concepts have been built for doing federated learning using it.

  • 00:30:00 In this section, the speaker expresses their appreciation for the tremendous effort put into the presentation and thanks the community for their questions. The goal of the presentation is to present ideas to the relevant SIG for further discussion and eventual implementations. The last session will be held on October 1st, and the speaker looks forward to continuing involvement with these ideas.
ONNX Roadmap Discussion #3 20210922
ONNX Roadmap Discussion #3 20210922
  • 2021.09.27
  • www.youtube.com
1. Rajeev Nalawadi & Manuj Sabharwal (Intel) – Address gaps with Opset conversions across broad set of models2. Rajeev Nalawadi (Intel) – ONNX model zoo exam...
 

ONNX Community Virtual Meetup – March 2021



000 ONNX 20211021 ONNX SC Welcome Progress Roadmap Release

The ONNX workshop started with an introduction, where the organizers emphasized the importance of community participation in the growth of the ONNX ecosystem. They also provided an overview of the agenda, which included updates on ONNX statistics, community presentations, and the roadmap discussions of ONNX's Steering Committee. The roadmap proposals are aimed at improving the support, robustness, and usability of the ONNX framework, and include pre-processing operators, C APIs, federated learning, and better integration of data processing and inference. The recent release of version 1.10 of the ONNX specs was also discussed, and attendees were encouraged to ask questions and participate in the ONNX Slack channel to continue the conversation.

  • 00:00:00 In this section of the workshop, the organizers provide an overview and welcome to all the attendees. They mention the vast array of products available for AI and urge attendees to check it out. The overall goals of the workshop are to get the latest updates on ONNX, its processes, roadmap, and releases, as well as learn from community participants on how ONNX is being used. They encourage attendees to share their feedback, get more involved with the ONNX Steering Committee, SIGs, and Working Groups. They provide an overview of the agenda, which includes the logistics of the ONNX Working Group, the State of the State presentation by Wenming Yi, followed by Alex, and community presentations. Finally, they present exciting updates on the statistics of ONNX, including an almost 400% increase in monthly downloads to 1.6 million per month, showing healthy growth in the ecosystem.

  • 00:05:00 In this section, the speaker discusses the progress and growth of the ONNX ecosystem, emphasizing the importance of contributions from companies in the community. The speaker mentions the deep java library project from Amazon, which has built a good experience for the Java community and has seen a lot of growth. Several commercial companies such as IBM, AMD, and Sony are providing support for the ecosystem and helping ONNX become the industry standard. The speaker also talks about the governance of the community and the new members of the steering committee, and invites participation in the roadmap discussions, Slack channel, Q&A on GitHub, and contribution to the documentation and blogs. The next speaker follows up with the roadmap, which is crucial to moving in the right direction and lowering ONNX models down to batteries for CPU and accelerators.

  • 00:10:00 In this section, the speaker discusses the roadmap discussions of ONNX’s Steering Committee, which took place over the summer. The proposals received from different members are divided into four groups of three proposals each, and each group is presented to the respective Sigs for approval and implementation. The proposals range from pre-processing operators, C APIs, model checking, support for emitting models in other languages, adding metadata information in tensors, better integrating data processing and inference, defining concepts for federated learning, defining metadata processing properties to improve the integrity of data, models, and more. The goal is to enable better support, robustness, and usability of the ONNX framework for all users.

  • 00:15:00 In this section, the speaker discusses the recent release of version 1.10 of the ONNX specs and thanks the contributors for their hard work. There will be further discussions and details about this latest change on the next.ai website. The speaker invites the audience to post any questions in the chat or ONNX general Slack channel to continue the conversation.
000 ONNX 20211021 ONNX SC Welcome Progress Roadmap Release
000 ONNX 20211021 ONNX SC Welcome Progress Roadmap Release
  • 2021.11.06
  • www.youtube.com
Event: LF AI & Data Day - ONNX Community Meeting, October 21, 2021Talk Title: ONNX Steering Committee (SC) Update - Host Welcome, Progress, Governance, Roadm...