With the recent breakthroughs in deep learning and related technologies, Machine Learning (ML)
algorithms have drastically improved in terms of accuracy, application, performance etc. While typically thought of as a technology only applicable to server technologies, the inferencing process of machine learning models can run on device as well. Development of a
machine learning application usually involves two stages:
- The developer first train the model by first creating a
skeleton framework and then
iterating the model with large dataset
- The developer then port the model to production environment so that
it can infer insight from user input
Though training typically takes place in the cloud because it requires a significant
amount of data and
computing power, inference can take place in the cloud or on the device. Running inference on the
device has a number of appealing properties, such as performance boost
due to edge computing, resistance toward poor or
no network, and security/privacy protection, etc.
Although platforms for native applications have all shipped APIs to support machine learning
inference on device, similar functionality has been missing on the web platform. To fill the gap, we could provide an API set including:
- WebAssembly with GPU and multi-thread support
- A WebML (Web Machine Learning) API with a pre-defined set of mathematical functions that the platform can optimize for
- A WebNN (Web Neural Network) API that provides a high-level abstraction to run neural networks efficiently.
Please take a look at the explainer for more detailed info, such as use case, problem statement, proposal, related research, etc. Feedbacks are welcomed! I would love to hear more about what you think !