Edge ML : Automated Machine Learning ... without brute force !

 

 

White Paper

 

 

1 - Introduction

Machine Learning (ML) is a scientific and technical field that aim at learning a behaviour to a computer by exploiting data. These techniques are generic and can be exploited in various aplication areas (eg churn, fraud detection, marketing, medical diagnosis)

Edge ML is an Auto ML library wich allows one to dramatically reduce the cost of Machine Learning projects:

  • Time saving: manual task automation & very fast algorithm
  • Standard hardware: no parameter to be optimized by brute force
  • Standard skills: methodological safeguard which avoids overfitting

 

 

2 - Edge ML is a disruptive tool

 

 picto boostV5

Boost your Data Scientists!

 

 

  • Unleash their creativity
  • Finish your projects in no time
  • Benefit from methodological safeguards

 picto hardwareV5

Use common hardware 

 

 

  • Push the limits of your hardware
  • Quickly process large amounts of data
  • Without optimizing your models by brute force!

 picto fuseV2

Earn your customers' trust

 

 

  • More time to understand business indicators
  • Provide accurate and interpretable models,
  • Robust in production.

 picto concentrationV5

Focus your efforts

 

 

  • Quickly get a baseline model
  • Detect impossible projects (fail fast)
  • Continuously evaluate the contribution of a manual work

 

 

3 - Why do the other approaches resort to brute force ? 

All the Machine Learning algorithms aim at adjusting the inner parameters of the models by using the data. Once learned, a model is able to predict the output variable (target) from the input variables. Overfitting is a trap to avoid: the model learns the data by rote, which makes it unable to properly predict the target on new data.

 

 

Actually, most of Machine Learning algorithms are unable to adjust all the parameters of the model. The parameters whose role is to fight against overfitting are optimized by brute force. This optimization involves two steps: A) grid-search; B) cross-validation.

 

A - grid-search

This step aims to find the best combination of values for the parameters to be adjusted. For each parameter, a set of values to be tested is defined beforehand. The grid consists of all combinations of values.

 

 

B - cross-validation

Cross-validation evaluates the quality of the model for a given combination of values. In order to avoid overfitting, the model must be evaluated on data that has not been used during the learning process. To do this, data is cut into K subsets (for example K = 10). Then, the Machine Learning algorithm is repeated K times, leaving aside one of the K subsets to evaluate the model. At last, the average evaluation is used to characterize the combination of parameters.

 

 

Let's take a concrete example, where we define 5 values to be tested for each parameter, and we cut data into 10 subsets. To optimize only 2 parameters a standard Machine Learning algorithm is repeated 250 times! And that is exacerbated when the number of parameters increases (eg for 4 parameters, the algorithm is repeated 6250 times). At last, parameters optimization requires ever greater hardware resources, when the data size and complexity of the models increase ...

 

Edge ML has the distinction of having no parameter to adjust: the learning algorithm is executed only once. Edge ML is based on the MODL approach which purely avoids overfitting problems through a clever mathematical formalization.

 

 

4 - The MODL approach

MODL (Minimum Optimized Description Length) was been invented by Marc Boullé, a Machine Learning researcher at Orange Labs. In summary, the MODL approach turns the Machine Learning into as a model selection problem. The objective is to select the most probable model given the data. This Bayesian approach naturally manages the "performance vs. robustness" compromise by finding a trade-off between: i) simple models that are very robust but not informative enough; ii) complex models that accurately describe the data but are not robust enough. In the end, the MODL approach provides accurate and very robust models.

 

 

 

 

The implementation of MODL is technically hard and requires the programming of specific algorithms, which must be optimized in depth to handle large amounts of data. Edge ML is an original implementation of MODL which is optimized and parallelized.

 

 

5 - Conclusion

Edge ML is a fast and reliable approach that efficiently leverages hardware resources. Models are automatically learned by minimizing manual tasks.

Edge ML offers a simplified use of Machine Learning. Users do not need to have advanced mathematical skills, because the robustness of the MODL approach plays the role of a methodological safeguard.

The advanced features of Edge ML enable data-scientists to accelerate their projects:

 

Edge ML is distributed as a shareware on this websiteDo not hesitate to test it :-)

<< back