11 min readOct 15, 2021
H2O in practice: a Data Scientist feedback

Automated machine learning (AutoML) platforms are gaining popularity and becoming a new important tool in the data scientists’ toolbox. A few months ago, I introduced H2O, an open-source platform for AutoML, with tutorial. Recently, I had a great opportunity to test it in practice. I had been working with our client, EDF Lab, together with ENEDIS. Both companies are major players in the French electricity market, ENEDIS being the network maintainer and EDF the major electricity supplier. The use-case covered the preventive maintenance of an underground low-tension electrical network. More specifically, the question was how to detect the segments of cables (feeders) that need to be replaced to prevent power cuts. The use-case was extensively studied and published previously.

Several models were developed beforehand. Therefore, the purpose of the mission was not the modeling itself but rather to estimate the added value of AutoML. Will it shorten the time of modeling and/or produce better models? We started with the comparison of existing frameworks. A great benchmark can be found here. Probably the most restrictive criterion for the selection is the open source nature of the framework. Namely, many AutoML frameworks are proprietary. H2O was estimated to be the most suitable for the task. Our task was to understand its functioning, its entry barrier, its performance on real-world data and to compare the model quality built with AutoML with the in-house reference…


Open Source consulting - Big Data, Data Science, Node.js