Data Iceberg Model for Machine Learning

bigThinking Data Iceberg Model

One of the pitfalls to developing production-ready machine learning solutions is the failure to identify the appropriate data assets. In evaluating the data assets to be used for your project, use the Data Iceberg Model approach to determine the underlying (i.e. not visible) structures that triggered the creation of the dataset.

The Iceberg Model is a good tool for discovering the underlying patterns, structures, and behaviors that cause an observable event. We know that approximately 90% of an iceberg is underwater. The 90% of the iceberg that exists below the surface is what creates the “event” seen by the 10% that exists above the surface.

The following Data Iceberg Model can be used to evaluate the quality & limitations of the data used to train your machine learning models.

bigThinking Data Iceberg Model (bT Data Iceberg Model PDF Version)
Total
0
Shares
Related Posts
Read More

Characteristics of Machine Learning solutions

It is estimated that 87% of data science projects never reach production. One of the pitfalls to developing a production-ready machine learning solution is the ability to define if it's an appropriate tool for solving the problem. Not every problem should be solved with machine learning.