Data Iceberg Model for Machine Learning

bigThinking Data Iceberg Model

One of the pitfalls to developing production-ready machine learning solutions is the failure to identify the appropriate data assets. In evaluating the data assets to be used for your project, use the Data Iceberg Model approach to determine the underlying (i.e. not visible) structures that triggered the creation of the dataset.

The Iceberg Model is a good tool for discovering the underlying patterns, structures, and behaviors that cause an observable event. We know that approximately 90% of an iceberg is underwater. The 90% of the iceberg that exists below the surface is what creates the “event” seen by the 10% that exists above the surface.

The following Data Iceberg Model can be used to evaluate the quality & limitations of the data used to train your machine learning models.

bigThinking Data Iceberg Model (bT Data Iceberg Model PDF Version)
Total
0
Shares
Related Posts
Read More

Systems Thinking Resources

Systems thinking is a discipline used to understand systems to provide a desired effect. It provides methods for "seeing wholes and a framework for seeing interrelationships rather than things, for seeing patterns of change rather than static snapshots." The intent is to increase understanding and determine the point of “highest leverage”, the places in the system where a small changes can make a big impact.
Read More

Characteristics of Machine Learning solutions

It is estimated that 87% of data science projects never reach production. One of the pitfalls to developing a production-ready machine learning solution is the ability to define if it's an appropriate tool for solving the problem. Not every problem should be solved with machine learning.