4: Building SolutionsToolsFeatured

Data Iceberg Model for Machine Learning

Identifying the Appropriate Data Assets

One of the pitfalls to developing production-ready machine learning solutions is the failure to identify the appropriate data assets. In evaluating the data assets to be used for your project, use the Data Iceberg Model approach to determine the underlying (i.e. not visible) structures that triggered the creation of the dataset.

The Iceberg Model is a good tool for discovering the underlying patterns, structures, and behaviors that cause an observable event. We know that approximately 90% of an iceberg is underwater. The 90% of the iceberg that exists below the surface is what creates the “event” seen by the 10% that exists above the surface.

The following Data Iceberg Model can be used to evaluate the quality & limitations of the data used to train your machine learning models.

bigThinking Data Iceberg Model (bT Data Iceberg Model PDF Version)
Tags
Show More

Kishau Rogers is the editor and founder of The bigThinking Project. The bigThinking Project is a resource center and collaborative innovation project which promotes the principles of systems thinking. Our mission is to empower the next generation of innovators to think bigger, to think better and to create solutions which make significant impact in the areas that matter. Kishau Rogers is an award-winning entrepreneur with over twenty years of experience in the computer science industry. She is a serial entrepreneur having founded and co-founded companies such as Websmith Group, TimeStudy, PeerLoc Inc., and Websmith Studio.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Close