BigThinking.io

Resources for Solving Problems with Data : Getting Started

Fundamental Skills To Develop

  1. Python – A general purpose programming language, object oriented, extensible with a wide range of libraries.
    • 66% of data scientists are using Python daily and 84% of them use it as their main language
    • Top Python libraries: Tensorflow (ML solutions), Keras (deep learning), Scikit-learn (ML), NumPy (data analysis/ML), PyTorch (deep learning)
  2. R – An open-source programming language used for with robust visualization libraries (ggplot2, plotly)
    • 47% of data scientists are using R.
    • It is used by 70% of data miners
    • Good For: statistical analysis and modeling, analyzing structured and unstructured data
  3. Structured Query Language – combines analytics with transactional capabilities.
    • 32% of data scientists use SQL
    • Good for: data management, transactional capabilities.
  4. Data Visualization – for exploration, storytelling and communicating quick insights.
  5. Statistics – To support a general understanding of probabilities, distribution, sampling, hypothesis testing, confidence intervals, variables
  6. Spreadsheets – The all purpose data review(er)/calculator.
  7. Algebra & Calculus – To support a general understanding of how algorithms work under the hood.

Your Development Space : IDEs & Dev Tools

Tools for writing software.

  • Jupyter notebooks – Provides an interactive programming interface in a notebook environment. Good for: rapid prototyping, visualization.
  • PyCharm – Python IDE can support single or multi file/language projects. Good for: useful for writing code for production.
  • VS Code for Python – Python IDE based on Visual Studio
  • Github – version control, tracking and recording code changes
  • R Studio – IDE for R.

Safe Spaces to Practice : Communities & Open Datasets

Finding open communities & data sources for practice projects.

  • Google Dataset Search – over 25 million datasets indexed
  • Data.gov – Open data lake provided by the U.S. Government
  • Kaggle – Real-world datasets provided to the kaggle community for collaborative problem solving

Thinking About Solving Problems With Data

Here are more resources and considerations for people that want to solve problems with data.

Subscribe

Join our community of bigThinkers! Subscribe to learn, share and receive resources to apply to wicked problems.

Follow us

Don't be shy, get in touch. We love meeting interesting people and making new friends.