Python

Automated Exploratory Data Analysis

Exploratory data analysis is a critical initial step to building a machine learning model. Better understanding your data can make discovering outliers, feature engineering, and ultimately modeling more effective. Some pieces of exploratory data analysis such as reviewing feature histograms and missing values can be automated. This article walks through an open source library I created that runs some basic automated EDA processes.

Read
Python

Favorite Places to Find Datasets

Interesting datasets can make personal machine learning projects more fun and exciting. Here are some of my favorite places to go looking for datasets to hone my data science and ML skills.

Read
Python

Detecting Outliers Using Python

Detecting outliers can be important when exploring your data before building any type of machine learning model. Some causes of outliers include data collection issues, measurement errors, and data input errors. Detecting outliers is one step in analyzing data points for potential errors that may need to be removed prior to model training. This helps prevent a machine learning model from learning incorrect relationships and potentially lowering accuracy.

Read
Python

Predicting Wine Prices with Hyperparameter Tuning

Hyperparameter tuning can be used to improve default configurations of many popular machine learning libraries. This article walks through how to apply Optuna - a hyperparameter optimization library - to predict Wine prices!

Read