Python

Classification with Imbalanced Data

Building classification models on data that has largely imbalanced classes can be difficult. Using techniques such as oversampling, undersampling, resampling combinations, and custom filtering can improve accuracy.

Read
Python

A Straightforward Guide to A/B Testing

A/B Testing can be extremely useful during experimentation. Adding statistical rigor to situations where you compare one option against another. This is one step which can help guard against making faulty conclusions.

Read
Python

Preprocessing Text Data for Machine Learning

Unstructured text data requires unique steps to preprocess in order to prepare it for machine learning. This article walks through some of those steps including tokenization, stopwords, removing punctuation, lemmatization, stemming, and vectorization.

Read
Python

Filling Gaps in Time Series Data

Time Series data does not always come perfectly clean. Some days may have gaps and missing values. Machine learning models may require no data gaps, and you will need to fill missing values as part of the data analysis and cleaning process. This article walks through how to identify and fill those gaps using the pandas resample method.

Read