My contribution to the tidymodels ecosystem - implementing supervised discretization step with XgBoost backend

28 May 2020 in predictive modelling

There’s already a lot of cool features that the tidymodels ecosystem offers, which make data science and predictive modelling easy and effective, but at times I lacked this one: automated, supervised discretization preprocessing of numeric variables. In this blogpost I’d like to present to you a new step that I implemented with Max Kuhn in the embed package, which recently became officially available on CRAN!

step_shadow_missing - implementing a custom {recipes} step to account for missing data patterns

17 November 2019 in predictive modelling

Have you also always wanted to seemlessly account for missing data patterns when doing data modelling in R? In the following blogpost I will provide you with a ready-to-use, custom recipes step that will allow you to incorporate such technique easily and quickly in all your machine learning projects.

Testing the tune package from tidymodels - analysing the relationship between the upsampling ratio and model performance

11 October 2019 in predictive modelling

Have you ever also found yourself in a situation in which you were dealing with an imbalanced classification problem, but you weren’t really quite sure how much upsampling to apply? Or what’s exactly the impact of correcting the imbalance on model performance? In this post I will explore the relationship between the upsampling ratio and model performance, while using the brand new tidymodels tune package.

Caret vs. tidymodels - comparing the old and new

6 August 2019 in predictive modelling

In this post I will make a comparison between the most popular (by number of monthly downloads from Github) ML framework available for R to date: caret and its successor packages being written by the same author (Max Kuhn) that are wrapped together in a so called tidymodels framework.

My contribution to the tidymodels ecosystem - implementing supervised discretization step with XgBoost backend

step_shadow_missing - implementing a custom {recipes} step to account for missing data patterns

Testing the tune package from tidymodels - analysing the relationship between the upsampling ratio and model performance

Caret vs. tidymodels - comparing the old and new

Konrad Semsch