k-fold target encoding

Table of Contents

bayesian target encoding but split the training set to k equal size parts, and encode each part’s feature with all other parts target values.

So for

name favorite color height net worth
james red 1.7 5000
josh blue 1.8 4000
johnathan red 1.7 7000
joe blue 1.8 6000
joel blue 1.8 5000
johnas red 1.7 5000

with k = 3,

name favorite color height net worth
james red 1.7 5000
josh blue 1.8 4000
johnathan red 1.7 7000
joe blue 1.8 6000
joel blue 1.8 5000
johnas red 1.7 5000

James’ red value is encoded with only

name favorite color height net worth
       
       
johnathan red 1.7 7000
joe blue 1.8 6000
joel blue 1.8 5000
johnas red 1.7 5000

And joe’s blue with only

name favorite color height net worth
james red 1.7 5000
josh blue 1.8 4000
       
       
joel blue 1.8 5000
johnas red 1.7 5000

1. refrencde

statquest. k=5 works?

Backlinks

leave-one-out target encoding

k-fold target encoding in which each fold contains only 1 row, so that row’s feature’s encoding is calculated with bayesian target encoding for that feature of all other rows.

Author: Linfeng He

Created: 2024-04-03 Wed 23:23