discrete-valued feature

Backlinks

Variables whose set of possible values have 1-to-1 correspondance with the natural numbers.

Variables that have a finite(or countably infinite) set of possible values.

For example, categorical feature like color with set of options {red, yellow, blue}, or boolean feature like “love troll 2” with options {yes, no}, or number of days with options 1,2,3,4,....

Backlinks

target encoding

To encode the discrete-valued feature with the target value we are trying to pridict.

one-hot vector encoding

To represent discrete-valued feature, one can put one feature per value, and put 1 if the discrete-valued feature take that value, i.e.

name	favorite color	height	net worth
james	red	1.7	5000
josh	blue	1.8	4000

name	favorite red	favorite blue	height	net worth
james	1	0	1.7	5000
josh	0	1	1.8	4000

label encoding

Map each value of the discrete-valued feature into a natural number, i.e.

name	favorite color	height	net worth
james	red	1.7	5000
josh	blue	1.8	4000

(red -> 1, blue ->2)

name	favorite color	height	net worth
james	1	1.7	5000
josh	2	1.8	4000

feature

In data science and machine learning, a feature is a measurable property of a phenomenon.

In the raw data form, it normally refers to a single column in the data set such as follows:

name	favorite color	height	net worth
james	red	1.7	5000
josh	blue	1.8	4000

In this dataset, favorite color is a feature, and height is another one. They both describes some measureable property of people like james and josh

favorite color would be refer to as a discrete-valued feature, while height a continuous feature, and the whole row

james

red

1.7

5000

a feature vector