discrete-valued feature

Table of Contents

Variables whose set of possible values have 1-to-1 correspondance with the natural numbers.

Variables that have a finite(or countably infinite) set of possible values.

For example, categorical feature like color with set of options {red, yellow, blue}, or boolean feature like “love troll 2” with options {yes, no}, or number of days with options 1,2,3,4,....

Backlinks

one-hot vector encoding

To represent discrete-valued feature, one can put one feature per value, and put 1 if the discrete-valued feature take that value, i.e.

name favorite color height net worth
james red 1.7 5000
josh blue 1.8 4000

to

name favorite red favorite blue height net worth
james 1 0 1.7 5000
josh 0 1 1.8 4000

label encoding

Map each value of the discrete-valued feature into a natural number, i.e.

name favorite color height net worth
james red 1.7 5000
josh blue 1.8 4000

(red -> 1, blue ->2)

name favorite color height net worth
james 1 1.7 5000
josh 2 1.8 4000

feature

In data science and machine learning, a feature is a measurable property of a phenomenon.

In the raw data form, it normally refers to a single column in the data set such as follows:

name favorite color height net worth
james red 1.7 5000
josh blue 1.8 4000

In this dataset, favorite color is a feature, and height is another one. They both describes some measureable property of people like james and josh

favorite color would be refer to as a discrete-valued feature, while height a continuous feature, and the whole row

james red 1.7 5000

a feature vector

Author: Linfeng He

Created: 2024-04-03 Wed 23:18