DSC441: Fundamentals of Data Science
Assignment 2
Jesse Johnson 10/12/2020
1.
Age + %Fat
a.
Age and per fat are not skewed but normal. There are two outliers in percent fat,
8.8 and 10.5 both of these come from younger aged people. Boxplot of Age mean
is 50. Boxplot of percent fat mean is 31

Boxplot of age and percent fat combined. Age and percent fat is skewed to the left when
combined in one boxplot.
b.
Z-Score normalization can be found using v= (v-mean)/std

c.
Normalization
i.
All features are on the same scale. Min-max has the issue of not being able
to handle outliers very well. The minimum value starts at 0 and the
maximum value is set to 1.
ii.
Z score is good for being able to see if something is above or below the
mean, below the mean is negative, and above the mean is positive.
Handles outliers well, but features might not be all on the same scale. The
mean is 0.
iii.
The decimal scaling is dependent on the maximum value. Based off that
we drag the decimal point over so that the data will always be between 0
and 1.