---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- # [ source("http://macosa.dima.unige.it/r.R") ] # To go deep see en.wikipedia.org/wiki/Skewness or mathworld.wolfram.com/Skewness.html. # If we call M the mean of tha data X, the more used measure of skewness (asymmetry) is # M(((X–M(X))^3)/sigma^3: the mean of the cube of the deviation divided by the cube of # the standard deviation (so that tha data dispersion is out of consideration). # It is due to Pearson and Fisher (as well as other indices). # If the data is symmetrical with respect to the mean, the index is null; if they have # a tail to the right, the index is positive; if they have a tail to the left it is # negative. It can vary from −∞ to ∞. Some example: # sk = function(da) sum((da-mean(da))^3/Sd(da)^3)/length(da) x = c(1,rep(2,2),rep(3,4),rep(4,9),rep(5,16)); y = -x Histogram(x, 0.5,5.5, 1); Histogram(y, -0.5,-5.5, -1) sk(x); sk(y) # -1.248292 1.248292 # # Another index is Bowley's index: ((Q3-Q2)-(Q2-Q1))/(Q3-Q1) where Qi is the i-th # quartile. The geometric meaning is quite evident, but it disregards the tails (data # before 1st quartile and after 3th quartile). In the case of previous data it does not # make sense: sk2 = function(da) ((Percentile(da,75)-median(da))-(median(da)-Percentile(da,25))) / (Percentile(da,75)-Percentile(da,25)) sk2(x); sk2(y) # 0 0 # I have a null index! # # Excel uses another index: sk3 <- function(da) sk(da)*length(da)^2/((length(da)-1)*(length(da)-2)) sk3(x); sk3(y) # -1.374464 1.374464 # # Another index (that is due to Pearson and Fisher) is: sk4 <- function(da) (mean(da)-median(da))/Sd(da) sk4(x) # -0.3231104 # This index is between -1 and 1. # # After all, the same asymmetric coefficient can be used in particular cases to compare # data of the same type. But its meaningful use requires very special skills and # attentions: it is best to refer to graphic representations or "own" indexes referring # to percentiles!