What is a high leverage value

Purpose

Leverage is a measure of the effect of a particular observation on the regression predictions due to the position of that observation in the space of the inputs. In general, the farther a point is from the center of the input space, the more leverage it has. Because the sum of the leverage values is p, an observation i can be considered as an outlier if its leverage substantially exceeds the mean leverage value, p/n, for example, a value larger than 2*p/n.

Definition

The leverage of observation i is the value of the ith diagonal term, hii, of the hat matrix, H, where

H = X(XTX)–1XT.

The diagonal terms satisfy

where p is the number of coefficients in the regression model, and n is the number of observations. The minimum value of hii is 1/n for a model with a constant term. If the fitted model goes through the origin, then the minimum leverage value is 0 for an observation at x = 0.

It is possible to express the fitted values, , by the observed values, y, since

Hence, hii expresses how much the observation yi has impact on . A large value of hii indicates that the ith case is distant from the center of all X values for all n cases and has more leverage. is an n-by-1 column vector in the table.