# What is a high leverage value

#### Purpose

Leverage is a measure of the effect of a particular observation on the regression predictions due to the position of that observation in the space of the inputs. In general, the farther a point is from the center of the input space, the more leverage it has. Because the sum of the leverage values is *p*, an observation *i* can be considered as an outlier if its leverage substantially exceeds the mean leverage value, *p*/*n*, for example, a value larger than 2**p*/*n*.

#### Definition

The leverage of observation *i* is the value of the *i*th diagonal term, *h*_{ii}, of the hat matrix, *H*, where

*H* = *X*(*X*^{T}*X*)^{–1}*X*^{T}.

where *p* is the number of coefficients in the regression model, and *n* is the number of observations. The minimum value of *h*_{ii} is 1/*n* for a model with a constant term. If the fitted model goes through the origin, then the minimum leverage value is 0 for an observation at *x* = 0.

It is possible to express the fitted values, , by the observed values, *y*, since

Hence, *h*_{ii} expresses how much the observation *y _{i}* has impact on . A large value of

*h*

_{ii}indicates that the

*i*th case is distant from the center of all X values for all

*n*cases and has more leverage. is an

*n*-by-1 column vector in the table.

