# Background Part 2 — Uses of the Least-Squares Regression Coefficients and Determining Which Variables Matter

In this article, I am going to focus on the most important section of the regression output, namely the table of regression coefficients and the statistics that accompanies them. This article is part 2 of a 5 part series reviewing regular least-squares regression. In Part 1, I briefly discussed the three sections of the least-squares regression output (the coefficient table, the goodness of fit, and the ANOVA) using data from the Kid Creative example.

The least-squares regression coefficients and the regression coefficient table are used primarily for four key purposes:

1. To determine which -variables matter.
2. To determine the effects of each of the -variables.
3. To compute the regression equation in order to make predictions.
4. To assess the uncertainty about the regression coefficients (and predictions).

I will now discuss each of these uses in the context of regular linear least-squares regression using the Kid Creative Household income example.

### Coefficient Table Use #1: Which Variables Matter? Interpreting the -Values

To determine what -variables matter, we generally look at each coefficient’s -value. We could look at the -statistics, but it is simpler to look at the -values. The smaller a coefficient’s -value, the more evidence there is that the corresponding -variable matters. To say that an -variable matters is to say that there is evidence that the variable has an effect on the .

Specifically, when the -value for a coefficient is less than the significance level (usually or 5%), we take this to mean that there is evidence that the -variable matters. More precisely still, when the -value is less than the significance level , the null hypothesis that the regression coefficient is zero is rejected (meaning that there is statistical evidence that the regression coefficient is not ). Note that to say that a variable does not matter is the same thing as saying the the regression coefficient for that variable is zero (i.e., ). Thus, when I say that an -variable matters, I mean that there is a real association between that -variable and the -variable Household Income.

The table below shows just the regression coefficients and their corresponding -values from the KidCreative least-squares regression output from the regression of Household Income on the -variables discussed in Part 1. (Click here to see the entire regression output).

Looking at the table, we see that there is evidence that 9 of the variables matter:

• IsFemale — strong evidence: -value = 0.6%
• IsMarried — very strong evidence: -value = 0.0%
• HasCollege — very strong evidence: -value = 0.0%
• IsProfessional — very strong evidence: -value = 0.0%
• Unemployed — strong evidence: -value = 1.0%
• ResLength — strong eidence: -value = 0.8%
• Own — very strong evidence: -value = 0.0%
• White — very strong evidence: -value = 0.0%
• PrevChild — strong evidence: -value = 1.5% = 1 if previously purchased a children’s magazine)

There is no evidence that the following variables matter:

• Dual — no evidence: -value = 16.9%
• Children — no evidence: -value = 65.2%
• PrevParent — No evidence: -value = 13.5%

There is some suggestion that the following variables might matter:

• House — weak evidence: -value = 8.9%
• English — very weak evidence: -value = 11.0%

Determining which variables matter is one of of the most important uses of the regression coefficients and their associated statistics. It is often a central part of a model building process; that is, a process to determine what -variables should be in the regression equation and what variables should be omitted.

In the next part of this background series (Part 3), I discuss another very important use of regression coefficients, assessing the impact of each of the -variables.