# Logistic Regression FAQ (Frequently Asked Questions)

This page contains answer to frequently asked questions that have emerged as I have developed the material for this web site. In some cases, the questions have been motivated by search engine queries that have led people to the web site.

Have a specific question not on the list? Please contribute it using the comment form below and I will try to get you an answer.

The answers to the FAQ questions are intended to be brief (or even terse) and to the point. This is in contrast the the main material of the web site that is intended to be gentle and thorough.

### 6 Responses to Logistic Regression FAQ (Frequently Asked Questions)

1. Vik says:

Thank you so much for this! I’ve been taking Analytics Edge and Machine Learning on Edx and Coursera respectively, and this was exactly what I needed to fill in the gaps (along with Ben Lambert’s Under/Graduate courses in econometrics on youtube.

Thank you for your clear explanations and simple examples. You are a fantastic teacher and your students are very lucky. What school is it?! I want to apply! ðŸ™‚

2. yaorui says:

Hi prof, I am wondering that can you expain how to read or understand the output of a logistic regression model effectly? and what is AIC doing for?

• StatsProf says:

Thank you for your question. I have not yet added discussion of model fit to this website. It is on my to do list. But let me quickly give you an idea of the AIC and how to use it.

The AIC (Akaike Information Criterion) is a measure of how well your logistic regression model fits the data. So it plays a similar role to something like the R-squared in regular linear least-square regression. The AIC measure is based on information theory, so it has a clear theoretical basis. The reason that I am saying this is that you should consider this to be a very good measure for model selection.

The AIC is used to compare the fits of several models and thus it can be used in variable selection to pick the model with the best set of X-variabls. For the AIC, smaller is better, so the best model is the one with the smallest AIC. There is not really a good way to interpret the magnitude of the AIC, so only use it to compare 2 or more models. This is different from R-squared in linear least-square regression, where you can interpret the magnitude (e.g., and R-squared of 95% is a very good fit). This lack of ability to interpret the magnitude of the AIC may be disconcerting at first. But again, it is used to compare models and you will get used to not trying to interpret its magnitude.

The AIC trades off bias (due to your not having X-variables that matter in the model) with variance and over fitting (due to having extra X-variables in the model). In this regard, it is similar to adjusted R-squared in regression or Mallows Cp.

One other comment. You should probably really be using the AICc. The AIC is an asymptotic estimate of the information loss given by the model you are considering. AIC can be improved in small samples, and this is what the AICc does. Specifically, you should use AICc unless n/K > 40 where n is the number of observations and K is the number of parameters (in logistic regression, the number of X’s plus 1). AICc = AIC + 2K(K+1)/(n-K-1).

I am not providing any formulas in my answer because I have not developed the required background in this website yet. So I am assuming that you are reading the AIC and AICc from your logistic regression output and therefore do not need the formulas. For a great but much more theoretical discussion of the AIC, see the following link:

For interpreting the logistic regression coefficient table output, see my other posts starting with .

I hope this helps,

Regards,

StatsProf

3. tridib says:

Thanks for such nice lectures.It is very useful.

4. miku says:

Hello!

I apologize if this message is not appropriate for this page but Mr. StatsProf, I would like to extend my thanks. This website helped me so much because I am using Logistic Regression in my undergraduate thesis.

Is there any way I can cite you properly, prof? You can email me at (e-mail address deleted).

• StatsProf says:

Miku,

Thank you so much for you comment. I really appreciate your taking the time to let me know that this website was helpful to you! Hearing from you helps energize me to keep working on it.

I haven’t decided yet whether or not I want to make my actual identity visible or keep the low profile of using the pseudonym “StatsProf.” I am a statistics professor at a major U.S. university who teaches in the business school. I may change my mind later and “go public.” I just haven’t decided yet. I think I will wait and see how things develop.

I appreciate your citing this website. The general form of the citation should be as follows:

Lastname, Firstname, “Page Title,” WebsiteName.com, Page Date, URL.

When a pseudonym is used, you replace Lastname, Firstname with the pseudonym.

So, to cite the page that you have commented on before, you would use:

StatsProf, “Understanding Logistic Regression Output: Part 3 â€” Assessing the Effects of the X-Variables,” LogisticRegressionAnalysis.com, July 18, 2013, http://logisticregressionanalysis.com/817-understanding-logistic-regression-output-part-3-assessing-the-effects-of-the-x-variables/.

By the way, I used the information from Purdue Online Writing Lab – Web Sources to answer your question about citation.

Regards,

StatsProf