Directions

Go through all of the directions in this lab / assignment, and answer all questions. Submit the responses to the directions and questions as a Word (or other word processed) document. Report regression results in a table similar to that found on page 4 of these lecture slides. You can use a blank table provided here, in order to report your regression results for each question. Submit your assignment on UM Learn.

Open RStudio

Click on the home button, and scroll down until you find the RStudio directory. Click on RStudio.

Create a script file

Click on “File”, “New File”, “R Script”.

Download a CPS data set

Download the data from the website using:

cps <- read.csv("http://home.cc.umanitoba.ca/~godwinrt/3040/data/lab5.csv")

This is another version of the “Current Population Survey” data from the US. Take a good look at the data set either by clicking on the spreadsheet icon next to its object name in the top-right window, or by using the command:

View(cps)

Our dependent variable will be ahe - average hourly earnings. bachelor takes on the value 1 if the individual has a university degree, 0 otherwise. Similarly, female is a dummy variable taking on the value 1 if the individual is female. age measures the age of the individual in years.

Start your assignment


Q1) Regress ahe on bachelor. Report the results in a table.

summary(lm(ahe ~ bachelor, data = cps))
a) Interpret the estimated coefficient on bachelor.
b) Why is it important to add more variables to the model?


Q2) Run a regression of average hourly earnings on age, gender, and education. Report the results in a table.

summary(lm(ahe ~ age + female + bachelor, data = cps))
a) Compare the \(R^2\) from this regression to the \(R^2\) from the regression in question 1. Why has it increased?
b) If Age increases from 25 to 26, how are earnings expected to change?
c) If Age increases from 50 to 51, how are earnings expected to change?


Q3) Run a regression of the logarithm average hourly earnings on Age, Female, and Bachelor. Report the results in a table.

summary(lm(log(ahe) ~ age + female + bachelor, data = cps))
a) If Age increases from 25 to 26, how are earnings expected to change?
b) If Age increases from 50 to 51, how are earnings expected to change?


Q4) Run a regression of the logarithm average hourly earnings on log Age, Female, and Bachelor. Report the results in a table.

summary(lm(log(ahe) ~ log(age) + female + bachelor, data = cps))
a) What is the estimated effect of Age on average hourly earnings in this regression?
b) If Age increases from 25 to 26, how are earnings expected to change?
c) If Age increases from 50 to 51, how are earnings expected to change?


Q5) Run a regression of the logarithm average hourly earnings on Age, Age\(^2\), Female, and Bachelor. Report the results in a table.

age2 <- cps$age^2
summary(lm(log(ahe) ~ age + age2 + female + bachelor, data = cps))
a) If Age increases from 25 to 26, how are earnings expected to change?
b) If Age increases from 50 to 51, how are earnings expected to change?


Q6) Run a regression of the logarithm average hourly earnings on Age, Age\(^2\), Female, Bachelor and the interaction term Female \(\times\) Bachelor. Report the results in a table.

You need to create the interaction term:

fem_bach <- cps$female * cps$bachelor

and then include it in the model:

summary(lm(log(ahe) ~ age + age2 + female + bachelor + fem_bach, data = cps)) 
a) What is the estimated effect of an education on earnings, for men and for women?
b) What does the coefficient on the interaction term measure?
c) Do women earn a different amount than men? Use an F-test, with the above model as the “unrestricted” model.

The restricted model is a model which does not allow for a difference in earnings between men and women. Starting with the above unrestricted model, remove all variables that involve the female dummy. This results in the following restricted model:

summary(lm(log(ahe) ~ age + age2 + bachelor, data = cps)) 

Make sure that you can calculate the F-statistic of 162.8468. Do you reject or fail to reject the null hypothesis that men and women have the same earnings?

d) Does education have a different effect on earnings for women, than it does for men?

Just check to see if the fem_bach variable is significant.


Q7) Is the effect of Age on earnings different for males than females? Specify and estimate a regression that you can use to answer this question.

We need two new interaction terms:

fem_age <- cps$female * cps$age
fem_age2 <- cps$female * age2

and to estimate the equation:

summary(lm(log(ahe) ~ age + age2 + fem_age + fem_age2 + female + bachelor + fem_bach, data = cps))

Calculate the F-statistic for the null hypothesis of “no different effect of age on earnings for men and for women”, using the \(R^2\) from the unrestricted and restricted model.