Sunday, November 14, 2021

Time to Graduation

How long does it take to complete a bachelor's degree? Furman's four-year graduation rate runs around 75% and the six-year rate is about 81%, so take a guess what the average time to graduation is.

The answer is an average of 3.8 years from start to graduation, a rate that has been steady for years.  Surprised? I was, because the math doesn't seem to add up. Wouldn't it have to be more than four years?

Here's the code I used to calculate time to graduation.

grads <- grads %>%
  mutate(GradDate = ymd(GradDate),
         StartDate = ymd(paste0(Cohort,"/8/20")),
         Time      = as.numeric((GradDate - StartDate)/365.25))

This relies on the lubridate library to convert text strings like "05-07-2021" into a date format and to perform the difference calculation in the last line. Subtracting two dates gives the difference in days, which I divided by 365.25 to get years. I approximated the actual start dates by August 20 of the year they enrolled, which won't be off more than a week (less than 1% of four years). 

The first thing to notice is that August 2016 to May 2020 is three months short of four years, so a "four year graduation rate" is really more like a 3.75-year graduation rate for the majority of our graduates. Additionally, students who take longer than four years, typically do not take longer than five, so a six-year rate is more like a 4.75-year rate in practice.

Using just those two facts we could estimate the time to graduation with 

$$ \frac{3.75(.75) + 4.75(.81 - .75)}{.81} = 3.82 $$

But there are a few three-year graduates too, which brings the average down to 3.8. 

IPEDS has four-, five-, and six year graduates, and the code in the next section can extract such from the CSV files. 

Figure 0. Cumulative graduation rates by quintile of four-year rate.

Just eyeballing the figure shows that lower four-year rates are associated with longer average times to graduation.

Estimating Time to Graduation

As argued in Most college students don’t graduate in four years, so college and the government count six years as “success”, time to graduation is a big deal because it (1) costs money to attend college, and (2) limits the income of the students. It's hard to start a career before finishing the degree, and those extra years are lost earnings and lost experience in the job market.
 
Pundits are already calling curtains for liberal arts colleges, and part of the problem is cost. The benefits of small classes and broad learning goals is a hard sell when a private college costs ten to twenty thousand dollars more per year. But that doesn't factor in the time to graduation! If it takes a year or two longer at a "cheaper" college, that cost needs to be averaged in to make a fair comparison. 

The College Scorecard uses a 200% time to graduation (!), meaning eight years for a bachelor's degree. Fortunately, the IPEDS database has more detailed information, which we can retrieve from the grYYYY tables.

grad_time <- read_csv("gr2019.csv") %>%
  filter(GRTYPE %in% c(8,13,14,15)) %>% 
  select(UNITID,
         GRTYPE,
         N =     GRTOTLT) %>%
  spread(GRTYPE, N, sep = "_") %>%
  mutate(Grad4    = GRTYPE_13 / GRTYPE_8,
         Grad5    = GRTYPE_14 / GRTYPE_8,
         Grad6    = GRTYPE_15 / GRTYPE_8,
         GradTime = (Grad4*3.75 + Grad5*4.75 + Grad6*5.75)/(Grad4 + Grad5 + Grad6)) %>%
  na.omit()


This code snip accesses a local copy of the 2019 IPEDS graduation file (in reality, my code hits a data warehouse instead and averages over three years), and grabs the information about undergraduates who finish in four years or less, five years, or six years. A GradTime estimate follows from that.
 
Figure 1. Estimated time to graduation by institution predicted by four-year graduation rates.

We can estimate the average time to graduation from the four year graduation rate. The regression line is shown in the figure. It's a cubic polynomial in Grad4 and has an R-squared of .81. 

The black line can be used as a cost-scaling factor in addition to a risk assessment. The lower the graduation rate, the riskier it is to enroll because of drop-outs, and on average the longer it will take to graduate as well. I called out two colleges for illustration. St. John's College in Annapolis has a 67% four-year grad rate in the IPEDS survey, and according to  collegeresults.org, has a net cost of about $30K/year. Youngtown State University, in Ohio, has a net cost on the Scorecard of about $12K/yr, and a four-year graduation rate of about 19%.

Note that the y-axis isn't zeroed, which makes the gap seem larger than it is. The time-to-graduation difference between these two schools represents about an 18% extra cost for YSU graduates due to the extra time to graduation, so a comparative yearly cost of $14K instead of $12K. This doesn't count opportunity cost--the lost earnings from being in a job earlier--so it's a conservative estimate.

Career Lag

The analysis so far is unsatisfactory because it doesn't consider the fates of students who do not graduate. I'll illustrate how this might be done. Students who do not graduate may transfer to another college and finish there, or may drop out. Either way, it sets them behind because of lost time and credits. For the sake of argument, assume that the resulting career lag penalty for dropping out of the first school attended is ten years. A normal lag is 3.75 years--the time a first-time full-time freshman takes to earn a bachelor's degree if all goes as planned. For non-graduates, this hypothetical extra lag is intended reflect the earnings penalty for not getting the diploma, or having to wait because of transfer mechanics. Its a guess--don't take it too seriously.

The formula for career lag is then

CareerLag = (Grad4*3.75 + Grad5*4.75 + Grad6*5.75 + (1-Grad4-Grad5-Grad6)*nongrad_career_lag) # nongrad_career_lag = 10

We can run the same analysis as before to get the graph below.

Figure 2. Hypothesized career lag by institution predicted by four-year graduation rates.
 

This model has an R-squared of .91; you can see it's pretty linear. Now the difference between the two highlighted schools is more than a 30% difference. We will get different values if we pick different non-grad penalties.

Discussion

I'm not that familiar with the labor economics work in higher ed. Brian Caplan's book has a nice primer, but I don't recall that much about time to graduation. A quick google turns up these related articles:

My scan of these suggests that the situation is complicated (e.g. are students working while in college, explaining the delay to graduation?), but that taking longer to graduate is associated with lower earnings, and that this may be partly due to signalling to the labor market. Students who take longer to graduate may be seen as less capable.

In 2020, Sara Vanovac and I published a validity analysis of student writing assessments and discovered a Matthew effect wherein students with the highest levels of assessed writing ability were also seen to develop the most quickly. We went looking for other examples of this type of divergence and found one in Raj Chetty, et al's work on equality of opportunity

Here's figure 8 from our paper, showing an inter-generational divergence in earnings.

Figure 3. Selectivity of universities of children is associated with incomes of parents. Right: incomes of children who graduate is linked to the selectivity of the institution.

The figure uses the mammoth data set Chetty scoured from tax records to suggest that lower incomes persist despite college opportunity partly because of the effect of college selectivity (which includes the type of students they select). The incomes for those attending non-selective institutions barely exceed parent incomes--for those who graduate. This may be partly due to longer graduation times. The figure doesn't take into account the reality that lower selectivity entails lower graduation rates as well, so the true picture on the right is more divergent that the one shown.

Conclusions

We can add time to graduation to the list of student success measures we track, like retention and graduation rates. I've started using four-year rates for bachelor's programs, instead of the more common six-year rates, as a proxy for institutional quality (outcomes for students).

It may be that delaying graduation is beneficial to a student, for example working to pay for college in a job that benefits their career. There's no way to know without studying the matter at your institutions: what types of students leave before graduation and what types of students take longer to graduate? Are these the same groups? With data from the National Student Clearinghouse we can calculate time to graduation for transfer-outs (and transfer-ins).

Resources

The code I used to make the first two figures is here. You can modify it to read CSV files downloaded from IPEDS if you don't have those tables in a database.

Updates

After the initial post, I went back and added Figure 0. The github code is updated too.

No comments:

Post a Comment