Mistakes happen. I should know—I make more than my fair share of them (including on this blog). But some mistakes are a little more noticeable than others, such as when your mistake has been viewed more than a million times. That is what happened to the U.S. Department of Education recently, when they found a coding error in the popular College Scorecard website and dataset.
Here is a description of the coding error from the Department of Education’s announcement:
“Repayment rates measure the percentage of undergraduate borrowers who have not defaulted and who have repaid at least one dollar of their principal balance over a certain period of time (1, 3, 5, or 7 years after entering repayment). An error in the original college scorecard coding to calculate repayment rates led to the undercounting of some borrowers who had not reduced their loan balances by at least one dollar, and therefore inflated repayment rates for most institutions. The relative difference—that is, whether an institution fell above, about, or below average—was modest. Over 90 percent of institutions on the College Scorecard tool did not change categories (i.e., above, about, or below average) from the previously published rates. However, in some cases, the nominal differences were significant.”
As soon as I learned about the error, I immediately started digging in to see how much it affected loan repayment rates. After both my trusty computer and I made a lot of noise trying to process the large files in a short period of time, I was able to come up with some top-level results. It turns out that the changes in loan repayment rates are very large. Three-year repayment rates fell from 61% to 41%, five-year repayment rates fell from 61% to 47%, and seven-year repayment rates fell from 66% to 57%. These changes were quite similar across sectors.
|Difference between corrected and previous loan repayment rates (pct).|
|Source: College Scorecard.|
For those who wish to dig into individual colleges’ repayment rates, here is a spreadsheet of the new and old 3, 5, and 7-year repayment rates.
Fixing the coding error made a big difference in the percentage of students who are making at least some progress repaying their loans. (And ED’s announcement yesterday that it will create a public microdata file from the National Student Loan Data System will help make these errors less likely in the future as researchers spot discrepancies.) This change is likely to get a lot of discussion in coming days, particularly as the new Congress and the incoming Trump administration get ready to consider potential changes to the federal student loan system.