Abraham Lincoln once expressed a desire to maintain a government that was “of the people, by the people, for the people” in a time of civil war. What he didn’t say was that such a government has always been of the data, because of the data, and sometimes for the data as well. Democratic governance has long been fundamentally data-driven. Representation in the US is subject to a constitutional requirement, established at inception, for an “actual census” of the population every 10 years: a census designed to ensure that people are accurate, in their proper places, and in proportion to their relative numbers.
A full nationwide census is always a monumental task, but the most recent actual census faced unprecedented challenges. The 2020 census first had to overcome the Trump administration’s misguided attempt to add a citizenship question. Then it spent half a year in the field counting everyone during a pandemic that made it particularly difficult to knock on strangers’ doorsteps. A series of devastating hurricanes and wildfires added to the challenge. And yet, in late April 2021, the professional staff of the US Census Bureau succeeded in fulfilling the constitution’s mandate and revealing state-level population totals, translating them into a breakdown of the 435 seats of the US House and a corresponding number of votes in the electoral college. (The division was done automatically according to an algorithm called “equal proportions” or “Huntington-Hill” which is required by law.) Now, last month, we found out that some of those numbers were most likely wrong.
The Census Bureau’s Post-Enumeration Survey (PES) went back into the field, surveying a sample of people from across the country and comparing the new, more in-depth survey with the results of the census. In analyzing this equation, the agency now estimates that the 2020 census was overcounted in eight states and undercounted in six. To give an idea of the magnitude of these errors, the PES reported with 90 percent certainty that New York’s state population was outnumbered by 400,000 to more than 1 million additional people, or 1.89 to 4.99 percent of the population. population. Given the circumstances of the census, such low error rates must be considered impressive, and yet such differences can have major repercussions when the last seat in the US House since 1940 has been determined by just 89 people and no more than 17,000. Much of the initial commentary on the PES results has focused on the implications of the errors for horse racing, noting that more of the overcounted states were blue states, while more of the undercounted states were red. goods. The mistakes, which apparently favor one side over the other, have even been labeled ‘a scandal’ and the census has been written off as ‘a failure’.
These are overreactions, and yet the question remains: what should we do about these small, but both statistically and politically significant errors?
This is a conundrum our nation’s leaders have struggled with since its inception. Over the course of the last century, two distinct approaches have dominated. One depends on funneling money and energy to mobilize more census takers and other system reforms that preemptively reduce errors. The other involves statisticians who have worked to develop techniques that can accurately measure errors and then make corrections to the counts. Both approaches remain important, yet the magnitude of the 2020 miscounts suggests that an older method of dealing with census errors needs to be revived: we need to expand the House and the electoral college so that few, if any, states lose face representation. of an uncertain count. We should try to count better and fix mistakes that we can, but our democracy will be more robust if we also lower the stakes on each count. Representation doesn’t have to be a zero sum game.
The earliest known reference to a census subcount came from Thomas Jefferson, then Secretary of State, who wrote in 1791 of the previous year’s census, the country’s first. Jefferson wrote to his correspondents in Europe, assuring them that the American population was a few percentage points larger than officially stated. It’s hard to say whether this was indeed the case, but the story makes it clear that concerns about omissions and undercounts began more than two centuries ago. In the decades that followed, disasters and administrative failures caused serious omissions, such as when the official in charge of counting the people of Alabama died in office before completing his work on the 1820 census, or when much of the data from California (including all of San Francisco County) burned after the 1850 Census.
This post Democracy demands too much of its data
was original published at “https://www.wired.com/story/census-algorithm-politics/”