In This Issue
Winter Bridge on Frontiers of Engineering
December 14, 2018 Volume 48 Issue 4

Human Factors of Democracy

Thursday, December 13, 2018

Author: Guru Madhavan and Charles E. Phelps

Voting is a powerful instrument in the civic arsenal. The ballot is mightier than the bullet, Abraham Lincoln once observed, and Theodore Roosevelt likened votes to rifles. Individual expressions are transformed to public choices, thus “poll-vaulting” the fates and futures of societies. Taxpayers vote for their representatives in government. Shareholders vote for directors to oversee the business of corporations. People vote to decide who joins a club, who leads religious and social organizations, or what investments an association should make. Ubiquitous social media amplify the importance of judicious voting.

Society’s collective choices also shape the conduct and impact of the sciences and the development of related policies. Because the economic consequences and public accountability of science policy are often high and funds are limited, decisions almost always require priority setting. For example, public and private funds are typically granted for research after a competitive evaluation and scoring of proposals. Scientific prizes, medals, and honor society memberships are awarded based on committee decisions. Public health recommendations, weapons acquisition options, missions for space exploration, trade treaties, university rankings, and hiring decisions at all levels also rely on various choice mechanisms. All of these scenarios involve some form of voting to determine the outcome.

But different rules yield different results for the same choices available to voters.

Inconsistent “Rules” of Voting

The related literature on social choice has two “truths”: first, no voting rule is perfect (Arrow 1950; Balinski and Laraki 2011; Sen 2017); second, all known methods can be manipulated by strategic (versus sincere) voting (Gibbard 1973; Satterthwaite 1975). In addition, voting methods are poorly understood and often obscure (sometimes perhaps deliberately so). They vary greatly in comprehensibility, ease of use, and voters’ ability to express themselves, all of which can influence voter participation (Madhavan et al. 2017). Yet very few organizations give even momentary thought to their voting methods and their relevance.

The widely used Robert’s Rules of Order strives to bring all complex decisions down to pairwise choices. As is well understood, numerous opportunities exist in this approach to predetermine outcomes. But when voters confront three or more options, voting becomes much more complicated, and attempts to simplify to a sequence of pairwise votes are rife with hazards.

Here, as is typically done in engineering design, testing, and refinement of products (Madhavan 2015; Norman 2013), we consider human factors to better understand the meaningfulness of voting methods from a user’s standpoint.

Choosing How to Choose

A principal challenge emerges from collective choices when more than two options exist. This relates to the impossibility of finding a voting rule that satisfies seemingly simple requirements, such as fairness, universality, efficiency, and no change in other pairwise rankings when a candidate is added or removed (Arrow 1950; Balinski and Laraki 2011).

Six Approaches to Selection

Most approaches use voters’ ranked (ordinal) preferences as the basic input, but other options include the following:

  • Vote for one: the winner either obtains a plurality of the votes cast or is determined by a runoff (this is the most commonly used tool worldwide).
  • Approval voting: voters vote for each candidate they endorse (i.e., anything from none to all of the -choices); the priority order is determined by total approvals.
  • Rank order lists: voters individually produce ranked lists that can be combined through many approaches, each of which may yield different voting results.
  • Dotmocracy: voters distribute, say, 20 or 100 points across available candidates, with ranks determined by the sum of allocated points.
  • Range voting: voters assign a score from 0 to 99 to each candidate, with ranks determined by sums of scores.
  • Majority judgment: voters categorically grade each candidate using a standard vocabulary, as in academic grading (A = “excellent,” B = “very good,”… F = “unacceptable”); median values are used for final scores with lexicographic rules for breaking ties.

It is not always clear, however, that the scoring or ranking inputs have the same meaning for each voter. Without this, aggregating them has no meaning.

Assessing Selection Methods

Most research on social choice presumes that a vote represents a statement of the satisfaction of the voter if different candidates win, and hence (following a long tradition in economics) limit such representations to ordinal rankings (Arrow 1950; Balinski and Laraki 2011).

In sharp contrast, majority judgment specifies a common language (words such as excellent, very good, good, fair, poor, unacceptable) that voters use to grade candidates. There is no reference to the satisfaction of the voter, but rather their measure of the merits of the candidates. Range voting seems to have a similar structure and, indeed, assumes that the 0–99 score range has a common meaning to all voters; however, this has not been tested, and is almost certainly invalid. Similarly, “approval voting” implicitly assumes that “approve” has the same meaning for each voter—also an untested presumption.

Table 1

Another important human factor concerns the richness of vocabulary that each method offers voters to express themselves, varying from minimal expressiveness to millions of “words.” Table 1 shows the number of expressions available using various methods for K candidates.

The standard US voting method—pick one -candidate—has the least possible expressiveness. Approval voting is next to last, but notably better. Widely used rank-order lists have a richer vocabulary, but ranking limits the extent of expression since ties are typically not allowed and the gaps between ranks are not revealed; for example, candidates ranked 1 and 2 might be very close together or far apart—the information is masked by the rank order. Grading -methods (majority judgment) and point spread methods (-dotmocracy) offer far richer vocabularies, and point scoring -methods (range voting) offer nearly endless expressions. But it is not known how much these matter to the real electorate.

It is important to consider how comprehensible various methods are to the general voter. Among the few merits of the US system, most voters can easily describe the standard voting processes (majority, plurality, or runoff election), but few likely understand the full consequences of their use. At the other extreme, some proposed voting methods require advanced mathe-matics to explain. Since two thirds of US adults have at most a high school education, it is unlikely that they can understand voting methods that require higher-level mathematics, and voters are likely not to trust or participate in elections using voting methods that they do not understand.

Standard methods for evaluating voting rules test them against a set of criteria. A voting method fails if any distribution of votes creates a violation of the criteria (Balinski and Laraki 2011), but many of these mathematical violations rapidly diminish or disappear as the number of voters increases. A human factors approach considers how often and under what circumstances violations occur, rather than simply demonstrating the possibility of a violation.

The Human Element: Examples of Impacts

An Exercise in Funding Prioritization

In what we believe is unique for this line of inquiry, in a small study we asked respondents to evaluate a set of choices that were professionally relevant to them using six different voting methods. We compared the results, focusing on how the participants evaluated the voting methods.

Our 21 respondents were scholars and pro-fessionals attending our symposium at an international conference on health outcomes research and technology assessment.[1] We asked them to act as advisors to a national health system and to choose from the following six options for investing public funds: (1) new medical imaging capability (MRI, CT, PET scanners); (2) intensified research on a drug to delay the onset of Alzheimer’s disease; (3) interventions to reduce obesity and smoking; (4) development, testing, and transition to market of a gene-based therapy for Huntington’s disease (which affects 1 per 10,000 people); (5) accelerated research and development on a vaccine to prevent Ebola; and (6) increased hiring and improved retention of clinicians and nurses (by 10 percent) to enhance everybody’s access to care.

Although synthetic, these options reflected real-world constraints and were accompanied by an estimate (from 20 to 55 percent) of available funds that would be consumed by a given option. Participants were told that available funds would cover only a portion of the -choices—and would consume 150 to 220 percent of the available budget if all were attempted—so prioritization was essential. The six voting methods tested—vote for one, approval voting, rank order, -dotmocracy, range voting, and majority judgment (allowing + or – -designations to letter grades but not requiring them)—were described in detail; some were familiar to the participants, others novel.

The results confirmed that different voting methods create different priority lists when used by the same people. All six voting methods selected the same two options as the least favorite (a cure for Huntington’s and a vaccine for Ebola), but they differed considerably in prioritizing the remaining four, so each method would lead to different funding decisions if actually employed in practice. Two of the methods—vote for one and range voting—appeared to produce more outlier results than the other four methods.

Of more interest and consequence are the participants’ feedback on the voting methods themselves. We asked them to evaluate the methods on four dimensions: (1) ease of use, (2) expressiveness, (3) enjoyableness, and (d) likelihood of future use in professional settings. The results of our preliminary data are telling. Of the 21 participants, 15 disliked (on several dimensions) the vote for one method and, considerably more than any other option, would prefer not to use it in the future. With less unanimity, participants found majority judgment and dotmocracy the most expressive and to some extent the most enjoyable. They moderately preferred rank ordering for future use, noting its ease of use, which their comments indicated was due to their greater familiarity with the method.

Many of our participants had never encountered approval voting, range voting, dotmocracy, or majority judgment. Perhaps with more familiarity and experience people might actually prefer dotmocracy or majority judgment, the methods our participants often identified as most expressive and enjoyable for use.

Honorific Society Election Processes

In addition to this experiment, we reviewed the member election processes of leading honorific societies as described on their websites, including procedures for selecting Nobel prizewinners. We observed two distinct features of the selection processes:

  • They invariably involve an initial narrowing of the candidate pool in several steps—none of which are clearly described and all of which appear to involve considerable subjectivity.
  • They commonly involve a final majority vote on the candidates by a defined body of voters, but nothing describes what they actually vote on. It may be an approval voting process (which could lead to multiple winners, as commonly occurs with the Nobel prizes), or approval of a slate created by a committee, or some other mechanism.

These preliminary steps of “elimination”—also common in the review of grant proposals—require further scrutiny to fully understand how important scientific funding decisions are made.

Making the Right Choice

A question that animates the scientific community is how to improve the selection processes that lead to federal and other grant awards. Since divergent views and disagreements cannot be eliminated, it is necessary to determine how best to consolidate the differing opinions of reviewers and decision makers into meaningful group choices.

First, reviewers need a common language to describe their views. While some may believe that numeric scoring systems provide this language, such systems can mean quite different things to different reviewers. For instance, does a score of 1 (or 9) on a scale mean the best (or worst) of the current group of proposals, or proposals that reviewers have ever seen, or that they can imagine? Some may even confuse whether 1 or 9 is best. If reviewers apply different meanings for these scores, averaging them has no more meaning than averaging Fahrenheit and Celsius temperature readings.

Strategic manipulation is another source of incongruity: reviewers shift scores depending on how various proposals relate to their own (preferred) line of research or worldview. This practice has been extensively analyzed in the scoring of athletic events (such as figure skating and gymnastics) that, analogous to scientific reviews, requires subjective judgment (Balinski and Laraki 2011). The opportunity for strategic manipulation looms large in competitive grant review processes, even with blinded reviews.

Ultimately the only way to assess the value of various voting methods is to test them head-to-head in various voting situations. Only with such direct comparisons is it possible to ascertain how often the methods agree or disagree in rankings and resulting decisions, which seem better suited to various settings, and which create the greatest sense of trust among the participants.

Vox Populi

Political, scientific, business, and social institutions rely on voting methods that appear to fail on important human factor dimensions. The most widely used method—voting for one candidate—is the most constraining, offers the lowest possible expressiveness, is the least informative, and was the most disliked by our respondents.

Although more experiments and field testing are necessary to develop this area of research, it would be beneficial to improve public familiarity—and comfort level—with diverse voting methods and how they affect outcomes across settings. The use of tools of engineering design can not only enable more expressive public opinion for important decisions but also lead the way toward scientifically informed choice systems, thereby improving public policy decisions and, ultimately, democracy.

Acknowledgment

We are grateful to Cameron Fletcher, whose comments and feedback improved this article.

References

Arrow KJ. 1950. A difficulty in the concept of social welfare. Journal of Political Economy 58:328–346.

Balinski M, Laraki R. 2011. Majority Judgment: Measuring, Ranking, and Electing. Cambridge MA: MIT Press.

Gibbard A. 1973. Manipulation of voting schemes: A general result. Econometrica 41:587–601.

Madhavan G. 2015. Applied Minds: How Engineers Think. New York: W.W. Norton & Company.

Madhavan G, Phelps C, Rappuoli R. 2017. Compare voting systems to improve them. Nature 541:151–153.

Norman D. 2013. The Design of Everyday Things (rev ed). New York: Basic Books.

Satterthwaite M. 1975. Strategy-proofness and Arrow’s conditions: Existence and correspondence theorems for voting procedures and social welfare functions. Journal of Economic Theory 10:187–217.

Sen A. 2017. Collective Choice and Social Welfare (-expanded ed). Cambridge MA: Harvard University Press.


[1]  The 20th Annual European Congress of the International Society for Pharmaceutical Outcomes and Research, Glasgow, November 4–8, 2017. The focus of the conference was the evolution of value in health care.

About the Author:Guru Madhavan is a senior program officer with the National Academies of Sciences, Engineering, and Medicine. Charles Phelps (NAM) is University Professor and provost emeritus of the University of Rochester.