How secret is your secret ballot?
Posted: 2018-08-25T19:04:03
We rely on the secret ballot to prevent vote selling and voter intimidation, but the “secret” ballot is not very secret. In this post I will discuss problems that very few people know about or understand—problems that allow us to give ourselves away using the very choices we make!
Note: Originally posted as a three part series on 2008-06-19 22:41:52, 2008-06-26 20:15:26, and 2008-07-10 09:57:56. Reposted because it is election season! :-)
Pattern Voting
Pattern Voting (PV) occurs when there are enough choices on a ballot to allow voters to identify themselves using a predetermined voting pattern. Whether or not this is possible is a function of the the number of unique choices on the ballot, the number of voters, and how ballots are counted.
The simplest PV example is an election with one voter. That voter identifies her choices simply by voting, but more realistic scenarios are simple to construct. Consider an election with 10 voters and 3 races with 2 candidates each. Assuming a two-party system, let us say the choices for each race are the democrat (D), republican(R), or no vote (N). If voters follow the rules, this situation leads to the following 27 possible voting patterns:
DDD, DDR, DDN, DRD, DRR, DRN, DND, DNR, DNN,
RDD, RDR, RDN, RRD, RRR, RRN, RND, RNR, RNN,
NDD, NDR, NDN, NRD, NRR, NRN, NND, NNR, NNN
This is a permutation with repetition (33). To identify a voter, all that is necessary is to agree before the election on an unlikely voting combination. Up to 9 voters could vote for the same candidate in a select race using unique patterns between them.
As a coercer or vote buyer, all I need to do is give the voter a unique combination (e.g. DNR), and look for that pattern in the ballots during counting or whenever they become publicly available. The voter can either vote the way I told her, guaranteeing that unique pattern in the output, or vote the way she wants hoping the pattern will appear anyway.
The chance of the latter happening is pretty low given the number of voters. Assuming each voter votes randomly, there is less than a 30% ((1-(26/27)9), see the birthday paradox) chance that a random voter will share the same vote as the coerced voter.
The worst part about this situation is that what I gave above is a best case scenario. Chances decrease if the other voters do not vote randomly, are also being coerced, or do not follow the rules. Unless there’s a particularly bad or good candidate, the likely patterns are straight party (DDD or RRR).
Pattern Voting on a Real Ballot
To make this seem more real, I decided to take Maryland’s 2006 sample ballot I got and calculate the number of unique patterns you could make on it. Note that Maryland used DREs w/out VVPAT, so this is not directly applicable, but it does point out a potential problem when we switch back to optical scan.
There are 30 contests on this ballot. 16 of them have 2 options or 3 choices (yes/no/none), yielding 316 patterns. 3 of the races are “choose x” elections, for which the logic is explained in the next section. The rest of the races are detailed below (assuming voters follow the rules):
- Governor: 6 patterns
- Comptroller: 4
- AG: 4
- US Senator: 5
- District 3 Congressional Rep: 5
- State Senator: 4
- House of Delegates (vote for 2 of 6): C(6, 2)+C(6,1)+C(6,0) = 15+6+1 = 22
- County Executive: 4
- County Council District 1: 4
- Circuit Court Judge (vote for 4 of 8): C(8, 4)+C(8,3)+C(8,2)+C(8,1)+C(8,0) = 70 + 56 + 28 + 8 + 1 = 163
- States Attorney: 4
- Circuit Court Clerk: 4
- Judge of the Orphans Court (vote for 3 of 9): C(9,3)+C(9,2)+C(9,1)+C(9,0) = 84 + 36 + 9 + 1 = 130
- Sheriff: 4
To get the total number of patterns, we multiply it all together:
316 * 6 * 4 * 4 * 5 * 5 * 4 * 22 * 4 * 4 * 163 * 4 * 4 * 130 * 4 = 1.97271752x1020 = 197,271,752,498,675,712,000
There were only 5 to 6 million people in Maryland, and fewer in the county. Not all of these people were registered to vote. If you counted at each polling place, the numbers would be noticeably worse. Also remember that this is a conservative number. You could easily sell over half the ballot and have plenty of patterns left over!
Calculating Your Ballot’s Secrecy
It’s not too hard. Each race has a certain number of choices, and all you have to do is calculate these numbers and multiply them together. If you want to see the number of unique choices after targeting a specific race, for 1 choice election methods you remove that race from the multiplication. For rank choices, n out of m, or range/approval voting you simply remove the candidate you want to win from the calculation.
Below is a guide to help you figure out how many unique patterns appear on your ballot. n
is the number of candidates in the election, r
is the range or number of choices you can make.
- Choose 1: n+1
- Choose r: sum(C(n,r), 0, r) – this can express choose 1, too.
- Approval: 2^n
- Range: (r+1)^n – this can also express approval
- Ranked Choice: P(n,r) ==> n!/(n-r)!
Of course, this is assuming the voters follow the rules. Otherwise, the answer is 2^(number of dots) (because each dot can either be chosen or not). You can see wikipedia’s combinatorics page for more.
Fighting the Pattern Vote
The bad news is that few people pay attention to this problem, but the good news is that it can be mitigated. To defeat pattern voting, you have to reduce the number of choices that are associated with each other. Except for Ranked Choice, which is special, the key is treating each race separately, and in some election methods you need to treat each candidate separately. This is (sometimes) easier said than done.
In paper ballot systems you have a few choices. You could keep the ballots secret, and use only trusted counters (machines or people). You could have one ballot per race. You could also have a machine that cuts ballots after they are used. DREs w/ VVPAT would need a different mechanism than a paper rolltape to work. Because DREs w/out VVPAT can report results in aggregate, they avoid the PV problem.
As far as I know, every E2E system can handle PV, and some can handle PV with ranked choice. My colleague Stefan Popoveniuc wrote a paper about how this is accomplished in Punchscan and Scantegrity.
The problem with ranked choice is that you cannot hide the relationship between rankings. You need to know it to do the counting. In this scenario, the only choice for traditional systems is secret counting. Digital systems have the possibility of zero knowledge proofs to prove that the counting was correct, however.
Identifying Marks
Identifying Marks (IMs) occupy a middle ground because the voter may or may not knowingly be giving away his identity. Many states make IMs on ballots illegal but they rarely give a clear definition. In some cases it means serial numbers. In others it means writing outside of acceptable spaces. For our purposes, the definition of an IM is anything on the ballot that could potentially identify a voter after her ballot has been cast. As far as I know, only a small subset of such IMs could be considered illegal under most laws.
Simple Identifying Marks
Simple IMs are obvious and generally require voter complicity. These include marks that would generally be considered illegal under an IM law, such as arbitrarily signing your name or writing your address on the ballot.
Because they are legal, write-in candidate slots are the worst kind of simple IM. Voter’s can easily identify their ballots by voting for an agreed upon candidate. It might also be possible to identify voters through a handwriting recognition program (unlikely at this point, but possible in the future).
Serial numbers can also be an IM. If the voter knows the serial number, she can write it down and tell people what it is. This is easy to fix, however, by making the serial number unreadable to the voter, or adding said serial numbers after the voter casts her vote. Some places have serial numbers on ballots that are removed when casting a ballot.
Covert Identifying Marks With Voter Cooperation
There are endless possibilities for IMs when the voter cooperates. A voter could mark her ballot in a specific way. In an optical scan system the voter could make little flags on the circled choices. Since some people will do this accidentally (but not in a specific pattern), it is hard to detect. Some optical scan systems make voters draw an arrow, and a voter could do the same thing by drawing predictably crooked arrows.
Marking patterns are not the only way to make identifying marks. Voters could make creases in the paper. The coercer could give the voter a particular marking device (and it could be invisible except under blacklight). The other end of the pen used for marking could make a barely visible indentation in the paper. A particular colored grease could be put on the voter’s hands as they are using the ballot. The voter could write something on the other side of the ballot that is not checked by the scanner.
IMs without Voter Cooperation
As in the write-ins example, it is possible to identify voter’s choices without their knowledge. The attack I am most familiar with can be done with lever machines and grease. Levers for the candidates are marked with various colors of grease or ink. Voter’s who vote for those candidates must pull on the levers, and they will unwittingly get the grease on their hands. As the voter leaves the polling place, the attacker shakes her hand, and he can check the transfer to see how the voter voted.
The opposite of the grease attack is also possible. A voter could shake hands with the attacker before she votes, and the attacker could identify the ballot after the election by checking for grease. There’s also genetic material and finger prints on the ballot. A sophisticated attacker could scan all the ballots and identify voters if he knew their DNA or fingerprints (again, this is something that is probably not possible now, but might be in the future).
On absentee ballots, voters are required to sign the envelope that contains the ballot. Pressure on the envelope could transfer the signature to the ballot. Of course, if an attacker controls the receipt of the absentee ballots, he can get the identity anyway. Likewise, if an attacker has a poll worker on his side, the poll worker could put identifying marks on the ballot during casting time by helping the voter put the ballot into the ballot box.
Defeating IM Attacks
Unfortunately, there are no easy answers here for traditional paper systems, and as technology gets more powerful the situation gets worse. You can’t detect all IMs before casting without violating voter privacy, but you might be able to get a machine to do it in a limited way.
One way to prevent IM would be to create a machine that makes a pristine copy of the ballot and destroys the other copy. Only the valid marks would be transferred to the new copy, and any identifying marks would not. The problem here, though, is that voters might not always check the copy very carefully before casting their ballots.
As with PV, DREs mostly avoid this problem, because the voter doesn’t have the opportunity to make IMs. However, the logging might still make it possible, particularly if it records interaction with the machine (e.g. how the voter moves through the ballot, or when the voter marks and unmarks candidates). Even simply storing the choices in order could identify voter choices if you correlated it with poll book data, and I remember a story of this being done successfully in Ohio. You might also be able to do the grease attack, if you could make the grease undetectable. Surveillance is much easier on DREs, too.
E2E systems, again, do a great job solving these problems. That’s because the ballot you submit, the receipt, is public knowledge. That you put identifying information on it matters a lot less, because a copy is made without those marks and posted online then you walk out with what you used to vote.
Surveillance
The previous sections dealt with interface problems between the voter and a paper ballot, machine, or computer that records her vote. For this last segment, Surveillance, we discuss the ways the voter can be watched to determine her choices. Because the attacker or a device must be present to carry out these attacks, they are generally considered more expensive to carry out than what we have discussed so far.
Using the same strategy as seen in the previous segment, we will start with simple examples of this attack, move on to more elaborate examples, and end our discussion with how you could defend against these attacks. Again, as we’ve already seen, different flavors of these attacks may or may not require voter cooperation to work.
Simple Surveillance
The simplest paper ballot scenario is the following: the local union boss sits in the polling place. You flash your ballot to him as you take it from the booth to the ballot box or scanner. He checks your name off on his list.
Another, that works for DREs as well, is to take a cell phone picture or video of your ballot just before or as you are casting it. If the DRE has an audio interface, you may also be able to hook up an audio recorder and record your vote casting experience on tape.
Another class is the “over the shoulder” attack. The voter or poll workers may or may not have to cooperate for it to work. In some cases you may be able to succeed at a significant distance.
Hacking the Machine
The optical scanner or computer (or even lever machine), by definition, records voter choices. It could be modified to keep a serial record this input. The attacker can record the serialization to each voter by recording the order of who uses the machine, and retrieve the record after the election.
Because of the trail it would leave, this class of attacks is undesirable. However, our current testing practices and laws are such that this information might be public record, as seen in Ohio after the 2006 election.
Going High Tech
Mini wireless spy cameras sell for less than $100. That is well within the range of affordability. In addition, the relative predictability of how polling places are set up means the cameras could be there days before the election begins. A bag or pen equipped with this technology would have no problem recording voter choices.
The camera does not have to be limited to the visible light spectrum. An infrared or other kind of camera might be much easier to hide. In some cases, your body might not be enough to block its vision.
It may not even need to be a camera. Sensors or microphones in the polling booth might be enough to correlate voter choices. You can recover typed text using audio, it’s not a huge jump to do it for voting.
TEMPEST Attacks
A TEMPEST attack is one which records electronic emanations that reveal information being processed by the computer. A dutch group created a great video showing how this works. Take a look:
My favorite TEMPEST hack, from what I have seen, is an MP3 player for CRT monitors. Just tune your AM radio and enjoy.
Defeating Surveillance
In general, it’s an arms race. As technology progresses and becomes ever more affordable, the situation gets worse. Unless you can strip each voter and scan for optical eye and other types of implants, election officials will eventually lose.
The strategy here should be to drive up costs and take precautions. Make machines that meet the TEMPEST standards. Go to each polling place and do a scan for wireless emissions. Look for cameras and sensors when you set up the polling place. Do not allow voters to take cell phone cameras or bags into the voting booth. As long as it is prohibitively expensive, the laws are harsh, and there is the threat of being caught, it is hopefully not worth it.
Special thanks to my proof readers: Taral, Emily, Jeremy, Scott, and Ben.
Keywords: voting, security, privacy, steemstem