There have been a number of polls released on support for political parties, ahead of the 2019 national and provincial elections. The Institute of Race Relations (IRR) for one, has done two over the past three months and there will be more to come as election day approaches.
Market research of this kind is often misunderstood and poorly reported on. It might be helpful, then, to generate a rough guide as to the strengths and weaknesses of polls, so that those people interested in them can get a better idea of how to read them, what their limitations are, and what value can be drawn from them. These are some general observations.
There are three broad points to make upfront.
The first thing to understand is that any poll can only tell you what it can tell you. They are not crystal balls; they cannot predict the future and no poll, ever, can tell you with 100% certainty what the current situation is. Every poll has a set of limitations built into it. That said, within those limitations, if properly conducted, a poll can be incredibly helpful. But extend its significance beyond those parameters and, naturally, it becomes less reliable.
So, the how does a polling company ensure a poll is as accurate as possible? Let’s use voting intention (or support for political parties) as an example.
To be 100% accurate, you would need to capture the honest view of every single person who intends to vote on election day. There is a name for that: it is called an election — the ultimate political poll.
Given that you cannot conduct an election every time you want to survey public opinion, the question becomes how close can you come to being as accurate as possible by using a sample of the general group?
Representivity
The key to accuracy is randomness. To avoid bias, you must ensure every possible member of the general population you are measuring has an equal chance of being selected into the sample generated (that is, the pool of people to be questioned) and which you wish to be representative of the general group. This is important for one reason above all others: randomness ensures representivity.
Representivity is a big, all-consuming concept. You want your sample to be representative of the general population. That covers everything: representative in terms of attitudes, demographics, geography — you name it. And the only way to ensure you achieve that is to ensure your sample is randomly selected.
Remarkably, it does not matter how big the general population, a sample of size of about 400 will give you a margin of error of about 5%, if the poll is random and the methodology sound
How polling companies go about doing this is complex. The thing to look for is how well any given company is able to ensure that its sample is as random as possible.
There are many natural constraints (someone selected might not want to take the survey, for example). The better able to you are to ensure the fewer constraints on randomness, the better the poll.
Like for like
The second thing to understand is that if any poll wants to tell you something about a general population, the sample needs to be drawn from that specific population. So it is like drawn from like.
For example, if you want to test what all voters think, your sample should only comprise voters. If you ask people who are not registered to vote what they think, together with registered voters, the view of non-registered voters might skew the results. This matters because, on election day, their view will not be recorded — obviously, only registered voters can vote.
Confidence and margins of error
The third and final broad point to understand, is that, once you understand the limitations of any given poll, there are two different kinds of information you can gleam from them.
Every poll has one natural limitation — which may apply to the information we can read from a poll. Every poll has a margin of error, which is generally consistent on a sliding scale. There are various things that impact on the margin of error, the biggest of which is the sample size. Remarkably, it does not matter how big the general population, a sample of size of about 400 will give you a margin of error of about 5% or 6%, if the poll is random and the methodology sound.
Just to put that in context, because it is a mathematical marvel, if you were able to generate a random sample of 400 people in China (population 1.3-billion), you could generate findings with about a 5% margin of error for the whole of China.
The confidence level (typically about 95% for a good poll) is another way of demonstrating how accurate a polling company believes its margin of error to be. So, when you read: “The margin of error is 5% with a confidence of 95%,” what the polling company is saying, is that it believes 95% of the time, their findings will never vary by more than 5% in either direction from reality.
One could understand confidence level another way: a 95% confidence level means, if the same poll was conducted 100 times, the relevant company believes the findings in 95 of them would fall within the margin of error. For the other five, the findings might fall outside it.
As you grow your sample size, the margin of error tends to drop (the confidence level stays pretty much the same if conducted properly). At about 1,000 respondents, the margin of error is generally about 3%. At 3,500 respondents, it is about 1.5%. After that, the drop in the margin of error becomes so incrementally small, it really not worth the financial cost of boasting your sample any bigger.
This range then, from around 1,000 respondents to 3,500, is the Goldilocks zone, so to speak, and most polls will fall within this bracket.
How does this help you read a poll? Here is an example: you have a simple question, put to a random, representative sample of 1,200 registered voters, with a margin of error of 3% and a confidence level of 95%. The question is: “If an election was held tomorrow, which party would you vote for (the options are party X or party Y).” The answer you get is that party X comes out with 64% and party Y with 36%.
You need to ensure the people conducting the survey are professional (that they record information accurately; that they are able to converse in the first language of the respondent, so that nothing is lost in translation; and that they rotate questions, so that you don’t subconsciously create a patterned response in the way questions are interpreted)
What this tells us is that party X has a clear majority. The gap between it and party Y is 28 percentage points — and that is far greater than the margin of error (3%). So you can be 95% sure, in reality, party X enjoys, at the very least, a 25 percentage point advantage over its competitor, party Y. That is one way of reading information: a direct interpretation, that tells you something specific about a particular issue.
Trends
Another thing polls are able to reveal are trends and patterns, over both a range of questions and over time. This is a more indirect way of reading polls.
Using the example above, by looking at other questions in the poll (party favourability, for example) you could cross check the trend identified in the voting intention question. If the poll finds that the favourability towards party X is also in the 60% range, and the favourability towards party Y in the 30% range, this reinforces the accuracy of the voting intention finding, because the same attitudes underpin both sets of answers.
Likewise, if the poll in question is carried out more than once, say every six months or so, it becomes possible to compare results, which show trends over time. In turn, that would allow you to contextualise any given result. And context can add insight and understanding.
A great many other things need to be done to protect the integrity of a poll. These are less obvious when the findings are presented but no less important. For example, you need to ensure the people conducting the survey are professional (that they record information accurately; that they are able to converse in the first language of the respondent, so that nothing is lost in translation; and that they rotate questions, so that you don’t subconsciously create a patterned response in the way questions are interpreted). These, and a great many other small details, are essential to ensuring any poll’s credibility.
So, what are some of the problems inherent to bad polling?
Artificial population
First, there can be the mistaken perception that an artificial population represents the general population. This is the primary problem with Twitter polls, and a fairly widespread belief in SA. But Twitter polls are in no way, shape or form, representative of the South African voting population.
People on Twitter (about 10% of the South African voting population) are a very specific sub-sample. They don’t share the same demographic characteristics of the general population, or the same views, and cannot, under any circumstances, be used to accurately gauge voter sentiment. There are a myriad other problems: they are not all registered voters, they are not all over 18, they can vote multiple times — if they have more than one account, and they tend to exist in bubbles, surrounded by people who reflect their own worldview.
Twitter polls are fun and entertaining, but not serious. They help generate debate and gauge sentiment in small, unrepresentative and segmented online communities, but that is all.
The media
Second, the media is generally very bad at reporting on market research. They often don’t report the methodology, or misreport it. This makes it harder for the reader to tell what they can and cannot take from a poll (if you don’t know the margin of error, you can’t properly gauge a finding). In the pursuit of sensation, the media also often turns a description of the status quo into a prediction for the future (fueling distrust in polling), and they sometimes confuse findings.
Third, there have been some significant polling failures in recent times, most notably in the US (Donald Trump) and the UK (Brexit). Although, to be fair, some of this confusion was exacerbated by the press, who failed to properly explain the limits of some of those polls (there were some polls on both issues that were very accurate). Nevertheless, this is obviously a problem.
But these failures are actually no different in scale and seriousness to many of the other component parts of public debate and analysis. Many news stories get key facts wrong. Many opinion writers boast a flawed analysis. The fact is, readers of market research should apply the same standards to polls they do to anything they read: examine the integrity of the evidence and logic presented and, based on its veracity, arrive at an opinion. Anyone who takes anything at face value, without thinking, deserves what they get if it proves to be wrong or flawed.
Fourth, the public and politicians are also often to blame. People are desperate for “truth”, absolute and unarguable. And so, because market research is a scientific endeavour, defined as it is by much talk of methodology and scientific jargon, findings are often exaggerated and misrepresented to win an argument or make a point.
No-one likes information that must be presented with care and context if it is to be properly understood. That requires time, space and caveats, and few would seem to have inclination for that in the modern age. The desire is for sound bites and certainty. And, with platforms such as Twitter and Facebook, which spread misinformation like wildfire, any poll can quickly be turned into something it is not. And what it says, distorted and warped.
The bottom line is that a good poll, conducted properly, random, representative and clearly presented to the public can hold invaluable insights. The better the public and media become at reading them, and learning what they can and cannot deduce from them, the more informed they will be.
Not every poll will be a good poll, just as not every political speech or news story is well argued or grounded in evidence. To distinguish the good from the bad, the public and the media need to actively educate themselves on this front. If they do, the rewards will be great.
• Van Onselen is the head of politics and governance at the South African Institute of Race Relations.






Would you like to comment on this article?
Sign up (it's quick and free) or sign in now.
Please read our Comment Policy before commenting.