Thursday 12 February 2015

Statistics

It's more and more common for linguists of all types to use quantitative methods in their research. This used to be something that only certain people did, because it was the nature of the method/subject matter etc. Now I increasingly get the feeling that those who don't are seen by some people as somehow not doing work that is as valid. I'm still pretty well in the theoretical linguistics camp (which doesn't mean we don't use data, interestingly, but it's not quantitative data). This means that my ability to wrangle statistical packages and interpret complex facts is close to nonexistent, but even I could spot some clangers in a recent episode of More Or Less (a BBC World Service programme).

First, there was an item about the apparent rapid increase in antisemitic attacks. The organisation Campaign Against Antisemitism had carried out a survey which revealed a worryingly high rate of British Jewish people being concerned about their long-term future in this country. It's not in question that there is antisemitism to some extent, but the presenter, Tim, noted that it's hard to sample the Jewish population in a fully representative way in this country. In response to Tim asking the reasonable question 'How do you know your respondents aren't disproportionately worried about antisemitism?', the spokesman for CAA said 'If you look at the results, they represent a range of views'. Well. Maybe so, but I think it's quite obvious that you can't judge how representative your sample is just from the responses of your own sample, if you don't have anything to compare it to.

Then there was an item in which someone (I think a Manchester police spokesperson but I could be wrong) talked about 60 men found in canals over the last few years and put this high number of deaths down to an as yet unidentified killer. The programme's researchers looked into how many deaths from accidental drownings one might expect over a similar period. When this chap was told that one would expect 61 accidental drownings, he said this: 'You can't ignore the statistics - well if you want to ignore the statistics...' and went on to speculate further about these deaths being linked. But it's him who is ignoring the statistics, in this case, and speculating on the basis of misleading numbers.

I find More Or Less and similar 'behind the numbers' things really interesting, because I'm fascinated by how easy it is to confuse ourselves and others with statistics. I remember one particular example from Bang Goes The Theory where Dr Yan demonstrated (with bacon sandwiches) how nearly everyone fails to spot that 'bacon increases your risk of bowel cancer by 20%' and 'bacon increases your risk of bowel cancer from 5% to 6%' are making exactly the same claim. We are apparently very bad at this kind of thing.

No comments:

Post a Comment