A and an and innateness

I tutor students on writing skills in order to not starve to death in a garret while I write my PhD. Most of the time they have problems with punctuation, argumentation, structure and so on. Most of the time I patiently explain how to do it right, and then they have to work on improving. Sometimes that's hard, like if they have difficulty in writing critically, and they have to really practise. Sometimes it's simple: honestly, the number of times I've explained how to use a semi-colon and the student has said 'Oh, that's quite easy, isn't it? I wish someone had told me that before'.

Every now and then, a student has a problem and I can tell them, 'You already know how to do this; just do it the way you say it'. I love it when that happens. The trouble is, students with poor grammar are used to not trusting their own knowledge. They say I seen it happen, but they must write I saw it happen. They say We don't know the reasons for this but they must write We do not know the reasons for this. They say /ðɛə/ and /ðɛə/ but write there and their. It's a minefield out there.

So I do understand when a student isn't sure about using a and an. It's a simple enough rule, on the face of it: use a before consonsants and an before vowels. But then what about words like university or hour? And abbreviations like MP and USB? If you follow the rule, you end up with something that conflicts with what you feel like it should be. Here, the key is to trust your instincts. Of course, the rule works off the sound of the pronunciation, not the spelling. So University sounds like you- and hour sounds like our. MP sounds like empee and USB sounds like you-ess-bee.

But the clever thing is that I could say to that student, just do it the way you say it and you will always be right. That's quite exciting. A native speaker of English, once they're no longer a very young child, will never ever get that wrong (apart from slips of the tongue and the like - but that's different). They will never be unsure which they should use. It's obviously not innate, as it's a language-specific rule, and children do say Can I have a apple, but I can't recall if it's something you're explicitly taught. I have a feeling it's not. But it is an example of the worry caused by learning that there are rules and that you're probably not doing it right.

Implicit goodness

Consider the following exchange:
A: Morning.
B: Glad you think so.
(I heard this said on TV; perhaps on Last of the summer wine.)

There is so much pragmatic implicature in here it's outrageous. B, by saying glad you think so (with stress on you), implies that he does not agree. But what is it that he does not agree with? Morning is not something you can agree or disagree with; it's not a proposition and as such has no propositional content and no truth-value. And it can't be interpreted as meaning that 'morning is here', or 'it is morning' - that's definitely not what he's disagreeing with.

Obviously, Morning is a truncated form of the greeting Good morning. The good is understood. This kind of shortening is something we do a lot, because it's a set phrase, said often, and we know what's meant even if half of it is missing. But in itself, this is not something you can (dis)agree with either: good morning is also not a proposition.

What we really mean when we say Good morning (or Morning) is I wish a good morning to you. We are expressing our hope that our interlocutor has a pleasant day. But again, you can't disagree with what someone hopes - it really isn't your place to do so.

B, in the exchange above, has correctly inserted the implicit Good, but then wilfully misinterpreted Good morning to be a shortened form of It is a good morning. That is a proposition, and can certainly have a truth value - it is either true that it is a good morning, or it is false.

If English still had a case system, incidentally, this kind of deliberate reanalysis couldn't happen. The phrase Good morning would be accusative in the speaker's intended interpretation, but nominative in the hearer's reanalysis. So presumably those languages that do have overt case lack this kind of hilarious repartee.

I learnt what that means just recently. I've seen it loads and couldn't be bothered to find out. Fittingly, it turns out to mean 'too long; didn't read', for use on the internet when someone posts something that just looks too long and boring to bother reading. Is it the only initialism (or even abbreviation) with a semi-colon in it?

Grammar... sort of.

Ok, so like, I am not a prescriptivist, blah blah blah, but I do tell people how to use apostrophes for a living, so I found this momentarily funny. 
But then I found it annoying because that ain't grammar. It's punctuation, and perhaps also spelling. If it was grammar, people who make this mistake would be genuinely mixing up
CP[ shit[ADJ]] (a clause with a second person subject, the copula and an adjectival complement)
NP[2sg.poss shit[NOUN]] (a noun phrase consisting of a possessive second person determiner and a noun).
Methinks they are not. They just don't know how to spell the right version of two homophonous but otherwise distinct forms.

That's how to kill a joke.

Greenberg's diversity index

Well, this is interesting. It's an article from the Economist, showing language diversity in several countries. It gives the probability of two people selected at random from any one country speaking the same language. So in Papua New Guinea, two people are almost certain to speak different languages, whereas in North Korea, they will definitely both speak the same language.

The UK isn't on this list, but the fuller list at Ethnologue gives us a probability value of 0.133, so around the same as Mexico or Australia. However, we have many fewer indigenous languages spoken than either of them (Ethnologue says 12; click here if you want to know what they are) so it looks like we must have more speakers of our minority languages than they do.

No word for wool in Tagalog

I was listening to The unbelievable truth the other day, which is a BBC radio programme in which contestants have to deliver a lecture made up of mostly lies, but with some truths slipped in. The trick is to hide the truths so that your opponents think they are lies (and meanwhile they're trying to spot the truths). It's quite enlightening.

In the episode I was listening to, one 'fact' offered was that there is no word for 'wool' in Tagalog (spoken in the Philippines). The other contestants all twitched, wondering if they should go for their buzzers. No word for wool! It could so easily be true! There's no way of knowing, of course, if you don't speak Tagalog. Do they have sheep in the Philippines? If not, they might not have wool, and therefore no word for it! Or maybe they just don't have one word that just means wool, they might have a word that encompasses wool and cotton! Or maybe it's a phrase, not a word, like sheep's hair!

No, it was a lie, I'm afraid. Tagalog for 'wool' is lana.

I've got a couple of questions

To me, a couple means two. If I say I'd like a couple of poppadoms, I want two poppadoms. It might sometimes mean about two, though it might not be exactly two - I might for instance say that I met a friend a couple of days ago, and not be unduly put out if I realised that it was actually three days ago. And equally, if I ask for a couple of chips and you give me three or four, I'm not going to complain or pull you up on your failure to carry out my request. If, however, you gave me eight chips, then I'd have to stop you and tell you I only wanted a couple. It means two, though there's some leeway for a bit more (not less though - a couple can never ever mean one).

But it doesn't necessarily mean two for everyone. As far as I can see, there are some people for whom it always means two, and exactly two, and those people are to be found all over the internet referring to their outdated rule books and invoking etymological arguments. You can ignore those people. There are also people who, similar to me, perhaps, mean around two but maybe three or four. And there are some people who mean three, or four, or a few, or more than two anyway.

I had an encounter when I told someone of this persuasion (with an Irish dialect) that I had a couple of questions. I asked precisely two questions, and after she answered them she waited patiently for the next. When I said that was it, she said Oh, you meant two when you said a couple. For her, a couple definitely meant more than two.

On the TV programme Two and a half men (I like it, OK? It makes me laugh) Charlie Sheen responds to another character saying I only had a couple of [something - drinks, maybe?] with the line,
Jack the Ripper only killed a couple of prostitutes.
Jack the Ripper killed at least five women (and they were still women, by the way, despite being prostitutes). I wonder if Charlie sees a couple as always being more than two, or 2-5, or if he was deliberately expanding the meaning of the phrase for the sake of the joke.

Correlations in linguistic data

Geoff Pullum at Language Log recently reluctantly (because it's not yet published) commented on a paper by a Yale economist, Keith Chen. In this paper, Chen argues that if your language has a grammatical future tense marker, you are less likely to save money, live healthily etc because the future seems like some other time, not to be worried about now. If your language uses present tense to refer to the future, you treat is an extension of the present and you'll be much more sensible about it. Pullum is guardedly sceptical about these claims, for reasons which you can read about yourself

He is also sceptical about this kind of claim (made based on correlations found in large amounts of data) because
I also worry that it is too easy to find correlations of this kind, and we don't have any idea just how easy until a concerted effort has been made to show that the spurious ones are not supportable. For example, if we took "has (vs. does not have) pharyngeal consonants", or "uses (vs. does not use) close front rounded vowels", would we find correlations there too? I have some colleagues here at the University of Edinburgh, within Simon Kirby's research group, who have run some informal experiments on the data Chen uses to see if dredging up spurious correlations of this kind is easy or hard, and so far they have found it jaw-droppingly easy.
He doesn't comment further on these experiments, but it reminded me of the talk Martin Haspelmath gave when our university's linguistics research centre opened a few years ago, and he told us about the World Atlas of Language Structures (WALS). After telling us what a wonderful, useful tool it is (and it is, I've found it invaluable), he ended on a note of caution. It's easy, he said, to find false correlations. For example, you can show a map of languages which have a different word for hand and arm or use the same word for both. That map shows that the languages that don't distinguish are, broadly speaking, around the warmer areas of the globe (yellow dots) and the ones that do distinguish are in colder areas (red dots):
(Map from WALS, feature 129A)
Now might one not hypothesise, asked Haspelmath, that this language fact is due to the climate? In colder countries the distinction is important, in that one wears items of clothing that cover only the hands (gloves), or sleeves that come down to the wrist. In warm countries, sleeves are not so long and gloves are not worn, so a separate word for hands never becomes necessary. A far-fetched example, but a lesson in not putting too much faith in correlations.

100 posts and no grumblems

This is my 100th post. I thought, what can I do to commemorate this momentous occasion? I'll have to post something brilliant, original, thought-provoking and inspiring. So here it is. As seen on a review website when I was deciding whether to put my broadband life in the hands of Plusnet...

No grumblems!

Please, adopt it, use it, spread the coinage. It's a fantastically useful word, conveying exactly the sort of thing that happens when companies do things wrong and get you annoyed. And conversely, as used here, to express the fact that nothing has gone wrong, they're doing their job, and you have no complaints. Lovely nuanced word, I think you'll agree.

Chocolate bar gives evidence for syntactic constraint

I bought myself a Cadbury's Boost (other chocolate bars are available, though none are as good if you want to ingest maximum calories). It describes itself thus:
2 x milk chocolate with caramel and biscuit filling bars
(It was a Boost Duo, OK? Don't judge me.)

Does that sound at all odd to you? Grammatically, I mean; it obviously sounds perfect in terms of content, though it does sell itself short, in my opinion. It's so much more than just 'caramel and biscuit'. But syntactically, it's really awkward.

Happily, the reason for its awkwardness can be attributed to the constraint that I work on. It's called the Final-Over-Final Constraint (FOFC), and it's about what phrases can be combined in what order. It specifically states that one order is not allowed, and it's the one instantiated in the description of my delicious chocolatey snack (although it's so many calories, it might have to be my delicious chocolatey tea).

This is the (partial) structure, as far as I can work it out:
[[milk chocolate [with caramel and biscuit filling]] bars]
Bars is the head of the phrase. It's at the end, as you can see. This means that it's a head-final phrase. FOFC states (basically) that a head-final phrase should not immediately 'dominate' (i.e. have as its immediate constituent) a head-initial phrase (that would be one where the head is at the start). The phrase milk chocolate with caramel and biscuit filling is just such a head-initial phrase, with chocolate as its head (we're not going to talk about milk now - it doesn't affect the argument). So we have precisely the relationship that FOFC doesn't like.

This type of construction is sometimes found: the quick-off-the-mark athletehis out-of-the-blue question. These are marginal, for most people, and it's not a very productive construction: *a happy in his job employee is not good at all. The ones that are accepted are often taken to contain a lexicalised or Spelt-Out element - that is, the first part is not interpreted as having any internal structure, so any structural constraints don't apply to its parts, only to it as a whole, as if it was one word. The Boost description, I think we can agree, is definitely compositional (that is, it's built by the syntax, not interpreted as a single unit), so that explanation doesn't hold and we get a decidedly dodgy bundle of words.

In fact, this is such a mangled piece of syntax that there are at least two other reasons why this is bad. It leaves us hanging a long time before we get to the head, which we English speakers are none too keen on, and furthermore, the PP with caramel and biscuit filling modifies bars, not milk chocolate, so it should follow bars. Why it's where it is at all is beyond me. So we don't need FOFC to write this off as a bad job. But as we have FOFC for other reasons anyway, we can add it to the long list of Things Cadbury's Has Bungled.

(P.S. You may have noticed that Final-Over-Final Constraint violates itself. Its originators are quite proud of this, although it was accidental - the observation is attributed to Gertjan Postma. They note that all the best generalisations do so.)

Hong Kong dollar

Here in the Northeast of England (and in some other British dialects), it's common to hear phrases like ten year or twenty pound, with nouns quantified by a numeral lacking the plural morpheme you'd find in Standard English. It's one of the features of the Geordie dialect I quite like, though I don't do it myself. (Although you'll notice that the colloquial terms quid, nicker and so on tend to be singular - you don't ask someone to lend you twenty quids or twenty nickers. In fact, you'd get some decidedly funny looks if you did ask for that, and possibly a bumper pack of underwear.)

Andrew Graham Dixon presented a Culture Show special called Cash in China's Attic on Friday night, about the booming antiques market in China. He kept on telling us the price of various things, and sometimes he said It's thirty thousand Hong Kong dollars, but sometimes he said It's thirty thousand Hong Kong dollar. He seemed to alternate between using the plural and singular forms. He's not a Geordie, or a speaker of any dialect that this is a feature of, as far as I know. And it's not the case that dollar is always used in the singular in this currency, and he was experiencing L1 influence from English: a Chinese man used the plural form. 

What I think was going on is that he had an interesting case of assimilation to the dialect of many of the people he would have met in Hong Kong. Chinese (Cantonese or any of its other varieties) doesn't have much in the way of inflection, and it doesn't have plural morphology. This means that, although people from Hong Kong mostly seem to speak excellent English, they sometimes make mistakes like not using plural forms. So a lot of antiques dealers and shopkeepers probably told Andrew Graham Dixon the price of things by saying It's one thousand dollar. Either he's just got used to it and started saying it the way everyone else does, or he's assumed that that's the way it's meant to be said and done it consciously. I don't know which, but it sounded decidedly odd.