Saturday, 25 August 2012

Thursday, 23 August 2012

Spoffles, flanges and pobbling

Words get into the language in lots of funny ways. Most of them have been there so long we don't know, or we just know that we brought them with us when we left Germany. Lots of them were borrowed, from Latin and French in the case of English. Lots of words we make ourselves, with our word-making tools like suffixes (brightness) and compounding (carwash). My favourite type of words are the ones that someone makes up, and they catch on.

Quiz was supposedly an early example of this, when (according to anecdote) a Dublin man had a bet that he could introduce a new word into the language. Sadly, as with so many such anecdotes, it's probably bunk, as it was in use already by the alleged time of this bet. Still, there are some words that we know were invented this way, as the people who did it either did so on film or are around to tell us.

Spoffle, quite apart from being a great word, is a useful one, if you ever have anything to do with microphones, as I sometimes do. It's this thing, in front of Stephen Fry:

It's the foamy cover for a mic that prevents 'popping' and cuts down wind noise. Essential item, apparently unnamed (people seem to use the very dull microphone cover) until Hugh Laurie called it a spoffle (in Stephen Fry's hearing, which is partly why he's in the photo above). This is what I always call it, and lots of others do too. At my old job it was the common name for such things. But, astonishingly, it's not in the OED and it doesn't have a Wikipedia article. Apparently it used to but it was deleted, and now it's not even mentioned in the section on microphone covers.

Another word that came out of British comedy is the collective noun for baboons, a flange, and for gorillas, a whoop. Flange has caught on remarkably well and, although it too doesn't get into the OED (they're so cautious, those lexicographers) it's used by people, including (allegedly) academics (although I have not read the papers or books in which it's used). Here's the sketch, with Mel Smith and Rowan Atkinson talking to Pamela Stephenson:

Literature has been a source of some good words. Lewis Carroll invented the word chortle as a blend of chuckle and snort, although not many of his more creative efforts from the Jabberwocky have caught on. This OED blog post talks about Edward Lear, who made up a lot of words (runcible spoon being a particularly well-known example) and also used obscure real words. It has a charming anecdote about the blog-writer's doctor friend who was certain that pobble meant 'to amputate toes', until the blog-writer introduced her to the rhyme The pobble who has no toes. Pobble's Bay also seems to be a place in Wales, which does rather remind one of The meaning of Liff by Douglas Adams and John Lloyd (who created QI, presented by Stephen Fry - we have come full circle so it's time to stop waffling).

Monday, 20 August 2012

Pussy Riot and swearing in the media

The other day, Ben Zimmer at Language Log covered the way that Pussy Riot (the name of a Russian punk band currently in the news) can be rendered into Russian (the band is Russian, but the name is in English). Then Arnold Zwicky blogged about the way that the New York Times is coping with having to write naughty words in its articles. He takes this Guardian blog post as a jumping-off point.

I was wondering about this myself. The thing is that it's a sort of pun, or at least a double entendre. It's what the whole point of Mrs Slocombe was in 'Are you being served':

So that means that the word itself can be said quite freely without causing offence, provided that it means 'cat' rather than... its other meaning. After all, here it was being said on a pre-watershed sitcom as long ago as the 1970s. But as Zwicky notes, the 'cat' meaning is pretty much non-existent these days. We all know it means that, but it's rarely used to mean 'cat' (perhaps because of its rude meaning). Furthermore, the band presumably mean it to have its naughty connotations, as they're a punk band and that's what punk bands do. They have a handy get-out by being able to say that it's simply a name about cats, which is apparently what they told the police it meant. So if the word is ambiguous, and has these two separate meanings, does it mean the rude one if that's what its authors intended it to mean? Does it mean 'female genitalia' here? Or is it a word that can mean that, but not necessarily? I don't know. I kind of feel like it does.

Anyway, according to the blog posts mentioned above, the US papers are struggling a bit and basically not banning it, but trying not to mention it more than absolutely necessary, and definitely not in headlines. From what I can tell, in the UK, the media are more than happy to use the name in the papers, on the radio and on TV (and everywhere else). This means that the lovely lunchtime newsreaders have to say 'pussy' quite a lot, and are essentially using rude words in the news.

Wednesday, 15 August 2012

Functional and lexical words

Words are split up into two major classes, which we can call functional/grammatical and lexical/content words.

Functional, or grammatical, words are the ones that it's hard to define their meaning, but they have some grammatical function in the sentence. The, for instance. What does it mean? Well, that's hard to say. But its function is easy: it's the definite article. It makes things definite (says that you're talking about a particular instance of whatever follows). Or could - hard to describe its meaning, but its function is clear. Prepositions like on or at or if are also functional. Functional words are a closed class, which means we can't add new ones very easily. Try and remember the last time you heard a new preposition or article.

Lexical words, however, do have meaning: cat and armchair and toilet-brush and velociraptor all have clear meanings that you could describe to someone. They're also all nouns, which is one type of lexical word. Verbs can be lexical too, like fly, arrange and steal. Lexical words are open class, and we can make up new ones willy-nilly, by all the different word-formation rules we can muster. You can probably invent a new word now: just noun a verb or add -ify to something.

Did you know all that already? Maybe you did. If you think you didn't, well, I'm here to tell you that your subconscious knowledge of language includes this.

In The Jabberwocky, Lewis Carroll invents a whole stack of nonsense words. But every single one of them is a lexical word. If they weren't, you wouldn't be able to understand the poem the way you can.
I've emboldened the nonsense words in the first verse here, and they are all easily interpretable by the reader as nouns, adjectives and verbs (which are all generally lexical). The English words are all function words.
Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe:
All mimsy were the borogoves,
  And the mome raths outgrabe.
We can understand it because although we don't know what the unfamiliar words mean, we can recognise them for the right part of speech. We can tell that slithy is an adjective and toves is a noun. How? Well, there's a definite article just before that phrase, and we know that in English, a definite article comes at the start of a noun phrase. We expect it to occur with a noun, which we expect to find at the end of the phrase, and if there's another word in there we expect it to be an adjective and to come before the noun. We also know that nouns can have a plural -s ending, like tove-s, and that adjectives can have a -y ending, like slith-y.

The function words are what give us the sentence structure, so if you turn them into nonsense syllables, you're left with no clue. You've just got a string of words, some of which you know the lexical meaning of, but no idea how they fit together.

If you're still not sure that your brain knows this stuff, here's proof. I was watching an episode of slightly naff 00s game show 'Win, Lose or Draw Late' (presented by Liza Tarbuck) the other day (don't judge me, I like game shows, OK?). Paul Tonkinson, who had to draw the book title Life of Pi, began by drawing lines to signify the words in the title. He drew something like this:
____ __ ____
Now, if he was representing the length of the words, he would have drawn the last one the same length as the middle one. But he didn't; it's clearly much longer. He is not representing the length of the words but rather their status as functional versus content words.

Tuesday, 14 August 2012

I got a job!

Part of the reason why I was so busy a couple of weeks ago is that I had a job interview, and I got it! I'll be starting at the University of Kent in September, for ten months. I'll be teaching semantics, morphology and research skills.

I'm really excited. It's going to be weird leaving Newcastle after being a student here for 8 years and a person here since 1993, but Canterbury seems like a really nice place and the department very friendly. And I'm thrilled to have got a job, with a proper salary, so quickly.

But all this means that my PhD thesis really needs to get a wiggle on and start writing itself. I'd been expecting it to do it before now but perhaps it's waiting till the very last moment. In the meantime, I'm helping it along. I'm up to the last chapter of substantial writing/finishing, and then will be able to go back and do the fiddly bits. I'm hoping to get the majority done before I go, otherwise there'll be some very late nights till I submit.

Here's a screenshot, in case I wake up in the night and need proof:

Monday, 6 August 2012

Replace all

I've just finished reading the Hunger Games trilogy by Suzanne Collins. It was all right, not great literature but a fun enough read. Odd typos in the last book though.

One thing was that there very often lacked a space between sentences. I have no explanation for that. Another thing was three misspellings, all related. The first was evapourate. The second was elabourate. The third was labouratory. In each case, the error is an extra u where there shouldn't be one. Can you see why?

Friday, 3 August 2012

Search for nouns

Some people have noticed that the search bar on facebook looks like this, and suggested an improvement:

They think it should read 'Search for nouns'. 

I'm not having a go at these people; the OED defines 'noun' as follows:
A word used as the name or designation of a person, place, or thing.

Wikipedia sensibly adds or idea to its definition, however, as liberty and disappointment are both nouns, and further down expands this to: 
personplacethingeventsubstancequalityquantity, or idea, etc.
Because Wikipedia is often written by experts, it's a lot more detailed than the OED and goes on to tell us that this definition is not very useful on account of it's a bit fuzzy and doesn't tell us much, and that it's better to have  a formal definition based on the properties of the elements we class as nouns. 

One way to do this is to use such characteristics as 'can occur with a determiner like a or the', or 'can take plural inflection -s'. These characteristics are often not cross-linguistic, but can help to identify nouns if you know the rules for the languages you're interested in. In beginning syntax classes I hand out a cheat sheet with just such tests. 

As I think I've mentioned before, the distinguishing characteristic of nouns that linguists generally use is if they behave like nouns. Does that sound circular? What I mean is, the name 'noun' is used to refer to a class of words that all behave in the same way. They can be the subject of a sentence, for instance, or the object of a verb or preposition. These are grammatical characteristics. The 'people, places and things' definition is a semantic definition: it describes what nouns mean or refer to. 

We can use both to identify nouns, but the semantic definition is not appropriate to denote what facebook wishes you to search for; it does not think you should search in that box for love, peace and understanding; it will not bring you success (or at least, not as an abstract noun). 

Wednesday, 1 August 2012

Lost 'lost ''lost' sign' sign' sign

A wonderful example of centre embedding from a ridiculously silly blog, via my friend Valdemar:

The image shows a lost sign, and the lost thing that it's advertising is another lost sign. And the thing that sign was advertising (before it got lost) was a lost sign... and so on. 

It's called centre embedding because, unsurprisingly, it means embedding a phrase in the centre of another one. By 'in the centre' we don't mean that it's precisely central, but rather that words from the higher-up phrase are on both sides of the embedded one. Here's an example of the more common type of embedding we find in English:
I really hate people [who don't think of others]
The bracketed part is a relative clause, which means that it tells you more about people, and it's embedded in the main clause. It's at the end, which is nice and easy to understand. We can go on for a surprisingly long time like this:
This is the farmer sowing his cornThat kept the cock that crowed in the mornThat waked the priest all shaven and shornThat married the man all tattered and tornThat kissed the maiden all forlornThat milked the cow with the crumpled hornThat tossed the dog that worried the catThat killed the rat that ate the maltThat lay in the house that Jack built!
Every line in that rhyme is a new embedded clause, but we can keep track of it all and it's not terribly remarkable. We actually do it quite a lot in normal speech. This example, which inspired Language Log's Trent Reznor Prize for Tricky Embedding, contains a whole stack of embedded clauses and other stuff but is completely understandable, and was produced in natural speech in an interview:
"When I look at people that I would like to feel have been a mentor or an inspiring kind of archetype of what I'd love to see my career eventually be mentioned as a footnote for in the same paragraph, it would be, like, Bowie."
The thing with centre embedding is that it is totally grammatical (it does not break any of the rules of English (by which I mean the rules that speakers intuitively know and that cannot be broken, rather than the prescriptive rules that we all break in our everyday speech), but not acceptable (i.e. speakers don't say things like this and if asked, don't think they are good sentences at all). This is very different from most other grammatical puzzles that we (linguists) have, which are far more often of the type 'this is ungrammatical in most dialects but some speakers produce it - why?' or 'this theory predicts this to be ungrammatical but it's not, because it occurs in language X - why?'. 

It's really striking how quickly examples of centre embedding get impossible to parse (work out the grammar of). In the poster, we can of course easily understand the phrase with no embedding at all:
Lost sign
But then even just one layer of embedding, equivalent to I hate people who don't think of others, is a bit hard to work out:
Lost lost sign sign
And then when you get just one more, it's too hard:
Lost lost lost sign sign sign
The quotation marks help a bit here, but not much, and that's obviously no good in spoken language. This example is obviously designed for humour, and some are more or less easy to work out. Wikipedia (yeah, I'm being lazy today - I've got a PhD to write) cites this example of double embedding, attributing it to De Roeck et al (1982):
Isn't it true [that example-sentences [that people [that you know] produce] are more likely to be accepted]?
The double-embedded part that might cause trouble is the that people that you know produce part, but here it's not too difficult, perhaps because we're used to hearing know+verb constructions. But the Wikipedia page also says (summarising Karlsson 2007) that three is the maximum degree of embedding in written language, and even two is vanishingly rare in spoken language. It gives this example of super-tricky centre embedding, where the first one (with one level of embedding, and not centre embedding) is fine, but adding just one centre-embedded clause makes it incredibly difficult to parse:
A man [that a woman loves]

A man [that a woman [that a child knows] loves]
It means a man who is loved by a woman, who in turn is known by a child. But you try working that out while you're in full conversational flow. It's supposed to be basically just that while we're super-good at keeping track of relations and actions, we're really really bad at keeping track of a whole load of subjects without linking them to their predicates (what they did). 

Finally, this completely incomprehensible paragraph from SpecGram

An apparently new speech disorder a linguistics department our correspondent 
visited was affected by has appeared. Those affected our correspondent a local grad student called could hardly understand apparently still speak fluently. The cause experts the LSA sent investigate remains elusive. Frighteningly, linguists linguists linguists sent examined are highly contagious. Physicians neurologists psychologists other linguists called for help called for help called for help didn’t help either. The disorder experts reporters SpecGram sent consulted investigated apparently is a case of pathological center embedding.