Tuesday, December 24, 2013

Has Amazon lost its way?

We've leant fairly heavily on Amazon for the last few years. We aren't happy about their approach to corporate tax - though it's inevitable that a corporation will do what it's entitled to do to avoid paying more tax than necessary.

Incidentally, I think there's a way to fix this. At the moment, corporation tax is paid on profits. If a corporation that has the ability to declare income in different countries were to pay a lower level of corporation tax, and were to pay a sales tax (of say 1-2%), then I would have thought this would not discourage international expansion, but it would mean that it would be creating tax income in the country that it was earning money from. This isn't a thought-out idea - but I'm sure that there's something that could be worked on there.

Back to Amazon. Their sales operation has become more and more complex. They now offer fulfilment on behalf of other companies (where they hold and ship warehouse stock), they offer a sales front end for companies, and doubtless other mechanisms as well. The scale of the operation is such that they are probably using just about every shipping company that there is. Sometimes, this works incredibly well. For example, DPD are amazing. We get an email beforehand giving a one hour window in which the delivery will take place, and alternative delivery options.

But sometimes, and perhaps my imagination, but seemingly increasingly commonly, it doesn't work so well. This year, we've had a bunch of stuff that should have arrived next day which has turned up three or four days later. Several of the delivery companies don't have a tracking process, let alone one as good as that of DPD. One item was postponed on 23rd December to the 27th December, when it had been specifically ordered for Christmas delivery. Another item was picked wrong - somewhat incongruously, we got the French version of something.

None of these are that big a deal. I've little doubt that Amazon UK could argue that these are faults of contracting companies. But in previous years, I had the feeling that Amazon had all this stuff tied down, whereas now these things seem to be below their radar. Customer service is still exemplary, informative, polite and efficient. But when you find yourself needing customer service more regularly, you can't help but think that problems may be surfacing.

Tuesday, December 10, 2013

Wikipedia on Stephen Meyer

"Darwin's Doubt" is the latest book by Stephen Meyer. In it, he explains why the Cambrian explosion is a problem to a purely naturalistic understanding of evolution.

In the Wikipedia article on Meyer, this criticism is included:
In a review published by The Skeptics Society titled Stephen Meyer's Fumbling Bumbling Amateur Cambrian Follies,[40] paleontologist Donald Prothero points out the number of errors, cherry-picking, misinterpretation and misinformation in Meyer's book. The center of Meyer's argument for intelligent design, Cambrian Explosion, has been deemed an outdated concept after recent decades of fossil discovery. 'Cambrian diversification' is a more consensual term now used in paleontology to describe the 80 million year time frame where the fossil record show the gradual and stepwise emergence of more and more complicated animal life, just as predicted in Darwin's evolution. Prothero explains that the early Cambrian period is divided into three stages: Nemakit-Daldynian, Tommotian and Atdabanian. Meyer ignores the first two stages and the fossil discoveries from these two periods, instead he focuses on the later Atbadbanian stage to present the impression that all Cambrian live forms appeared abruptly without predecessors. To further counter Meyer's argument that the Atdabanian period is too short for evolution process to take place, Prothero cites paleontologist B.S. Lieberman that the rates of evolution during the 'Cambrian explosion' are typical of any adaptive radiation in life's history. He quotes another prominent paleontologist Andrew Knoll that '20 million years is a long time for organisms that produce a new generation every year or two' without the need to invoke any unknown processes. Going through a list of topics in modern evolutionary biology Meyer used to bolster his idea in the book, Prothero asserts that Meyer, not a paleontologist nor a molecular biologist, does not understand these scientific disciplines, therefore he misinterprets, distorts and confuses the data, all for the purpose of promoting the 'God of the gaps' argument: 'anything that is currently not easily explained by science is automatically attributed to supernatural causes', i.e. intelligent design.
To this, I added the following comment:

However, Bethell, writing in a review in The American Spectator points out that Prothero demonstrates in the criticisms that he raises that he hadn't read the sections of the book where Meyer had already addressed those criticisms.

Hardly a big deal, or an attempt to close down the debate, one would have thought. It did not, however, get past the gate-keepers (or at least one of them), who deleted it, commenting:
The referenced article written by Bethell is not a critical review of the book, rather an advocacy of Meyer’s philosophy. Given Bethell’s history of supporting fringe scienceAIDS denialismman-made global warming denial and intelligent design, I view his article as biased and should not be included per WP: NPOV Giving "equal validity".
So, basically, the review might be flawed, but criticism of the review is not allowed because the reviewer doesn't disagree with the author, and has some controversial opinions. Personally, I'd have thought that Wikipedia's NPOV (neutral point of view) policy ought to mean that neither side of the debate is favoured - and yet it seems that regardless of the negative review's provenance, it is allowed to stand unchallenged.

Unfortunately, there are many more naturalist gate-keepers than me, and this is a pointless battle for me to embark on. I do look forward to the day when the discussion of these books becomes focussed on science, rather than what seems to amount to political attempts to suppress fair consideration. But I'm not holding my breath.

In the meantime, if you are interested in an interesting, thought-provoking book on why the appearance of substantial amounts of biological information challenges a naturalistic understanding of evolution, and also some insight into the Kitzmiller vs Dover and Sternberg cases, then I'd recommend Meyer's book. Unfortunately, given how quickly any dissent to naturalism is struck from Wikipedia's pages, I can't really recommend those links ....

Saturday, December 07, 2013

Whilst we're on the subject of maths ...

Another thing that was never mentioned (as far as I can remember) was the interesting phenomenon in the multiplication square - you know, this thing ...

  X   1   2   3   4   5   6 ...
  1   1   2   3   4   5   6
  2   2   4   6   8  10  12
  3   3   6   9  12  15  18
  4   4   8  12  16  20  24
  5   5  10  15  20  25  30
  6   6  12  18  24  30  36

etc.

If you look down the diagonal axis from top left to bottom right, then you get a list of the square numbers - 1, 4, 9, 16, 25 ... What I noticed was that, if you go "northeast" and "southwest" from those numbers, you always get a number exactly one less. That is, if you take a number, and multiply the number one more and one less than it, then you get one less than the number squared. Or ...

(n - 1) (n + 1) = n2 - 1

It turns out to be pretty trivial once you expand out the expression, of course ...

(n - 1) (n + 1) = n2 - n + n - 1 = n2 - 1

But nobody ever bothered to point it out, and I felt a gram of so of smug when I proved it for myself.

There's actually a more general thing lurking here ...

(n - k) (n + k) = n2 - k2

... which means that if you look at the differences as you continue "northeast" and "southwest" from numbers on the diagonal, you are going to get another series of square numbers.

Thursday, December 05, 2013

Happy Pythagoras Day

Not that this is a particularly well-known observance - actually, I just made it up, though it seems as though it did exist before.

We grown-ups have a kind of abiding folk memory of Pythagoras's Theorem - "For a right-angled triangle, the square on the hypotenuse is equal to the sum of the square on the other two sides." In studying Maths O-level and A-level, we had thrown at us over and over again triangles with sides having particular ratios. Most noticeably, 3:4:5, because

32 + 42 = 52

Less commonly, we were also exposed to triangles with sides in the proportion 5:12:13 and 7:24:25, because

52 + 122 = 132,

and

72 + 242 = 252,

These are known as Pythagorean triples, and they form a fairly exclusive group. Conventionally (in the UK!) our shorthand for writing dates is dd/mm/yy. There are only two Pythagorean triples that in their lowest form, written from lowest to highest, encode a date - namely, 3/4/5 and 5/12/13. (Technically, 6/8/10 and 9/12/15 are also Pythagorean triples - but they don't really count, as they are multiples of 3/4/5). Thus, for the people who take nerdy notice of quirky numbers, 5/12/13 (ie. today!) is the last time we will see a date that is a Pythagorean triple for a long time. Hence Pythagoras Day.

I didn't have as much fun with this stuff as I might have done at school. (Yes, yes, I know that those of you who take pride in your mathematical ignorance will be appalled at the concept of maths being fun). I discovered for myself relatively recently that odd numbers form gaps between successive square numbers:

1 to 4 gap is 3
4 to 9 gap is 5
9 to 16 gap is 7
16 to 25 gap is 9

and so on. A series of Pythagorean triples can be built from this, as the squares of odd numbers are also odd numbers, and the gap between two square numbers is also an odd number, the sum of the two numbers:

The gap between 22 and 32 is 2 + 3
The gap between 32 and 42 is 3 + 4

So when the gap between two squares is equal to a square number, hey presto, you have a Pythagorean triple:

The gap between 42 and 52 is 9, which is 32
The gap between 122 and 132 is 25, which is 52
The gap between 242 and 252 is 49, which is 72

They were the ones I knew about - but then I could see that 9:40:41 would be a Pythagorean triple, as would 11:60:61 and 13:84:85. Pretty neat.

However, Wikipedia takes the sense of achievement away by introducing Euclid's formula, which permits us to generate all Pythagorean triples. It's even more neat, but a bit soul-destroying. I just wish someone had shown me this stuff when I was at school!

Monday, November 25, 2013

Language stuff - types of sentence

Sentences function in different sorts of ways, and we can classify them accordingly. The most obvious type is declarative - this conveys information:

  • You are looking at the cat in the basket.
However, by reorganising the elements of the sentence, we can find the other sentence types. An interrogative sentence (question) is one which requests information.
  • Are you looking at the cat in the basket?
An imperative sentence (command) is one where the subject of the sentence is being instructed to do something.
  • Look at the cat in the basket.
Another class of sentence is exclamatory. Here, the sentence is intended to convey emotion through emphasis - so:
  • Awww! Look at the cat in the basket!
does not function as an imperative, although the words are the same as the previous example.

Word order and punctuation aren't sufficient to determine the type of sentence. For example, parents might say to their children:
  • Are you going to tidy up the floor?
in a way which acted as a command, rather than a question. Similarly, a declarative sentence can be used to ask a question through intonation. This might be represented in writing using a question mark, but the word order would be as for the first example above:
  • You are looking at the cat in the basket?
We can also distinguish major sentences from minor sentences. A minor sentence is an irregular sentence, in that it doesn't contain a finite verb (a process). These have various roles - here are some examples.
  • Yes.
  • Wow!
  • Hello?
Wikipedia notes that sentences consisting of a single word are called word sentences, and the words in these sentences are called sentence words.

Tuesday, November 19, 2013

Thursday, November 14, 2013

More song lyric language research

There's a pretty definite "north/south" divide apparent amongst some rock musicians. Following the research project I did for my OU module, I'm interested in seeing whether this is reflected linguistically in the lyrics. But I'd like some help ....

What I'm looking for are singers, and albums, that represent definitive "north" or "south" music. That is, music that self-consciously identifies itself as belonging to either "north" or "south". My plan would be then to create a language corpus from the lyrics of these songs, and carry out the sort of corpus analysis that is hinted about in my posts below about language stuff.

In a sense, I suspect that by "south" I may really mean London. The list I've been drawing up so far consists of:

  • South - Lily Allen, Madness, Blur, The Kinks
  • North - Oasis, The Housemartins/Beautiful South, The Smiths
Other ideas? And if you could pick one archetypal album from each band, which would it be? Your thoughts, please.

Wednesday, November 13, 2013

Unintended consequences

Putting Adsense on here had resulted in the blog advertising Wonga. I think I've worked out how to stop this. Please don't use them - at least try looking at Peer-to-peer lending first. P2P lenders don't charge punitive interest rates. Have a look at Zopa, for example ...

Monday, November 11, 2013

Cultural dumbing down ...

... as expressed in TV and radio quizzes.

What used to happen was that you'd be invited to answer a pretty tricky question - perhaps on a postcard, or phoning in. The first answer drawn out of the hat would win. But the programme makers wouldn't want their job to be harder than necessary. They wanted people to "have a go" - but they didn't want to be completely flooded with right answers.

This even extended to children's programmes. I remember one mini-quiz - perhaps it was on Swap Shop or something like that - which showed a picture of penguins, and invited people to phone in and say what sort they were. There weren't three options to choose from - just a photo of a penguin. Like this ...

... and a phone number.

All that changed with the arrival of two things - premium rate phone lines, and call automation. Now, the winner of the prize draw could be determined entirely automatically, without tying up any staff in the company. And also, the more people who phoned in, the more money the company got. So the above photo would be accompanied by three options:

What type of penguin is this?
  1. A macaroni penguin 
  2. A beefburger penguin
  3. A Chinese takeaway penguin
You'd actually have to be pretty determinedly stupid to get the wrong answer.

What's the effect? Well, all - or rather, anyone - could have a prize. But the prize is no longer worth having. It doesn't represent any achievement, and if you win it, it's just down to luck.

Monday, October 28, 2013

"Ender's Game"

We went to watch the film Ender's Game on Friday. I had been looking forward to the film for a long time.

Orson Scott Card published the book Ender's Game in 1985. To get things in historical perspective, this was the year that Windows 1.0 was released. It looked, according to Wikipedia, like this.

The internet was not commercialised until ten years later. In this context, Orson Scott Card was talking about something which behaved like the internet - not just from the point of view of technology, but as a medium of cultural exchange - and also virtual reality.

It would be easy to assume that large sections of the film were 2013 glosses of the 1985 book - but the fact is, Card's view of the future as portrayed in his books was close enough to where we are thirty years later that it is basically recognisable - if you read the book today, there's nothing terribly anachronistic about the world it portrays, and there's much SF that hasn't aged as well as that.

Other reasons that I was impressed by the book were:
  • Rather than a bland philosophical naturalist, or artificially pantheistic world in which everyone lacks any sort of cultural identity, in Card's world, people have national and religious identities, within the context of what seems to be something like a world government. This is reflected in the film.
  • Children are portrayed as morally complex. Probably a bit too morally complex - the idea of Ender being a small child (i.e. about 6!) doesn't make it into the film - I guess he's supposed to be closer to 12. And the language and imagery that shapes the children in the book is omitted as well - the alien race, the "buggers" become "formics", for example. But the book portrays a childhood world that is much closer to reality than what tends to be presented by children to adults. This in itself makes Ender's Game an interesting subject for children's literature.
  • Some of the ideas in the book are really life shaping. Look up Demosthenes' "hierarchy of foreignness." Again, think of how xenophobic and insular the world was in 1985 - Europe was still divided by the Iron Curtain. In this context, Card was making the case that, if other people shared our humanity, or even if inhuman but communication was possible (!), then there ought to be an alternative to war and destruction.
Orson Scott Card is a controversial figure, due to his opinions on homosexuality. How should you deal with a writer who has opinions about issues which you strongly disagree with? Obviously when people are saying things in public, it's reasonable to disagree with them in public - and they need to be prepared to defend their beliefs. However, ought we to disregard someone's teaching on childhood and education because they persuaded the mother of multiple illegitimate children that he fathered to put them into a foundling hospital? And yet, look at the influence Rousseau continues to have on education. Many of Card's perspectives are hugely positive. Is it really not possible to take and affirm the good, whilst opposing the things you consider to be bad?

So, what of the film? Technically, very good. It was a real shame that the complex political and social aspects of the relationship between Valentine and Peter aren't developed - but you have to sacrifice lots when you go from your average book to your average film. And I'm sure the people who read the book as children or teens would love to have seen more games in the battle room. However, what was brought out much more strongly in the film was the climax - not the final battle, but what happens after. I'm trying to avoid spoilers here - but suffice it to say that Card's message about the need for tolerance and almost pacifism, is made clearer in the film than it was in the book.

Friday, October 25, 2013

Agent de-emphasis and naturalism

In my previous post, I talked about three different ways in which English could be used to draw attention away from the subject of a verb - the agent that is carrying out a particular process. These are:
  • short passive verbs;
  • nominalisation;
  • ergative verbs.
I guess my aim in highlighting this is that I'd like to think that an awareness of this would become part of more people's critical thinking repertoire - "It was said..." By whom? "Research has shown..." Who did the research?

Naturalism is, according to the Oxford English Dictionary Online, "the idea or belief that only natural (as opposed to supernatural or spiritual) laws and forces operate in the world." It says that everything in the universe is the outcome of time and chance - the universe itself has no designer; the contents of the universe (including animals and us) don't have a designer either. This is a little bit problematic, because lots of things in the universe look designed. Richard Dawkins coined the term "designoid" to refer to complex objects which are neither simple nor, he believes, designed - or rather not designed by an intelligent agent.

Another way of thinking about naturalism is to talk about telos - a word that comes from Greek, meaning "ultimate purpose or aim". The universe of the naturalist is atelic - it has no ultimate purpose or aim. Specifically, evolution, to a naturalist, is atelic. Any particular outcome of the evolutionary process - whether it's humans, multicellular life or antibiotic resistance - isn't designed, it just happens to arise.

This causes problems when it comes to language use in the context of evolutionary processes. The sort of processes that change stuff in the world are material processes. I listed the possible participants in material processes as being "actor, goal, scope, attribute, client, recipient" - with the key ones being the actor (the participant carrying out the process) and the goal (the participant affected by the process). So:
  • Sam (participant, actor) eats (process, material) some sushi (participant, goal).
But when we come to considering evolutionary processes, grammar really struggles. In Darwin's Doubt, Stephen Meyer gives examples of the way in which neo-Darwinist writers use a "word salad" to make up scientific-sounding phrases effectively as "just-so stories" to explain how evolution must have occurred. But it's worthwhile looking at these phrases from a grammar point of view as well.

Meyer gives examples of people talking about exons being "recruited" or "donated". These are short passives - remember above that the short passive is a form that allows agent de-emphasis. So, who or what is the actor associated with these processes? The same with "radical change in the structure" - here we have a nominalisation ("change") - again, the question that is begged is who or what has changed the structure? The actor can only really be "evolution":

  • exons (participant, goal) were recruited (process, material) (by evolution - participant, actor, de-emphasised)
But evolution is not allowed telos. In other contexts, people would squirm if we talked about evolution "doing" something - evolution just happens. And yet, through agent de-emphasis, we can slip in the concept of evolution as the actor in material processes.

The effect of this is that neo-Darwinism smuggles in the idea, and the categories, of purposeful, telic activity through agent de-emphasis. I would suggest that this is misleading - it is difficult to talk about evolutionary processes as being blind and purposeless: however, it's also wrong to use purposeful categories for something which has been defined as purposeless. If it is impossible to work on the basis that evolution is genuinely atelic, then maybe this belief was wrong in the first place.


Thursday, October 24, 2013

Language stuff - process types

Verbs are "doing words". However, as I suggested in my discussion of lexical density, not every verb is a doing word all the time - sometimes verbs behave as function words. When a verb is a lexical verb - that is, when it's describing something that is actually happening, it can also be referred to as a process. In fact, we can divide clauses up into processes, participants and circumstances - and, with a clause being effectively an indivisible quantum of meaning, it usually has exactly one process.

Blerk again. What does that mean? Let's take some of the clauses above and break them down.
  • Verbs (participant) are (process) "doing words". (participant)
  • However (circumstance) not every verb (participant) is (process) a doing word (participant) all the time (circumstance)
  • sometimes (circumstance) verbs (participant) behave (process) as function words. (circumstance)
What happened to my suggestion, you may be asking? We can see, since it has exactly one process, that it is a clause itself:
  • as (conjunction) I (participant) suggested (process) in my discussion of lexical density (circumstance)
However, it doesn't make any sense without the context of the second clause above which surrounded it - hence it is a subordinate clause.

Also, why is "as function words" a circumstance, not a participant? Effectively the preposition followed by the noun is behaving like an adverb - it is describing how the verbs behave, not what they are.

There are different types of processes. In the OU course, we divided processes into five sorts:
  • material - a participant acts upon the material world or is acted upon in some way ("I ate sushi");
  • mental - processes of consciousness and cognition ("We thought it didn't matter");
  • verbal - processes of communication ("I told him so.");
  • relational - being, having, consisting of, locating ("He has no father.");
  • existential - indicating the existence of an entity ("There is a problem").
In grammatical terms, we can talk about subjects, direct and indirect objects and so forth. However, these different types of processes have been assigned different types of participants - it seems to make the whole thing pretty complicated, but in actual fact, when we reflect on what is going on in a sentence, the types of participant associated with a process help to clarify the sort of process we are looking at in some cases. This summary comes from here:

  • Material - actor, goal, scope, attribute, client, recipient
  • Mental - sensor, phenomenon
  • Verbal - sayer, receiver, verbiage
  • Relational - token, value
This provides us with a more comprehensive way of analysing processes.
  • Verbs (participant, token) are (process, relational) "doing words". (participant, value)
  • as (conjunction) (participant, sayer) suggested (process, verbal) to you (participant, receiver) in my discussion of lexical density (circumstance)
As the page I just linked to makes clear, it's also possible to go into more detail about different types of circumstance - but that's quite enough for one blog post!

Wednesday, October 23, 2013

Language stuff - modality

Prior to studying E303, my experience of modal verbs had basically come from learning foreign languages - most specifically, the verbs pouvoir, devoir and vouloir which we learnt in O-level French. I had never been given grammatical categories for the same things in English, although obviously I could see how il peut mapped onto he can; voulez-vous onto do you want, and so on. They work as forms of auxiliary verbs - that is, they don't function as the main process in a sentence.

There are two categories of modal verbs - epistemic, which are modal verbs that relate to the likelihood of something being true, and deontic, which are modal verbs relating to possibility or necessity of action. They can be ranked according to their strength - O'Halloran, in the E303 textbooks, offers the following scale of epistemic modal verbs, from strongest to weakest:
  • will
  • would
  • must (in the sense of "he must be there" - "surely he's there")
  • may
  • might
  • could
  • can
and of deontic modal verbs:
  • has to
  • must (in the sense of "he must do it" - "if he doesn't do it, he's doomed")
  • had better
  • ought
  • should
  • needs to
  • is supposed to
Modal verbs are used differently in different forms of discourse. If we consider conversation, we tend to hedge - that is, we tend to make statements less assertively than if we were writing them down. Strong modality tends to come across as being forceful, and thus rude. There are other means of toning down the modality - for example, by personalising statements - I don't think that's true or even I'm sure that's not true both have less strong modality than That's not true.

In song lyrics, the dominant epistemic modal verbs in the 33000 word corpus I constructed were:
  • will (also counting 'll, I'll, won't) (307 occurrences)
  • can (can't) (290 occurrences)
  • would (I'd, wouldn't) (101 occurrences)
  • could (couldn't) (59 occurrences)
The use of deontic modality is much less common, and the most common verbs were:
  • had to (have to, has to, got to) (51 occurrences)
  • should (33 occurrences)
  • need to (needed to, needs to) (18 occurrences)
  • must (15 occurrences)
The frequency of use of strong deontic modality was very similar to what is found in the fiction corpus. However, in fiction, the use of the verb "must" is much more common than it is in song lyrics, which lean much more on "need to" and "have to".

Tuesday, October 22, 2013

Language stuff - agent de-emphasis

Normally when we think about describing an event, we think in terms of who or what is actually doing it - that is, the agent.
David broke the plate.
We may, for various reasons, not wish to draw attention to the agent. English language allows us several options for doing this. The most obvious one is to use a "short passive":
The plate was broken. 
The passive voice is used, and the person who actually does the breaking is not specified. It is possible to include the agent when using the passive voice:
The plate was broken by David.
but there may be reasons for de-emphasising the agent by omitting it - for example, when the parents come downstairs to discover the reason for a loud noise, a child might choose to draw attention to the fact that the vase has been broken, rather than admitting that it was him rather than the dog that did it.

Another option for agent de-emphasis is the use of nominalisation. This is converting a verb into a noun. I have to come up with a more complicated sentence now, as nominalisation of "to break" will leave it without a process (verb).
David broke the plate. We glued it back together.
We can de-emphasise David by nominalising the verb:
After the breakage of the plate, we glued it back together.
Yes, I know, it's a little artificial, but hopefully you get the idea.

There's a third option, and this is to use what is known as an ergative verb. This is quite subtle. An ergative verb is one that can be transitive or intransitive - that is, it can either be used with a subject and object, or just with a subject - but the object when it is used transitively becomes the subject when it is used intransitively. Blerk. The easiest way of explaining this is to give an example.
The government closed the mines.
Here, "the government" is the subject of the verb, and "the mines" is the object - or to use different grammatical categories, "the government" is the actor and "the mines" is the goal. It's possible to write this using a short passive, as explained above:
The mines were closed (omitting "by the government").
The agent/actor/subject can be omitted, which means we don't need to mention who actually closed the mines. Or we can use a nominalisation, and talk about the closure of the mines - again, the agent disappears. But we have a third option, because "close" here is an ergative verb - that is, the object of the sentence above (the mines) actually becomes the subject if we use the verb intransitively. If we want to use this verb intransitively, then the sentence becomes:
The mines closed.
(rather than "The government closed.") Once again, the agent/actor has disappeared.

There are various reasons why agent de-emphasis might be considered desirable. Those of us who have been using computers for more than ten years probably remember earlier versions of Word for Windows nagging us about using the passive voice. In my case, it was because I was often writing about science - and an aspect of science writing is use of the passive voice - to de-emphasise the person actually doing the work. Using nominalised verbs allows the writer/speaker to increase the lexical density - that is, to convey more information in less space. This is valuable in media where word count and space is at a premium - like journalism.

More significantly, as the example above suggests, there may be political reasons for de-emphasising the agent. And I'd like to return to this in a future post ....

Monday, October 21, 2013

Language stuff - field, tenor, mode

One of the aspects of studying language that interested me is how recently much of linguistic theory has been developed. When I was studying computer science in the late 80s, many of the theoretical foundations were pretty new - Dijkstra's algorithm, for example, that we were taught about in my degree (which is now part of the Further Maths A Level syllabus!) was published in 1959. However, Michael Halliday's seminal book on language An Introduction to Functional Grammar was not even first published until 1985. Both Halliday and Noam Chomsky, perhaps the most famous linguistic theorist, are still alive.

The features of a use of language may be considered by considering its field, tenor and mode. The field is also referred to as the experiential metafunction. It is how language is used to make meaning about the world - in other words, the actual content of what is being communicated.

One might naively think that this was all that language was - the communication of information - but it is more subtle than that. Every language event takes place between one or more participants, and in addition to communicating information, language events are used as part of the process of enacting interpersonal relations. This is tenor, also referred to as the interpersonal metafunction. Additionally, language events can take place in many different forms (conversation, email, a sermon ...), and these are themselves largely detached from both field and tenor. This textual metafunction is the mode.

So, what does this mean in practice? Let's take this blog post.

  • Field - I am attempting to explain, in fairly simple terms, information about the theoretical use of language.
  • Tenor - I'm addressing unknown readers (who are you? Say hello!), but I'm writing in a fairly informal style - I'm assuming that the average reader will just have happened across this, and wants to read something that's engaging, friendly and not too heavy. Frankly, that's how I like communicating anyway, and since this is my space, really written for my own amusement, I guess I do what I want.
  • Mode - a blog post. Written language can be more planned and deliberate than spoken language, for a start - there's a definite structure, and I've assumed in writing it that people will start at the beginning and read through hopefully to the end!
You can imagine changing each of those individually would change the way in which language was used. For example - suppose (field) I was writing about something else, maybe a film review? Or suppose (tenor) I knew that the people reading this were children aged 12? Or suppose (mode) I was delivering this as a talk? Each of the metafunctions, then, has a bearing on how language is used, and this linguistic structure is something that has only really been described in the last 30 years or so.

Monday, October 14, 2013

Language stuff - lexical density

Words in a text can be divided into content words and function words. Content words refer to some object, action or other non-linguistic meaning. The sort of words that are content words are nouns, adjectives, adverbs and most verbs. These are "open classes" of words - that is, it's possible to invent more content words. If I said "The abdef ghijk was mnoping stuvly over there", even though you'd never come across a sentence like that, you'd probably conclude that "ghijk" was a thing (noun), of which you can get "abdef" ones (adjective) "to mnope" was a verb which it is possible to do "stuvly" (adverb). If someone feels like drawing a picture of this happening, I'll include it in this post!

Function words are words that have little meaning in themselves, but express grammatical relationships with other words in the sentence. In the sentence above, "the", "was" and "over there" are all function words. They include conjunctions, prepositions, modal and auxiliary verbs, and pronouns. These are all "closed classes" - that is, there isn't "space" in the English language to add new ones, Dr. Dan Streetmentioner notwithstanding. All the new words that come along are content words, not function words.

It is possible to work out the proportion of content words compared to the total number of words. This is the lexical density. Different sorts of texts will have different lexical densities. On our OU course, we used the Longman Student Grammar of Spoken and Written English. This is a descriptive grammar (in other words, it described how English was used, rather than saying how it ought to be used), and analysed four different styles of discourse. Based on large corpora, it gave the lexical density of different sorts of discourse as:

  • Conversation - 35%
  • Fiction - 47%
  • Academic prose - 51%
  • News - 54%
Conversation is low for several reasons. The first is that, unlike written discourse, in most conversations, there is a shared context. This means that it's possible to use pronouns to a greater extent than nouns, for example. Also, conversation is improvised to a greater extent than written discourse. This means that there are likely to be dysfluencies - such as hesitators and repetition - which have the effect of decreasing the lexical density. As part of the OU course, I did my own analysis of lyrics from pop records. Their lexical density turns out to be almost exactly the same as that of fiction.


Tuesday, October 08, 2013

Language stuff - corpus (pl. corpora)

A new(ish) tool for the systematic examination of the English language is the development of language corpora. These are collections of samples of English language into a "body", which can then be examined using computer programs.

According to Wikipedia, the first corpus used for language investigation was the Brown Corpus. It consisted of around a million words, gathered from about 500 samples of American English, and was used in the preparation of the benchmark work Computational Analysis of Present-Day American English by Kucera and Francis. This was as recently as 1967.

The size of corpora has increased with increasing computer power. The British National Corpus currently contains 100 million words. The Oxford English Corpus - used by the makers of Oxford Dictionaries amongst others - contains 2 billion words of English. The Cambridge English Corpus is a "multi-billion word" corpus. In addition to the texts that make up the corpora, the words they include can be tagged for parts of speech - for example, whether "love" as it appears in a text is being used as a noun ("His love was so great...") or a verb ("I love you"). A corpus can be examined with software called concordancing software. This will search for specific words, phrases or instances of grammar, and can do things like highlight words that are frequently collocated. This can be used to identify patterns in the language that might otherwise go unnoticed.

Corpora can be produced from particular classes of text - for example, transcribed conversations, newspaper articles, academic journals, fiction. For E303, the Open University undergraduate course that introduces corpus linguistics, we were provided with a 4 million word corpus, with a million words drawn from each of these classes. It's also possible to produce your own corpus. I created a corpus of pop song lyrics - only 33000 words or so, but still enough to look for trends and patterns of language use. There is software available that can tag a text with parts of speech - for example, CLAWS4. And for analysing the sofware, the AntConc concordancing software is freely available.

Monday, October 07, 2013

Language stuff - type/token ratio

I just finished the Open University module E303 - English Grammar in Context - it sounds pretty deadly, but I loved it. Language is inherent to who we are as human beings - we all communicate. And yet, it's only relatively recently that the resources have been available to examine language in a systematic, large-scale way. A lot of the underlying theory is actually newer than the computer science theory that felt pretty new when I was doing my first degree.

I've promised blog series before, and they rarely amount to much, but I'd like to see whether I can write about some of the ideas we covered, and maybe get across some of the reason that I found the material so fascinating.

The first concept is type/token ratio. "Tokens" are the number of words in a piece of text - if I do a word count, then it tells me the number of tokens. But not all of them are unique. The most common word "the" I have now used ten times so far in this text (don't hold me to that - it's likely to have been edited in a highly non-linear manner - but you get the idea). You can get some insight into a text by dividing the number of unique words by the total number of words, and expressing it as a percentage. So for the text up to the start of this sentence, there were 229 words and 134 types - giving a type/token ratio of 59%.

A couple of things about type/token ratio. The first is that as a piece of text gets longer, the type/token ratio is likely to fall. The number of words is clearly increasing, but the number of types is increasing more slowly - it's more likely that you will be using the same words again. What that means is that if you want to compare type/token ratio of two different texts, they need to be about the same size.

The next thing is that different sorts of text will have different type/token ratios, as they are a measure of the diversity of the vocabulary being used. For my final assignment, I looked at pop song lyrics. I had a database of around 34000 words, and this had a type/token ratio of just under 10%. I compared this with a slightly larger database of words from a work of fiction, and this had a higher type/token ratio - just over 12%. A slightly smaller database of words from transcribed conversations had a lower type/token ratio - about 6.5%.

One might assume that the language used in pop music was pretty narrow in its range. But it turns out that it is quite diverse - almost as diverse as fictional writing, and much more so than the sort of language that's used in everyday conversation.

Wednesday, July 24, 2013

Thursday, March 21, 2013

Human factors in medicine

I just watched the latest BBC Horizon programme. It was about the need for developing human factors training and practice in medicine. It drew on the civil aviation industry, and also fire fighting and Formula 1.

My wife could tell you that the need for human factors training within medicine is something that I've been banging on about for years. It's been apparent that whilst there are many very good and well-intentioned practitioners of all sorts within the NHS, there are also too many occasions when issues arise that really boil down to human factors end up having a negative clinical outcome. This is something that has been a focus of training and safety within aviation since Kegworth, and over and over again, when I hear descriptions of events that have taken place in hospitals, I can think of ways in which human factors training could have helped.

Aspects drawn from aviation in this episode included the use of checklists (although checklists themselves are an innovation, the idea of following "protocols" and procedures are already widely established within medical practice), a strong focus on situational awareness, mention of the need to understand the impact of authority gradients, and the impact of too much stress.

It's easy to add other aspects of human factors issues from aviation (now generally called CRM - crew resource management). For example, fatigue, error chains (mentioned passim without explanation) and communication. One of the major contributors to improvements in the human factors environment within aviation has been CHIRP - the Confidential Human Factors Incident Reporting Programme - which has been extended to Air Traffic Control, cabin crew, aviation engineers and also the maritime industry. Again, for years, I've been saying that if the NHS were to take human factors seriously, such a publication would greatly benefit.

Dr Kevin Fong, the presenter, makes a very good and articulate case for the development of human factors within medical practice. If Sir David Nicholson, the Chief Executive of the NHS, is keen to see substantial improvements in clinical outcomes, I'm pretty convinced that this is one of the places that he should be looking.

Monday, March 18, 2013

Francis Schaeffer on logical positivism

Logical positivism claims to lay the foundation for each step as it goes along, in a rational way. Yet in reality it puts forth no theoretical universal to validate its very first step. Positivists accept (though they present no logical reason why this should be so) that what reaches them from the "outside" may be called "data"; i.e., it has objective validity. 
This dilemma was well illustrated by a young man who had been studying logical positivism at Oxford. He was with us in Switzerland as a student ... and he said one day, "I'm confused about some of these things. ... when this data reaches you ..." 
At once I said, "How do you know, on the basis of logical positivism, that it is data?" 
He started again, and went on for another sentence or two, and then said a second time, "When this data reaches you ..." 
...I had to say, "No, you must not use the word data. It is loaded with all kinds of meaning; it assumes there is objectivity, and your system has never proved it." 
"What do I say then?" he replied. 
So I said, "Just say blip. You don't know what you mean by data, so substitute blip." 
He began one more, "When blip reaches you ..." and the discussion was over. On the basis of their form of rationalism, there is just as much logic in calling something "blip" as "data." 
Thus, in its own way, though it uses the title of positivism and operates using reason, it is just as much a leap of faith as existentialism - since it has no postulated circle within which to act which validates reason nor gives a certainty that what we think is data is indeed data. 
Michael Polanyi's (1891-1976) work showed the weakness of all forms of "positivism" and today positivism in theory is dead. However, it must be said that the materialistic, rationalistic scientists have shut their eyes to its demise and continue to build their work upon it as though it were alive and well. They are doing their materialistic science with no epistemological base. In the crucial area of knowing, they are not operating on facts but faith.
Francis Schaeffer, "The God who is there", emphasis mine.

The trouble is that there are many non-scientists who have accepted the epistemological assertions of the "materialistic, rationalistic scientists" who "have shut their eyes" the the demise of their epistemological foundation, that science is an adequate philosophical foundation for not believing in God. "Well, we know so much more than we used to know. It used to be necessary to believe in God to explain the world around us. But nowadays, we are much better informed, and belief in God is not necessary."

Science as a philosophy - "scientism", if you like - is not built on a solid foundation. For example, Richard Dawkins said: "Although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist." This is not a logical statement. Firstly, although Darwin provided a naturalistic and gradualistic explanation of how life might arise, this actually has no bearing on whether or not there is a god (which is, in effect, what Dawkins is claiming). Secondly, what is absent from Darwin's (and Dawkins') work is reference to an epistemological foundation. It is a justification of this which would provide the possibility to be an intellectually fulfilled atheist, rather than a description of phenomena. Questions such as: how does life differ from non-life? what is consciousness? what is communication? why do the things that matter so much to us - truth, love, beauty, justice - seem to have so little to do with the physical nature of the universe?

This isn't to say that science is bunk. On the contrary, the achievements of science in explaining the nature of the universe are immense and wonderful. Also, some scientists have made sincere attempts to answer these questions. But like the student that Schaeffer talked to, their answers are not philosophically complete.

Science is not the sole preserve of logical positivists. In fact, the foundations of modern science were laid by people with a very different philosophical framework - Christians, who believed that the foundation for belief in the objective validity of data was the existence of a deity, an external absolute reference point. Christians still do science today. It's uncommon for their books to be as successful as those of the logical positivists who haven't comprehended their mislaid foundation yet, though.

Wednesday, March 06, 2013

Frustration with "AirPort" on Mac

We have a wireless network, and one of the frustrations we've had with our Macbook is that, pretty much whenever it entered "sleep" mode, it would fail to reconnect properly to the network. Resetting this was taking up to half an hour a day - particularly bad since the major selling feature of Apple products is that they should "just work".

This was getting gradually worse as time went on. I'd come to the conclusion that it was something to do with IP address conflicts - switching an extra computer on would quite often trigger a period of misbehaviour. Our house now has four phones, a TV, five computers, a wireless printer, a couple of e-readers and several games consoles which drop on and off the network. The potential for IP address conflicts was getting steadily worse, and the Macbook in particular was getting more and more flaky. The problem has been known about for years, but Apple have been singularly poor at working out a fix. Something had to be done - or the Macbook was likely to experience defenestration.

A further complication, which may or may not have made the situation worse, was the addition of a wireless network extender. This had the benefit of making the internet available in the furthest reaches of the house (which is not actually that huge!), but created the complication of (apparently) having two different networks with the same SSIDs. This had various implications - it was not possible to tell whether a computer and the wireless printer were actually on the same network until data failed to make it to the printer.

It was all pretty bad. But I think we may have found a way forward.

Devices on the network are identified by a MAC address. This is a number which is six two-digit hexadecimal numbers in a row, separated by colons, and every device capable of accessing a network has a different one (I suppose). Some wireless routers have the option of "reserving" an IP address for a particular device. So I found the Macbook's MAC address, entered the configuration page for the wireless router (a Virgin Media Superhub), and asked it to reserve the IP address for that Macbook. A slight additional complication was that I had to change the name in the Macbook "Sharing" section of its System Preferences - it had a name, but until I changed it, this wasn't being made available on the network, and the wireless router needed a name to reserve the address.

In doing this, I discovered that the Superhub was somewhat more super than I'd expected. It actually has the option of running multiple wireless networks. So I then set up a second wireless network - I now have networks with the SSIDs virginmedia12345678 and extension12345678. The printer and the Macbook and most things within reach of the router use the first one. But the network extender uses the second one, and this means that most devices can ignore its behaviour altogether. So people on "that" side of the house get wireless internet, and it doesn't interfere with devices on "this" side of the house.

I'll let you know if this turns out not to resolve the issue long term, but it looks like we've found a workaround for Apple's inadequacies.

In the meantime, I've come across a new problem. The Superhub has an ongoing issue that after a while, it stops being possible to access the administration screens using the standard IP address. Like the Airport fault on Apple, this has been flagged up for years, and Virgin have apparently failed to achieve much by way of fixes. Apparently, restarting it allows access again - and fortunately we will hopefully not have to access the configuration pages on a very frequent basis. So I think I can live with this.

Monday, February 11, 2013

Ethnology

My OU course, AA100, is uncovering various interesting things.

It's possible to look at a wiki based on the 1911 Encyclopaedia Britannica online, here. Interestingly, an article that was included for reference in one of our course books is omitted from the online version. It's on "Negro" - and its absence can be spotted if you scroll to the end of the article on "Ethnology". Look at where it says:
 For a detailed discussion of the branches of these three main divisions of Man the reader must refer to articles under race headings, and to Negro; NegritosMongolsMalaysNorth American IndiansAustraliaAfrica; &C., &C.
 "Negro" could have hyperlinked to the relevant article, were it present in the wiki. But it doesn't.

This is understandable. Here are some quotes from the article on "Negro":
In certain of the characteristics mentioned ... the negro would appear to stand on a lower evolutionary plane than the white man ...
Mentally the negro is inferior to the white ... it is not fair to judge of his mental capacity by tests in mental arithmetic; skill in reckoning is necessary to the white man, and it has cultivated this faculty; but it is not necessary to the negro.
Offensive nonsense. There are parts of the article which aren't quite so offensive, but plenty that is. It's understandable that this should not be given any disk space.

However, in some ways, it's not a good thing that these shameful ideas should be omitted from the text. Not because they are or ever were true, but because it reveals something about the intellectual mindset of the time. How on earth could the Encyclopaedia Britannica, of all publications regarded as the ultimate repository of knowledge at the time, have included this sort of stuff? The answer is - can only be - that such attitudes genuinely represented an uncontroversial consensus opinion. It was derived from the naturalistic presupposition that humanity represented the end point of the evolutionary process, and "white men" represented a point closer to the end than "black men". It is (or should be!) unnecessary to say that such an understanding of Darwinistic processes has been completely discredited and is no longer given the time of day.

In the course, this ethnological perspective is contrasted to an anthropological one - but it is interesting to note that the basis on which the British Museum was established was ethnological, and assumed the cultural superiority of Britain and Western Europe, and that "more primitive" cultures were ones which were either stalled, or should be moving towards them - and the intellectual understanding was that this view was bolstered by Darwinism. The course talks about how the bronzes from Benin (here be pictures) unsettled this idea. It also talks about how "primitivism" in art, a reaction to modernism, still reinforced the idea that other non-European cultures were actually more primitive.

I've talked in other contexts about how other naturalistic assumptions turned out to be false - the idea that the universe was infinitely old ("Big Bang" was originally a dismissive term for the idea that the universe may have had a starting point); the idea that life in its lowest form was simple; the idea that there was nothing remarkable about the earth as an environment in which life could appear; and so on. To this we can add another - the idea that "white" people are superior to "black" people. Of course, naturalism has moved on, and accommodated the fact that reality didn't turn out as expected. But it's interesting the way in which beliefs based on presuppositions can so seriously misdirect people. Who knows what we might have learnt anthropologically about cultures we squashed in the imperial/colonial era had we regarded them all along as our equivalent rather than our inferior?

Now, let's reflect for a moment on our own culture. Just like the Victorians/Edwardians, we are thoroughly convinced of our own absolute rightness. Is it possible that any of our presuppositions are leading us to beliefs about the world that in thirty years time will cause people to gasp as much as that 1911 Encyclopaedia Britannica article makes us gasp?

Tuesday, February 05, 2013

Gay marriage


What will be will be. But just as with the electoral reform referendum, political entities, with the collusion of much of the press, have their own agenda which has little to do with the interests or will of the electorate.

For myself personally ... I don't believe it is the responsibility of the government to legislate definitions of words, or assume them, unless there is a consensus. If the former consensus as to what marriage actually is no longer exists, then I don't believe it is for the state to decide what the new definition should be, even if an overwhelming majority of the electorate are happy with it (and that case hasn't been made). Language is not the responsibility of the state.

I also don't believe that it is the job of a government to introduce legislation of this sort within a parliament that hasn't been anticipated in a manifesto.

Also, whilst the legislation may allow for freedom of conscience, this was the case for working on Sunday when the legislation for that was introduced. But 20 years down the line, it is pretty much assumed. It's hard to remember today just how big an issue working on Sunday was at the time. Does that matter, or doesn't it? Who can say? But the point is, regardless of the protections that are included, big social changes can follow from such "tidying up" and "making more equal" of the law. The government has said that it intends to change the law regardless of the outcome of consultation, and that's the point at which we stand now. Is that democracy? Is it wise? Does it reflect a reasoned, or reasonable, position?

Sunday, January 20, 2013

I (e)published a book!

The book that I scanned last year (Short Papers on Church History, Vol. 1 by Andrew Miller) is in the process of being made available through the Kindle Direct Publishing program. It should take a day or two to make its way across the various Amazon servers, I understand, but should then be available worldwide.

The scanning of the book itself, thanks to the new scanner (an Epson Perfection V500) was actually pretty quick - and ABBYY FineReader software converted it into a single file with few glitches. What took the time was the process of effectively proof-reading the whole book. What had to be done?

  • I had to try and pick up what are known as "scannoes" - the OCR equivalent of typos;
  • I wanted to impose a consistent style across the whole book that would work with things like the generation of a table of contents, and how it would be presented on a Kindle;
  • I wanted to make sure that all italicised text was converted properly;
  • I wanted to consistently replace (for example) ' with ‘ or ’ as appropriate, and where necessary ae with æ;
  • There were the odd bursts of Greek characters!
So it turned into a time-consuming process - and as I say in the foreword, I'm pretty sure that I've not done it perfectly, so I may need to re-upload at some stage.

Anyway, if you're interested, the first place that you can find a link to the book is here. I'll add the UK address once it becomes available. I am intending to charge for the book, by the way - but only about £1.