## Can a Monkey be Taught to Type Shakespeare?

Posted on: December 13, 20091 comment so far

** Mathematical Linguistics by Jack Reichman, Ph.D.**

There are some who would believe that given enough time and energy, a monkey could be taught to type, get lucky, and write some memorable prose. These are probably the same people who believe that luck plays the major role in all art. Let’s see if that is really true by doing a thought experiment.

We will start by assuming that a keyboard has exactly 27 keys, one for each letter plus one for the ‘space’ key. We won’t be fussy about punctuation or capitalization, because if the monkey writes something that is even halfway decent, I am sure that an editor could put it into good grammatical and punctuational shape.

In the first experiment, we will also assume that the monkey will type any key randomly, i.e., each key has an equal chance of being pushed by the monkey. In particular, this means that the monkey is never offered an incentive to learn which key to type. We’ll come back to that in our second thought experiment.

A quick primer in probability theory: If there are 27 keys, each with an equally likely chance of being selected (i.e., pushed by the monkey), then there is a 1 in 27 chance that the monkey will select any given key. If you pick a two key sequence (e.g., ab, or th, or ss – the same letter is allowed to be doubly selected and a space is considered as one of the keys), then there are 27*27 or 729 pairs of keys. Note that the order of the keystrokes is important, so that th and ht are two different sequences.

In our first experiment, with random choices of each keystroke, this means that each pair of keys is equally likely to be selected (because each keystroke is “independent” in probabilistic parlance) and therefore there is a one in 729 chance of the monkey typing any particular two keystroke sequence in order. Similarly, there are 27*27*27 (=19,683) different three keystroke sequences and 27*27*27*27 (=531,441) different four stroke sequences.

The blanks might be a bit confusing for our purposes, so let’s consider only four keystroke sequences that have no blanks. There are 26*26*26*26 ( 456,976) of these. This would fit in with the assumption that we had a slightly trained monkey – assume that we have taught the monkey to only press the space key when he/she has completed a “word”.

Now, of the 456,976 four letter key sequences, how many of these are actually words? There are 2006 pages of definitions of words in Webster’s New 20^{th} Century Dictionary of the English Language (1951). It has been estimated that there are 13 four letter words on a particular dictionary page of many dictionaries. This would mean that there are approximately 13*2006, or 26,078 four letter words in the English language. This estimate is probably on the high side, because the longer dictionaries, such as this, will probably tend to have many more complex words of greater than average length than dictionaries meant for typical high school type usage. Some have estimated that the number of four letter words in the English language is more likely in the 6000-8000 range.

In any case, let’s go with the higher number of four-letter words (26,078), because that will increase the monkey’s chances of getting a word when randomly typing. This would mean that based on pure random choices of 4 consecutive letters, a monkey has approximately a 5.7% chance of typing an actual word (26078/456,976). In turn, this would mean that the chance that a monkey could type two four letter words in a row is approximately 33 out of 10,000 or a probability of .0033 ( .0057 x .0057). The chance that a monkey could type four four letter words in a row is therefore approximately .0033 squared or .000011 (i.e., 11 chances out of a million). What then would the chances be of a monkey typing two consecutive sentences each of four four letter words? It would be approximately the square of .000011, or 121 times out of a billion. If the monkey were to be expected to type some five letter and six letter words, as well as four letter words, the chances would be even lower, because it is less likely to randomly type a five letter word, etc. The longer the word, the less likely it is that the monkey will guess at it.

So let’s take stock of where we are right now: In a billion tries, a monkey would be likely to type two sentences each of four words of length four in about 121 of these tries. If a monkey types one word per second, or sixty words per minute (which would put that monkey into contention for a good secretarial job), how long would it take the monkey to type 8 billion words (each set of 2 sentences consists of 8 letters, and we are giving the monkey one billion chances)? It would take the monkey slightly longer than 92,592 and a half days to type 8 billion words, or approximately 253 years. The lifespan of a monkey depends on the particular subspecies but for our purposes, we can assume that 20 years is a typical monkey’s lifespan. So, if we took 12 or 13 monkeys, trained them to type, and set them to doing so 24 hours a day, 7 days a week, for their entire lives, we would get about 121 sets of 8 consecutive four letter words. Now, not every collection of consecutive words makes sense or is good grammar. We could try to estimate that, but let’s not bother, because we can next ask ourselves how good the chances are of getting four consecutive sentences each of four four letter words. The answer is that it is about 1.4 out of 10 to the fourteenth power, which is so low as to defy the odds that our collection of monkeys could accomplish that.

What should now be obvious is that the likelihood of a monkey randomly typing a Shakespearean play is going to be so low as to be impossible for all practical purposes. Even if you believed that a monkey could succeed in randomly typing thousands of words, that monkey would then face the challenge of getting these words to have the deep and rich structure that we find in Shakespeare’s works. If you were to continue to try to calculate these odds, the numbers would be so astronomically large that there wouldn’t be enough time in the universe to achieve such a work of prose.

So, in short, the answer is “No, a monkey could not type a work of Shakespeare by randomly picking keys on the keyboard.” If you’ve gotten this far, you may feel like saying a few four letter words yourself!

October 26th, 2009 at 10:02 pm

Social comments and analytics for this post…This post was mentioned on Twitter by RockyReichman: RT @literarymagic Mathematical Linguistics: Can Monkeys Type Shakespeare? | Literary Magic http://retwt.me/1ud3g…