Even A Monkey Could Do It

7/1/2010

So here’s CS/Math from a Bio major for a change. Risky venture, but I have somehow been reminded a lot of MATLAB these days and thought I’d revisit my project.

My project was a Random Text Generator. The whole idea was essentially based upon Markov chains. Markov chains are named after Andrei Markov, who produced the first results for these processes. Markov was a mathematician. Dr. Sabieh Anwar says mathematicians aren’t humans, they are super humans. So, even though Markov chains are related to some fundamental physical concepts such as Brownian motion and the Ergodic hypothesis, Markov’s motivation was purely mathematical, closely relating it to probability theory. Thus, to me it looks intensely complicated.

Formally, a Markov chain is defined as a “discrete random process with the Markov property that goes on forever.” By now, I suppose we all very well know what discrete means. A discrete random process would then mean that the system changes between the ‘discrete’ states ‘randomly’. Having the Markov property means that given the present state, the future states do not depend on the past states, that the current state has ‘all’ the information necessary for the evolution of the process. And what happens next (future states) will be reached through a purely probablistic process.

A good example I found to explain this is the comparison between board games, suppose snakes and ladders and card games, suppose blackjack. In snakes and ladders, the future state depends only on the current state of the board and what number the dice will give. It doesn’t matter ‘how’ the current state of the board was achieved and it doesn’t affect what will happen next in the game. So the process is a Markov chain. However, in blackjack, ‘how’ the current state was achieved matters a lot. The cards serve as a memory, so a player who remembers better which cards have been shown and which are still in the deck gains an advantage over other players, and the game is not independent of the past states.

Markov chains have many uses in many disciplines including Physics, Chemistry, Mathematical Biology, Economics, as well as gambling, music etc. In Physics, Markovian systems appear extensively in Thermodynamics and Statistical Mechanics. And for anyone who took the Biochemistry or Mathematical Biology course this semester, the classical model of enzymatic activity, Michaelis-Menten kinetics, is a Markov chain. But what is even more amazing and seems slightly confusing to me is that Markov chains can also be used to simulate brain function, such as the mammalian neocortex, which is involved in higher brain functions such as sensory perception, motor commands, conscious thought and language.

But the most interesting thing you can do with Markov chains is to make a Markov text generator. The text generator produces superficially ‘real-looking’ text which is based on a sample text. How the text generator works is this: to begin with it picks up a word randomly from the original sample text, suppose its “the”. The program knows the words and how many times each of them occurs in the original text after the word “the”, so each of the words has different probabilities. The next word produced depends on these probabilities. And the process keeps occuring for every word produced. But this is only a first order Markov chain. This means that a word (n) produced depends only on the word (n-1). An order-2 Markov chain would have the word (n) depending on word (n-1) ‘and’ word (n-2). Subsequently the new text produced has been done so by a completely probablistic process and is absolute gibberish, but it looks amazingly like the original sample text. The higher the order of the Markov chain the greater the similarity between the new text produced and the sample. I heard that some MIT students produced a research paper using many other research papers as their sample texts and it actually got accepted in one of the journals.

My personal favorite was an Oscar Wilde text. The result was hilarious! It really gives a new dimension to ‘playing with words’.
While I was researching for my project, I found out that it was closely related to the ‘Infinite Monkey Theorem’. “The infinite monkey theorem revolves around the idea that a monkey hitting random keys on a typewriter for an infinite amount of time will almost surely type a given text, usually defined as the complete works of William Shakespeare”.

This theorem had incredible and far-reaching consequences. The physicist Arthur Eddington wrote in The Nature of the Physical World (1928):
If I let my fingers wander idly over the keys of a typewriter it might happen that my screed made an intelligible sentence. If an army of monkeys were strumming on typewriters they might write all the books in the British Museum. The chance of their doing so is decidedly more favourable than the chance of the molecules returning to one half of the vessel.

Eddington was trying for us to consider the great improbability of a large but finite number of monkeys working for a large but finite amount of time to produce a great amount of work and compare this to the even greater improbability of certain physical events (here entropy). What is significant is that anything that is even less probable than monkeys doing this is in effect impossible.

Another argument involves Evolution. Reverend John F. MacArthur claimed:
The genetic mutations necessary to produce a tapeworm from an amoeba are as unlikely as a monkey typing Hamlet's soliloquy, and hence the odds against the evolution of all life are impossible to overcome.
But the argument has also been used in favour of Evolution so there’s really no saying anything.

The Argentine writer Jorge Luis Borges wrote in his essay The Total Library (1939) (Also mentioning the same idea in his short story The Library of Babel later):
Everything would be in its blind volumes. Everything: the detailed history of the future, Aeschylus' The Egyptians, the exact number of times that the waters of the Ganges have reflected the flight of a falcon, the secret and true nature of Rome, the encyclopedia Novalis would have constructed, my dreams and half-dreams at dawn on August 14, 1934, the proof of Pierre Fermat's theorem, the unwritten chapters of Edwin Drood, those same chapters translated into the language spoken by the Garamantes, the paradoxes Berkeley invented concerning Time but didn't publish, Urizen's books of iron, the premature epiphanies of Stephen Dedalus, which would be meaningless before a cycle of a thousand years, the Gnostic Gospel of Basilides, the song the sirens sang, the complete catalog of the Library, the proof of the inaccuracy of that catalog. Everything: but for every sensible line or accurate fact there would be millions of meaningless cacophonies, verbal farragoes, and babblings. Everything: but all the generations of mankind could pass before the dizzying shelves—shelves that obliterate the day and on which chaos lies—ever reward them with a tolerable page.

But everthing about the theorem is so subtle and the concepts of infinity, probability and time beyond average human experience and practical comprehension that you can’t really encompass the whole meaning of it. And I hope they explain this in our Probability course next semester. Fat chance though.

Nayab

The monkey is doing it!

9 Comments

Muneeb

7/1/2010 09:44:31 am

Great piece!
Maybe maths is your calling.

Kamil

7/2/2010 05:14:56 am

I agree. I really wanted to know more about Markov chains, more than the cursory explanation we were given in CS101

konpal

7/2/2010 05:29:08 am

great article

Ali

7/2/2010 05:50:10 am

very nice article!

Mehr

7/2/2010 06:38:49 am

Awesome. I didn't know Markov chains had so many applications

Abdullah Khalid link

7/2/2010 06:51:45 am

Very interesting.. But it's funny how you have to write MATHEMATICAL biology.. :P

P.S The said research paper was accepted to be presented at a conference.. And not a journal..

Nayab

7/2/2010 06:59:51 am

Right, my bad. conference it is...
And how is mathematical biology funny?

Syed Ali Raza

7/4/2010 05:59:43 pm

Wah Wah....
Very interesting and informative....

Could you also post post the funny passages you got after running the program?

Nayab

7/5/2010 06:04:18 am

I reinstalled windows...I don't have MATLAB anymore...=S
My code is useless at the moment..

ON THE FRINGE:

Learn something
The story of a how a YouTube video of a blind man biking down a mountain inspired good non-fiction writing on echolocation. You may find it useful for your own writing. Read here.

An awful waste of space?
Amidst NASA's budgetary cuts and scientists' renewed vigor in justifying Space Programs, it is important to shed some light on the background. Click here for a succinct overview of the Search for Extra-Terrestrial Intelligence(SETI) project.

Manto ka Muqaddama
Pakistaniat.com publishes, on the anniversary of Saadat Hasan Manto's death, a sampling of his works, a tribute to him as well as articles chronicling the obscenity trial he was tried for. Read all three parts of the series here.

Not Another 2010 List
The folks over at The Last Word put up their list for the best non-fiction in the past year, including 'The Mind's Eye' which is very hard to find in bookstores, indeed! Read the full list here.

Bird Conspiracies
Leslie Kaufman at The New York Times tells us exactly why those birds flying above are dropping to their deaths. Click here to read the article.

Spontaneous Solar Growth!
Reported at MadScience, scientists at MIT have found a way to create solar cells that can regenerate themselves like living organisms. Read more here.

Lessons from Chernobyl
Decades after the radiation disaster at Chernobyl, scientists elucidate how plant life has been thriving in the highly radioactive environment. Read more here.

Rolling Ribbons
MIT Scientists revisit Galileo's famous inclined plane experiment, this time with polymer ribbons and discover complex results. Read here.

A Lifetime, Washed Away
Pakistani author Daniyal Mueenuddin writes in the NY Times about the aftermath of the flood and displaced people. Read more here of the article posted by 3QD.

On String theory and Materials Science
Click here to find out how physicists at MIT are using ideas of gauge/gravity duality to explain properties of superconductors.

That's why you're irrational!
Newsweek's Sharon Begley provides a fascinating argument for why evolution may favor irrationality. I particularly liked the examples she picked. Read here.

Just when you wanted a gene kit
The US Food and Drug Administration held hearings n the 19th and 20th of July to talk about the validity of tests which were sold directly to the public which gives consumers direct access to to their genomes. Should it be regulated? Read more here.

On Trees and Prisons
In a 6 minute talk on Ted.com, Nalini Nadkarni (shown above) talks about her ideas of incorporating conservation into prison programs. Watch the talk and read Nadkarni's fascinating biography here.

These Lungs are made in USA
Stem Cell Biology takes huge leaps forward with the new advances made in lung transplants based on using the lungs extracellular architecture. Read more from Nature here.

Economist Special Reports
Ten years after Craig Venter revealed the first working draft of the entire Human Genome, this special report demonstrates how Biology is now at the brink of something brilliant - just recently, the draft of the entire genome of the Neanderthal was revealed. Suck on that, sceptics!

Orbit Stories
Bobby Satcher, astronaut, the first orthopedic surgeon in space. Read all about his tales here on MITnews.

Craig Venter Creations
Researchers create world's first fully synthetic self replicating, living cell. Massive fuss about limitless monster potential possible. Read the NewScientist article here. Watch the Ted.com talk by Craig Venter here.

Even A Monkey Could Do It

Leave a Reply.

ON THE FRINGE:

TAGS

ARCHIVES