I recently came across an interesting post on the Powerset Blog recently about garden path sentences. Garden path sentences are sentences that lead you down the wrong path through a string of words with multiple meanings. For example,
In this case, most readers would probably think complex was an adjective that modified the plural noun houses. The post ended with a challenge – how easy would it be to create a program to automatically generate these sentences. Since school is out and I have some free time, I tried it myself. I found a decent free xml dictionary, and wrote a Ruby script to parse the important bits (the type of word and alternate forms) into an SQL database. I cross-checked all the words against a word frequency table to make sure there were no obscure words. I then wrote a Python script to put the words together into a (hopefully meaningful, but not often) sentence. April 2009 Update: I removed the live demo as part of a server move.
As you can see, the sentences that it comes up with are far from meaningful. However, in most cases you can at least see how a reader could be taken down the wrong path (at least in the cases where there is a right path). In the above example, concrete could be an adjective or a noun, and spheres could be a noun or a verb (to form a sphere). Foster could be an adjective or a noun depending on the context, but I couldn’t see the reader seeing it as an adjective here. Certainly the sentence generator leaves a lot to be desired (especially considering that this was one of the better sentences), but I got about as far with it as I expected to. I think it could be improved further with a few modifications:
I noticed later that Ero Carrera had taken a similar approach to what I did, but with his linguistics experience he better anticipated the problems I ran into. He has some good ideas, and his post is an interesting read.