## Shakespeare’s Birthday and Evolution

William Shakespeare was baptized April 26, 1564. His birthday is traditionally commemorated on April 23 (incidentally, that is also the date of his death, in 1616). One interesting connection between Shakespeare and evolution was made by Richard Dawkins in his book The Blind Watchmaker: I am talking about the Weasel program. Weasel is an elegant illustration of the process that drives evolution by natural selection: random variation in offspring, coupled with non-random selection. The connection to Shakespeare is in the quote from Hamlet: “Methinks it is like a weasel”. Achieving this exact sentence through random processes alone (a monkey hacking away at a typewriter)  would take and estimated 1040 generations. But achieving it through random mutations coupled with targeted selection makes for a much shorter number of generations: less than 100, usually.

If you are unfamiliar with the Weasel program, read the Wikipedia entry before reading on.

In honor of The Bard’s birthday, I have penned a short Python program which performs Weasel. You may need to tweak it for Windows (remove the first #! line which is a Linux thing). When running the program, the user selects 3 parameters: the target net result (in the form of a string), the number of offspring per generation, and the mutation rate.

Here is a demo run, with 300 offspring per generation, and a mutation rate of 0.05 per position:

```   ./weasel.py "methinks it is like a weasel" 300 0.05
1 `,<o?[}= g!yq[}?*v=.+ :y<<:g
2 `,<o?[}= gjyq[}?*v=.m :y<<eg
3 `,<o?[}= gjyq[ ?*v=.m :y<<eg
4 `,<o?[}= ijyq[ ?*v=.m :y<<eg
5 `,<o?[k= ijyq[ ?*v=.m :y<<eg
6 `g<o?[k= ijyq[ l*v=.m :y<<eg
7 `g<o?[k= ijyq[ l*ve.m :y<<eg
8 `g<o?[k= ij q[ l*ve.m :y<<eg
9 `v<o?[k= ij qs l*ve.m :p<<eg
10 ev<o?[k= ij qs l*ve.v :paieg
11 ev<o?[k( ij qs l*ke.v :paieg
12 ev<o?[k( it qs l*ke.v :paieg
13 ev<o?[k( it qs l_ke.v wpaieg
14 ev<oi[k( it qs l_ke.v wpaieg
15 ev<oi[k( it qs l_ke.a wpaieg
16 ev<oi[k( it [s l_ke.a wpaiel
17 ev<oi[k( it is l_ke.a wpaiel
18 ev<oi[k( it is l_ke.a wea*el
19 ev<oi[k( it is l_ke.a wea*el
20 mv<oi[k( it is l_ke.a wea*el
21 mv<oi[k( it is llke\a weasel
22 mv<,i[k( it is llke\a weasel
23 mv<hi[k( it is llke\a weasel
24 mvzhi[k( it is like\a weasel
25 mvzhi[k( it is like\a weasel
26 mv(hi[k( it is like\a weasel
27 mv(hi[ks it is like\a weasel
28 mv(hi[ks it is like\a weasel
29 mv(hi[ks it is like\a weasel
30 mvthi[ks it is like&a weasel
31 mvthi[ks it is like&a weasel
32 mvthimks it is like&a weasel
33 mvthimks it is like&a weasel
34 mvthimks it is like&a weasel
35 mvthimks it is like&a weasel
36 mvthimks it is like&a weasel
37 mxthimks it is like&a weasel
38 mxthimks it is like&a weasel
39 mxthimks it is like&a weasel
40 mxthimks it is like&a weasel
41 mxthimks it is like&a weasel
42 mxthimks it is like&a weasel
43 mxthimks it is like a weasel
44 mxthimks it is like a weasel
45 mxthilks it is like a weasel
46 mxthilks it is like a weasel
47 mxthilks it is like a weasel
48 mxthinks it is like a weasel
49 mxthinks it is like a weasel
50 mxthinks it is like a weasel
51 mxthinks it is like a weasel
52 mxthinks it is like a weasel
53 mxthinks it is like a weasel
54 mxthinks it is like a weasel
55 mxthinks it is like a weasel
56 mxthinks it is like a weasel
57 mxthinks it is like a weasel
58 methinks it is like a weasel```

And here is the Python code:

```#!/usr/bin/python
import string
import random
import sys
import copy
# Copyright(C) 2011 Iddo Friedberg
# Do not remove this comment

ALLCHARS = string.lowercase+' '+string.punctuation
def loopweasel(target_string,n_offspring,mut_rate):
i = 1
target_string = target_string.lower()
current_string = list(''.join(random.choice(ALLCHARS)
for i in range(len(target_string))))
print "    %s" % target_string
while target_string != ''.join(current_string):
print "%4d %s" % (i,''.join(current_string))
i += 1
offsprings = create_offspring(current_string,
n_offspring,mut_rate)
current_string = evolve_string(offsprings, target_string)
print "%4d %s" % (i,''.join(current_string))

def create_offspring(current_string,n_offspring,mut_rate):
offspring_list = []
for i in (range(n_offspring)):
offspring = []
for c in current_string:
if random.random() < mut_rate:
offspring.append(random.choice(ALLCHARS))
else:
offspring.append(c)
offspring_list.append(offspring)
return offspring_list

def diffseq(a,b):
diffcount = 0
for i,j in zip(a,b):
if i != j:
diffcount += 1
return diffcount

def evolve_string(offspring_list, target_string):
best_match = (2000,'')
for offspring in offspring_list:
diffscore = diffseq(offspring, target_string)
if diffscore < best_match[0]:
best_match = (diffscore,offspring)
return best_match[1]

if __name__ == '__main__':
if len(sys.argv) < 4:
print "Usage: weasel target_string n_offspring mutation_rate"
print
print "target_string: the string you would eventually evolve into"
print "n_offspring: number of offspring per generation"
print "mutation_rate rate of mutation per position, 0=< m <1"
sys.exit(1)
target_string = sys.argv[1]
n_offspring = int(sys.argv[2])
mut_rate = float(sys.argv[3])
for i in target_string:
if i not in ALLCHARS:
print "Error, string can only contain %s" % ALLCHARS
sys.exit(1)
if mut_rate >= 1.0 or mut_rate < 0:
print "Error: 0 =< mutation rate < 1"
sys.exit(1)
loopweasel(target_string,n_offspring, mut_rate)```

A quick guide: “loopweasel” (line 11) is the main bit that loops through generations. In each loop iteration it first calls “create_offspring” (line 25) which creates a list with the “offspring” strings (300 in the above example). Mutations are inserted (or not) in line 30. Control returns to loopweasel, which calls evolve_string (line 44). All offspring strings are score to the target (“methinks it is like a weasel”) string, using the “diffseq” code (line 37). (Is there a built-in way in Python for doing that? I could not find any.) The best scoring match is not he chosen offspring. It is printed, and the loop is repeated. Repeat until best-fitted offspring matches the target string.

It is actually quite fun to play with different mutation rates and offspring / generation numbers. Try it. Then rent a good Shakespeare movie. As a horror fan, I’m going to watch Titus tonight.

Titus Andronicus. Source: Wikipedia

Share and Enjoy:

### 10 Responses to “Shakespeare’s Birthday and Evolution”

1. Nick says:

Wow, so I’m pretty glad I found out about the Weasel program. I thought it was an amazing concept. So amazing, in fact, that I wrote my own in Java. Thought I might as well share it here: http://write.fm/weasel

In the current version, instead of having a constant mutation probability per character, it mutates a pre-determined amount of characters per generation. I should probably change that to make it more biology-y.

Also, it has an adaptive character set. That means it only uses the characters present in the destination string. So it won’t waste time with punctuation or uppercase letters if they aren’t used in the destination string.

2. What is the fitness function, i.e. the reward for a correct letter?
Are the correct letters locked in once found?

3. Iddo says:

@Bjørn

The fitness function is in lines 37-42. It is the Hamming distance between “methinks it is like a weasel” and any of the current offspring character strings.

The correct letters are not locked in. In fact, if you increase the mutation rate enough, you will find you go through many more generations (or until you give up and hit ctrl-C) to get to “methinks it is like a weasel”.

4. Okay, got it.

In other words, you choose the one most fit individual each generation to have 300 offspring, so there is no drift.

If you use Wright-Fisher selection you’ll have drift due to the stochastic selection, which will make the simulation more realistic, but will take much longer to find the target. How large population sizes can you handle?

5. Iddo says:

“In other words, you choose the one most fit individual each generation to have 300 offspring, so there is no drift.”

Well, not me. The weasel program was written by Dawkins, this is only a Python implementation.

Also, the Weasel program was never intended to model evolution accurately. “[It] was only meant to demonstrate the power of cumulative selection as compared to random selection, and show the complete unrealism of the popular notion of natural selection as ‘monkeys pounding on typewriters’. ” (Wikipedia

That said, it would be interesting to jack up this program and add some genetic drift model. I might do it some time, or are there any takers?

6. Actually, I made one a couple of years ago that can use either of these two methods of selection.

With the same params as you (N=300, mu=0.05, and 100 replicates), I get:
Winner-takes-all: mean updates to target = 39.98, stdev = 6.5056
Wright-Fisher does not converge with those parameters. The mutation rate is too high, so the population suffers from mutational meltdown (hovers around 17 correct out of 28 chars).

But with mu = 0.01 I get this:
Winner-takes-all: mean updates to target = 58.26, stdev = 14.833
Wright-Fisher: mean = 705.73, stdev = 141.99

Fun!

7. Update: Actually, I just re-read my code, and it turns out I used a fitness function that was 1.2^hits, where hits is the number of correct chars. That function goes up faster than linear, which makes the individuals fitter than just the number of hits when hits>15.

If I use fitness = hits, winner-takes-all, N=300, mu=0.05, I get mean = 41.95, std = 8.807. Not much different.

8. There is a one-line Hamming distance function in Python:
a=”string of 12″
b=”-fling of –”
print sum([i!=j for i,j in zip(a,b)])

This will work with either strings or lists—anything zippable.

9. Iddo says:

@gasstationwithoutpumps
Ha! Nice one-liner. Thanks Kevin!

10. eryksun says:

@gasstationwithoutpumps,

You don’t need a list. sum will work with a generator. You could also handle sequences of different length:

>>> a = “string of 12″
>>> b = “-fling of –”
>>> c = “-fling of –0123456789″
>>> print(abs(len(a) – len(b)) + sum(i != j for i, j in zip(a, b)))
5
>>> print(abs(len(a) – len(c)) + sum(i != j for i, j in zip(a, c)))
15