« February 2008 | Main | July 2008 »
March 01, 2008
Quines
I've never read ‘Gödel, Escher, Bach’. (I've been lent it on occasion, but I never took to it.) Consequently, I only vaguely know about quines.
A quine is a program that produces its own source code as output. Such a program will contain a representation of the program. However, this representation is part of the program, and so must itself be represented. This is where it gets a bit fiddly.
Quines had always seemed a bit mysterious, but I've just read David Madore's account and it turns out that they are almost disappointingly straightforward.
The trouble is that quines I've seen have often been difficult to read. There are two reasons for this. One is that they are usually very terse. (The shorter a quine's source code is, the less work it has to do in producing it: there is some advantage in having the source consist of a single line.) The other is that a typical quine will contain text data that is very similar to the rest of the program: this doubling can often be visually confusing. Both of these factors are evident in this example.
I've tried to write a quine that avoids both of these problems.
#!/usr/bin/perl $d=' H4sIAG+KyEcAA1WPUUvDMBSF3/Mrrl3LEgidW0vElL4oOvYwffDNMWQ1cQba puQmyPz1piuD+XI598C593yzm0VAt2hMvxhO/tv2hKg6SZIM4yAkQ9MN1nlo DqhFyeHNO9MfN68cjr9m4IAnJAHq85av43g2raYvttccEpdwmOQlll8EnQ7m jSiV/rRKU8UYy50+KMpIvJqjVzb4/McZr2nYLe/v5Gol9pABVRzCLi6yEGLP GJldQWjXEpKqep7hvBoBAmrYbrZPUj6cX1ZkdB5tNziNKOV7a5qKpCPEf1PK Tnfr0EcoOnX8mErTNHatyBBJ/BdgaNA7mgYOtxyKgnFIY8Eruyg4LEUZI38o ePPObgEAAA== '; use MIME::Base64; use Compress::Zlib; $u = Compress::Zlib::memGunzip(decode_base64($d)); printf substr($u, 0, 33), $d, substr($u, 33, 164);
In a quine there are typically two important parts: some data, and some logic
which decodes this data. In my program, the data is in the variable
$d, and last 4 lines are the decoder.
The data is base64 encoded: partly to disguise its content (avoiding the 'doubling' problem), and partly as it means that the data won't have any special characters (quotes, backslashes, etc) that would complicate the decoder. The important thing is that the four lines of the decoder are stored within it.
The decoder has to print out the whole program: both data and decoder. This
means it has to process the data twice over: once ‘raw’ and once
decoded. This is why the decoder prints both $d itself, and
substr($u, 33, 164)—which is derived from the data in
$d. In writing this program I used a ‘skeleton’
version, where the data was missing. I wrote another program to work out what
the correct data should be.
To tie everything together there is also a formatting string. For convenience
I've also stored that within the data in $d.
Is that really all there is to it?
Posted by robin2 at 08:47 PM | Comments (0)