Cellular Automata and Artificial Intelligence III

ORDER	LIFE	DISORDER
----------------	----------------	----------------
pattern	fractal	random
trivial	computable	incomputable
I(P)=O(1)	I(P)=\|p\|	I(p)=oo
tone	music	noise
physics	biology	religion
determinism	free will	chaos
0	1/4	1/2

Life (and intelligence) exists on the border between order and disorder. What do I mean by this? And why do I say it? This article, third in a series, will attempt to justify this statement.

Each row in the table above represents an example from a specific discipline that illustrates this principle. The rows are from, respectively, dynamical systems theory, theory of computation, algorithmic information theory, theory of music, philosophy of science, ethics, and finally, the theory of cellular automata. My thesis is that each of these trichotomies result from a common cause -- a universal truth about self-organizing systems.

I will use the (I hope) least familiar example, cellular automata, to explain all the others. Suppose you build the simplest possible cellular automaton -- an infinite row of binary-valued cells. The value of each cell at step N is determined from the value of its nearest neighbors and itself at step N-1 -- the simplest possible non-trivial program. The program for each cell takes three binary inputs and outputs one binary value, so it can be represented by eight binary digits (bits) or one byte. For further simplicity, suppose all the cells are running the same program.

Suppose we want to investigate self-organizing behavior in this cellular automaton. We can do this by giving the CA (cellular automaton) a problem to solve that requires global cooperation. For example, if the CA starts in a random state (the binary values are randomly set to zero or one in the cells), we can ask it to compute whether there are more zeros or more ones in the initial state. This clearly requires global communication between the cells in an organized way. How does the CA "compute" the answer? We have a lot of choices here, so suppose we choose this: if there are more ones in the initial state, we want the CA to become all ones; more zeros, the CA should become all zeros.

As you might expect, if we randomly choose a program for the CA and see how it does on our problem, it fares pretty poorly. So we borrow a page from nature's book, and use simulated evolution. We randomly select a few programs (read "gene pool"), test them out and keep the ones that are better than the others ("natural selection") with a few random changes ("mutation"). We run this process over and over, and soon we have evolved a program thit does the job -- it solves the more-zeros-or-ones problem.

So, we change the problem and run the whole procedure over, and over, and after a while a certain pattern begins to emerge. It always seems to work out that the programs that evolve have densities in their programs near 1/4. In other words, of the eight bits in the surviving programs, on average either two are zero and six one, or vice versa. You never see a surviving program with high densities of zeros or ones, nor do you see a surviving program with equal densities of zero and ones.

Upon further experimentation, the reason for this becomes clear. Programs with densities near 0 (0 ones or 0 zeros) can only produce CAs that exhibit very regular behavior -- simple patterns or constants. Programs with densities near 1/2 (4 ones and 4 zeros) produce CAs that exhibit random, chaotic, unpredictable behavior. Only programs on the edge of order and disorder produce CAs that exhibit self-organizing behavior.

Further experimentation shows that this principle generalizes to more complex CAs, with more complex programs, higher dimensionality, etc. This principle is much more than an empty tautology (in fact, as far as I know, there is no mathematical proof of it), since it provides strong guidance in selecting initial programs for CAs. Such guidance is essential, since the number of such programs possible grows exponentially with the number of neighbors. The rate of convergence of genetic algorithms is very slow, and without this guidance practical programs for many problems could not be found in a reasonable time.

My thesis is that this same principle explains the table at the beginning of this article. Can this principle be put to practical use outside the CA field? That remains to be seen....