Friday, September 26, 2008

Living in a simulation

There has been a lot of press lately (summarized here) on the possibility that we are living inside of a computer simulation. Much of the attention has been focused on whether or not you could know if you lived in a simulation. Here, I will focus on what you could know (or compute) if you were running the simulation. Before I proceed, I'll briefly summarize some points about the theory of computation. I've alluded to these ideas in several of my recent posts but have never formally introduced them. Obviously, this is very deep area so I'll just briefly summarize some important points.

The theory of computation is basically about repeated operations on a finite set of symbols according to some rule. The paradigm is the Turing machine, which consists of a finite number of internal states and a tape on which symbols can be written or erased. The Turing machine then makes transitions from state to state based on its current state and the symbols on the tape. One of the states is for the Turing machine to halt, which signals the end of a computation. The important thing to take away is the Church-Turing thesis, which basically states that all forms of computation on finite symbol sets are basically equivalent. For example, the computer on your desk is equivalent to a Turing machine. (Actually, it is even less powerful because it is finite but I digress).

One of the things that Turing proved was that there is a universal Turing machine, which can emulate any other Turing machine. The input to a Turing machine is a set of symbols on the tape, i.e. an input string. Some of the symbols code for instructions or rules and the others code for an initial condition. Turing also showed that there are problems for which a Turing machine can never solve. The most famous is the Halting Problem, which states that there does not exist a Turing machine that can decide if another Turing machine will halt given some input. Turing actually showed that it was impossible to produce a general algorithm to decide if a given input string to a Turing machine will ever cause it to print a given symbol. In other words, there is no general algorithm to decide if a computation will have an infinite loop or perform some specific task. This doesn't imply that you couldn't prove that a specific program has this property, just that there isn't a way to do it generally. The proof of the Halting problem is similar to Cantor's diagonal proof that the set of real numbers is uncountable.

One of the consequences of Turing's work is that the total number of possible computations is countable. You simply take all strings of length 1, then 2, etc and feed it to a Universal Turing machine. Every possible computation or numerical simulation will be included. Thus, any simulation of the universe is coded in one of these strings. Some of these inputs will lead to computations that will halt and some will run forever. Some will make the Turing machine print a particular symbol and some will not. However, there is no way to decide, which of the input strings are on any of these lists.

The question is then, given an input string, can you determine if it will produce a universe that has some property such as supporting life. There are actually two separate issues regarding this question. The first is, how would you even define life or recognize it in the computation. I will deal with this in a future post. The second is, given that you have a definition of life, then can you know ahead of time whether or not your simulation will produce it. The answer to this question is no because if it were yes then you could solve the Halting problem. This is easy to see because any definition of life must involve some pattern of symbols to be printed on the tape and there is no way to decide if an input string will ever produce a symbol much less a pattern. This doesn't mean that a simulator couldn't come up with a simulation of our universe, it just means that she could never come up with a general algorithm to guarantee it. So, in the infinite but countable list of possible compuations, some produce simulations of universes, perhaps even ours, but we can never know for sure which.

Saturday, September 20, 2008

Complexity of the brain

The Kolmogorov complexity of an object is the length of the minimal description of that object. In terms of the brain, it would correspond to the length of the smallest computer program that could reproduce the brain. It could also be thought of as the amount of information necessary to model the brain. Computing the Kolmogorov complexity is not possible since it is an undecidable problem but we can estimate it. If we presume that molecular biology is computable then one estimate of the Kolmogorov complexity of the brain is given by the length of the genome, which is 3 billion base pairs long or 6 billion bits. To be conservative, we could also include the genome of the mother and baby, which implies 12 billion bits. This corresponds to less than two billion bytes and easily fits on a DVD. Hence in principle, we could potentially grow a brain with less than 12 billion bits of information and this is probably an upper bound.

However, this would not imply that we could describe a brain with this amount of information since it ignores the modifications due to external inputs. For example, the visual system cannot fully develop if the brain does not receive visual inputs. So we also need to estimate how much input the brain receives during development. The amount of information available in the external world is immense so it is safe to assume that the amount received is limited by the brain and not the source. However, there is no way to estimate this in a principled way since we don't know how the brain actually works. Depending on what you assume to be the neural code (see previous post), you could end up with a very wide range of answers. Nonetheless, let's suppose that it is a rate code with a window of 10 ms. Generally, neurons fire at rates less than 100 Hz so this corresponds to the presence or absence of a spike in a 10 ms window. This corresponds to 100 bits per neuron per second. The brain has about 10^11 neurons so the maximum amount of information that could be input to the brain is 10^13 bits per second. There are over 30 million seconds in a year, so that is a lot of information and can easily dwarf the genomic contribution.

However, this does lead us to a potential means to quantify the influence of genes versus environment on intelligence and behaviour debate. If the complexity of the brain is less than 12 billion bits then we are basically genetically determined. If it is greater, then we are mostly shaped by the environment. So what do you think?

Saturday, September 13, 2008

Contrasting worldviews - Part 2

Compared to biologists and mathematicians, physicists are much more uniform in their worldview. There is a central canon of physics that everyone is taught. With respect to how physicists approach biology especially with regards to how many details to include, their ideas are shaped by the concept of universality and the renormalization group, which I will explain below. This is partly what gives physicists the confidence that complex phenomena can have simple explanations although this physics worldview hegemony is starting to break as physicists become more immersed in biology. In fact, I've even noticed a backlash of some physicists cum biologists towards their colleagues that espouse the notion that details can be dispensed with. Some physicists are very much of the notion that biologically detailed modeling is necessary to make progress. In this sense, what I'm describing might be more appropriately called the old physics world view.

The concept of universality arose from the study of phase transitions and critical phenomena with inspiration from quantum field theory. In a nutshell, it says that for certain systems in regimes where there is no obvious length scale (usually indicated by power law scaling), such as at the critical point of a second order phase transition, the large scale behavior of the system is independent of the microscopic details and only depends on general properties such as the number of dimensions of the space and symmetries in the system. Hence, systems can be classified into what are called universality classes. Although, the theory was developed for critical phenomena in phase transitions, it has since been generalized to apply to a wide range of dynamical situations such as earthquakes, avalanches, flow through porous media, reaction diffusion systems and so forth.

The paradigmatic system for critical phenomena is magnetism. Bulk (ferro)magnetism arises when the atoms (each of which have a small magnetic moment) align and produce macroscopically observable magnetization. However, this only occurs for low temperatures. For high enough temperature, the random motions of the atoms can destroy the alignment and magnetization is lost (material becomes paramagnetic). The change from a state of ferromagnetism to paramagnetism is called a phase transition and occurs at a critical temperature (the Curie temperature).

These systems are understood by considering the energy associated with different states. The probability of occupying a given state is then given by the Boltzmann weight, which is exp(-H(m)/kT), where H(m) is the internal energy of the state with magnetization m (also called the Hamiltonian), T is the temperature, and k is the Boltzmann constant. Given the Boltzmann factor, the partition function (sum of Boltzmann weight over all states) can be constructed from which all quantities of interest can be obtained. Now this particular system was studied over a century ago by notables such as Pierre Curie, who using known microscopic laws of magnetism and mean field theory, found that below a critical temperature Tc, m is nonzero and above Tc m is zero.

However, the modern way of how we think of phase transitions starts with Landau, who first applied it to the onset of superfluidity of helium. Instead of trying to derive the energy from first principles, Landau said let's write out a general form based on the symmetries of an order parameter, which in this example is the magnetization m(x), at spatial location x. Since the energy must be a scalar, it can only depend on terms like |m|^2 or (grad m)^2. The first few terms then obey H ~ \int dx q (grad m)^2 - (T-Tc) m^2 + u m^4 + ..., for parameters q, T, and u. The (grad m)^2 term is due to fluctuations. If fluctuations are ignored, then this is called mean field theory, in which case H ~ -(T -Tc)/2 m^2 + u m^4. The partition function can be estimated by using a saddle point approximtion, which in the mean field limit amounts to evaluting the critical points of H, which are m=0 and m^2=(Tc-T)/4 u. They correspond to the equilibrium states of the system, so if T is greater than Tc then the only solution is m=0 and if T is less than Tc then the magnitude of magnetization is nonzero.

The partition function cannot be explicitly computed in the presence of fluctuations. This is where Ken Wilson and the renormalization group comes in. What Wilson said, following people before him like Murray Gell-Mann, Francis Low and Leo Kadanoff, is suppose we have scale invariance, which is true near a critical point. Then if we integrate out small length scales (or high spatial frequency scales), rescale in x, and then renormalize m, we will end up with a new partition function, but with slightly different parameters. These operations form a group action (i.e dynamical system) on the parameters of the partition function. Thus, a scale invariant system should be at a fixed point of the renormalization group action. In other words, if you keep applying the renormalization group, the parameters can flow to a fixed point and the location of the fixed point only depends on the symmetry of the order parameter and dimension of the space. Many different systems can flow to the same fixed point. The most important element of the renormalization group in terms of the physics worldview is that terms in the Hamiltonian are renormalized in different ways. Some grow, these are called the relevant operators, some stay the same, these are marginal operators, and some decrease, these are called irrelevant operators. For critical systems, only a small number of terms in the Hamiltonian are relevant and this is why microscopic details do not matter at large scales.

Now, these ideas were originally developed just for behavior near a critical point, which is pretty specialized. If it were only applicable to an equilibrium phase transition, then physicists really wouldn't have a leg to stand on in terms of ignoring details. However, these ideas were later generalized to dynamical systems with critical behavior. What also motivates them is that power laws (also called 1/f or fractal scaling) seem to be ubiquitous. They can been found in the size distribution of earthquakes, thermal noise in resistors, size of river meanders, the coastline of Norway, size of hubs in the Internet, connectivity of protein networks, and even neural firing patterns, to name a few. Although there is not an agreement as to why these systems exhibit power laws (many theories have been proposed), the spectre of the renormalization group and universality permeates the air and influences the physicist world view.

My personal view is that some details matter immensely while others do not. However, there is no a priori systematic way of deducing which is which. There are only rules of thumb and experience that can assist us. Hence, even if you buy into the details-may-not-matter worldview, there is no prescription for how to implement it. What it does do is give me less confidence that there is such a thing as the "correct" theory for a system. I'm more inclined to believe that given the current state of knowledge and a specific set of questions, some theories perform better than others. With more information, we can refine our theories. However, I don't think this process ever converges to "the" theory because specifying what a system is is somewhat arbitrary. Nothing is purely isolated from its surroundings, so drawing a boundary is always going to involve a choice. These could be very logical and well informed choices but choices nonetheless. Also, we can never have full control of all the external inputs that can affect a system. In this way, I have a Bayesian viewpoint in that we only make progress by updating our priors.

Saturday, September 06, 2008

Contrasting worldviews - Part 1

In my previous post, I talked about how we probably needed a new worldview before we would be prepared to understand the brain. What I thought I would do here is to introduce what I think forms the worldviews of people who do dynamical systems (which forms a sizable contingent of the mathematical neuroscience community) and physics (in particular statistical mechanics and field theory). Having trained in physics and applied mathematics, served on the faculty in a math department and worked on biology, I've gotten a chance to see how these different groups view science. The interesting thing is that in many ways biologists and mathematicians can sometimes understand each other better than physicists and mathematicians. I was led to this belief after hearing Alla Borisyuk, who is an applied mathematician, exclaiming at a conference I helped organize in 2000 for young researchers, that she had no trouble talking to biologists but had no idea what the physicists were talking about.

The reason why biologists may have more in common with mathematicians than physicists is because unlike physics, biology has no guiding laws other than evolution, which is not quantitative. They rarely will say, "Oh, that can't be true because it would violate conservation of momentum," which was how Pauli predicted the neutrino. Given that there are no sweeping generalizations to make they are forced to pay attention to all the details. They apply deductive logic to form hypotheses and try to prove their hypotheses are true by constructing new experiments. Pure mathematicians are trained to take some axiomatic framework and prove things are true based on them. Except for a small set of mathematicians and logicians, most mathematicians don't take a stance on the "moral value" of their axioms. They just deduce conclusions within some well defined framework. Hence, in a collaboration with a biologist, a mathematician may take everything a biologist says with equal weight and then go on from there. On the other hand, a physicist may bring a lot of preconceived notions to the table (Applied mathematicians are a heterogeneous group and their world views lie on a continuum between physicists and pure mathematicians.) Physicists also don't need to depend as much on deductive logic since they have laws and equations to rely on. This may be what frustrates biologists (and mathematicians) when they talk to physicists. They can't understand why the physicists can be so cavalier with the details and be so confident about it.

However, when physicists (and applied mathematicians) are cavalier with details, it is not because of Newton or Maxwell or even Einstein. The reason they feel that they can sometimes dispense with details is because their worldviews are shaped by Poincare, Landau and Ken Wilson. What do I mean by this? I'll cover Poincare (and here I use Poincare to represent several mathematicians near the turn of the penultimate century) in this post and get to Landau and Wilson in the next one. Poincare, among his many contributions, showed how dynamical systems can be understood in terms of geometry and topology. Prior to Poincare, dynamical systems were treated using the tools of analysis. The question was: Given an initial condition, what are the analytical properties of the solutions as a function of time? Poincare said, let's not focus on the notion of movement with respect to time but look at the shape of trajectories in phase space. For a dynamical system with smooth enough properties, the families of solutions map out a surface in phase space with tiny arrows pointing in the direction the solutions would move on this surface (i.e. vector field). The study of dynamical systems becomes the study of differential geometry and topology.

Hence, any time dependent system including those in biology that can be described by a (nice enough) system of differential equations, is represented by a surface in some high dimensional space. Now, given some differential equation, we can always make some change of variables and if this variable transformation is smooth then the result will just be a smooth change of shape of the surface. Thus, what is really important is the topology of the surface, i.e. how many singularities or holes are in them. The singularities are defined by places where the vector field vanishes, in other words the fixed points. Given that the vector field is smooth outside of the fixed points then the global dynamics can be reconstructed by carefully examining the dynamics near to the fixed points. The important thing to keep track of when changing parameters is the appearance and disappearance of fixed points or the change of dynamics (stability) of fixed points. These discrete changes are called bifurcations. The dynamics near fixed points and bifurcations can be classified systematically in terms of normal form equations. Even for some very complicated dynamical system, the action is focused at the bifurcations. These bifurcations and the equations describing them are standardized (e.g. pitchfork, transcritical, saddle node, Hopf, homoclinic) and do not depend on all the details of the original system. Thus, when a dynamical systems person comes to a problem, she immediately views things geometrically. She also believes that there may be underlying structures that capture the essential dynamics of the system. This is what gives her confidence that some details are more important than others. Statistical mechanics and field theory takes this idea to another level and I'll get to that in the next post.