Scientific Clearing House

Saturday, October 11, 2008

Complexity of art and science

There seems to be a consensus that art cannot be compressed. A plot summary of Hamlet is not the same as Hamlet. A photo of Picasso's Guernica is not the same as the actual painting in Madrid. Music is a particulary interesting case. A Bach partita can be written down in a few thousand bits but reading the music is not the same as hearing it played by Heifetz or Menuhin. One could even argue that one performance by the same artist is not the same as a recording or even another performance.

This is in contrast to Science. We all know the theory of evolution but most of us have never read Darwin's On the Origin of Species. The three volumes of Newton's Principia Mathematica can now be reduced to F = ma and F = G m1 m2/r^2. Obviously, it takes some concerted study to understand these equations but one doesn't need to read Newton to do so. It is interesting that scientists tend to worry a lot about priority of a discovery while they are alive but unless their name is directly associated with a concept, theorem or equation, the provenance of many scientific ideas tend to get lost. Quantum mechanics is often taught before classical mechanics now so most starting students have no idea why the energy function is called a Hamiltonian. The concept of the conservation of energy is so natural to scientists now that most people don't realize how long it took to be established and who were the main players.

If art is not compressible then we can interpret the complexity of the brain in terms of the complexity of art. The complete works of Shakespeare runs a little over 1200 pages. Estimating 5000 characters per page and 8 bits per character leads to a total size of less than 50 million bits, which is not very much compared to the hard drive on your computer. Charles Dickens was much more prolific in terms of words generated. Bleak House alone is over 1000 pages. I haven't counted all the pages of all twenty plus novels but let's put his total output at say a billion bits.

If art is incompressible then that means there could not be an algorithm smaller than a billion bits that could have generated the work of Dickens. This would put a lower bound on the complexity of the "word generation" capabilities of the brain. Now perhaps if you are uncharitable (like some famous authors have been), you could argue that Dickens had a formula to generate his stories and so the complexity is actually less. One way to do this would be to take a stock set of themes, plots, characters, phrases and so on and then randomly assemble them. Some supermarket romances are supposedly written this way. However, no one would argue that they compare in anyway to Dickens, much less Shakespeare. Given that the Kolmogorov complexity is uncomputable we can never know for sure if art is compressible. So a challenge to computer scientists is to write a program that can generate literature with a program shorter than the work itself.

Friday, October 03, 2008

Modeling the financial crisis

There is an interesting op-ed piece in the New York Times this week by physicist and science writer Mark Buchanan on predicting the current financial crisis. His argument is that traditional economists were unable to predict or handle the current situation (Nouriel Rubini notwithstanding) since their worldviews are shaped by equilibrium theorems, which unfortunately are either incomplete or wrong. Buchanan writes:

Well, part of the reason is that economists still try to understand markets by using ideas from traditional economics, especially so-called equilibrium theory. This theory views markets as reflecting a balance of forces, and says that market values change only in response to new information — the sudden revelation of problems about a company, for example, or a real change in the housing supply. Markets are otherwise supposed to have no real internal dynamics of their own. Too bad for the theory, things don’t seem to work that way.
Nearly two decades ago, a classic economic study found that of the 50 largest single-day price movements since World War II, most happened on days when there was no significant news, and that news in general seemed to account for only about a third of the overall variance in stock returns. A recent study by some physicists found much the same thing — financial news lacked any clear link with the larger movements of stock values.
Certainly, markets have internal dynamics. They’re self-propelling systems driven in large part by what investors believe other investors believe; participants trade on rumors and gossip, on fears and expectations, and traders speak for good reason of the market’s optimism or pessimism. It’s these internal dynamics that make it possible for billions to evaporate from portfolios in a few short months just because people suddenly begin remembering that housing values do not always go up.
Really understanding what’s going on means going beyond equilibrium thinking and getting some insight into the underlying ecology of beliefs and expectations, perceptions and misperceptions, that drive market swings.

He then goes on to describe the work of some pioneers who are trying to model the actual dynamics of markets. A Yale economist with two physicists (Doyne Farmer being one of them) used an agent-based model to simulate a credit market. They found that as the leverage (amount of money borrowed to amplify gains) increases there is a phase transition or bifurcation from a functioning credit market to an unstable situation that results in a financial meltdown.

I found this article interesting on two points. The first is the attempt to contrast two worldviews: the theorem proving mathematician economist versus the computational physicist modeler. The second is the premise that the collective dynamics of a group of individuals can be simpler than the behavior of a single individual. A thousand brains may have a lower Kolmogorov complexity than a single brain. My guess is that biologists (Jim Bower?) may not buy this. Although, my worldview is more in line with Buchanan's, in many ways his view is on less stable ground than traditional economics. With an efficient market of rational players, you can at least make some precise statements. Whereas with the agent-based model there is little understanding as to how the models scale and how sensitive the outcomes depend on the rules. Sometimes it is better to be wrong with full knowledge than be accidentally right.

I've always been intrigued by agent-based models but have never figured how to use them effectively. My work has tended to rely on differential equation models (deterministic and stochastic) because I generally know what to expect from them. With an agent-based model, I don't have a feel for how they scale or how sensitive they are to changes in the rules. However, this lack of certainty (which also exists for nonlinear differential equations, just look at Navier-Stokes for example), may be inherent in the systems they describe. It could simply be that some complex problems are so intractable that any models of them will rely on having good prior information (gleened from any and all sources) or plain blind luck.

Friday, September 26, 2008

Living in a simulation

There has been a lot of press lately (summarized here) on the possibility that we are living inside of a computer simulation. Much of the attention has been focused on whether or not you could know if you lived in a simulation. Here, I will focus on what you could know (or compute) if you were running the simulation. Before I proceed, I'll briefly summarize some points about the theory of computation. I've alluded to these ideas in several of my recent posts but have never formally introduced them. Obviously, this is very deep area so I'll just briefly summarize some important points.

The theory of computation is basically about repeated operations on a finite set of symbols according to some rule. The paradigm is the Turing machine, which consists of a finite number of internal states and a tape on which symbols can be written or erased. The Turing machine then makes transitions from state to state based on its current state and the symbols on the tape. One of the states is for the Turing machine to halt, which signals the end of a computation. The important thing to take away is the Church-Turing thesis, which basically states that all forms of computation on finite symbol sets are basically equivalent. For example, the computer on your desk is equivalent to a Turing machine. (Actually, it is even less powerful because it is finite but I digress).

One of the things that Turing proved was that there is a universal Turing machine, which can emulate any other Turing machine. The input to a Turing machine is a set of symbols on the tape, i.e. an input string. Some of the symbols code for instructions or rules and the others code for an initial condition. Turing also showed that there are problems for which a Turing machine can never solve. The most famous is the Halting Problem, which states that there does not exist a Turing machine that can decide if another Turing machine will halt given some input. Turing actually showed that it was impossible to produce a general algorithm to decide if a given input string to a Turing machine will ever cause it to print a given symbol. In other words, there is no general algorithm to decide if a computation will have an infinite loop or perform some specific task. This doesn't imply that you couldn't prove that a specific program has this property, just that there isn't a way to do it generally. The proof of the Halting problem is similar to Cantor's diagonal proof that the set of real numbers is uncountable.

One of the consequences of Turing's work is that the total number of possible computations is countable. You simply take all strings of length 1, then 2, etc and feed it to a Universal Turing machine. Every possible computation or numerical simulation will be included. Thus, any simulation of the universe is coded in one of these strings. Some of these inputs will lead to computations that will halt and some will run forever. Some will make the Turing machine print a particular symbol and some will not. However, there is no way to decide, which of the input strings are on any of these lists.

The question is then, given an input string, can you determine if it will produce a universe that has some property such as supporting life. There are actually two separate issues regarding this question. The first is, how would you even define life or recognize it in the computation. I will deal with this in a future post. The second is, given that you have a definition of life, then can you know ahead of time whether or not your simulation will produce it. The answer to this question is no because if it were yes then you could solve the Halting problem. This is easy to see because any definition of life must involve some pattern of symbols to be printed on the tape and there is no way to decide if an input string will ever produce a symbol much less a pattern. This doesn't mean that a simulator couldn't come up with a simulation of our universe, it just means that she could never come up with a general algorithm to guarantee it. So, in the infinite but countable list of possible compuations, some produce simulations of universes, perhaps even ours, but we can never know for sure which.

Saturday, September 20, 2008

Complexity of the brain

The Kolmogorov complexity of an object is the length of the minimal description of that object. In terms of the brain, it would correspond to the length of the smallest computer program that could reproduce the brain. It could also be thought of as the amount of information necessary to model the brain. Computing the Kolmogorov complexity is not possible since it is an undecidable problem but we can estimate it. If we presume that molecular biology is computable then one estimate of the Kolmogorov complexity of the brain is given by the length of the genome, which is 3 billion base pairs long or 6 billion bits. To be conservative, we could also include the genome of the mother and baby, which implies 12 billion bits. This corresponds to less than two billion bytes and easily fits on a DVD. Hence in principle, we could potentially grow a brain with less than 12 billion bits of information and this is probably an upper bound.

However, this would not imply that we could describe a brain with this amount of information since it ignores the modifications due to external inputs. For example, the visual system cannot fully develop if the brain does not receive visual inputs. So we also need to estimate how much input the brain receives during development. The amount of information available in the external world is immense so it is safe to assume that the amount received is limited by the brain and not the source. However, there is no way to estimate this in a principled way since we don't know how the brain actually works. Depending on what you assume to be the neural code (see previous post), you could end up with a very wide range of answers. Nonetheless, let's suppose that it is a rate code with a window of 10 ms. Generally, neurons fire at rates less than 100 Hz so this corresponds to the presence or absence of a spike in a 10 ms window. This corresponds to 100 bits per neuron per second. The brain has about 10^11 neurons so the maximum amount of information that could be input to the brain is 10^13 bits per second. There are over 30 million seconds in a year, so that is a lot of information and can easily dwarf the genomic contribution.

However, this does lead us to a potential means to quantify the influence of genes versus environment on intelligence and behaviour debate. If the complexity of the brain is less than 12 billion bits then we are basically genetically determined. If it is greater, then we are mostly shaped by the environment. So what do you think?

Saturday, September 13, 2008

Contrasting worldviews - Part 2

Compared to biologists and mathematicians, physicists are much more uniform in their worldview. There is a central canon of physics that everyone is taught. With respect to how physicists approach biology especially with regards to how many details to include, their ideas are shaped by the concept of universality and the renormalization group, which I will explain below. This is partly what gives physicists the confidence that complex phenomena can have simple explanations although this physics worldview hegemony is starting to break as physicists become more immersed in biology. In fact, I've even noticed a backlash of some physicists cum biologists towards their colleagues that espouse the notion that details can be dispensed with. Some physicists are very much of the notion that biologically detailed modeling is necessary to make progress. In this sense, what I'm describing might be more appropriately called the old physics world view.

The concept of universality arose from the study of phase transitions and critical phenomena with inspiration from quantum field theory. In a nutshell, it says that for certain systems in regimes where there is no obvious length scale (usually indicated by power law scaling), such as at the critical point of a second order phase transition, the large scale behavior of the system is independent of the microscopic details and only depends on general properties such as the number of dimensions of the space and symmetries in the system. Hence, systems can be classified into what are called universality classes. Although, the theory was developed for critical phenomena in phase transitions, it has since been generalized to apply to a wide range of dynamical situations such as earthquakes, avalanches, flow through porous media, reaction diffusion systems and so forth.

The paradigmatic system for critical phenomena is magnetism. Bulk (ferro)magnetism arises when the atoms (each of which have a small magnetic moment) align and produce macroscopically observable magnetization. However, this only occurs for low temperatures. For high enough temperature, the random motions of the atoms can destroy the alignment and magnetization is lost (material becomes paramagnetic). The change from a state of ferromagnetism to paramagnetism is called a phase transition and occurs at a critical temperature (the Curie temperature).

These systems are understood by considering the energy associated with different states. The probability of occupying a given state is then given by the Boltzmann weight, which is exp(-H(m)/kT), where H(m) is the internal energy of the state with magnetization m (also called the Hamiltonian), T is the temperature, and k is the Boltzmann constant. Given the Boltzmann factor, the partition function (sum of Boltzmann weight over all states) can be constructed from which all quantities of interest can be obtained. Now this particular system was studied over a century ago by notables such as Pierre Curie, who using known microscopic laws of magnetism and mean field theory, found that below a critical temperature Tc, m is nonzero and above Tc m is zero.

However, the modern way of how we think of phase transitions starts with Landau, who first applied it to the onset of superfluidity of helium. Instead of trying to derive the energy from first principles, Landau said let's write out a general form based on the symmetries of an order parameter, which in this example is the magnetization m(x), at spatial location x. Since the energy must be a scalar, it can only depend on terms like |m|^2 or (grad m)^2. The first few terms then obey H ~ \int dx q (grad m)^2 - (T-Tc) m^2 + u m^4 + ..., for parameters q, T, and u. The (grad m)^2 term is due to fluctuations. If fluctuations are ignored, then this is called mean field theory, in which case H ~ -(T -Tc)/2 m^2 + u m^4. The partition function can be estimated by using a saddle point approximtion, which in the mean field limit amounts to evaluting the critical points of H, which are m=0 and m^2=(Tc-T)/4 u. They correspond to the equilibrium states of the system, so if T is greater than Tc then the only solution is m=0 and if T is less than Tc then the magnitude of magnetization is nonzero.

The partition function cannot be explicitly computed in the presence of fluctuations. This is where Ken Wilson and the renormalization group comes in. What Wilson said, following people before him like Murray Gell-Mann, Francis Low and Leo Kadanoff, is suppose we have scale invariance, which is true near a critical point. Then if we integrate out small length scales (or high spatial frequency scales), rescale in x, and then renormalize m, we will end up with a new partition function, but with slightly different parameters. These operations form a group action (i.e dynamical system) on the parameters of the partition function. Thus, a scale invariant system should be at a fixed point of the renormalization group action. In other words, if you keep applying the renormalization group, the parameters can flow to a fixed point and the location of the fixed point only depends on the symmetry of the order parameter and dimension of the space. Many different systems can flow to the same fixed point. The most important element of the renormalization group in terms of the physics worldview is that terms in the Hamiltonian are renormalized in different ways. Some grow, these are called the relevant operators, some stay the same, these are marginal operators, and some decrease, these are called irrelevant operators. For critical systems, only a small number of terms in the Hamiltonian are relevant and this is why microscopic details do not matter at large scales.

Now, these ideas were originally developed just for behavior near a critical point, which is pretty specialized. If it were only applicable to an equilibrium phase transition, then physicists really wouldn't have a leg to stand on in terms of ignoring details. However, these ideas were later generalized to dynamical systems with critical behavior. What also motivates them is that power laws (also called 1/f or fractal scaling) seem to be ubiquitous. They can been found in the size distribution of earthquakes, thermal noise in resistors, size of river meanders, the coastline of Norway, size of hubs in the Internet, connectivity of protein networks, and even neural firing patterns, to name a few. Although there is not an agreement as to why these systems exhibit power laws (many theories have been proposed), the spectre of the renormalization group and universality permeates the air and influences the physicist world view.

My personal view is that some details matter immensely while others do not. However, there is no a priori systematic way of deducing which is which. There are only rules of thumb and experience that can assist us. Hence, even if you buy into the details-may-not-matter worldview, there is no prescription for how to implement it. What it does do is give me less confidence that there is such a thing as the "correct" theory for a system. I'm more inclined to believe that given the current state of knowledge and a specific set of questions, some theories perform better than others. With more information, we can refine our theories. However, I don't think this process ever converges to "the" theory because specifying what a system is is somewhat arbitrary. Nothing is purely isolated from its surroundings, so drawing a boundary is always going to involve a choice. These could be very logical and well informed choices but choices nonetheless. Also, we can never have full control of all the external inputs that can affect a system. In this way, I have a Bayesian viewpoint in that we only make progress by updating our priors.

Saturday, September 06, 2008

Contrasting worldviews - Part 1

In my previous post, I talked about how we probably needed a new worldview before we would be prepared to understand the brain. What I thought I would do here is to introduce what I think forms the worldviews of people who do dynamical systems (which forms a sizable contingent of the mathematical neuroscience community) and physics (in particular statistical mechanics and field theory). Having trained in physics and applied mathematics, served on the faculty in a math department and worked on biology, I've gotten a chance to see how these different groups view science. The interesting thing is that in many ways biologists and mathematicians can sometimes understand each other better than physicists and mathematicians. I was led to this belief after hearing Alla Borisyuk, who is an applied mathematician, exclaiming at a conference I helped organize in 2000 for young researchers, that she had no trouble talking to biologists but had no idea what the physicists were talking about.

The reason why biologists may have more in common with mathematicians than physicists is because unlike physics, biology has no guiding laws other than evolution, which is not quantitative. They rarely will say, "Oh, that can't be true because it would violate conservation of momentum," which was how Pauli predicted the neutrino. Given that there are no sweeping generalizations to make they are forced to pay attention to all the details. They apply deductive logic to form hypotheses and try to prove their hypotheses are true by constructing new experiments. Pure mathematicians are trained to take some axiomatic framework and prove things are true based on them. Except for a small set of mathematicians and logicians, most mathematicians don't take a stance on the "moral value" of their axioms. They just deduce conclusions within some well defined framework. Hence, in a collaboration with a biologist, a mathematician may take everything a biologist says with equal weight and then go on from there. On the other hand, a physicist may bring a lot of preconceived notions to the table (Applied mathematicians are a heterogeneous group and their world views lie on a continuum between physicists and pure mathematicians.) Physicists also don't need to depend as much on deductive logic since they have laws and equations to rely on. This may be what frustrates biologists (and mathematicians) when they talk to physicists. They can't understand why the physicists can be so cavalier with the details and be so confident about it.

However, when physicists (and applied mathematicians) are cavalier with details, it is not because of Newton or Maxwell or even Einstein. The reason they feel that they can sometimes dispense with details is because their worldviews are shaped by Poincare, Landau and Ken Wilson. What do I mean by this? I'll cover Poincare (and here I use Poincare to represent several mathematicians near the turn of the penultimate century) in this post and get to Landau and Wilson in the next one. Poincare, among his many contributions, showed how dynamical systems can be understood in terms of geometry and topology. Prior to Poincare, dynamical systems were treated using the tools of analysis. The question was: Given an initial condition, what are the analytical properties of the solutions as a function of time? Poincare said, let's not focus on the notion of movement with respect to time but look at the shape of trajectories in phase space. For a dynamical system with smooth enough properties, the families of solutions map out a surface in phase space with tiny arrows pointing in the direction the solutions would move on this surface (i.e. vector field). The study of dynamical systems becomes the study of differential geometry and topology.

Hence, any time dependent system including those in biology that can be described by a (nice enough) system of differential equations, is represented by a surface in some high dimensional space. Now, given some differential equation, we can always make some change of variables and if this variable transformation is smooth then the result will just be a smooth change of shape of the surface. Thus, what is really important is the topology of the surface, i.e. how many singularities or holes are in them. The singularities are defined by places where the vector field vanishes, in other words the fixed points. Given that the vector field is smooth outside of the fixed points then the global dynamics can be reconstructed by carefully examining the dynamics near to the fixed points. The important thing to keep track of when changing parameters is the appearance and disappearance of fixed points or the change of dynamics (stability) of fixed points. These discrete changes are called bifurcations. The dynamics near fixed points and bifurcations can be classified systematically in terms of normal form equations. Even for some very complicated dynamical system, the action is focused at the bifurcations. These bifurcations and the equations describing them are standardized (e.g. pitchfork, transcritical, saddle node, Hopf, homoclinic) and do not depend on all the details of the original system. Thus, when a dynamical systems person comes to a problem, she immediately views things geometrically. She also believes that there may be underlying structures that capture the essential dynamics of the system. This is what gives her confidence that some details are more important than others. Statistical mechanics and field theory takes this idea to another level and I'll get to that in the next post.

Sunday, August 31, 2008

Understanding the brain

I think that sometimes philosophy is important and this may be true for neuroscience right now. I don't mean ivory tower, "what is life?" type philosophy (although that is important too) but trying to pin down what it would mean to say "we understand the brain." When would we know that the game is won? I think this is important for neuroscience now to help to guide research. What should we be doing?

So what does it mean to understand something? I would say there are two aspects. One is predictive power, which would mean that we would be able to know what drugs or therapies would be useful to cure a brain disorder. The second aspect is more difficult to pin down but would basically mean incorporating something seamlessly into your worldview. The simplest example I can give is a mathematical theorem. Predictive understanding would correspond to the ability to follow all the steps of the proof of the theorem and use the theorem to prove new theorems. Incorporative understanding would be the ability to summarize the proof in a way that relates it in a highly compressed form to things you already know. For example, we can understand bifurcations of complicated dynamical systems by reducing them to the behavior of solutions of simple polynomial equations.

Sometimes the two views can clash. Consider the proof of the Kepler Conjecture for sphere packing by my friend and former colleague Tom Hales. The theorem is difficult because there are an infinite number of ways to pack spheres in 3 dimensions. Hales made this manageable by showing that this could be reduced to solving a finite (albeit large) optimization problem. He then proceeded to solve the finite problem computationally. To some people, the proof is a done deal. The trick was to reduce it to a finite problem, after that it is just details. Even if you don't believe Hales's computation you could always repeat it. Others would say, it is not done until you have a complete pen and pencil proof. To me, I think the proof is understandable because Hales was able to reduce it to an algorithm. However this is not a view that everyone shares.

Now we come back to the brain. What would you consider understanding to entail? I'm not sure that we, namely people working in the field today, will ever have that satisfying incorporating understanding of the brain because we don't have anything in our current worldview that could encapsulate that understanding. We will never be able to say, "Oh right, I understand, the brain is like X." In that sense, it is like quantum mechanics (QM). This is a theory that is highly successful in the predictive sense. As a predictive theory, it is quite simple. There are just a few rules to apply and much of our modern technology like lasers and electronics rely on it. However, no one who has ever thought about it would claim any understanding of QM in the incorporation sense. The Copenhagen interpretation is basically a "Don't ask, don't tell" policy for the theory.

In this sense, trying to understand any complex system, no matter how unrelated it is to the brain, could help in the long run to provide a foundation for an incorporating understanding of the brain. That is not to say that I believe there are laws of complex system similar to classical and quantum mechanics. My own view is that there are no laws in complex systems such as the global climate, economics or the brain; there are just effective theories that sort of work in limited circumstances. However, it is by slowly creating effective theories and models that we will form a new worldview of what it means to understand complex systems like the brain. In the meantime, we should continue to try to build a predictive understanding so that we can cure diseases and treat disorders.

Thursday, August 21, 2008

Materialism and meaning

Let me first say that I am a die hard materialist in that I do believe that there is nothing beyond the physical world. I also believe that physics and hence the physical world is computable in that it can be simulated on a computer. However, helped along by Stuart Kauffman's new book Reinventing the Sacred, I have been gradually edging towards accepting that even in a purely materialistic world there is some amount of arbitrariness in our perception of reality. Kauffman argues that this arbitrariness is not "mathematizable". I will argue here that the question can be formulated mathematically and can be shown to be undecidable or at best intractable. Kauffman's thesis is that we should take advantage of this arbitrariness and make it the foundation of a new concept of the sacred.

The problem arises from how we assign meaning to things in the world. Philosophers like Wittgenstein and Saul Kripke have thought very deeply on this topic and I'll just barely scratch the surface here. The simple question to me is what do we consider to be real. I look outside my window and I see some people walking. To me the people are certainly real but is "walking" real as well? What exactly is walking? If you write a simple program that makes dots move around on a computer screen and show it to someone then depending on what the dots are doing, they will say the dots are walking, running, crawling and so forth. These verbs correspond to relationships between things rather than things themselves. Are verbs and relationships real then? They are certainly necessary for our lives. It would be hard to communicate with someone if you didn't use any verbs. I think they are necessary for an animal to survive in the world as well. A rabbit needs to classify if a wolf is running or walking to respond appropriately.

Now, once we accept that we can ascribe some reality or at least utility to relationships then this can lead to an embarrassment of riches. Suppose we live in a world with N objects that you care about. This can be at any level you want. The number of ways to relate objects in a set is the number of subsets you can form out of those objects. This is called the power set and has cardinality (size) 2^N. But it can get bigger than that. We can also build arbitrarily complex arrangements of things by using the objects more than once. For example, even if you only saw a single bird, you could still invent the term flock to describe a collection of birds. Another way of saying this is that given a finite set of things, there are an infinite number of ways to combine them. This then gives us a countable infinity of items. Now you can take the power set of that set and end up with an uncountable number of items and you can keep on going if you choose. (Cantor's great achievement was to show that the power set of a countable set is uncountable and the power set of an uncountable set is even bigger and so forth). However, we can probably only deal with a finite number of items or at most a countable list (if we are computable ). This finite or countable list encapsulates your perception of reality and if you believe this argument then the probability of obtaining our particular list is basically zero. In fact, given that the set of all possible lists is uncountable, this implies that not all lists can even be computed. Our perception of reality could be undecidable. To me this implies an arbitrariness in how we interact with the physical world which I call our prior. Kauffman calls this the sacred.

Now you could argue that the laws of the material world will lead us to a natural choice of items on our list. However, if we could rerun the universe with a slightly different initial condition would the items on the list be invariant? I think arbitrarily small perturbations will lead to different lists. An argument supporting this idea is that even among different world cultures we have slightly different lists. There are concepts in some languages that are not easily expressible in others. Hence, even if you think the list is imposed by the underlying laws of the physical world, in order to derive the list you would need to do a complete simulation of the universe making this task intractable.

This also makes me have to back track on my criticism of Montague's assertion that psychology can affect how we do physics. While I still believe that we have the capability to compute anything the universe can throw at us, our interpretation of what we see and do can depend on our priors.

Saturday, August 16, 2008

Realistic versus abstract neural modeling

There is a very interesting discourse running on the comp-neuro email list. I've only caught the past week but it seems to be a debate between the benefits of "abstract" versus biological "realistic" models. (Let me caveat that everything here is my interpretation of the two points of view). Jim Bower, who is a strong proponent of realistic modeling, argues that abstract models (an example is a network of point neurons) add biases that lead us astray. He thinks that only through realistic modeling can we set down all the necessary constraints to discover how the system works. In a side remark, he also said that he thought the most important problem to understand is what information a given neuron transmits to another and the rest is just clean up. Bower believes that biology is pre-Copernican and that abstract modeling is akin to Ptolemy adding epicycles to explain planetary motion and realistic modeling is closer to the spirit of Kepler and Newton.

I don't want to debate the history and philosophy of science here but I do want to make some remarks about these two approaches. There are actual several dichotomies at work. One of the things it seems that Bower believes is that a simulation of the brain is not the same as the brain. This is in line with John Searle's argument that you have to include all the details to get it right. In this point of view, there is no description of the brain that is smaller than the brain. I'll call this viewpoint Kolmogorov Complexity Complete (a term I just made up right now). On the other hand, Bower seems to be a strict reductionist in that he does believe that understanding how the parts work will entirely explain the whole, a view that Stuart Kauffman argued vehemently against in his SIAM lecture and new book Reinventing the Sacred. Finally, in an exchange between Bower and Randy O'Reilly, who is a computational cognitive scientist and connectionist, Bower rails against David Marr and the top down approach to understanding the brain. Marr gave an abstract theory of how the cerebellum worked in the late sixties and Bower feels that this has been leading the field astray for forty years.

I find this debate interesting and amusing on several fronts. When I was at Pitt, I remember that Bard Ermentrout used to complain about connectionism because he thought it was too abstract and top down whereas using Hodgkin-Huxley-like models for spiking neurons with biologically faithful synaptic dynamics was the bottom up approach. At the same time, I think Bard (and I use Bard to represent the set of mathematical neuroscientists that mostly focus on the dynamics of interacting spiking neurons; a group to which I belong) was skeptical that the fine detailed realistic modeling of single neurons that Bower was attempting would enlighten us on matters of how the brain worked at the multi-neuron scale. One man's bottom up is another man's top down!

I am now much more agnostic about modeling approaches. My current view is that there are effective theories at all scales and that depending on the question being asked there is a level of detail and class of models that are more useful to addressing that question. In my current research program, I'm trying to make the effective theory approach more systematic. So if you are interested in how a single synaptic event will influence the firing of a Purkinje cell then you would want to construct a multi-compartmental model of that cell that respected the spatial structure. On the other hand if you are interested in understanding how a million neurons can synchronize, then perhaps you would want to use point neurons.

One of the things that I do believe is that complexity at one scale may make things simpler at higher scales. I'll give two examples. Suppose a neuron wanted to do coincidence detection of its inputs, e.g. it would collect inputs and fire if the inputs arrived at the same time. Now for a spatially extended neuron, inputs arriving at different locations on the dendritic tree could take vastly different amounts of time to arrive at the soma where spiking is initiated. Hence simultaneity at the soma is not simultaneity of arrival. It thus seemed that coincidence detection was a hard problem for a neuron to do. Then it was discovered that dendrites have active ion channels so that signals are not just passively propagated, which is slow, but actively propagated quickly. In addition, the farther away you are the faster you go so that no matter where a synaptic event occurs, it takes about the same amount of time to reach the soma. The dendritic complexity turns a spatially extended neuron into a point neuron! Thus, if you just focused on understanding signal propagation in the dendrites, your model would be complicated but if you only cared about coincidence detection, your model could be simple. Another example is in how inputs affect neural firing. For a given amount of injected current a neuron will fire at a given frequency giving what is known as an F-I curve. Usually in slice preparations, the F-I curve of a neuron will be some nonlinear function. However, in these situations not all neuromodulators are present so some of the slower adaptive currents are not active. When everything is restored, it was found (both theoretically by Bard and experimentally) that the F-I curve actually becomes more linear. Again, complexity at one level makes it more simple at the next level.

Ultimately, this "Bower" versus "Bard" debate can never be settled because the priors (to use a Bayesian term) of the two are so different. Bower believes that the brain is Kolmogorov complexity complete (KCC) and Bard doesn't. In fact, I think that Bard believes that higher level behavior of networks of many neurons may be simpler to understand than sets of just a few neurons. That is why Bower is first trying to figure out how a single neuron works whereas Bard is more interested in explaining a high level cognitive phenomenon like hallucinations in terms of pattern formation in an integro-differential system of equations (i.e. Wilson-Cowan equations). I think most neuroscientists believe that there is a description of the brain (or some aspect of the brain) that is smaller than the brain itself. On the other hand, there seems to be a growing movement towards more realistic characterization and modeling of the brain at the genetic and neural circuit levels (in addition to the neuron level) as evidenced by the work at Janelia Farm and EPFL Lausanne, of which I'll blog about in the future.

Friday, August 15, 2008

New Paper on insulin's effect on free fatty acids

A paper I've been trying to get published for two years will finally appear in the American Journal of Physiology - Regulatory, Integrative, and Comparative Physiology. The goal of this paper was to develop a quantitative model for how insulin suppresses free fatty acid levels in the blood. A little background for those unfamiliar with human metabolism. All of the body's cells burn fuel and for most cells this can be fat, carbohydrate or protein. The brain, however, can only burn glucose, which is a carbohydrate, and ketone bodies, which are made when the body is short of glucose. Why the brain can't burn fat is still a mystery. It is not because fat cannot cross the blood brain barrier as is sometimes claimed. Thus, the body has a reason to regulate glucose levels in the blood. It does this through hormones, the most well known of which is insulin.

Muscle cells cannot uptake glucose unless insulin is present. So when you eat a meal with carbohydrates, insulin is released by the pancreas and your body utilizes the glucose that is present. In between meals, muscle cells mostly burn fat in the form of free fatty acids that are released by fat cells (adipocytes) through a process called lipolysis. The glucose that is circulating is thus saved for the brain. When insulin is released, it also suppresses lipolysis. Basically, insulin flips a switch that causes muscle and other body cells to switch from burning fat to glucose and in addition switches off the fuel supply for fat.

If your pancreas cannot produce enough insulin then your glucose levels will be elevated and this is diabetes mellitus. Fifty years ago, diabetes was usually the result of an auto-immune disorder that destroyed pancreatic beta cells that produce insulin. This is known as Type I diabetes. However, recently the most prevalent form of diabetes, called Type II, arises from a drawn out process attributed to overweight or obesity. In Type II diabetes, people first go through a phase called insulin resistance where more insulin is required for glucose to be taken up by muscle cells. The theory is that after prolonged insulin resistance, the pancrease eventually wears out and this leads to diabetes. Insulin resistance is usually reversible by losing weight.

Thus, a means to measure how insulin resistant or sensitive you are is important. This is usually done through a glucose challenge test, where glucose is either ingested or injected and then the response of the body is measured. I don't want to get into all the methods used to assess insulin sensitivity but one of the methods uses what is known as the minimal model of glucose disposal, which was developed in the late seventies by Richard Bergman, Claudio Cobelli and colleagues. This is a system of 2 ordinary differential equations that model insulin's affect on blood glucose levels. The model is fit to the data and an insulin sensitivity index is one of the parameters. Dave Polidori, who is a research scientist at Johnson and Johnson, claims that this is the most used mathematical model in all of biology. I don't know if that is true but it does have great clinical importance.

The flip side to glucose is the control of free fatty acids (FFAs) in the blood and this aspect has not been as well quantified. Several groups have been trying to develop an analogous minimal model for insulin's action on FFA levels. However, none of these models have been validated or even tested against each other on a single data set. In this paper, we used a data set of 102 subjects and tested 23 different models that included previously proposed models and several new ones. The models have the form of an augmented minimal model with compartments for insulin, glucose and FFA. Using Bayesian model comparison methods and a Markov chain Monte Carlo algorithm, we calculated the Bayes factors for all the models. We found that a class of models distinguished themselves from the rest with one model performing the best. I've been using Bayesian methods quite a bit lately and I'll blog about it sometime in the future. If you're interested in the details of the model, I encourage you to read the paper.

Saturday, August 09, 2008

SIAM Lifesciences '08

I've just returned from the Society of Industrial and Applied Mathematics (SIAM) Lifesciences meeting in Montreal. I haven't traveled to a meeting since my baby was born so it was nice to catch up with old friends and the field. I thought that all of the plenary talks were excellent and I commend the organizing committee for doing a great job. Particularly interesting was a public lecture given by Stuart Kauffman on his new book "Reinventing the Sacred". That talk was full of many ideas that I've been directly interested in and I'll blog about them soon.

One of the things I took away from this meeting was that the field seems more diverse than when I organized it in 2004. In particular, I thought that the first two renditions of this meeting (2002, 2004) were more like an offshoot of the very well attended SIAM dynamical systems meeting held in Snowbird, Utah on odd numbered years. Now, I think that the participation base is more diverse and in particular there is much more overlap with the systems biology community. One of the unique things about this meeting is that people interested in systems neuroscience and systems biology both attend. These two communities generally don't mix even though some of the problems and methods have similarities and would benefit from interacting. Erik De Shutter wrote a nice article recently in PLoS Computational Biology exploring this topic. I thus particularly enjoyed the fact that there were sessions that included talks on both neural and genetic/biochemical networks. In addition, there were sessions on cardiac dynamics, metabolism, tissue growth, imaging, fluid dynamics, epidemiology and many other areas. Hence, I think that this meeting does play a useful and unique role bringing together mathematicians and modelers from all fields.

I gave a talk on my work on the dynamics of human body weight change. In addition to summarizing my PLoS Computational Biology paper, I also showed that because humans have such a long time constant to achieve energy balance when on a fixed rate of energy intake (i.e. a year or more), we can tolerate a wide amount of fluctuations in our energy intake rate and still have a small variance in our body weight. This answers the "paradox" that nutritionists seem to believe, namely that if a change of as small as 20 kcals/day (a cookie is ~150 kcal) can lead to a weight change of a kilogram then how do we maintain our body weights if we consume over a million kcals a year. Part of their confusion stems from conflating average with standard deviation. Given that we only eat finite amounts of food per day then no matter what you eat in a year you will have some average body weight. The question is why the standard deviation is so small; we generally don't fluctuate by more than a few kilos per year. The answer is simply that with a long time constant, we average over fluctuations. My back of the envelope calculation shows that the coefficient of variation (standard deviation divided by mean) of body weight suppresses by a factor of 15 or more the coefficient of variation in the food intake. This also points to correlations in food intake rate leading to weight gain, as was addressed in my paper with Vipul Periwal (Periwal and Chow, AJP-EM, 291:929 (2006)).

Karen Ong, from my group, also went and presented our work on steroid mediated gene expression. She won the poster competition for undergraduate students. I'll blog about this work in the future. While I was sitting in the sessions on computational neuroscience and gene regulation, I regretted not having more people in my lab attend and present our current ideas on these and other topics.

Saturday, August 02, 2008

Penrose redux

In 2006, I posted my thoughts on Roger Penrose's argument that human thought must be noncomputable. Penrose's argument follows from the fact that Godel's incompleteness theorem states that there exist true statements in a consistent formal system (e.g. arithmetic with integers) that cannot be proved within that system. The proof basically boils down to showing statements like "this statement cannot be proved" are true but cannot be proved because if they could be proved then there would be an inconsistency with the system. Turing later showed that this was equivalent to saying that there are problems, known as undecidable or uncomputable problems, that a computer could not solve. From these theorems, Penrose draws the conclusion that since we can recognize unprovable statements are true then we must not be a computer.

My original argument refuting Penrose's claim was that we didn't really know what formal system we were using or whether or not it remained fixed so we couldn't know if we were recognizing true statements that we can't prove. However, I now have a simpler argument, which is simply that no human has ever solved an uncomputable problem and hence has not shown they are more than a computer. The fact that they know about uncomputability is not an example. A machine could also have the same knowledge since Godel's and Turing's proofs (as are all proofs) are computable. Another way of staying this is that any proof or thing that can be written down in a finite number of symbols could also be done by a computer.

An example is the fact that you can infer the existence of real numbers using only integers. Thus, even though real numbers are uncountable and thus uncomputable, we can prove lots of properties about then just using integers. The Dedekind cut can be used to prove the completeness of real numbers without resorting to the axiom of choice. Humans and computers can reason about real numbers and physical theories based on real numbers without actually ever having to deal directly with real numbers. To paraphrase, reasoning about uncomputable problems is not the same as solving uncomputable problems. So until a human being can reliably tell me whether or not any Diophantine equation (polynomial equation with integer coefficients) has a solution in integers (i.e. Hilbert's tenth problem) or always know if any program will ever halt (i.e. the halting problem), I'll continue to believe that a computer can do whatever we can do.

Sunday, July 27, 2008

Limits to thought and physics

In a recent post, I commented on Reed Montague's proposal that it may be necessary to account for the limits of psychology when developing theories of physics. I disagreed with his thesis because if we accept that physics is computable and the brain is described by physics then any property of physics should be discernible by the brain. However, I should have been more careful in my statements. Given the theorems of Godel and Turing, we must also accept that there may be certain problems or questions that are not decidable or computable. The most famous example is that there is no algorithm to decide if an arbitrary computation will ever stop. In computer science this is known as the halting problem. (Godel's incompleteness theorems are direct consequences of the halting problem, although historically they came first). The implication is quite broad for it also implies that there is no sure fire way of knowing if a given computation will do a particular thing (i.e. print out a particular symbol). This is also why there is no certain way of ever knowing if a person is insane or if a criminal will commit another crime, as I claimed in my post on crime and neuroscience.

Hence, it may be possible that some theories in physics, like the ultimate super-duper theory of everything, may in fact be undecidable. However, this is not just a problem of psychology but also a problem of physics. Some, like British mathematical physicist Roger Penrose, would argue that the brain and physics are actually not computable. Penrose's arguments are outlined in two books - The Emperor's New Mind and Shadows of the Mind. It could be that the brain is not computable but I (and many others for various reasons) don't buy Penrose's argument for why it is not. I posted on this topic previously although I've refined my ideas considerably since that post.

However, even if the brain and physics were not computable there would still be a problem because we can only record and communicate ideas with a finite number of symbols and this is limited by the theorems of Turing and Godel. It could be possible that a single person could solve a problem or understand something that is undecidable but she would not be able to tell anyone else what it is or write about it. The best she could do is to teach someone else how to get into such a mental state to "see" it for themselves. One could argue that this is what religion and spiritual traditions are for. Buddhism in a crude sense is a recipe for attaining nirvana (although one of the precepts is that trying to attain nirvana is a surefire way of not attaining it!). So, it could be possible that throughout history there have been people that have attained a level of, dare I say, enlightenment but there is no way for them to tell us what that means exactly.

Friday, July 18, 2008

Neural Code

An open question in neuroscience is: what is the neural code? By that it is meant, how is information represented and processed in the brain. I would say that the majority of neuroscientists, especially experimentalists, don't worry too much about this problem and implicitly assume what is called a rate code, which I will describe below. There is then a small but active group of experimentalists and theorists who are keenly interested in this question and there is a yearly conference, usually at a ski resort, devoted to the topic. I would venture to say that within this group - who use tools from statistics, Bayesian analysis, machine learning and information theory to analyze data obtained from in vivo multi-electrode recordings of neural activity in awake or sedated animals given various stimuli - there is a larger amount of skepticism towards a basic rate code than the general neural community.

For the beneficiary of the uninitiated, I will first give a very brief and elementary review of neural signaling. The brain consists of 10^11 or so neurons, which are intricately connected to one another. Each neuron has a body, called the soma, an output cable, called the axon, and input cables, called the dendrites. Axons "connect" to dendrites through synapses. Neurons signal each other with a hybrid electro-chemical scheme. The electrical part involves voltage pulses called action potentials or spikes. The spikes propagate down axons through the movement of ions across the cell membrane. When the spikes reach a synapse, they trigger a release of neurotransmitters, which diffuse across the synaptic cleft, bind to receptors on the receiving end of the synapses and induce either a depolarizing voltage pulse (excitatory signal) or a hyperpolarizing voltage pulse (inhibitory signal). In that way, spikes from a given neuron can either increase or decrease the probability of spikes in a connected neuron.

The neuroscience community is basically all in agreement that neural information is carried by the spikes. So the question of the neural code becomes: how is information coded into spikes? For example, if you look at an apple, something in the spiking pattern of the neurons in the brain is representing the apple. Does this change involve just a single neuron? This is called the grandmother cell code, from the joke that there is a single neuron in the brain that represents your grandmother. Or does it involve a population of neurons, known not surprisingly, as a population code. How did the spiking pattern change? Neurons have some background spiking rate, so do they simply spike faster when they are coding for something, or does the precise spiking pattern matter. If it is just a matter of spiking faster then this is called a rate code, since it is just the spiking rate of the neuron that contains information. If the pattern of the spikes matter then it is called a timing code.

The majority of neuroscientists, especially experimentalists, implicitly assume that the brain uses a population rate code. The main reason they believe this is because in most systems neuroscience experiments, an animal will be given a stimulus, and then neurons in some brain region are recorded to see if any respond to that particular stimulus. To measure the response they often count the number of spikes in some time window, say 500 ms, and see if it exceeds some background level. What seems to be true from almost all of these experiments is that no matter how complicated a stimulus you want to try, a group of neurons can usually be found that respond to that stimulus. So, the code must involve some population of neurons and the spiking rate must increase. What is not known is which and how many neurons are involved and whether or not the timing of the spikes matter.

My sense is that the neural code is a population rate code but the population and time window change and adapt depending on context. Thus understanding the neural code is no simpler than understanding how the brain computes. In molecular biology, deciphering the genetic code ultimately led to understanding the mechanisms behind gene transcription but I think in neuroscience it may be the other way around.

Tuesday, July 08, 2008

Crime and neuroscience

The worry in the criminal justice system is that people will start trying to use neuroscience in their defense. For example, they will start to claim that their brain was faulty and it committed the crime. I think some have already tried this. I think the only way out of this predicament is to completely reframe how we administer justice. Currently, the intent of the perpetrator to commit the crime (negligence included as a crime) is required to establish criminal activity. This is why insanity is a viable defense. I think this notion will gradually become obsolete as the general public comes to accept the mechanistic explanation of mind. When the discontinuity between man and machine is finally accepted by most people, the question of intent is going to be problematic. That is why we need to start rethinking this whole enterprise now.

My solution is that we should no longer worry about intent or even treat justice as a form of punishment. What we should do in a criminal trial is to determine if the defendant a) actually participated in the crime and b) if they are dangerous to society. By this reasoning, putting a person in jail is only necessary if they are dangerous to society. For example, violent criminals would be locked up and they would only be released if it could be demonstrated that they were no longer dangerous. This schema would also eliminate the concept of punishment. The duration of a jail sentence should not be about punishment but only about benefit to society. I don't believe that people should be absolved of their crimes. For nondangerous criminals, some form of retribution in terms of garnished wages or public service could be imposed. Also, if some form of punishment can be shown to be a deterrent then that would be allowed as well. I'm sure there are many kinks to be worked out but what I am proposing is that a fully functional legal system could be established without requiring a moral system to support it.

This type of legal system will be necessary when machines become sentient. Due to the theorems of Godel and Turing, proving that a machine will not be defective in some way will be impossible. Thus, some machines may commit crimes. Given that they are sentient also means that we cannot simply go around and disconnect machines at will. Each machine deserves a fair hearing to establish guilt and sentencing. Given that there will not be any algorithmic way to establish with certainty whether or not the machine will repeat the crime, justice for machines will have to be administered in the same imperfect way it is administered for humans.

Friday, July 04, 2008

Oil Consumption

I thought it would be interesting to convert the amount of oil consumed by the world each year into cubic kilometres to give a sense of scale of how much oil we use and how much could possibly be left. The world consumes about 80 million barrels of oil a day. Each barrel is 159 litres and a cubic kilometre corresponds to 10^12 litres, so the amount of oil the world consumes in a year is equivalent to 4.6 cubic kilometres. This would correspond to a cubic pit with sides that are about 1.45 km long. The US consumes a little over a quarter of the world's oil, which is about 1.2 cubic kilometres. The Arctic National Wildlife Refuge is estimated to have about 10 billion barrels of recoverable oil, about a half of year's supply.

Proven world reserves amount to about 1.3 trillion barrels of oil or about 200 cubic kilometres. If we continue at current rates of consumption, then we have about 40 years of oil left. If the rest of the world decided to use oil like American's then the world's yearly consumption could increase by a factor of 5, which would bring us down to 8 years worth of reserve. However, given that the surface of the earth is about 500 million square kilometres, it seems plausible that there is a lot more oil out there that hasn't been found, especially under the deep ocean. The main constraint is cost and greenhouse gas emissions. We may not run out of oil anytime soon but we may have run out of cheap oil already.

Tuesday, June 17, 2008

Physics and psychology

Montague in the epiologue of his book, which I blogged about recently, argued that a marriage of psychology and physics is in order. His thesis is that our intuitions about the world are based on flawed systems that were only designed to survive and reproduce. There are several responses to his argument. The first is that while we do rely on intuition to do math and physics, the intuition is based on learned concepts more than our "primitive" intuitions. For example, logical inference itself is completely nonintuitive. Computer scientist Scott Aaronson gives a nice example. Consider cards with a letter on one side and a number on the other. You are given 2, K, 5, J and the statement every J has a 5 on the back. Which cards do you need to turn over to see if the statement is true? (Hint: If A implies B then the only conclusion you can draw is not B implies not A). Most college freshmen get this wrong. Quantum mechanics and thermodynamics are notoriously counter intuitive and difficult to understand. We were led to these theories only after doing careful experiments that could marginalize away our prior beliefs.

However, that is not to say that perhaps we're at a stumbling block over questions like what is dark matter or why do we remember the past and not the future because of some psychological impediment. A resolution to this issue could reside again on whether or not physics is computable. Montague doesn't think so but his examples do not constitute a proof. Now if physics is computable and the brain is governed by the natural laws of physics, then the brain is also computable. In fact, this is the simplest argument to refute all those that doubt that machines can ever think. If they believe that the brain is in the natural world and physics can be simulated then we can always simulate the brain and hence the brain is computable. Now if the brain is computable, then any phenomenon in physics can be understood by the brain or at least computed by the brain. In other words, if physics is computable then given any universal Turing machine, we can compute any physical phenomenon (given enough time and resources).

There is one catch to my argument and that is the fact that if we believe the brain is computable then we must also accept that it is finite and thus less powerful than a Turing machine. In that case, there could be computations in physics that we can't understand with our finite brains. However, we could augment our brains with extra memory (singularity anyone) to complete a computation if we ever hit our limit. The real question is again tractability. It could be possible that some questions about physics are intractable from a purely computational point of view. The only way to "understand" these things is to use some sort of meta-reasoning or some probabilistic algorithm. It may then be true that the design of our brains through evolution may impede our ability to understand concepts that are outside of the range it was designed for.

Personally, I feel that the brain is basically a universal Turing machine with a finite tape so that it can do all computations up to some scale. We can thus only understand things with a finite amount of complexity. The way we approach difficult problems is to invent a new language to represent complex objects and then manipulate the new primitives. Thus our day to day thinking uses about the same amount of processing but accumulated over time we can understand arbitrarily difficult concepts.

Tuesday, June 10, 2008

Why so slow?

John Tierney of the New York times shows a figure from Ray Kurzweil of a log-log plot of the time between changes in history, such as the appearance of life multicellular organisms to new technologies like televisions and computers. His graph shows power law scaling with an exponent of negative one, which I obtained by eyeballing the curve. In other words, if dT is the time between the appearance of the next great change then it scales as 1/T where T is the time. I haven't read Kurzweil's book so maybe I'm misinterpreting the graph. The fact that there is scaling over such a long time is interesting but I want to discuss a different point. Let's take the latter part of the curve regarding technological innovation. Kurzweil's argument is that the pace of change is accelerating so we'll soon be enraptured in the Singularity (see previous post). However, the rate of appearance of new ideas seems to be only increasing linearly with T. So the number of new ideas are accumulating as T^2, which is far from exponential. Additionally, the population is increasing exponentially (at least in the last few hundred years). Hence the number of ideas per person is obeying t^2 Exp(-t). I'm not sure where we are on the curve but after an initial increase, the number of ideas per person actually decreases exponentially. I was proposing in the last post that the number of good ideas was scaling with the population but according to Kurzweil I was being super optimistic. Did I make a mistake somewhere?

Friday, June 06, 2008

The singularity

Steve Hsu points to an IEEE Spectrum special report on The Singularity, which is variously described as the point in time when machine intelligence surpasses human intelligence so that machines can then go on to improve themselves without our help. The super optimists like Ray Kurzweil think this will happen in 15 years and when it happens machine intelligence will increase at doubling times that will be hours, minutes or shorter. The machines will then solve all of the world's problems instantly. He also thinks we can upload our minds into the machine world and be immortal.

Personally, I think that there is no question that humans are computable (in the Turing sense) so there is no reason that a machine that can think like we do will someday exist (we being an existence proof). I have no idea when this will happen. Having studied computational neuroscience for about 15 years now, I can say that we aren't that much closer to understanding how the brain works then we were back then. I have a personal pet theory, which I may expound on sometime in the future, but it's probably no better than anyone else's. I'm fairly certain machine intelligence won't happen in 15 years and may not in my lifetime.

The argument that the singularity enthusiasts use is a hyper-generalization of Moore's law of exponential growth in computational power. They apply the law to every thing and then extrapolate. For example, anything that doubles each year (which is a number Kurzweil sometimes uses) will improve by a factor of 2^15=32,000 in 15 years. To Kurzweil, we are just a factor of 2^15 away from singularity.

There are two quick responses to such a suggestion, the first is where did 2^15 come from and the second is nonlinear saturation. I'll deal with the second issue first. In almost every system I've ever dealt with there is usually some form of nonlinear saturation. For example, some bacteria can double every 20 minutes. If it weren't for the fact that they run out of food eventually and stop growing (i.e. nonlinear saturation) a single colony would cover the earth in less than a week. Right now components on integrated circuits are less than 100 nm in size. Thus in less than 10 doublings they will be smaller than atoms. Hence, Moore's law as we know it can only go on for another 20 years at most. To continue the pace beyond that, we will require a technological paradigm shift and there is no successor on the horizon. The singularists believe in the Lone Ranger hypothesis so something will come to the rescue. However, even if computers do get faster and faster, software is not improving at anything near the same pace. Arguably, Word is worst now then it was 10 years ago. My computer still takes a minute to turn on. The problem is that good ideas don't seem to be increasing exponentially. At best they only seem to scale linearly with the population.

That leads us to the first point. How far away are we from building a thinking machine? The answer is that we haven't a clue. We may just need a single idea or it might be several. Over the past 50 years or so we've really only had a handful of truly revolutionary ideas about neural functioning. We understand a lot about the machinery that the brain runs on but very little about how it all works together to create human intelligence. We are making progress but it's slow. However, nonlinearity could help us here because we may be near a bifurcation to take us to a new level of understanding. However, this is not predictable by exponential growth.

The other thing about the singularity is that the enthusiasts seem to think that intelligence is unlimited, so that thinking machines can instantly solve all of our problems. Well if physics is computable (see previous post), then no amount of intelligence can solve the Halting problem or Hilbert's tenth. If we believe that P is not equal to NP, then no hyper intelligence can solve intractable problems, unless the claim extends to the ability to compute infinitely fast. I would venture that no amount of intelligence will ever settle the argument of who was the greatest athlete of all time. Many of our problems are due to differences in prior beliefs and that can't be solved by more intelligence. We have enough wealth to feed everyone on the planet yet people still starve. Unless the singularity implies that machines control all aspects of our lives, there will be human problems that will not be solved by extra intelligence.

The example enthusiasts sometimes give of a previous singularity is the emergence of humans. However, from the point of view of the Dodo bird, humpback whale, buffalo, polar bear, American elm tree, and basically most other species, life got worse following the rise of the humans. We're basically causing the greatest extinction event since the fall of the dinosaurs. So who's to say that when the machine singularity strikes, we won't be left behind similarly. Machines may decide to transform the earth to suit their needs and basically ignore ours.

Friday, May 30, 2008

Are humans computable?

I picked up Read Monatgue's book on decision making - Why Choose This Book?: How We Make Decisions, this evening at Barne's and Noble and read the epilogue with the provacative title: Are Humans Computable? In this chapter, Montague puts forward the argument that protein-protein interactions are not computable and hence neither is the brain or physics for that matter. He argues that proteins and worldly stuff have physical properties that are not computable but can be strung together algorithmically to achieve an end, such as the brain. As an example, he suggests that writing down the equations that govern a nuclear reactor do not suddenly create energy because the equations lack the "physical properties" necessary for a chain reaction. His final point is that our intuitions are based on a brain that is only designed to survive and reproduce and thus physics is ineluctably intertwined with psychology. He proposes that the next frontier is to incorporate the limitations of human perception into new physical theories.

I have been reading and thinking about the theory of computation lately. I have a host of incomplete and inchoate ideas on the topic that are not ready for prime time but after reading Montague's chapter I thought it would be useful to put some things down before I forget them. Montague has touched on some interesting and deep questions but I believe his particular point of view is flawed. He incorrectly conflates intractability with uncomputability and an algorithm with a computation. This post will only touch briefly on the very many issues related to this topic.

On protein-protein interactions, Montague writes that the totality of possible interactions is unimaginably large and thus could never fit on a realizable computer. He equates this with being uncomputable. Protein-protein interactions may not be computable but not because it can't fit on a computer. A problem is deemed uncomputable or undecidable if a computer (i.e. Turing machine) cannot solve (decide) it. This means that the problem cannot be solved by any algorithm. Perhaps Montague knows this but his editor forced him to tone down the technical details. The most famous undecidable problem is the halting problem, which says that there is no algorithm that can tell if a computation will ever stop. Hilbert's tenth problem on the solvability of diophantine equations with integers is also undecidable. It is not known if protein-protein interactions and by implication all of physics is decidable but almost everyone in physics and applied math believes it is so, whether they know this or not. One notable exception is Roger Penrose who believe that quantum gravity is uncomputable, but that is another story for another post.

The question is not as trivial as it sounds. Kurt Godel, Alonzo Church, Alan Turing, and many other twentieth century mathematicians established the criteria for computability and I hope to get to their ideas in more depth in future posts. However, for now in a nutshell, the question of computability comes down to the cardinality of the set you are trying to compute. If the set of possible outcomes of a problem is countable, then the problem is computable. If it is not countable, then it is not. Now to physics: If we believed that space-time were a true continuum (i.e. described by real numbers) then the set of all possible configurations of two proteins would be uncountable and Montague would be correct that the dynamics are uncomputable.

However, there are two very plausible responses to this problem. The first is that although real numbers are uncomputable, they can be approximated arbitrarily closely by countable rational numbers. This is the foundation of numerical analysis, which shows that any continuous dynamics can be approximated arbitrarily well with a discretized system. That's how we do numerical simulations for weather prediction, airflow over a an airplane wing, and even protein-protein interactions. The reason that numerical simulations work is that there is an underlying smoothness to the dynamics so we can approximate it by a finite set of points. In essence, we can predict what will happen next for short enough times and distances. Now, this need not be true. It could be that space-time is chaotic at small scales so that no discretization can approximate it. This is proposed in theories of quantum gravity. However, even if that were the case there is probably some averaging over larger scales that effectively smooths the underlying turbulence, (think the uncertainty principle), to make physics effectively computable and that is what we deal with on a day to day basis.

The second way to argue for the computability of the universe is that the entropy of the observable universe is finite. Entropy is basically the logarithm of the number of microstates. Thus if entropy is finite then the number of possible configurations of the universe is countable (actually finite) so again physics is computable. Why is the entropy of the universe finite? Again, think quantum mechanics. If space-time is quantized then there will be a smallest scale, namely the Planck length which is about 10^-35 metres.

However, Montague does have a point that a simulation of the interactions of two or more proteins could be intractable in that it would take an immense amount of memory or time to do a computation. The field of algorithmic complexity examines questions of intractability of which the most famous problem is whether or not P = NP. I don't have time to go into that one but I hope to post more on that later as well. So, while we may never be able to simulate the brain, that doesn't mean the brain is not computable, it's just intractable. However, intractability doesn't imply that we couldn't build an artificial brain.

I think I'll save my comments on Montague's other points in a future post.