Sunday, November 30, 2008
Wednesday, November 26, 2008
I like to visualize this as a problem of flux balance. For example, let be a CDS payout from party j to i. Then the total net gain or loss (i.e. flux) for party i is given by
Saturday, November 22, 2008
However, Tom was so frustrated with the whole episode that he decided that he needed to prove the Kepler conjecture formally, so that the asterisk could be removed. He began what he called the Flyspeck project, which is a stylized acronym for Formal Proof of Kepler. He's already recruited a number of mathematicians around the world to work on Flyspeck and nearly half of the computer code used to prove the Kepler conjecture is now certified. He estimates that it may take as long as twenty work years to complete the project so if he gets enough people interested it can be done quite quickly in real time.
A formal proof essentially involves translating mathematics into symbol manipulation that is encapsulated within a foundational system. Tom uses a system called HOL Light (an acronym for lightweight implementation of Higher Order Logic), which I'll summarize here. The details are fairly technical; they involve using a new axiomatic or logical system that involves types (similar to what is used in computer languages like C). HOL Light differs slightly from the Zermelo-Fraenkel-Choice axioms of set theory used in traditional mathematics. The use of types means that certain incongruous operations that any mathematician would deem nonsensical (like taking the union of a real number and a function), would automatically be disallowed. The system then involves mathematical statements or objects called terms involving symbols and logical operations or inference rules. Theorems are expressed as a set of terms called the sequent that imply the truth (or more accurately the provability) of another term called the conclusion. Proofs are demonstrations that using the allowed inference rules and axioms, it is possible to arrive at the conclusion.
What I like about formal proofs is that it reduces mathematics to dynamical systems. Each term is a point in the theorem space of all terms (whatever that means). A proof is then a trajectory between the initial condition (the sequent) and the conclusion. Technically, a formal proof is not a true dynamical system because the next step is not uniquely specified by the current state. The fact that there are multiple choices at each step is why theorem proving is hard. Interestingly, this is connected to the famous computer science problem of whether or not P=NP. Theorem proving is in the complexity class NP because any proof can be verified in polynomial time. The question is whether or not a proof can be found polynomial time. If it can be shown that this is possible then you would have a proof of P=NP and get a million dollars from the Clay foundation. It would also mean that you could prove the Riemann Hypothesis and all the other Clay Millennium problems and collect for those as well. In fact, if P=NP you could prove all theorems in polynomial time. This is one of the reasons why most people (including myself) don't think that P=NP.
I think it would be very interesting to analyze the dynamics of formal proof. So many questions immediately come to my mind. For example, what are the properties of theorem space? We know that the set of all theorems is countable but the set of possible terms is uncountable. A formal system consists of the space of all points reachable from the axioms. What does this look like? Can we define a topology on the space of all terms? I suppose a metric could be the fewest number of steps to get from one term to another term, which might be undecidable. Do people even think about these questions?
Sunday, November 16, 2008
The current crisis first became public knowledge when Bear Stearns went under in March of this year. Federal Reserve Chairman Ben Bernanke quickly engineered a buy out of Bear by JPMorgan and the market calmed for a while. Then in quick succession starting in September came the bailout of Fannie Mae and Freddie Mac, the sale of Merrill Lynch to Bank of America, the bankruptcy of Lehman Brothers, and the bailout of A.I.G. Shortly afterwards, Treasury Secretary Hank Paulson went to Congress to announce that the entire financial system is in jeopardy and requested 700 billion dollars for a bail out. The thinking was that banks and financial institutions had stopped lending to each other because they weren't sure which banks were sound and which were on the verge of collapse. The money was originally intended to purchase suspect financial instruments in an attempt to restore confidence. The plan has changed since then and you can read Steve Hsu's blog for the details.
What I want to point out here is that a narrative of what happened is not the same as understanding the system. There were certainly a lot of key events and circumstances starting in the 1980's that may have contributed to this collapse. There was the gradual deregulation of the financial industry including the Gramm-Leach-Bliley Act in 1999 that allowed investment banks and commercial banks to coalesce and the Commodity-Futures-Modernization Act in 2000 that ensured that financial derivatives remained unregulated. There was the rise of hedge funds and the use of massive amounts of leveraging by financial institutions. There were low interest rates following the internet bubble that fueled the housing bubble. There was the immense trade deficit with China (and China's interest in keeping the US dollar high) that allowed low interest rates to persist. There was the general world savings glut that allowed so much capital to flow to the US. There was the flood of physicists and mathematicians to Wall Street and so on.
Anyone can create a nice story about what happened and depending on their prior beliefs they can be dramatically different - compare George Soros to Phil Gramm. We really would like to understand in general how the economy and financial markets operate but we only have one data point. We can never rerun history and obtain a distribution of outcomes. Thus, although we may be able to construct a plausible and consistent story for why an event happened we can never know if it were correct and even more importantly we don't know if that can tell us how to prevent it from happening again. It could be that no matter what we had done, a crisis would still have ensued. My father warned of a collapse of the capitalistic system his entire life. I'm not sure how he would have felt had he lived to see the current crisis but he probably would have said it was inevitable. Or perhaps, if interest rates were a few points higher nothing would have happened. The truth is probably somewhere in between.
Another way of saying this is that we have a very large complex dynamical system and we have one trajectory. What we want to know about are the attractors, the basins of attraction, and the structural stabilty of the system. These are things that are difficult to determine even if we had full knowledge of the underlying dynamical system. We are trying to construct the dynamical system and infer all these properties from the observation of a single trajectory. I'm not sure if this task is impossible (i.e. undecidable) but it is certainly intractable. I don't know how we should proceed but I do know that conventional economic dogma about efficient markets needs to be updated. Theorems are only as good as their axioms and we definitely don't know what the axioms are for sure. I think the sooner economists own up to the fact that they really don't know and can't know what is going on, the better we will be.
Saturday, November 08, 2008
Sunday, November 02, 2008
Let me first summarize some positions currently held by the US right: 1) low taxes, 2) small government, 3) deregulation of industries, 3) free trade, 4) gun rights, 5) strong military, 6) anti-abortion, 7) anti-gay rights, and 8) anti-immigration. I would say positions 1) through 4) seem consistent with the historical notion of the right (although regulation can be consistent with the right if it makes markets more transparent), position 5) is debatable, while positions 6) through 8) seem dissonant. The left generally but not always take the opposite positions except possibly on point 5), which is mixed. The strong military position was understood as a right wing position during the Cold War because of the opposition to communism. The rationale for a strong military waned after the fall of the Soviet Union but 9/11 changed the game again and now the military is justified as a bulwark against terrorism.
The question is how to reconcile positions 6) through 8) and we could add pro-death penalty and anti-evolution into the mix as well. These positions are aligned on the right because of several historical events. The first was that many of the early settlers to the United States came to escape religious persecution at home and this is why there is a significant US Christian fundamentalist population. The second is slavery and the Civil Rights movement. The third is that middle class whites fled the cities for the suburbs in the 50's and 60's. These people were probably religious but a genetic mix between left and right.
In the early 20th century, the Republicans were an economic right wing party while the Democrats under FDR veered to the left although they were mostly Keynesian and not socialist. The South had been Democratic because Lincoln was a Republican. The Civil Rights movement in the 60's angered and scared many suburban and southern whites and this was exploited by Nixon's "Southern Strategy", which flipped the South to the Republicans. This was also a time of economic prosperity for the middle class so they were more influenced by issues regarding crime, safety, religion, and keeping their communities "intact". Hence, as long as economic growth continued, there could be a coalition between the economic right and the religious right since the beliefs of both sides didn't really infringe on each other. Hence, culturally liberal New York bankers could coexist with culturally conservative southern factory workers.
Let me now go through each point and see if we can parse them rationally. I will not try to ascribe any moral or normative value to the positions, only on whether or not it would be consistent in a right or left worldview. I think low taxes, small government, deregulation and free trade certainly belong on the right without much argument. Gun rights seem to be consistent with the right since it is an anti-regulatory sentiment. Strong military is not so clear cut to me. It certainly helps to ensure that foreign markets remain open so that would help the right. However, it could also enforce rules on other people, which is more left. It is also a big government program, which is not so right. So my sense is that a strong military is neither right nor left. Abortion is quite difficult. From the point of view of the woman, I think being pro-choice is consistent with being on the right. Even if you believe that life begins at conception and I've argued before that defining when life begins is problematic, the fetus is also a part of the woman's body. From the point of view of the fetus, I think it's actually a left wing position to be pro-life. Gay rights seems to be clearly a right wing position in that there should not be any regulation on personal choice between consenting adults. However, if you view gay behavior as being very detrimental to society then as a left winger you could possibly justify disallowing it. So interestingly, I think being against gay rights is only viable from a left wing point of view. Anti-immigration is probably more consistent with the left since immigrants could be a competitive threat to one's job. A right winger should encourage immigration and more competition. Interestingly, anti-evolution used to be a left wing position. William Jennings Bryan, who was against evolution in the Scopes Monkey trial, was a populist Democrat. He was worried that evolution theory would justify why some people had more than others. Survival of the fittest is a very right wing concept.
I doubt that political parties will ever be completely self-consistent in their positions given accidents of history. However, the current economic crisis is forcing people to make economic considerations more of a priority. I think what will happen is that there will be a growth in socially conservative economic populism, which as I argued is probably more self-consistent. Republican presidential candidate Mike Huckabee is an example of someone in that category. The backlash against Johnson's Great Society and anti-poverty measures was largely racially motivated. However, as the generation that lived through the Cold War and the Civil Rights era shrinks in influence, I think a slow rationalizing realignment in the issues among the political parties may take place.
I think it is appropriate to add on Election Day that I don't think either of the two major American political parties fall into the right or left camp as I've defined it. There are elements of both right and left (as well as a royalist bent) in both party's platforms.
Friday, October 24, 2008
What then is the theory of everything? Well, one answer would be the set of physical laws that the programmer put in. Now suppose that the programmer didn't come up with any laws but just started off a cellular automaton (CA) with some rules and an initial condition. An example of a CA, which Steve Wolfram's book "A new kind of science" describes in great detail, is a one dimensional grid of "cells" that can be in one of two states. At each time step, each cell is updated according to what state it and its nearest neighbors are in. There are thus 2^8=256 possible rule sets, since each of the 8 configurations that 3 contiguous cells can have yields two possible updated states for the middle cell. Wolfram has ennumerated all of them in his book.
One of Wolfram's former employees, Matthew Cook, proved that rule 110 is a universal computer. Hence, all possible computations (simulations) can be obtained by running through all possible initial conditions. One of these initial conditions corresponds to the program with the same physical laws that the inspired programer came up with. However, in this case the physical laws will be an emergent phenomenon of the CA. What then is the theory of everything? Is it the the set of rules of the cellular automaton? Is it the combination of the rules and the initial condition? Is it still the set of emergent physical laws? In all likelihood, the elementary constituents of the emergent theory will be comprised of some number of cells. Below this scale, the emergent theory will no longer hold and at the very lowest level, there will be rule 110.
Finally, as has been pointed out previously, the inhabitants of a simulation can never know that they are in a simulation. Thus, there is really no way for us to know if we are living in a simulation. So what does that say about our theory of everything? Will it be an uber string theory or the CA rule and initial condition? Would we want to have a theory of everything that included a theory of the programmer and the programmer's world? This is why I've come to adopt the notion that a theory of everything is a theory of computation and that doesn't really tell us much about our universe.
Sunday, October 19, 2008
However, the current economic turmoil is uncovering a more complex (or maybe obvious) interaction at play. The anti-correlation between the performance of the economy and the likelihood of a Democratic US president seems to indicate that there is a threshold effect for wealth. Happiness does not go up appreciably above this threshold but certainly goes down a great deal below it. For people above this threshold, other factors start to play a role in their political decisions and sense of well being. However, when you are below this threshold then the economy is the dominant issue.
This may be why the growth of income disparity did not create that much outcry over the past decade or so. When the majority of the population was above their wealth comfort threshold, they didn't particularly care about the new gilded age since the rich were largely isolated from them. It mostly caused unease among the rich that weren't keeping up with the super-rich. However, when the majority finally fell below their comfort threshold, the backlash came loud and strong. Suddenly, everyone was a populist. However, when (if?) the economy rights itself again then this regulatory fervor will subside in kind. The general public will tune out once again and the forces that pushed for policies favourable to unequal growth will dominate the political discourse.
The system may always be inherently unstable. Suppose that fervor for political activism and where you sit on the left-right divide are uncorrelated but have approximately equal representation. We can then divide people into four types - Active/Left, Active/Right, Nonactive/Left and Nonactive/Right. Also assume that when the economy is doing poorly, all the left are motivated but when the economy is doing well then all the right are motivated. A graph in the New York times yesterday showed that stock market growth was higher when democrats are in office, even when you don't count Herbert Hoover, who was in power during the crash of 1929. So let's assume that when the left is in power there is more total economic growth and less income disparity and when the right is in power there is less total growth but more income disparity. The nonactive fractions of the left and right determine the policy. When things are going well in the economy, the Nonactive/Left relax but the Nonactive/Right become motivated. Thus we have half the population pushing for more right leaning policies countered by only a quarter of the population, namely the Active/Left. This then leads to the right attaining power resulting in a widening of income disparity. When enough people fall below the wealth threshold, the Nonactive/Left become engaged while the Nonactive/Right disengages, which then allows the Left to come back into power. The interesting things is that the only way to break this cycle is for the right to enact policies that keep everyone above threshold. For the other stable fixed point, namely left wing policies failing completely and keeping everyone down would eventually lead to a breakdown of the system.
Friday, October 17, 2008
Following Fowler, I can imagine how early humans could take two approaches to how to divide up a downed mastodon. The paleo-leftists would argue that the meat should be shared equally among everyone in the tribe. The rightwingers would argue that each tribe member's share should be based solely on how much they contributed to that hunt. My guess is that any ancient group that had approximately equal representation of these two opposing views would outcompete groups that had unanimous agreement of either viewpoint. In the rightwing society, the weaker members of the group simply wouldn't eat as much and hence would have a lesser chance of survival reducing the population and diversity of the group. The result may be a group of excellent hunters but perhaps they won't be so good at adapting to changing circumstances. Now in the proto-socialist group, the incentive to go out and hunt would be reduced since everyone would eat no matter what. This might make hunts less frequent and again weaken the group. The group with political tension may compromise on a solution where everyone gets some share of the spoils but there would be incentives or peer pressure to contribute. This may be why genes for left and right leanings have both persisted.
If this is true, then it would imply that we may always have political disagreement and the pendulum will continuously swing back and forth between left and right. However, this doesn't imply that progress can't take place. No one in a modern society tolerates slavery even though that was the central debate a hundred and fifty years ago. Hence, progress is made by moving the center and arguments between the left and the right lead to fluctuations around this center. A shrewd politician can take advantage of this fact by focusing on how to frame an issue instead of trying to win an argument. If she can create a situation where two sides argue about a tangential matter to the pertinent issue than the goal can still be achieved. For example, suppose a policy maker wanted to do something global warming. Then the strategy should not be to go out and try to convince people on what to do. Instead, it may be better to find a person on the opposite political spectrum (who also wants to do something about global warming) and then stage debates on their policy differences. One side could argue for strict regulations and the other could argue for tax incentives. They then achieve their aim by getting the country to take sides on how to deal with global warming, instead of arguing about whether or not it exists.
Saturday, October 11, 2008
This is in contrast to Science. We all know the theory of evolution but most of us have never read Darwin's On the Origin of Species. The three volumes of Newton's Principia Mathematica can now be reduced to F = ma and F = G m1 m2/r^2. Obviously, it takes some concerted study to understand these equations but one doesn't need to read Newton to do so. It is interesting that scientists tend to worry a lot about priority of a discovery while they are alive but unless their name is directly associated with a concept, theorem or equation, the provenance of many scientific ideas tend to get lost. Quantum mechanics is often taught before classical mechanics now so most starting students have no idea why the energy function is called a Hamiltonian. The concept of the conservation of energy is so natural to scientists now that most people don't realize how long it took to be established and who were the main players.
If art is not compressible then we can interpret the complexity of the brain in terms of the complexity of art. The complete works of Shakespeare runs a little over 1200 pages. Estimating 5000 characters per page and 8 bits per character leads to a total size of less than 50 million bits, which is not very much compared to the hard drive on your computer. Charles Dickens was much more prolific in terms of words generated. Bleak House alone is over 1000 pages. I haven't counted all the pages of all twenty plus novels but let's put his total output at say a billion bits.
If art is incompressible then that means there could not be an algorithm smaller than a billion bits that could have generated the work of Dickens. This would put a lower bound on the complexity of the "word generation" capabilities of the brain. Now perhaps if you are uncharitable (like some famous authors have been), you could argue that Dickens had a formula to generate his stories and so the complexity is actually less. One way to do this would be to take a stock set of themes, plots, characters, phrases and so on and then randomly assemble them. Some supermarket romances are supposedly written this way. However, no one would argue that they compare in anyway to Dickens, much less Shakespeare. Given that the Kolmogorov complexity is uncomputable we can never know for sure if art is compressible. So a challenge to computer scientists is to write a program that can generate literature with a program shorter than the work itself.
Friday, October 03, 2008
He then goes on to describe the work of some pioneers who are trying to model the actual dynamics of markets. A Yale economist with two physicists (Doyne Farmer being one of them) used an agent-based model to simulate a credit market. They found that as the leverage (amount of money borrowed to amplify gains) increases there is a phase transition or bifurcation from a functioning credit market to an unstable situation that results in a financial meltdown.
Well, part of the reason is that economists still try to understand markets by using ideas from traditional economics, especially so-called equilibrium theory. This theory views markets as reflecting a balance of forces, and says that market values change only in response to new information — the sudden revelation of problems about a company, for example, or a real change in the housing supply. Markets are otherwise supposed to have no real internal dynamics of their own. Too bad for the theory, things don’t seem to work that way.
Nearly two decades ago, a classic economic study found that of the 50 largest single-day price movements since World War II, most happened on days when there was no significant news, and that news in general seemed to account for only about a third of the overall variance in stock returns. A recent study by some physicists found much the same thing — financial news lacked any clear link with the larger movements of stock values.
Certainly, markets have internal dynamics. They’re self-propelling systems driven in large part by what investors believe other investors believe; participants trade on rumors and gossip, on fears and expectations, and traders speak for good reason of the market’s optimism or pessimism. It’s these internal dynamics that make it possible for billions to evaporate from portfolios in a few short months just because people suddenly begin remembering that housing values do not always go up.
Really understanding what’s going on means going beyond equilibrium thinking and getting some insight into the underlying ecology of beliefs and expectations, perceptions and misperceptions, that drive market swings.
I found this article interesting on two points. The first is the attempt to contrast two worldviews: the theorem proving mathematician economist versus the computational physicist modeler. The second is the premise that the collective dynamics of a group of individuals can be simpler than the behavior of a single individual. A thousand brains may have a lower Kolmogorov complexity than a single brain. My guess is that biologists (Jim Bower?) may not buy this. Although, my worldview is more in line with Buchanan's, in many ways his view is on less stable ground than traditional economics. With an efficient market of rational players, you can at least make some precise statements. Whereas with the agent-based model there is little understanding as to how the models scale and how sensitive the outcomes depend on the rules. Sometimes it is better to be wrong with full knowledge than be accidentally right.
I've always been intrigued by agent-based models but have never figured how to use them effectively. My work has tended to rely on differential equation models (deterministic and stochastic) because I generally know what to expect from them. With an agent-based model, I don't have a feel for how they scale or how sensitive they are to changes in the rules. However, this lack of certainty (which also exists for nonlinear differential equations, just look at Navier-Stokes for example), may be inherent in the systems they describe. It could simply be that some complex problems are so intractable that any models of them will rely on having good prior information (gleened from any and all sources) or plain blind luck.
Friday, September 26, 2008
The theory of computation is basically about repeated operations on a finite set of symbols according to some rule. The paradigm is the Turing machine, which consists of a finite number of internal states and a tape on which symbols can be written or erased. The Turing machine then makes transitions from state to state based on its current state and the symbols on the tape. One of the states is for the Turing machine to halt, which signals the end of a computation. The important thing to take away is the Church-Turing thesis, which basically states that all forms of computation on finite symbol sets are basically equivalent. For example, the computer on your desk is equivalent to a Turing machine. (Actually, it is even less powerful because it is finite but I digress).
One of the things that Turing proved was that there is a universal Turing machine, which can emulate any other Turing machine. The input to a Turing machine is a set of symbols on the tape, i.e. an input string. Some of the symbols code for instructions or rules and the others code for an initial condition. Turing also showed that there are problems for which a Turing machine can never solve. The most famous is the Halting Problem, which states that there does not exist a Turing machine that can decide if another Turing machine will halt given some input. Turing actually showed that it was impossible to produce a general algorithm to decide if a given input string to a Turing machine will ever cause it to print a given symbol. In other words, there is no general algorithm to decide if a computation will have an infinite loop or perform some specific task. This doesn't imply that you couldn't prove that a specific program has this property, just that there isn't a way to do it generally. The proof of the Halting problem is similar to Cantor's diagonal proof that the set of real numbers is uncountable.
One of the consequences of Turing's work is that the total number of possible computations is countable. You simply take all strings of length 1, then 2, etc and feed it to a Universal Turing machine. Every possible computation or numerical simulation will be included. Thus, any simulation of the universe is coded in one of these strings. Some of these inputs will lead to computations that will halt and some will run forever. Some will make the Turing machine print a particular symbol and some will not. However, there is no way to decide, which of the input strings are on any of these lists.
The question is then, given an input string, can you determine if it will produce a universe that has some property such as supporting life. There are actually two separate issues regarding this question. The first is, how would you even define life or recognize it in the computation. I will deal with this in a future post. The second is, given that you have a definition of life, then can you know ahead of time whether or not your simulation will produce it. The answer to this question is no because if it were yes then you could solve the Halting problem. This is easy to see because any definition of life must involve some pattern of symbols to be printed on the tape and there is no way to decide if an input string will ever produce a symbol much less a pattern. This doesn't mean that a simulator couldn't come up with a simulation of our universe, it just means that she could never come up with a general algorithm to guarantee it. So, in the infinite but countable list of possible compuations, some produce simulations of universes, perhaps even ours, but we can never know for sure which.
Saturday, September 20, 2008
However, this would not imply that we could describe a brain with this amount of information since it ignores the modifications due to external inputs. For example, the visual system cannot fully develop if the brain does not receive visual inputs. So we also need to estimate how much input the brain receives during development. The amount of information available in the external world is immense so it is safe to assume that the amount received is limited by the brain and not the source. However, there is no way to estimate this in a principled way since we don't know how the brain actually works. Depending on what you assume to be the neural code (see previous post), you could end up with a very wide range of answers. Nonetheless, let's suppose that it is a rate code with a window of 10 ms. Generally, neurons fire at rates less than 100 Hz so this corresponds to the presence or absence of a spike in a 10 ms window. This corresponds to 100 bits per neuron per second. The brain has about 10^11 neurons so the maximum amount of information that could be input to the brain is 10^13 bits per second. There are over 30 million seconds in a year, so that is a lot of information and can easily dwarf the genomic contribution.
However, this does lead us to a potential means to quantify the influence of genes versus environment on intelligence and behaviour debate. If the complexity of the brain is less than 12 billion bits then we are basically genetically determined. If it is greater, then we are mostly shaped by the environment. So what do you think?
Saturday, September 13, 2008
The concept of universality arose from the study of phase transitions and critical phenomena with inspiration from quantum field theory. In a nutshell, it says that for certain systems in regimes where there is no obvious length scale (usually indicated by power law scaling), such as at the critical point of a second order phase transition, the large scale behavior of the system is independent of the microscopic details and only depends on general properties such as the number of dimensions of the space and symmetries in the system. Hence, systems can be classified into what are called universality classes. Although, the theory was developed for critical phenomena in phase transitions, it has since been generalized to apply to a wide range of dynamical situations such as earthquakes, avalanches, flow through porous media, reaction diffusion systems and so forth.
The paradigmatic system for critical phenomena is magnetism. Bulk (ferro)magnetism arises when the atoms (each of which have a small magnetic moment) align and produce macroscopically observable magnetization. However, this only occurs for low temperatures. For high enough temperature, the random motions of the atoms can destroy the alignment and magnetization is lost (material becomes paramagnetic). The change from a state of ferromagnetism to paramagnetism is called a phase transition and occurs at a critical temperature (the Curie temperature).
These systems are understood by considering the energy associated with different states. The probability of occupying a given state is then given by the Boltzmann weight, which is exp(-H(m)/kT), where H(m) is the internal energy of the state with magnetization m (also called the Hamiltonian), T is the temperature, and k is the Boltzmann constant. Given the Boltzmann factor, the partition function (sum of Boltzmann weight over all states) can be constructed from which all quantities of interest can be obtained. Now this particular system was studied over a century ago by notables such as Pierre Curie, who using known microscopic laws of magnetism and mean field theory, found that below a critical temperature Tc, m is nonzero and above Tc m is zero.
However, the modern way of how we think of phase transitions starts with Landau, who first applied it to the onset of superfluidity of helium. Instead of trying to derive the energy from first principles, Landau said let's write out a general form based on the symmetries of an order parameter, which in this example is the magnetization m(x), at spatial location x. Since the energy must be a scalar, it can only depend on terms like |m|^2 or (grad m)^2. The first few terms then obey H ~ \int dx q (grad m)^2 - (T-Tc) m^2 + u m^4 + ..., for parameters q, T, and u. The (grad m)^2 term is due to fluctuations. If fluctuations are ignored, then this is called mean field theory, in which case H ~ -(T -Tc)/2 m^2 + u m^4. The partition function can be estimated by using a saddle point approximtion, which in the mean field limit amounts to evaluting the critical points of H, which are m=0 and m^2=(Tc-T)/4 u. They correspond to the equilibrium states of the system, so if T is greater than Tc then the only solution is m=0 and if T is less than Tc then the magnitude of magnetization is nonzero.
The partition function cannot be explicitly computed in the presence of fluctuations. This is where Ken Wilson and the renormalization group comes in. What Wilson said, following people before him like Murray Gell-Mann, Francis Low and Leo Kadanoff, is suppose we have scale invariance, which is true near a critical point. Then if we integrate out small length scales (or high spatial frequency scales), rescale in x, and then renormalize m, we will end up with a new partition function, but with slightly different parameters. These operations form a group action (i.e dynamical system) on the parameters of the partition function. Thus, a scale invariant system should be at a fixed point of the renormalization group action. In other words, if you keep applying the renormalization group, the parameters can flow to a fixed point and the location of the fixed point only depends on the symmetry of the order parameter and dimension of the space. Many different systems can flow to the same fixed point. The most important element of the renormalization group in terms of the physics worldview is that terms in the Hamiltonian are renormalized in different ways. Some grow, these are called the relevant operators, some stay the same, these are marginal operators, and some decrease, these are called irrelevant operators. For critical systems, only a small number of terms in the Hamiltonian are relevant and this is why microscopic details do not matter at large scales.
Now, these ideas were originally developed just for behavior near a critical point, which is pretty specialized. If it were only applicable to an equilibrium phase transition, then physicists really wouldn't have a leg to stand on in terms of ignoring details. However, these ideas were later generalized to dynamical systems with critical behavior. What also motivates them is that power laws (also called 1/f or fractal scaling) seem to be ubiquitous. They can been found in the size distribution of earthquakes, thermal noise in resistors, size of river meanders, the coastline of Norway, size of hubs in the Internet, connectivity of protein networks, and even neural firing patterns, to name a few. Although there is not an agreement as to why these systems exhibit power laws (many theories have been proposed), the spectre of the renormalization group and universality permeates the air and influences the physicist world view.
My personal view is that some details matter immensely while others do not. However, there is no a priori systematic way of deducing which is which. There are only rules of thumb and experience that can assist us. Hence, even if you buy into the details-may-not-matter worldview, there is no prescription for how to implement it. What it does do is give me less confidence that there is such a thing as the "correct" theory for a system. I'm more inclined to believe that given the current state of knowledge and a specific set of questions, some theories perform better than others. With more information, we can refine our theories. However, I don't think this process ever converges to "the" theory because specifying what a system is is somewhat arbitrary. Nothing is purely isolated from its surroundings, so drawing a boundary is always going to involve a choice. These could be very logical and well informed choices but choices nonetheless. Also, we can never have full control of all the external inputs that can affect a system. In this way, I have a Bayesian viewpoint in that we only make progress by updating our priors.
Saturday, September 06, 2008
The reason why biologists may have more in common with mathematicians than physicists is because unlike physics, biology has no guiding laws other than evolution, which is not quantitative. They rarely will say, "Oh, that can't be true because it would violate conservation of momentum," which was how Pauli predicted the neutrino. Given that there are no sweeping generalizations to make they are forced to pay attention to all the details. They apply deductive logic to form hypotheses and try to prove their hypotheses are true by constructing new experiments. Pure mathematicians are trained to take some axiomatic framework and prove things are true based on them. Except for a small set of mathematicians and logicians, most mathematicians don't take a stance on the "moral value" of their axioms. They just deduce conclusions within some well defined framework. Hence, in a collaboration with a biologist, a mathematician may take everything a biologist says with equal weight and then go on from there. On the other hand, a physicist may bring a lot of preconceived notions to the table (Applied mathematicians are a heterogeneous group and their world views lie on a continuum between physicists and pure mathematicians.) Physicists also don't need to depend as much on deductive logic since they have laws and equations to rely on. This may be what frustrates biologists (and mathematicians) when they talk to physicists. They can't understand why the physicists can be so cavalier with the details and be so confident about it.
However, when physicists (and applied mathematicians) are cavalier with details, it is not because of Newton or Maxwell or even Einstein. The reason they feel that they can sometimes dispense with details is because their worldviews are shaped by Poincare, Landau and Ken Wilson. What do I mean by this? I'll cover Poincare (and here I use Poincare to represent several mathematicians near the turn of the penultimate century) in this post and get to Landau and Wilson in the next one. Poincare, among his many contributions, showed how dynamical systems can be understood in terms of geometry and topology. Prior to Poincare, dynamical systems were treated using the tools of analysis. The question was: Given an initial condition, what are the analytical properties of the solutions as a function of time? Poincare said, let's not focus on the notion of movement with respect to time but look at the shape of trajectories in phase space. For a dynamical system with smooth enough properties, the families of solutions map out a surface in phase space with tiny arrows pointing in the direction the solutions would move on this surface (i.e. vector field). The study of dynamical systems becomes the study of differential geometry and topology.
Hence, any time dependent system including those in biology that can be described by a (nice enough) system of differential equations, is represented by a surface in some high dimensional space. Now, given some differential equation, we can always make some change of variables and if this variable transformation is smooth then the result will just be a smooth change of shape of the surface. Thus, what is really important is the topology of the surface, i.e. how many singularities or holes are in them. The singularities are defined by places where the vector field vanishes, in other words the fixed points. Given that the vector field is smooth outside of the fixed points then the global dynamics can be reconstructed by carefully examining the dynamics near to the fixed points. The important thing to keep track of when changing parameters is the appearance and disappearance of fixed points or the change of dynamics (stability) of fixed points. These discrete changes are called bifurcations. The dynamics near fixed points and bifurcations can be classified systematically in terms of normal form equations. Even for some very complicated dynamical system, the action is focused at the bifurcations. These bifurcations and the equations describing them are standardized (e.g. pitchfork, transcritical, saddle node, Hopf, homoclinic) and do not depend on all the details of the original system. Thus, when a dynamical systems person comes to a problem, she immediately views things geometrically. She also believes that there may be underlying structures that capture the essential dynamics of the system. This is what gives her confidence that some details are more important than others. Statistical mechanics and field theory takes this idea to another level and I'll get to that in the next post.
Sunday, August 31, 2008
So what does it mean to understand something? I would say there are two aspects. One is predictive power, which would mean that we would be able to know what drugs or therapies would be useful to cure a brain disorder. The second aspect is more difficult to pin down but would basically mean incorporating something seamlessly into your worldview. The simplest example I can give is a mathematical theorem. Predictive understanding would correspond to the ability to follow all the steps of the proof of the theorem and use the theorem to prove new theorems. Incorporative understanding would be the ability to summarize the proof in a way that relates it in a highly compressed form to things you already know. For example, we can understand bifurcations of complicated dynamical systems by reducing them to the behavior of solutions of simple polynomial equations.
Sometimes the two views can clash. Consider the proof of the Kepler Conjecture for sphere packing by my friend and former colleague Tom Hales. The theorem is difficult because there are an infinite number of ways to pack spheres in 3 dimensions. Hales made this manageable by showing that this could be reduced to solving a finite (albeit large) optimization problem. He then proceeded to solve the finite problem computationally. To some people, the proof is a done deal. The trick was to reduce it to a finite problem, after that it is just details. Even if you don't believe Hales's computation you could always repeat it. Others would say, it is not done until you have a complete pen and pencil proof. To me, I think the proof is understandable because Hales was able to reduce it to an algorithm. However this is not a view that everyone shares.
Now we come back to the brain. What would you consider understanding to entail? I'm not sure that we, namely people working in the field today, will ever have that satisfying incorporating understanding of the brain because we don't have anything in our current worldview that could encapsulate that understanding. We will never be able to say, "Oh right, I understand, the brain is like X." In that sense, it is like quantum mechanics (QM). This is a theory that is highly successful in the predictive sense. As a predictive theory, it is quite simple. There are just a few rules to apply and much of our modern technology like lasers and electronics rely on it. However, no one who has ever thought about it would claim any understanding of QM in the incorporation sense. The Copenhagen interpretation is basically a "Don't ask, don't tell" policy for the theory.
In this sense, trying to understand any complex system, no matter how unrelated it is to the brain, could help in the long run to provide a foundation for an incorporating understanding of the brain. That is not to say that I believe there are laws of complex system similar to classical and quantum mechanics. My own view is that there are no laws in complex systems such as the global climate, economics or the brain; there are just effective theories that sort of work in limited circumstances. However, it is by slowly creating effective theories and models that we will form a new worldview of what it means to understand complex systems like the brain. In the meantime, we should continue to try to build a predictive understanding so that we can cure diseases and treat disorders.
Thursday, August 21, 2008
The problem arises from how we assign meaning to things in the world. Philosophers like Wittgenstein and Saul Kripke have thought very deeply on this topic and I'll just barely scratch the surface here. The simple question to me is what do we consider to be real. I look outside my window and I see some people walking. To me the people are certainly real but is "walking" real as well? What exactly is walking? If you write a simple program that makes dots move around on a computer screen and show it to someone then depending on what the dots are doing, they will say the dots are walking, running, crawling and so forth. These verbs correspond to relationships between things rather than things themselves. Are verbs and relationships real then? They are certainly necessary for our lives. It would be hard to communicate with someone if you didn't use any verbs. I think they are necessary for an animal to survive in the world as well. A rabbit needs to classify if a wolf is running or walking to respond appropriately.
Now, once we accept that we can ascribe some reality or at least utility to relationships then this can lead to an embarrassment of riches. Suppose we live in a world with N objects that you care about. This can be at any level you want. The number of ways to relate objects in a set is the number of subsets you can form out of those objects. This is called the power set and has cardinality (size) 2^N. But it can get bigger than that. We can also build arbitrarily complex arrangements of things by using the objects more than once. For example, even if you only saw a single bird, you could still invent the term flock to describe a collection of birds. Another way of saying this is that given a finite set of things, there are an infinite number of ways to combine them. This then gives us a countable infinity of items. Now you can take the power set of that set and end up with an uncountable number of items and you can keep on going if you choose. (Cantor's great achievement was to show that the power set of a countable set is uncountable and the power set of an uncountable set is even bigger and so forth). However, we can probably only deal with a finite number of items or at most a countable list (if we are computable ). This finite or countable list encapsulates your perception of reality and if you believe this argument then the probability of obtaining our particular list is basically zero. In fact, given that the set of all possible lists is uncountable, this implies that not all lists can even be computed. Our perception of reality could be undecidable. To me this implies an arbitrariness in how we interact with the physical world which I call our prior. Kauffman calls this the sacred.
Now you could argue that the laws of the material world will lead us to a natural choice of items on our list. However, if we could rerun the universe with a slightly different initial condition would the items on the list be invariant? I think arbitrarily small perturbations will lead to different lists. An argument supporting this idea is that even among different world cultures we have slightly different lists. There are concepts in some languages that are not easily expressible in others. Hence, even if you think the list is imposed by the underlying laws of the physical world, in order to derive the list you would need to do a complete simulation of the universe making this task intractable.
This also makes me have to back track on my criticism of Montague's assertion that psychology can affect how we do physics. While I still believe that we have the capability to compute anything the universe can throw at us, our interpretation of what we see and do can depend on our priors.
Saturday, August 16, 2008
I don't want to debate the history and philosophy of science here but I do want to make some remarks about these two approaches. There are actual several dichotomies at work. One of the things it seems that Bower believes is that a simulation of the brain is not the same as the brain. This is in line with John Searle's argument that you have to include all the details to get it right. In this point of view, there is no description of the brain that is smaller than the brain. I'll call this viewpoint Kolmogorov Complexity Complete (a term I just made up right now). On the other hand, Bower seems to be a strict reductionist in that he does believe that understanding how the parts work will entirely explain the whole, a view that Stuart Kauffman argued vehemently against in his SIAM lecture and new book Reinventing the Sacred. Finally, in an exchange between Bower and Randy O'Reilly, who is a computational cognitive scientist and connectionist, Bower rails against David Marr and the top down approach to understanding the brain. Marr gave an abstract theory of how the cerebellum worked in the late sixties and Bower feels that this has been leading the field astray for forty years.
I find this debate interesting and amusing on several fronts. When I was at Pitt, I remember that Bard Ermentrout used to complain about connectionism because he thought it was too abstract and top down whereas using Hodgkin-Huxley-like models for spiking neurons with biologically faithful synaptic dynamics was the bottom up approach. At the same time, I think Bard (and I use Bard to represent the set of mathematical neuroscientists that mostly focus on the dynamics of interacting spiking neurons; a group to which I belong) was skeptical that the fine detailed realistic modeling of single neurons that Bower was attempting would enlighten us on matters of how the brain worked at the multi-neuron scale. One man's bottom up is another man's top down!
I am now much more agnostic about modeling approaches. My current view is that there are effective theories at all scales and that depending on the question being asked there is a level of detail and class of models that are more useful to addressing that question. In my current research program, I'm trying to make the effective theory approach more systematic. So if you are interested in how a single synaptic event will influence the firing of a Purkinje cell then you would want to construct a multi-compartmental model of that cell that respected the spatial structure. On the other hand if you are interested in understanding how a million neurons can synchronize, then perhaps you would want to use point neurons.
One of the things that I do believe is that complexity at one scale may make things simpler at higher scales. I'll give two examples. Suppose a neuron wanted to do coincidence detection of its inputs, e.g. it would collect inputs and fire if the inputs arrived at the same time. Now for a spatially extended neuron, inputs arriving at different locations on the dendritic tree could take vastly different amounts of time to arrive at the soma where spiking is initiated. Hence simultaneity at the soma is not simultaneity of arrival. It thus seemed that coincidence detection was a hard problem for a neuron to do. Then it was discovered that dendrites have active ion channels so that signals are not just passively propagated, which is slow, but actively propagated quickly. In addition, the farther away you are the faster you go so that no matter where a synaptic event occurs, it takes about the same amount of time to reach the soma. The dendritic complexity turns a spatially extended neuron into a point neuron! Thus, if you just focused on understanding signal propagation in the dendrites, your model would be complicated but if you only cared about coincidence detection, your model could be simple. Another example is in how inputs affect neural firing. For a given amount of injected current a neuron will fire at a given frequency giving what is known as an F-I curve. Usually in slice preparations, the F-I curve of a neuron will be some nonlinear function. However, in these situations not all neuromodulators are present so some of the slower adaptive currents are not active. When everything is restored, it was found (both theoretically by Bard and experimentally) that the F-I curve actually becomes more linear. Again, complexity at one level makes it more simple at the next level.
Ultimately, this "Bower" versus "Bard" debate can never be settled because the priors (to use a Bayesian term) of the two are so different. Bower believes that the brain is Kolmogorov complexity complete (KCC) and Bard doesn't. In fact, I think that Bard believes that higher level behavior of networks of many neurons may be simpler to understand than sets of just a few neurons. That is why Bower is first trying to figure out how a single neuron works whereas Bard is more interested in explaining a high level cognitive phenomenon like hallucinations in terms of pattern formation in an integro-differential system of equations (i.e. Wilson-Cowan equations). I think most neuroscientists believe that there is a description of the brain (or some aspect of the brain) that is smaller than the brain itself. On the other hand, there seems to be a growing movement towards more realistic characterization and modeling of the brain at the genetic and neural circuit levels (in addition to the neuron level) as evidenced by the work at Janelia Farm and EPFL Lausanne, of which I'll blog about in the future.
Friday, August 15, 2008
Muscle cells cannot uptake glucose unless insulin is present. So when you eat a meal with carbohydrates, insulin is released by the pancreas and your body utilizes the glucose that is present. In between meals, muscle cells mostly burn fat in the form of free fatty acids that are released by fat cells (adipocytes) through a process called lipolysis. The glucose that is circulating is thus saved for the brain. When insulin is released, it also suppresses lipolysis. Basically, insulin flips a switch that causes muscle and other body cells to switch from burning fat to glucose and in addition switches off the fuel supply for fat.
If your pancreas cannot produce enough insulin then your glucose levels will be elevated and this is diabetes mellitus. Fifty years ago, diabetes was usually the result of an auto-immune disorder that destroyed pancreatic beta cells that produce insulin. This is known as Type I diabetes. However, recently the most prevalent form of diabetes, called Type II, arises from a drawn out process attributed to overweight or obesity. In Type II diabetes, people first go through a phase called insulin resistance where more insulin is required for glucose to be taken up by muscle cells. The theory is that after prolonged insulin resistance, the pancrease eventually wears out and this leads to diabetes. Insulin resistance is usually reversible by losing weight.
Thus, a means to measure how insulin resistant or sensitive you are is important. This is usually done through a glucose challenge test, where glucose is either ingested or injected and then the response of the body is measured. I don't want to get into all the methods used to assess insulin sensitivity but one of the methods uses what is known as the minimal model of glucose disposal, which was developed in the late seventies by Richard Bergman, Claudio Cobelli and colleagues. This is a system of 2 ordinary differential equations that model insulin's affect on blood glucose levels. The model is fit to the data and an insulin sensitivity index is one of the parameters. Dave Polidori, who is a research scientist at Johnson and Johnson, claims that this is the most used mathematical model in all of biology. I don't know if that is true but it does have great clinical importance.
The flip side to glucose is the control of free fatty acids (FFAs) in the blood and this aspect has not been as well quantified. Several groups have been trying to develop an analogous minimal model for insulin's action on FFA levels. However, none of these models have been validated or even tested against each other on a single data set. In this paper, we used a data set of 102 subjects and tested 23 different models that included previously proposed models and several new ones. The models have the form of an augmented minimal model with compartments for insulin, glucose and FFA. Using Bayesian model comparison methods and a Markov chain Monte Carlo algorithm, we calculated the Bayes factors for all the models. We found that a class of models distinguished themselves from the rest with one model performing the best. I've been using Bayesian methods quite a bit lately and I'll blog about it sometime in the future. If you're interested in the details of the model, I encourage you to read the paper.
Saturday, August 09, 2008
One of the things I took away from this meeting was that the field seems more diverse than when I organized it in 2004. In particular, I thought that the first two renditions of this meeting (2002, 2004) were more like an offshoot of the very well attended SIAM dynamical systems meeting held in Snowbird, Utah on odd numbered years. Now, I think that the participation base is more diverse and in particular there is much more overlap with the systems biology community. One of the unique things about this meeting is that people interested in systems neuroscience and systems biology both attend. These two communities generally don't mix even though some of the problems and methods have similarities and would benefit from interacting. Erik De Shutter wrote a nice article recently in PLoS Computational Biology exploring this topic. I thus particularly enjoyed the fact that there were sessions that included talks on both neural and genetic/biochemical networks. In addition, there were sessions on cardiac dynamics, metabolism, tissue growth, imaging, fluid dynamics, epidemiology and many other areas. Hence, I think that this meeting does play a useful and unique role bringing together mathematicians and modelers from all fields.
I gave a talk on my work on the dynamics of human body weight change. In addition to summarizing my PLoS Computational Biology paper, I also showed that because humans have such a long time constant to achieve energy balance when on a fixed rate of energy intake (i.e. a year or more), we can tolerate a wide amount of fluctuations in our energy intake rate and still have a small variance in our body weight. This answers the "paradox" that nutritionists seem to believe, namely that if a change of as small as 20 kcals/day (a cookie is ~150 kcal) can lead to a weight change of a kilogram then how do we maintain our body weights if we consume over a million kcals a year. Part of their confusion stems from conflating average with standard deviation. Given that we only eat finite amounts of food per day then no matter what you eat in a year you will have some average body weight. The question is why the standard deviation is so small; we generally don't fluctuate by more than a few kilos per year. The answer is simply that with a long time constant, we average over fluctuations. My back of the envelope calculation shows that the coefficient of variation (standard deviation divided by mean) of body weight suppresses by a factor of 15 or more the coefficient of variation in the food intake. This also points to correlations in food intake rate leading to weight gain, as was addressed in my paper with Vipul Periwal (Periwal and Chow, AJP-EM, 291:929 (2006)).
Karen Ong, from my group, also went and presented our work on steroid mediated gene expression. She won the poster competition for undergraduate students. I'll blog about this work in the future. While I was sitting in the sessions on computational neuroscience and gene regulation, I regretted not having more people in my lab attend and present our current ideas on these and other topics.