Cosmologists are very troubled by the fact that they can't account for (depending on whom you ask) 90% to 99% of the mass and energy of the universe. The nature of this "Dark Matter" is the most pressing problem of their field. However, biologists don't seem nearly as perturbed by the fact that the purpose of a similar fraction of the mammalian genome is completely unknown. They are so unconcerned that only a small fraction of the genome is in the genes that code for proteins that much of the non-coding region is simply called junk DNA.
It has always perplexed me why most of our DNA would be junk. I can't believe that 90% of the DNA has no use whatsoever. It would seem much more likely that this so-called junk DNA is necessary for genetic regulation. After all, the main reason I am different from another person is not in the differences in the proteins I carry but in how and when they are expressed. Darwin himself recognized that much of the variation in nature must be due to regulation.
A very nice paper by Peter Andolfatto in the October 20 issue of Nature shows that in the fruit fly between 40% to 70% of the DNA nucleotides situated between genes are under selection pressure by evolution. He showed this in a very clever way. He analyzed the DNA of two species of Drosophila - D. melanogaster and D. simulans and looked at the level of polymorphism (differences within a species) and divergence (differences between species) in the genome. As a control he looked at synonymous sites (region in the coding region of DNA where a change in the nucleotide does not change the amino acid it codes for because of redundancies in the nucleotide triplet code).
Andolfatto found that the rate of mutation in non-coding regions is slightly lower than in synonymous sites indicating these sites have undergone negative selection pressure. Additionally, he found that the divergence rate in selected sites was increased relative to the polymorphism indicating that they also experience positive selection pressure. In other words, most mutations in these regions are deleterious and thus are selected against but every once in a while a nucleotide substitution confers some advantage and this is selected for. The bottom line is that these non-coding regions are crucial for the survival of the organism.
What these non-coding regions are for is unknown. The current dogma says that gene expression is controlled by sets of transcription factors that act on various promoter regions. According to Alex Kondrashov in the accompanying News and Views piece, current estimates of the fraction of functionally important segments of mammalian non-coding DNA is less than 15%. Although, an equivalent study still needs to be done in mammals, I'm betting that a significant portion of what is thought of as junk DNA is used for regulation and in a completely novel way.