Genetic studies for decades have estimated that humans and chimpanzees possess genomes that are about 98.5 percent similar. In other words, of the three billion base pairs along the DNA helix, nearly 99 of every 100 would be exactly identical.
However, new work by one of the co-developers of the method used to analyze genetic similarities between species says the figure should be revised downward to 95 percent.
Roy Britten, a biologist at the California Institute of Technology, reports in the current issue of the journal Proceedings of the National Academy of Sciences that the large amount of sequencing that has been done in recent years on both the human and chimp genomes—and improvements in the techniques themselves—allow for the issue to be revisited. In the article, he describes the method he used, which involved writing a special computer program to compare nearly 780,000 base pairs of the human genome with a similar number from the chimp genome.
To describe exactly what Britten did, it is helpful to explain the old method as it was originally used to determine genetic similarities between two species. Called hybridization, the method involved collecting tiny snips of the DNA helix from the chromosomes of the two species to be studied, then breaking the ladder-like helixes apart into strands. Strands from one species would be radioactively labeled, and then the two strands recombined.
The helix at this point would contain one strand from each species, and from there it was a fairly straightforward matter to "melt" the strands to infer the number of good base pairs. The lower the melting temperature, the less compatibility between the two species because of the lower energy required to break the bonds.
In the case of chimps and humans, numerous studies through the years have shown that there is an incidence of 1.2 to 1.76 percent base substitutions. This means that these are areas along the helix where the bases (adenine, thymine, guanine, and cytosine) do not correspond and hence do not form a bond at that point.
The problem with the old studies is that the methods did not recognize differences due to events of insertion and deletion that result in parts of the DNA being absent from the strands of one or the other species. These are different from the aforementioned substitutions. Such differences, called "indels," are readily recognized by comparing sequences, if one looks beyond the missing regions for the next regions that do match.
To accomplish the more complete survey, Britten wrote a Fortran program that did custom comparisons of strands of human and chimp DNA available from GenBank. With nearly 780,000 suitable base pairs available to him, Britten was able to better infer where the mismatches would actually be seen if an extremely long strand could be studied. Thus, the computer technique allowed Britten to look at several long strands of DNA with 780,000 potential base pairings.
As expected, he found a base substitution rate of about 1.4 percent—well in keeping with earlier reported results—but also an incidence of 3.9 percent divergence attributable to the presence of indels. Thus, he came up with the revised figure of 5 percent.
As for the implications, Britten says the new work should help biologists with future work on precisely how species branch off from each other, and why. "The basic question you would like to answer is what makes the chimp different from humans—what were the basic changes in the genome that mattered.
"A large number of these 5 percent of variations are relatively unimportant. But what matters, according to everyone's idea, is regulation of the genes, which is controlled by the genes that are actually expressed. So to address this issue, you first have to know how different the genomes are, and second, where the differences are located.
The article is available from PNAS by contacting Jill Locantore, the public information officer, at [email protected], or by calling 202-334-1310.
Contact: Robert Tindol (626) 395-3631