Category: Genomics

We traditionally think about the tree of life in terms of Kingdoms: plants, animals, fungi, bacteria, etc. Genetics has really revolutionized the way we think about the tree of life and, because our classifications should reflect ancestry (that is, who is more closely related to whom), it has actually called into question a lot of our traditional classifications. Most biologists split up life into three domains: Archaea, Bacteria, and Eucarya (the last of which includes animals, plants, fungi, etc.).

The three domains of life. From Carl Zimmer's blog The Loom.

Science writer Carl Zimmer has an interesting post on his blog about how the newest genetic data may even call this classification into question by adding a fourth domain. From his post:

There’s a lot of debate about whether eukaryotes actually split off from within the archaea, or just branched off from a common ancestor. Nevertheless, the two forms of life are quite distinct. For one thing, the common ancestor of living eukaryotes acquired oxygen-consuming bacteria that became a permanent part of their cells, called mitochondria. They’re keeping you alive right now.

A lot of scientists wonder how all the new species that scientists are discovering are going to change the shape of this tree. Will its three-part structure endure, with each part simply growing denser with new branches? Or have we been missing entire swaths of the tree of life?

It’s possible–but just possible at this point–that we have missed a big part of it.

Genetics are indicating that the fourth domain is represented by Nucleocytoplasmic Large DNA Viruses (NCLDV), or, as Zimmer comically refers to them, “Giant Viruses” (because they’re big and they have many more genes, >1,000, than “regular” viruses, which only have around 10 or so). It turns out that the genes from these “Giant Viruses” are so different that some researchers suggest they should be grouped in their own domain.

New research is looking at tons of genes from these sorts of organisms. Here is a video based on the work of Jonathan A. Eisen and his colleagues:


Wu, D., et al. (2011). Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreting novel, deep branches in marker gene phylogenetic trees. PLoS Online 6.

As a story in the New York Times reports, some people respond well to aerobic exercise, while others seem to benefit less or not at all. There are studies that show there is a genetic component to this: various exercise traits (and the drive to exercise at all) do certainly run in families. A new study has scanned to genomes of 473 individuals subjected to the same 5-month exercise regime and found that particular SNPs (pronounced “snips;” we’ve talked about these before, see this post) are associated with a robust response to exercise. From the New York Times story:

The researchers looked at 324,611 individual snippets over all. Each of the volunteers had already completed a carefully supervised five-month exercise program, during which participants pedaled stationary bicycles three times a week, at controlled and identical intensities. Some wound up much fitter, as determined by the increase in the amount of oxygen their bodies consumed during intense exercise, a measure called maximal oxygen capacity, or VO2 max. In others, VO2 max had barely budged. No obvious, consistent differences in age, gender, body mass or commitment marked those who responded well and those who continued to huff and struggle during their workouts, even after five months.

But there was a divergence in their genomes. The researchers identified 21 specific SNPs, out of the more than 300,000 examined, that differed consistently between the two groups. SNPs come in pairs, since each of us receives one paternal copy and one maternal copy. So there were 42 different individual versions of the 21 SNPs. Those exercisers who had 19 or more of these SNPs improved their cardiorespiratory fitness three times as much as those who had nine or fewer.

One interesting question that is raised by this research is: if one finds that they do not have the advantageous SNPs, will they simply not try to exercise at all?

Our keynote speaker John Hawks describes this study and harps on the New York Times reporting on his blog.


Bouchard, C., et al. Genomic predictors of maximal oxygen uptake response to standardized exercise training programs. Journal of Applied Physiology in press.

Researchers at the Royal Institute of Technology in Stockholm have now set the world record for number of simultaneous DNA sequence analyses: 5,000. Now, we’re not talking about whole-genome sequencing here; they’re just sequencing parts of an individual’s DNA sequence, but it’s impressive nonetheless. From the summary in ScienceDaily:

“Today the great majority of samples are run ten at a time. This yields a cost of SEK 10,000 (USD $1,600) per sample. We have run 5,000 samples at the same time at the same cost, that is, SEK 100,000. This computes to SEK 20 (USD $3) per sample,” says Peter Savolainen.

He points out several areas where his and his colleagues’ new method can have a great impact. One of them is cancer research, where there is a great need to scan numerous cell samples from many individuals. This is to see which cells and genes are involved in the cancer.

“Another field where our method can be of huge importance is in organ transplants. Many DNA analyses are needed to create a database for matching organ donors with transplant recipients. This will be of major importance to DNA research,” says Peter Savolainen.

Pretty cool stuff…

The FDA held a meeting on March 8th and 9th about direct-to-consumer (DTC) genetic testing. According to the FDA’s executive summary, DTC is:

…clinical genetic tests that are marketed directly to consumers (DTC clinical genetic tests), where a consumer can order tests and receive test results without the involvement of a clinician.

As Dan Vorhaus of the Genomics Law Report describes it, the main issue of the meeting was to decide how (and if) the FDA will regulate DTC genetic tests. There were really two perspectives:

1. Those who oppose DTC genetic testing worry that incorrect or misinterpreted tests could produce harmful outcomes, and they even questioned whether anything of value is actually gained from the tests in the first place.

2. Those who support DTC genetic testing argue that the information empowered patients to explore their “genetic selves” without any ill effects.

The meeting will sum up with recommendations for the FDA from the Molecular and Clinical Genetics Panel (MCGP), which is an FDA committee that “reviews and evaluates data concerning the safety and effectiveness of marketed and investigational in vitro devices for use in clinical laboratory medicine including clinical and molecular genetics and makes appropriate recommendations to the Commissioner of Food and Drugs.” Vorhaus suspects that the MCGP will recommend:

that clinical (as defined by the FDA, which is itself a separate issue) direct-to-consumer genetic testing, when offered without a requirement that a clinician participate in the ordering, receipt and interpretation of the test, be removed from the marketplace. At least for the time being.

Our keynote speaker, John Hawks, blogs about this issue and considers himself a “genetic libertarian.” He describes his position:

I believe that I have a fundamental right to my own biological information. What I mean is that, if anybody has biological information about me, I should be able to access and use it. Additionally, I think it is immoral for anyone to charge me excessive rates to access my own information. So that’s where I’m coming from. I’m a genetic libertarian. 

For more info see the FDA’s website for the event.

What do you think about DTC genetic testing? Do you think it’s a good idea? How much regulation (if any) should be provided by government agencies?

Continuing with our exploration of the vignettes in Science’s 10th anniversary celebration of the human genome project, we run across an interview with Eric Green, who just recently became the director of the National Human Genome Research Institute. As with all of these pieces, there’s lots of interesting stuff here. A couple of highlights from the interview:

Q: Why did you set 2020 for when genomics will begin affecting health care? Why is it going to take so long?

Eric Green: When we talk to people who have a historic view of medical advances, they have pointed out that truly changing medical care takes a substantial amount of time. Often decades. And I’ve grown sensitive to the criticisms of genomics by some who believe that since 2003, when the genome project ended, we haven’t sufficiently improved human health 7 years later. So part of the reason is just to be a little bit more realistic and a little more cautious.  

Q: Where are you hoping we will be by 2020?

Eric Green: I’m hoping that by 2020 we will have this incredible mountain of information about how genetic variants play a role in disease, that it will just provide an entirely new venue for really thinking about how to both predict disease, maybe prevent disease, and certainly treat disease.

Notice that Dr. Green seems pretty confident in our ability to use genomics to predict and treat disease, but puts a “maybe” in front of prevention.


Kaiser, J., Green, E. (2011). The genome project: what will it do as a teenager? Science 331: 660.

There is a gene, with the exciting name of Tp53, that, among other things, regulates apostosis (programmed cell death). Apostosis is actually good for a number of reasons, but one thing that it does is get rid of damaged cells before they cause problems. There are, however, two variants of this gene: one has the amino acid arganine, in which apostosis proceeds normally, and the other, which has the amino acid proline, where apostosis is curtailed. The arganine variant has been shown to protect individuals from the development and spread of cancer cells. This is a good thing, of course. However, apostosis of brain cells occurs in the aftermath of a stroke, and if it is not checked, this can delay or prevent an individual’s recovery. You can imagine, then, that a new study finds that people with the arginine variant do not recover as well from strokes as those with the proline variant. From a summary in ScienceNews:

Of people who had a poor prognosis after a stroke, about 81 percent carried two copies of the arginine variant. About 91.5 percent of people with a poor outcome after a hemorrhage had the arginine variant. None of the people with two copies of the proline variant had bad outcomes after either stroke or hemorrhage. People with one copy of each variant tended to have good prognosis after either type of brain injury.

One commentator suggests that his may not apply equally well across racial groups. Again, from the summary:

“We know already that there’s no way this is going to hold up in African Americans,” says Maureen Murphy, a cancer biologist at the Fox Chase Cancer Center in Philadelphia. African Americans tend to have the proline version of p53, but also have high rates of stroke, often with very poor outcomes, she says. It will be important to repeat the study in other ethnic groups to determine the variants are good predictors of stroke outcome for everyone.


Gomez-Sanchez, J.C., et al. (2011). The human Tp53 Arg72Pro polymorphism explains different functional prognosis in stroke. Journal of Experimental Medicine. 209: online.

When humans are nursing, we all have an enzyme, lactase, that allows us to break down the milk sugar lactose. However, in our early ancestors, the activity of lactase eventually decreased or stopped entirely. Those modern humans that retain this trait are lactose intolerant as adults. However, as we know, some people are able to safely consume milk (and thus lactose) into adulthood. Sarah Tishkoff and her colleagues summarized their recent findings on lactase persistence a couple weeks ago at the American Association for the Advancement of Science meetings in D.C. From one of Tishkoff et al.’s papers on the subject (see reference below):

These individuals have the ‘lactase persistence’ trait. The frequency of lactase persistence is high in northern European populations (>90% in Swedes and Danes), decreases in frequency across southern Europe and the Middle East (~50% in Spanish, French and pastoralist Arab populations) and is low in non-pastoralist Asian and African populations (~1% in Chinese, ~5%–20% in West African agriculturalists). Notably, lactase persistence is common in pastoralist populations from Africa (~90% in Tutsi, ~50% in Fulani)

What do all these populations with high frequencies of the lactase persistence trait have in common? You guessed it….they all have a long history of cattle domestication. What’s cool about this new study is they show that the genetic mutation that gave rise to lactase persistence in modern Europeans is different from that of modern Africans. So, basically, this trait evolved independently at least twice. It also appears as if the evolution and spread of lactase persistence is consistent with a selective sweep (see this post for more info) that began about 7,000 years ago. So, in other words, it’s spread really fast, which means that it conferred a pretty big advantage to those individuals that had it. For more info see this podcast from Scientific American.

Participate in our poll below…are you lactose intolerant? Can you trace your ancestry back to populations that practiced cattle domestication?


Tishkoff, S.A., et al. (2006). Convergent adaptation of human lactase persistence in Africa and Europe. Nature Genetics 39: 31-40.

We’re starting to go through some of the interesting vignettes in Science’s 10th anniversary celebration of the human genome project. One of these papers takes a realistic view of how genomic research has benefited human health over the past 10 years. A few areas that the authors touch on:

1. Identifying risk:  The predictive power of most genetic variants associated with diseases is not very high. This means that the potential benefits of separating patients even into gross categories such as “high” and “low” risk based on the presence/absence of disease-risk genes are in many cases outweighed by the cost of potentially misclassifying (and thus mistreating) them.

2. The difficulty of changing behavior: When someone is told they are at a genetically higher risk of developing a particular disease, there is really no evidence to indicate that they change their dietary or exercise habits (see also this post on the blog). Altering an individual’s environment (regardless of the presence/absence of disease-risk genes) is probably a better, and more lasting, way of convincing them to be less lazy, or to eat better and not smoke.    

3. False hope: Scientists and the press are both responsible for creating false hopes for genomic research in human health.

The authors do suggest that the following are realistic expectations:

1. The genes responsible for most Mendelian disorders will be identified. This will permit quick diagnoses, particularly for diseases that are caused by a single gene. 

2. Pharmacogenomics (the study of the influence of genetic variation on drug response) will enhance the safety and efficacy of treatments. However, because a lot of variability in drug response is tied to non-genetic factors, we can’t expect genomics to completely solve this issue.    

They make the interesting suggestion that because most mortality in high-income countries results from things like smoking, sedentary behavior, and excessive food and alcohol consumption, the diseases associated with these factors are best (or at least as effectively) researched via the social and behavioral sciences (i.e., how do we change these behaviors?) rather than through genomics (i.e., how do we identify individuals at genetic risk for these diseases?).  


Evans, J.P., Meslin, E.M., Marteau, T.M., Caufield, T. (2011). Deflating the genomic bubble. Science 331: 861-862.

UPDATE 2.23.2011. Dr. Hawks blogs about this issue here.

Ok, so we all know what the genome is: the entire sequence of As, Ts, Gs, and Cs (all 6 billion or so of them) that make up the DNA sequence in each of our cells. As you all now know, we’ve had the complete human genome mapped out for ten years now (of course, we’re still trying to figure out what it all does). We also know that whenever there is a change in the DNA molecule (a mutation), that change can become permanent and, in some cases, will be passed down to offspring.  

Well, there are ways that changes can be heritable WITHOUT actually changing the underlying DNA sequence…which is where the epigenome comes into play (“epi” comes from the Greek for “above” or “over”). The epigenome is the host of non-genetic factors that cause genes to change the way that they behave. The classic example is cell differentiation: how is it that a single fertilized egg can differentiate into heart cells, liver cells, skin cells, etc., even though the DNA molecule remains the exact same in all the cells? What happens is that epigenetic factors turn on only the genes that are needed for each cell type to carry out their specific functions. 

Now, most of these epigenetic changes occur only within an individual’s lifetime and are thus only passed from one cell to the next as they divide. However, and this is where it gets really cool, if these changes occur in a sperm or egg cell, then some could be inherited from one generation to the next. This should sound familiar because, in effect, this would be the inheritance of acquired characteristics, which is Jean-Baptiste Larmarck’s oft-ridiculed mechanism for evolution!

Scientists are now on the hunt for a map of the epigenome. For starters, they have been mapping the relatively small epigenomes of the fruit fly and the round worm, and an ongoing study now has a basic epigenomic map for both species. What can this tell us? From an interview with one of the team leaders on the project, Dr. Sarah C.R. Elgin (Washington University, St. Louis), from ScienceDaily:

“We learned many things from the Human Genome Project,” Elgin says, “but of course it didn’t answer every question we had!

“Including one of the oldest: We all start life as a single cell. That cell divides into many cells, each of which carries the same DNA. So why are we poor, bare, forked creatures, as Shakespeare put it, instead of ever-expanding balls of identical cells?

“This work,” says Elgin, “will help us learn the answer to this question and to many others. It will help us to put meat on the bones of the DNA sequences.”

There is actually a conference on human epigenetics, Environmental Epigenetics and Disease Susceptibility, March 27-April 1 in Asheville, NC. Just a smattering of the papers that will be given:

“Epigenetics, Brain Evolution and Behavior”; “Transgenerational Epigenetic Inheritance”; “Epigenetics at the Interface of Genetic and Environmental Risk Factors for Autism Spectrum Disorders”; “The Imprinted Brain Theory: How Genes Set the Balance between Autism and Psychosis”

It was waaaaaaayyy back in Feburary, 2001, when the human genome sequence was first published in the journal Science. This month, Science is running a series of vignettes about the impact of this revolution on genetics, culture, society, policy, and everything else. Take some time and look them over.

Science’s most recent podcasts also deal heavily with human genomics. Check these ones out: Ten years of the human genome sequence; Releasing genomic data; Forensic genetics

Check back for specific posts on these vignettes, because there’s a lot of interesting stuff to discuss here…