National Children's Study

Last week, the NY Times printed a short article by Kate Murphy marking the beginning of the National Children’s Study (Official site) this coming January:

After nearly a decade of planning, researchers will begin recruiting pregnant women in January for an ambitious nationwide study that will follow more than 100,000 children from before birth until age 21.
The goal of the federally financed project, the National Childrens Study, is to gain a better understanding of the effects of a wide array of factors on childrens health.

At a total cost of $2.7 billion, the study has been controversial – it looks like a giant fishing expedition, and its planning involved decisions not only about science but about which congressional districts would be home to study sites. In other words, it may be science but it’s larded with a lot of pork. Still, as fishing expeditions go, this one has a lot more potential than many high-energy physics or space projects that have comparable budgets.

The study was conceived at a time when substantial genetic information would not have been expected. But the study has been retrofitted with genomics, to some extent. Over the 21-year duration, the cost of full genetic sampling of each study participant (and parents) will be trivial relative to the total cost of the study. They’re already planning intrusive biological sampling for chemical agents:

Participating mothers and children (fathers will be encouraged but not required to take part) will be given periodic interviews and questionnaires. They will further be asked to submit samples of blood, urine and hair. Air, water and dust from their environments will also be sampled and tested.

Heck, if all the participants got 23andMe today, it would be a drop in the bucket compared to the total budget. It will be interesting to see how they alter the study protocol to include whole-genome sequencing when it becomes feasible. It shouldn’t be that hard to convince a few congressmen…

As presently described, their genetic methods are sort of rudimentary – they have the usual problem of a very large number of comparisons, and need ways to deal with it. You can believe that people who solve problems in this area will be in high demand. Today, the methods to deal with genome-wide data have to fall back on widespread tools like PHASE and STRUCTURE, which really don’t solve the problems at the level of information resolution that might be available. This is an increasingly an anthropological problem, since the fine-scale information about human history and prehistory can contribute to the statistical power of association studies, if the researchers take this information into account.

Also of interest: the entire cohort will include around 3000 sets of twins.