john hawks weblog

paleoanthropology, genetics and evolution

A standard variation file format for human genome sequences

Sun, 2011-07-31 22:09 -- John Hawks
TitleA standard variation file format for human genome sequences
Publication TypeJournal Article
Year of Publication2010
AuthorsReese, M, Moore, B, Batchelor, C, Salas, F, Cunningham, F, Marth, G, Stein, L, Flicek, P, Yandell, M, Eilbeck, K
JournalGenome Biology
Volume11
PaginationR88+
Date Publishedaug
ISSN1465-6906
Keywords2010-09-18, bioinformatics, data access, dataset, open access, sequencing, software
Abstract

Here we describe the Genome Variation Format (GVF) and the 10Gen dataset. GVF, an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data. The 10Gen dataset, ten human genomes in GVF format, is freely available for community analysis from the Sequence Ontology website and from an Amazon elastic block storage (EBS) snapshot for use in Amazon's EC2 cloud computing environment.

URLhttp://dx.doi.org/10.1186/gb-2010-11-8-r88
DOI10.1186/gb-2010-11-8-r88
Citation KeyReese:format:2010

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.