Open-source Analysis Tool enables Rapid Annotation of Metagenomes

20 Nov 2008
Emily
Student / Graduate

The study of microbial communities from environmental samples, known as metagenomics, has been propelled forward by technological advancements in high-throughput, low-cost sequencing such as 454’s Genome Sequencer FLX System.

With the recent use of sequencing approaches for metagenomic analysis, the major research challenges have shifted from generating to analyzing data. To resolve these analysis bottlenecks, a team of researchers from Argonne National Laboratory, The University of Chicago, and San Diego State University have developed an open source system for automated processing of metagenomic sequence data. The resource, designed specifically for data files generated by the 454 Sequencing platform, generates phylogenetic and functional summaries of the genomes by comparing the sequence against protein and nucleotide databases.

The metagenomics-RAST server, available via the company articel webpage link, is based on the SEED framework for comparative genomics. Researchers can upload data sets directly in the file format generated by the GS FLX instrument, either as raw reads or assembled contigs. “We built this analysis tool with specific consideration for 454 Sequencing data sets,” explained Rob Edwards, the project lead. “Only the long sequence reads from the GS FLX System ensure the specificity needed to compare data against DNA or protein databases for functional annotation, making it the platform of choice for metagenomic analysis.”

Metagenomes uploaded to the high-throughput pipeline are compared against a variety of known sequence databases, including rRNA and mitochondrial databases, and are screened for protein encoding genes. The tool provides the data types needed for phylogenetic comparisons, functional annotations, binning of sequences, phylogenomic profiling, and metabolic reconstructions.

“We are excited to see the development of this new online annotation tool,” said Ulrich Schwoerer, Head of Global Marketing 454 Life Science, a Roche company. “As we continue to improve our sequencing technology with higher throughput and longer reads, we support efforts to build collaborative networks for sharing and analyzing data within the metagenomic research community.”

The study titled “The metagenomic RAST server- a public resource for the automatic phylogenetic and functional analysis of metagenomes,” appears in the journal BMC Bioinformatics.

454 Life Sciences, a center of excellence of Roche Applied Science, develops and commercializes the innovative 454 Sequencing system for ultra-high-throughput DNA sequencing. Specific applications include de novo sequencing and re-sequencing of genomes, metagenomics, RNA analysis, and targeted sequencing of DNA regions of interest. The hallmarks of the 454 Sequencing system are its simple, unbiased sample preparation and long, highly accurate sequence reads, including paired-end reads. The technology of the 454 Sequencing system has enabled hundreds of peer-reviewed studies in diverse research fields, such as cancer and infectious disease research, drug discovery, marine biology, anthropology, paleontology and many more.

Tags