Simultaneously Sequence Long Contiguous DNA Regions in Thousands of Samples with Population Genetics Innovative Reflex™ Technology
17 Apr 2013Researchers at Population Genetics Technologies Ltd (Cambridge, UK) have developed and validated an innovative technology - Reflex™ - for efficient targeted sequencing of long DNA regions in large numbers of genomic DNA samples.
Targeted sequencing is used to study specific parts of a genome that may be involved in disease or other relevant clinical traits. Unveiling the role of these genomic regions usually requires interrogating long contiguous DNA sequences, such as a gene or genes, and doing so in many hundreds or thousands of people. Samples from these individuals can be pooled to take advantage of the high capacity of current sequencing platforms, but current targeting approaches require processing of each sample separately to generate the multiplicity of small fragments required for next generation sequencing. Reflex™ starts with pools of large genomic regions from hundreds or thousands of samples, performing fragmentation on the pool yet retaining the initial sample identity, thus greatly increasing the efficiency and decreasing the cost of targeted sequencing of contiguous genomic regions in large sample numbers.
This Reflex™ technology uses an intramolecular reaction to derive the shorter, sequencer-ready, daughter products from a pooled population of barcoded long-range PCR products while preserving the cognate DNA barcodes. This allows the large targeted region from many thousands of samples to be processed simultaneously in a pool, while allowing the derived sequences to be matched back to each individual. The size of the targeted region depends on the desired design of the long-range PCR, but typically will span 7-10 kilobases.
The Reflex workflow enables uniform sequence coverage of long contiguous sequence targets in large numbers of samples at low cost on desktop next-generation sequencers. The method requires small amounts of input genomic DNA and can be used to target members of multi-gene families with high specificity. The technique is platform-agnostic, having been used successfully on Roche 454, Ion Torrent and Illumina platforms.
Current next-generation sequencing (NGS) platforms require that adaptors are added to the ends of short target DNA fragments to be sequenced. Adding a multiplex identifier (MID), a short DNA barcode that identifies the sample, with the sequencing adaptor allows multiple DNA samples to be processed in a single sequencing run. Typically, individual samples are prepared and then pooled at the sequencing step, requiring expensive and labor-intensive preparation methods: thus, for targeted re-sequencing, sample preparation costs dominate the overall cost.
Population Genetics CEO, Alan Schafer said that this issue motivated Population Genetics founder, Nobel Laureate Sydney Brenner, to invent a technique that can perform sample preparation on a pooled population of long ‘parent’ DNA fragments which are already appended with adaptors and MIDs, to generate smaller, sequencer-ready, ‘daughter’ amplicons that preserve the adaptors and MIDs.
“Many laboratories are interrogating the same genomic regions in many hundreds, if not thousands, of samples and can benefit from the sample-scale efficiencies of Reflex”. “When coupled with sequencers that allow an extra indexing run, the method can be used to simultaneously sequence thousands of samples in a single run”, he said. The company has already used the Reflex workflow in this way to extract and sequence a gene target from 3000 human genomic DNA samples as part of an on-going disease susceptibility collaboration.
Reflex technology also has the potential to generate long reads within and beyond each starting long range PCR product by propagating molecular identifiers across a contiguous region (in development at Population Genetics). The resulting data can inform haplotyping, genome phasing and RNA isoform identification using short-read NGS platforms, extending its value in providing coverage of clinically important genes and genomes.