Next generation sequencing: Where are we now?
By: Roche Life Science
Posted: June 29, 2015 | Lab Life - Next-Gen Sequencing
The last decade of advancements in genetic technologies has revolutionized massively parallel DNA sequencing (also known as next- or second-generation sequencing), opening the floodgates for unprecedented scientific inquiry and diagnostic capabilities.
Indeed, the rapid evolution of next-generation sequencing (NGS) platforms has driven the cost of DNA sequencing down by multiple orders of magnitude compared to the nearly 3 billion dollars it cost to sequence the first human genome. This changes the landscape of genetic information analysis, placing sequencing capabilities once only attainable by select scientists at major genetic institutes, into the hands of individual investigators all over the world. And while NGS has the potential to dramatically shift genetic and biomedical research into a new era of comprehensive studies with unparalleled biological and clinical implications—there remain significant challenges that must continue to be addressed.
In this article, we will provide a snapshot of NGS technology and briefly discuss the advantages, limitations, and availability of NGS platforms.
Sanger sequencing at a glance
For more than twenty years, automated Sanger sequencing—or 'first-generation' sequencing—dominated genome analysis technology. When performed in high-throughput methods, this process utilizes bacterial cloning (or PCR) with template purification. This is followed by end-labeling of DNA fragments with dideoxynucleotides (ddNTPs), with sequencing determined by high-resolution capillary-based electrophoresis coupled to fluorescence detection that provides four-color emission spectra that is translated by computer software into DNA sequence.
While this method resulted in unprecedented success—including the complete sequencing of the first entire human genome—it's limitations in being able to efficiently and process large numbers of genomes in a cost-effective manner has resulted in the emergence of new DNA sequencing technologies. These second- or next-generation technologies are any of several high-throughput methods for DNA sequencing that utilize the concept of massively parallel processing with miniaturized and parallelized platforms.
Next generation sequencing strategies
These NGS strategies perform sequencing of millions (in some cases more than a billion) short reads on the order of 30 to 400 bases per instrument run. There are multiple NGS platforms commercially available today, differing in their strategies for sequencing chemistry and configuration, although they tend to share major steps in massively parallel sequencing using spatially separated, clonally amplified DNA templates/molecules.
The first common step is library preparation. This is generally accomplished in cyclic-array methods by clonal amplification of DNA fragments using various PCR based methods—such as in situ PCR colony (or polony) formation, emulsion PCR, or bridge PCR—each of which result in an array of millions of spatially immobilized clonally clustered amplicons.
The second major step is the sequencing of the DNA by the process of synthesis—meaning, by the addition of single nucleotides to the complementary strand by serial extension via a polymerase or a ligase (as compared to chain-termination in Sanger biochemistry).
Lastly, data is acquired in a massively parallel manner by imaging-based detection of fluorescent labels incorporated with each extension. Thus, successive imaging across the full spatially segregated amplified DNA array at each cycle builds a contiguous sequencing read.
Importantly, while whole genome sequencing is more readily available with today's NGS technology, depending on the experimental goals of a study, it may be more efficient and cost-effective to isolate high-priority segments of genomes using target enrichment strategies. For instance, when examining specific panels of genes, performing genotyping studies, or profiling genetic recombination, the accumulation of large and—at times, unnecessary—amounts of genetic information may impede or complicate data processing and analysis.
This has served as the impetus for significant recent advancements in custom-capture hybridization strategies to enrich specific genetic regions of interest prior to sequencing. These methods have enhanced the efficiency of NGS and increased the availability of NGS platforms to researchers by reducing the cost of running multiple samples and facilitating the use of multiplexing.
The major advantage of NGS platforms is the ability to amass an enormous volume of genetic data in a relatively cost-effective manner compared to high-throughput Sanger methods. Importantly, as more NGS platforms become commercialized, the technology is becoming cheaper and more readily available to researchers. This means that researchers are quickly beginning to establish large databases of genetic data unlike any acquired in the past. Thus, as with any rapid evolution in biotechnology, it also serves to highlight the current major challenges in NGS—namely, efficient and cost-effective processing of large volumes of genetic sequencing data and massive storage capabilities.
This requires both a robust bioinformatics approach for high-throughput analyses as well as streamlined data storage and retrieval strategies. While these limitations have been the bottleneck to mainstreaming NGS technologies over the past few years, major advances in bioinformatics technology with supercomputer processing capabilities as well as commercially available cloud-based storage infrastructure are actively working to overcome these barriers.
This is an exciting time for genetics research, and we are only just beginning to harness the potential of NGS technologies in applications ranging from cancer genomics to evolutionary biology to personal health genetics. In short, NGS is a rapidly moving field that is becoming more within the reach of individual investigators.