Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
DNA sequencing
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== High-throughput methods == [[File:Mapping Reads.png|thumb|right|Multiple, fragmented sequence reads must be assembled together on the basis of their overlapping areas.]] High-throughput sequencing, which includes next-generation "short-read" and third-generation "long-read" sequencing methods,<ref group="nt">"Next-generation" remains in broad use as of 2019. For instance, {{cite journal|vauthors=Straiton J, Free T, Sawyer A, Martin J|date=February 2019|title=From Sanger Sequencing to Genome Databases and Beyond|journal=BioTechniques|volume=66|issue=2|pages=60β63|doi=10.2144/btn-2019-0011|pmid=30744413|quote=Next-generation sequencing (NGS) technologies have revolutionized genomic research. (opening sentence of the article)|doi-access=free}}</ref> applies to [[exome sequencing]], genome sequencing, genome resequencing, [[transcriptome]] profiling ([[RNA-Seq]]), DNA-protein interactions ([[ChIP-sequencing]]), and [[epigenome]] characterization.<ref name="pmid19900591">{{cite journal | vauthors = de MagalhΓ£es JP, Finch CE, Janssens G | title = Next-generation sequencing in aging research: emerging applications, problems, pitfalls and possible solutions | journal = [[Ageing Research Reviews]] | volume = 9 | issue = 3 | pages = 315β23 | year = 2010 | pmid = 19900591 | pmc = 2878865 | doi = 10.1016/j.arr.2009.10.006 }}</ref> The high demand for low-cost sequencing has driven the development of high-throughput sequencing technologies that [[multiplex (assay)|parallelize]] the sequencing process, producing thousands or millions of sequences concurrently.<ref name="pmid23856935">{{cite journal | vauthors = Grada A | title = Next-generation sequencing: methodology and application | journal = J Invest Dermatol | volume = 133 | issue = 8 | pages = e11 | date = August 2013 | pmid = 23856935 | doi = 10.1038/jid.2013.248 | doi-access = free }}</ref><ref name=hall2007>{{cite journal | vauthors = Hall N | title = Advanced sequencing technologies and their wider impact in microbiology | journal = [[J. Exp. Biol.]] | volume = 210| issue = Pt 9 | pages = 1518β25 | date = May 2007 | pmid = 17449817 | doi = 10.1242/jeb.001370 | doi-access = free | bibcode = 2007JExpB.210.1518H }}{{open access}}</ref><ref name=church2006>{{cite journal | vauthors = Church GM | title = Genomes for all | journal = [[Sci. Am.]] | volume = 294 | issue = 1 | pages = 46β54 | date = January 2006 | pmid = 16468433 | doi = 10.1038/scientificamerican0106-46 | author-link1 = George M. Church | bibcode = 2006SciAm.294a..46C | s2cid = 28769137 }}{{subscription required}}</ref> High-throughput sequencing technologies are intended to lower the cost of DNA sequencing beyond what is possible with standard dye-terminator methods.<ref name="pmid18165802">{{cite journal | vauthors = Schuster SC | title = Next-generation sequencing transforms today's biology | journal = Nat. Methods | volume = 5 | issue = 1 | pages = 16β18 | date = January 2008 | pmid = 18165802 | doi = 10.1038/nmeth1156 | s2cid = 1465786 }}</ref> In ultra-high-throughput sequencing as many as 500,000 sequencing-by-synthesis operations may be run in parallel.<ref name=kalb1992>{{cite book | title = Massively Parallel, Optical, and Neural Computing in the United States | first1 = Gilbert | last1 = Kalb | first2 = Robert | last2 = Moxley | publisher = [[IOS Press]] | year = 1992 | isbn = 978-90-5199-097-3 }}{{Page needed|date=June 2013}}</ref><ref name=tenBosch2008>{{cite journal | vauthors = ten Bosch JR, Grody WW | title = Keeping Up with the Next Generation | journal = The Journal of Molecular Diagnostics | volume = 10 | issue = 6 | pages = 484β92 | year = 2008 | pmid = 18832462 | pmc = 2570630 | doi = 10.2353/jmoldx.2008.080027 }}{{open access}}</ref><ref name=Tucker2009>{{cite journal | vauthors = Tucker T, Marra M, Friedman JM | title = Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine | journal = The American Journal of Human Genetics | volume = 85 | issue = 2 | pages = 142β54 | year = 2009 | pmid = 19679224 | pmc = 2725244 | doi = 10.1016/j.ajhg.2009.06.022 }}{{open access}}</ref> Such technologies led to the ability to sequence an entire human genome in as little as one day.<ref name=":2">{{cite journal | vauthors = Straiton J, Free T, Sawyer A, Martin J | title = From Sanger sequencing to genome databases and beyond | journal = BioTechniques | volume = 66 | issue = 2 | pages = 60β63 | date = February 2019 | pmid = 30744413 | doi = 10.2144/btn-2019-0011 | publisher = Future Science | doi-access = free }}</ref> {{As of|2019||alt=As of 2019|lc=|since=}}, corporate leaders in the development of high-throughput sequencing products included [[Illumina, Inc.|Illumina]], [[Qiagen]] and [[Thermo Fisher Scientific|ThermoFisher Scientific]].<ref name=":2" /> {| class="wikitable" style="font-size:0.9em;" |+ Comparison of high-throughput sequencing methods<ref name=quail2012>{{cite journal | vauthors = Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Gu Y | title = A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and illumina MiSeq sequencers | journal = [[BMC Genomics]] | volume = 13 | issue = 1 | page = 341 | date = 1 January 2012 | pmid = 22827831 | pmc = 3431227 | doi = 10.1186/1471-2164-13-341 | doi-access = free }}{{open access}}</ref><ref name=lin2012>{{cite journal | vauthors = Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M | title = Comparison of Next-Generation Sequencing Systems | journal = Journal of Biomedicine and Biotechnology | volume = 2012 | pages = 251364 | date = 1 January 2012 | pmid = 22829749 | doi = 10.1155/2012/251364 | pmc=3398667| doi-access = free }}{{open access}}</ref> ! Method !! '''Read length''' !! '''Accuracy (single read not consensus)''' !! '''Reads per run''' !! '''Time per run''' !! '''Cost per 1 billion bases (in US$)''' !! '''Advantages''' !! '''Disadvantages''' |- | '''Single-molecule real-time sequencing (Pacific Biosciences)''' ||30,000 bp ([[N50 statistic|N50]]); maximum read length >100,000 bases<ref name="sequel21">{{cite web|url=https://www.pacb.com/blog/new-software-polymerase-sequel-system-boost-throughput-affordability/|title=New Software, Polymerase for Sequel System Boost Throughput and Affordability β PacBio|date=7 March 2018}}</ref><ref name="autogenerated1">{{cite web |url=http://www.genomeweb.com/sequencing/after-year-testing-two-early-pacbio-customers-expect-more-routine-use-rs-sequenc |title=After a Year of Testing, Two Early PacBio Customers Expect More Routine Use of RS Sequencer in 2012 |author=<!--Staff writer(s); no by-line.--> |date=10 January 2012 |publisher=GenomeWeb }}{{registration required}}</ref><ref>{{cite press release|url=http://globenewswire.com/news-release/2013/10/03/577891/10051072/en/Pacific-Biosciences-Introduces-New-Chemistry-With-Longer-Read-Lengths-to-Detect-Novel-Features-in-DNA-Sequence-and-Advance-Genome-Studies-of-Large-Organisms.html|title=Pacific Biosciences Introduces New Chemistry With Longer Read Lengths to Detect Novel Features in DNA Sequence and Advance Genome Studies of Large Organisms|year=2013}}</ref> | 87% raw-read accuracy<ref name="pmid23644548">{{cite journal | vauthors = Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, Turner SW, Korlach J | title = Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data | journal = Nat. Methods | volume = 10 | issue = 6 | pages = 563β69 | year = 2013 | pmid = 23644548 | doi = 10.1038/nmeth.2474 | s2cid = 205421576 }}</ref>|| 4,000,000 per Sequel 2 SMRT cell, 100β200 gigabases<ref name="sequel21" /><ref name="flxlexblog.wordpress.com">{{cite web|url=http://flxlexblog.wordpress.com/2013/07/05/de-novo-bacterial-genome-assembly-a-solved-problem/|title=De novo bacterial genome assembly: a solved problem?|date=5 July 2013}}</ref><ref name=rasko2011>{{cite journal | vauthors = Rasko DA, Webster DR, Sahl JW, Bashir A, Boisen N, Scheutz F, Paxinos EE, Sebra R, Chin CS, Iliopoulos D, Klammer A, Peluso P, Lee L, Kislyuk AO, Bullard J, Kasarskis A, Wang S, Eid J, Rank D, Redman JC, Steyert SR, Frimodt-MΓΈller J, Struve C, Petersen AM, Krogfelt KA, Nataro JP, Schadt EE, Waldor MK | title = Origins of the Strain Causing an Outbreak of HemolyticβUremic Syndrome in Germany | journal = [[N Engl J Med]] | volume = 365 | issue = 8 | pages = 709β17 | date = 25 August 2011 | pmid = 21793740 | doi = 10.1056/NEJMoa1106920 | pmc=3168948}}{{open access}}</ref>|| 30 minutes to 20 hours<ref name="sequel21"/><ref name=tran2012>{{cite journal | vauthors = Tran B, Brown AM, Bedard PL, Winquist E, Goss GD, Hotte SJ, Welch SA, Hirte HW, Zhang T, Stein LD, Ferretti V, Watt S, Jiao W, Ng K, Ghai S, Shaw P, Petrocelli T, Hudson TJ, Neel BG, Onetto N, Siu LL, McPherson JD, Kamel-Reid S, Dancey JE | title = Feasibility of real time next generation sequencing of cancer genes linked to drug response: Results from a clinical trial | journal = [[Int. J. Cancer]] | volume = 132 | issue = 7 | pages = 1547β55 | date = 1 January 2012 | pmid = 22948899 | doi = 10.1002/ijc.27817 | s2cid = 72705 | author-link18 = Thomas J. Hudson | author-link10 = Lincoln Stein | doi-access = free }}{{subscription required}}</ref> ||$7.2-$43.3 | Fast. Detects 4mC, 5mC, 6mA.<ref>{{cite journal | vauthors = Murray IA, Clark TA, Morgan RD, Boitano M, Anton BP, Luong K, Fomenkov A, Turner SW, Korlach J, Roberts RJ | title = The methylomes of six bacteria | journal = Nucleic Acids Research | volume = 40 | issue = 22 | pages = 11450β62 | date = 2 October 2012 | pmid = 23034806 | pmc = 3526280 | doi = 10.1093/nar/gks891 }}</ref> || Moderate throughput. Equipment can be very expensive. |- | '''Ion semiconductor (Ion Torrent sequencing)''' || up to 600 bp<ref>{{cite web|url=https://www.thermofisher.com/order/catalog/product/A30670|title=Ion 520 & Ion 530 ExT Kit-Chef β Thermo Fisher Scientific|website=thermofisher.com}}</ref> || 99.6%<ref>{{Cite web |url=http://129.130.90.13/ion-docs/GUID-C6419130-57D8-4DE2-BCF8-47157CB3C9A2.html |title=Raw accuracy |access-date=29 March 2018 |archive-url=https://web.archive.org/web/20180330075720/http://129.130.90.13/ion-docs/GUID-C6419130-57D8-4DE2-BCF8-47157CB3C9A2.html |archive-date=30 March 2018 |url-status=dead}}</ref> || up to 80 million || 2 hours || $66.8-$950 || Less expensive equipment. Fast. || Homopolymer errors. |- | '''Pyrosequencing (454)''' || 700 bp || 99.9% || 1 million || 24 hours || $10,000 || Long read size. Fast. || Runs are expensive. Homopolymer errors. |- | '''Sequencing by synthesis (Illumina)''' ||MiniSeq, NextSeq: 75β300 bp; MiSeq: 50β600 bp; HiSeq 2500: 50β500 bp; HiSeq 3/4000: 50β300 bp; HiSeq X: 300 bp | 99.9% (Phred30) || MiniSeq/MiSeq: 1β25 Million; NextSeq: 130-00 Million; HiSeq 2500: 300 million β 2 billion; HiSeq 3/4000 2.5 billion; HiSeq X: 3 billion | 1 to 11 days, depending upon sequencer and specified read length<ref name=vliet2010>{{cite journal | vauthors = van Vliet AH | title = Next generation sequencing of microbial transcriptomes: challenges and opportunities | journal = [[FEMS Microbiology Letters]] | volume = 302 | issue = 1 | pages = 1β7 | date = 1 January 2010 | pmid = 19735299 | doi = 10.1111/j.1574-6968.2009.01767.x | doi-access = free }}{{open access}}</ref> || $5 to $150 || Potential for high sequence yield, depending upon sequencer model and desired application. || Equipment can be very expensive. Requires high concentrations of DNA. |- |'''Combinatorial probe anchor synthesis (cPAS- BGI/MGI)''' |BGISEQ-50: 35-50bp; MGISEQ 200: 50-200bp; BGISEQ-500, MGISEQ-2000: 50-300bp<ref>{{cite web|url=http://en.mgitech.cn/product/30.html|title=BGI and MGISEQ|website=en.mgitech.cn|access-date=2018-07-05}}</ref> |99.9% (Phred30) |BGISEQ-50: 160M; MGISEQ 200: 300M; BGISEQ-500: 1300M per flow cell; MGISEQ-2000: 375M FCS flow cell, 1500M FCL flow cell per flow cell. |1 to 9 days depending on instrument, read length and number of flow cells run at a time. |$5β $120 | | |- | '''Sequencing by ligation (SOLiD sequencing)''' || 50+35 or 50+50 bp || 99.9% || 1.2 to 1.4 billion || 1 to 2 weeks || $60β130 || Low cost per base. || Slower than other methods. Has issues sequencing palindromic sequences.<ref name="Yu-Feng Huang, Sheng-Chung Chen, Yih-Shien Chiang, Tzu-Han Chen & Kuo-Ping Chiu 2012 S10">{{cite journal | vauthors = Huang YF, Chen SC, Chiang YS, Chen TH, Chiu KP | title = Palindromic sequence impedes sequencing-by-ligation mechanism | journal = [[BMC Systems Biology]] | volume = 6 | pages = S10 | year = 2012 | issue = Suppl 2 | pmid = 23281822 | doi = 10.1186/1752-0509-6-S2-S10 | pmc=3521181 | doi-access = free }}</ref> |- | '''Nanopore Sequencing''' || Dependent on library preparation, not the device, so user chooses read length (up to 2,272,580 bp reported<ref>{{Cite bioRxiv |biorxiv=10.1101/312256| title=Whale watching with BulkVis: A graphical viewer for Oxford Nanopore bulk fast5 files | date=3 May 2018| last1=Loose| first1=Matthew| last2=Rakyan| first2=Vardhman| last3=Holmes| first3=Nadine| last4=Payne| first4=Alexander}}</ref>). || ~92β97% single read || dependent on read length selected by user || data streamed in real time. Choose 1 min to 48 hrs || $7β100 || Longest individual reads. Accessible user community. Portable (Palm sized). || Lower throughput than other machines, Single read accuracy in 90s. |- |'''GenapSys Sequencing''' |Around 150 bp single-end |99.9% (Phred30) |1 to 16 million |Around 24 hours |$667 |Low-cost of instrument ($10,000) | |- |'''Chain termination (Sanger sequencing)'''|| 400 to 900 bp || 99.9% || N/A || 20 minutes to 3 hours || $2,400,000 || Useful for many applications. || More expensive and impractical for larger sequencing projects. This method also requires the time-consuming step of plasmid cloning or PCR. |} === Long-read sequencing methods === {{Further|Long-read sequencing}} ==== Single molecule real time (SMRT) sequencing ==== {{Main|Single-molecule real-time sequencing}} SMRT sequencing is based on the sequencing by synthesis approach. The DNA is synthesized in zero-mode wave-guides (ZMWs) β small well-like containers with the capturing tools located at the bottom of the well. The sequencing is performed with use of unmodified polymerase (attached to the ZMW bottom) and fluorescently labelled nucleotides flowing freely in the solution. The wells are constructed in a way that only the fluorescence occurring by the bottom of the well is detected. The fluorescent label is detached from the nucleotide upon its incorporation into the DNA strand, leaving an unmodified DNA strand. According to [[Pacific Biosciences]] (PacBio), the SMRT technology developer, this methodology allows detection of nucleotide modifications (such as cytosine methylation). This happens through the observation of polymerase kinetics. This approach allows reads of 20,000 nucleotides or more, with average read lengths of 5 kilobases.<ref name="flxlexblog.wordpress.com"/><ref>{{cite web|url=http://www.genomeweb.com/sequencing/pacbio-sales-start-pick-company-delivers-product-enhancements|title=PacBio Sales Start to Pick Up as Company Delivers on Product Enhancements|date=12 February 2013}}</ref> In 2015, Pacific Biosciences announced the launch of a new sequencing instrument called the Sequel System, with 1 million ZMWs compared to 150,000 ZMWs in the PacBio RS II instrument.<ref>{{cite web|url=http://www.bio-itworld.com/2015/9/30/pacbio-announces-sequel-sequencing-system.aspx|title=Bio-IT World|website=bio-itworld.com|access-date=16 November 2015|archive-date=29 July 2020|archive-url=https://web.archive.org/web/20200729220749/http://www.bio-itworld.com/2015/9/30/pacbio-announces-sequel-sequencing-system.aspx|url-status=dead}}</ref><ref>{{cite web|url=https://www.genomeweb.com/business-news/pacbio-launches-higher-throughput-lower-cost-single-molecule-sequencing-system|title=PacBio Launches Higher-Throughput, Lower-Cost Single-Molecule Sequencing System|date=October 2015}}</ref> SMRT sequencing is referred to as "[[Third-generation sequencing|third-generation]]" or "long-read" sequencing. ==== Nanopore DNA sequencing ==== {{Main|Nanopore sequencing}} The DNA passing through the nanopore changes its ion current. This change is dependent on the shape, size and length of the DNA sequence. Each type of the nucleotide blocks the ion flow through the pore for a different period of time. The method does not require modified nucleotides and is performed in real time. Nanopore sequencing is referred to as "[[Third-generation sequencing|third-generation]]" or "long-read" sequencing, along with SMRT sequencing. Early industrial research into this method was based on a technique called 'exonuclease sequencing', where the readout of electrical signals occurred as nucleotides passed by [[hemolysin|alpha(Ξ±)-hemolysin]] pores covalently bound with [[cyclodextrin]].<ref>{{cite journal | vauthors = Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S, Bayley H | title = Continuous base identification for single-molecule nanopore DNA sequencing | journal = Nature Nanotechnology | volume = 4 | issue = 4 | pages = 265β70 | date = April 2009 | pmid = 19350039 | doi = 10.1038/nnano.2009.12 | bibcode = 2009NatNa...4..265C }}</ref> However the subsequent commercial method, 'strand sequencing', sequenced DNA bases in an intact strand. Two main areas of nanopore sequencing in development are solid state nanopore sequencing, and protein based nanopore sequencing. Protein nanopore sequencing utilizes membrane protein complexes such as Ξ±-hemolysin, MspA (''[[Mycobacterium smegmatis]]'' Porin A) or CssG, which show great promise given their ability to distinguish between individual and groups of nucleotides.<ref name="Torre 2012">{{cite journal|year=2012|title=Fabrication and characterization of solid-state nanopore arrays for high-throughput DNA sequencing|journal=Nanotechnology|volume=23|issue=38|page=385308|bibcode=2012Nanot..23L5308D|doi=10.1088/0957-4484/23/38/385308|pmc=3557807|pmid=22948520|vauthors=dela Torre R, Larkin J, Singer A, Meller A}}</ref> In contrast, solid-state nanopore sequencing utilizes synthetic materials such as silicon nitride and aluminum oxide and it is preferred for its superior mechanical ability and thermal and chemical stability.<ref name="Pathak 2012">{{cite journal|year=2012|title=Double-functionalized nanopore-embedded gold electrodes for rapid DNA sequencing|url=https://zenodo.org/record/890231|journal=Applied Physics Letters|volume=100|issue=2|page=023701|doi=10.1063/1.3673335|vauthors=Pathak B, Lofas H, Prasongkit J, Grigoriev A, Ahuja R, Scheicher RH|bibcode=2012ApPhL.100b3701P}}</ref> The fabrication method is essential for this type of sequencing given that the nanopore array can contain hundreds of pores with diameters smaller than eight nanometers.<ref name="Torre 2012" /> The concept originated from the idea that single stranded DNA or RNA molecules can be electrophoretically driven in a strict linear sequence through a biological pore that can be less than eight nanometers, and can be detected given that the molecules release an ionic current while moving through the pore. The pore contains a detection region capable of recognizing different bases, with each base generating various time specific signals corresponding to the sequence of bases as they cross the pore which are then evaluated.<ref name="Pathak 2012" /> Precise control over the DNA transport through the pore is crucial for success. Various enzymes such as exonucleases and polymerases have been used to moderate this process by positioning them near the pore's entrance.<ref name="Korlach 2008">{{cite journal|year=2008|title=Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nanostructures|journal=Proceedings of the National Academy of Sciences|volume=105|issue=4|pages=1176β81|bibcode=2008PNAS..105.1176K|doi=10.1073/pnas.0710982105|pmc=2234111|pmid=18216253|vauthors=Korlach J, Marks PJ, Cicero RL, Gray JJ, Murphy DL, Roitman DB, Pham TT, Otto GA, Foquet M, Turner SW|doi-access=free}}</ref> === Short-read sequencing methods {{Anchor|Next-generation methods}} === {{Further|Short-read sequencing}} <!-- NB. Next-generation sequencing redirects to this section --> ==== Massively parallel signature sequencing (MPSS) ==== The first of the high-throughput sequencing technologies, [[massively parallel signature sequencing]] (or MPSS, also called next generation sequencing), was developed in the 1990s at Lynx Therapeutics, a company founded in 1992 by [[Sydney Brenner]] and [[Applied Biosystems#History|Sam Eletr]]. MPSS was a bead-based method that used a complex approach of adapter ligation followed by adapter decoding, reading the sequence in increments of four nucleotides. This method made it susceptible to sequence-specific bias or loss of specific sequences. Because the technology was so complex, MPSS was only performed 'in-house' by Lynx Therapeutics and no DNA sequencing machines were sold to independent laboratories. Lynx Therapeutics merged with Solexa (later acquired by [[Illumina (company)|Illumina]]) in 2004, leading to the development of sequencing-by-synthesis, a simpler approach acquired from [[Manteia Predictive Medicine]], which rendered MPSS obsolete. However, the essential properties of the MPSS output were typical of later high-throughput data types, including hundreds of thousands of short DNA sequences. In the case of MPSS, these were typically used for sequencing [[cDNA]] for measurements of [[gene expression]] levels.<ref name="Brenner_2000"/> ==== Polony sequencing ==== {{Main|Polony sequencing}} The [[polony sequencing]] method, developed in the laboratory of [[George M. Church]] at Harvard, was among the first high-throughput sequencing systems and was used to sequence a full ''[[E. coli]]'' genome in 2005.<ref name=Shendure2005>{{cite journal | vauthors = Shendure J, Porreca GJ, Reppas NB, Lin X, McCutcheon JP, Rosenbaum AM, Wang MD, Zhang K, Mitra RD, Church GM | title = Accurate multiplex polony sequencing of an evolved bacterial genome. | journal = Science | volume = 309 | issue = 5741 | pages = 1728β32 | date = 9 September 2005 | pmid = 16081699 | doi = 10.1126/science.1117389 | bibcode = 2005Sci...309.1728S | s2cid = 11405973 | doi-access = free }}</ref> It combined an in vitro paired-tag library with emulsion PCR, an automated microscope, and ligation-based sequencing chemistry to sequence an ''E. coli'' genome at an accuracy of >99.9999% and a cost approximately 1/9 that of Sanger sequencing.<ref name=Shendure2005 /> The technology was licensed to Agencourt Biosciences, subsequently spun out into Agencourt Personal Genomics, and eventually incorporated into the [[Applied Biosystems]] SOLiD platform. Applied Biosystems was later acquired by [[Life Technologies (Thermo Fisher Scientific)|Life Technologies]], now part of [[Thermo Fisher Scientific]]. ==== 454 pyrosequencing ==== {{Main|454 Life Sciences#Technology}} A parallelized version of [[pyrosequencing]] was developed by [[454 Life Sciences]], which has since been acquired by [[Roche Diagnostics]]. The method amplifies DNA inside water droplets in an oil solution (emulsion PCR), with each droplet containing a single DNA template attached to a single primer-coated bead that then forms a clonal colony. The sequencing machine contains many [[picoliter]]-volume wells each containing a single bead and sequencing enzymes. Pyrosequencing uses [[luciferase]] to generate light for detection of the individual nucleotides added to the nascent DNA, and the combined data are used to generate sequence [[read (biology)|reads]].<ref name="Margulies_2005"/> This technology provides intermediate read length and price per base compared to Sanger sequencing on one end and Solexa and SOLiD on the other.<ref name="pmid18165802"/> ==== Illumina (Solexa) sequencing ==== {{Main|Illumina dye sequencing}} [[Solexa]], now part of [[Illumina (company)|Illumina]], was founded by [[Shankar Balasubramanian]] and [[David Klenerman]] in 1998, and developed a sequencing method based on reversible dye-terminators technology, and engineered polymerases.<ref name = "Bentley_2008"/> The reversible terminated chemistry concept was invented by Bruno Canard and Simon Sarfati at the Pasteur Institute in Paris.<ref>{{Citation|last1=Canard|first1=Bruno|last2=Sarfati|first2=Simon | name-list-style = vanc |title=Novel derivatives usable for the sequencing of nucleic acids|date=13 October 1994|url=http://www.google.ge/patents/CA2158975A1|access-date=2016-03-09}}</ref><ref>{{cite journal | vauthors = Canard B, Sarfati RS | title = DNA polymerase fluorescent substrates with reversible 3'-tags | journal = Gene | volume = 148 | issue = 1 | pages = 1β6 | date = October 1994 | pmid = 7523248 | doi = 10.1016/0378-1119(94)90226-7 }}</ref> It was developed internally at Solexa by those named on the relevant patents. In 2004, Solexa acquired the company [[Manteia Predictive Medicine]] in order to gain a massively parallel sequencing technology invented in 1997 by [[Pascal Mayer]] and Laurent Farinelli.<ref name=DNA_colony_patents /> It is based on "DNA clusters" or "DNA colonies", which involves the clonal amplification of DNA on a surface. The cluster technology was co-acquired with Lynx Therapeutics of California. Solexa Ltd. later merged with Lynx to form Solexa Inc. [[File:Illumina HiSeq 2500.jpg|thumb|An Illumina HiSeq 2500 sequencer]] [[File:Illumina NovaSeq 6000 flow cell.jpg|thumb|Illumina NovaSeq 6000 flow cell]] In this method, DNA molecules and primers are first attached on a slide or flow cell and amplified with [[polymerase]] so that local clonal DNA colonies, later coined "DNA clusters", are formed. To determine the sequence, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. A camera takes images of the [[Fluorescent labeling|fluorescently labeled]] nucleotides. Then the dye, along with the terminal 3' blocker, is chemically removed from the DNA, allowing for the next cycle to begin. Unlike pyrosequencing, the DNA chains are extended one nucleotide at a time and image acquisition can be performed at a delayed moment, allowing for very large arrays of DNA colonies to be captured by sequential images taken from a single camera. [[File:Illumina MiSeq sequencer.jpg|thumb|An Illumina MiSeq sequencer]] Decoupling the enzymatic reaction and the image capture allows for optimal throughput and theoretically unlimited sequencing capacity. With an optimal configuration, the ultimately reachable instrument throughput is thus dictated solely by the analog-to-digital conversion rate of the camera, multiplied by the number of cameras and divided by the number of pixels per DNA colony required for visualizing them optimally (approximately 10 pixels/colony). In 2012, with cameras operating at more than 10 MHz A/D conversion rates and available optics, fluidics and enzymatics, throughput can be multiples of 1 million nucleotides/second, corresponding roughly to 1 human genome equivalent at 1x [[Coverage (genetics)|coverage]] per hour per instrument, and 1 human genome re-sequenced (at approx. 30x) per day per instrument (equipped with a single camera).<ref name="pmid18576944">{{cite journal | vauthors = Mardis ER | title = Next-generation DNA sequencing methods | journal = Annu Rev Genom Hum Genet | volume = 9 | pages = 387β402 | year = 2008 | pmid = 18576944 | doi = 10.1146/annurev.genom.9.081307.164359 }}</ref> ====Combinatorial probe anchor synthesis (cPAS)==== This method is an upgraded modification to combinatorial probe anchor ligation technology (cPAL) described by [[Complete Genomics]]<ref name=":0">{{cite journal | vauthors = Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, Carnevali P, Nazarenko I, Nilsen GB, Yeung G, Dahl F, Fernandez A, Staker B, Pant KP, Baccash J, Borcherding AP, Brownley A, Cedeno R, Chen L, Chernikoff D, Cheung A, Chirita R, Curson B, Ebert JC, Hacker CR, Hartlage R, Hauser B, Huang S, Jiang Y, Karpinchyk V, Koenig M, Kong C, Landers T, Le C, Liu J, McBride CE, Morenzoni M, Morey RE, Mutch K, Perazich H, Perry K, Peters BA, Peterson J, Pethiyagoda CL, Pothuraju K, Richter C, Rosenbaum AM, Roy S, Shafto J, Sharanhovich U, Shannon KW, Sheppy CG, Sun M, Thakuria JV, Tran A, Vu D, Zaranek AW, Wu X, Drmanac S, Oliphant AR, Banyai WC, Martin B, Ballinger DG, Church GM, Reid CA | display-authors = 6 | title = Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays | journal = Science | volume = 327 | issue = 5961 | pages = 78β81 | date = January 2010 | pmid = 19892942 | doi = 10.1126/science.1181498 | bibcode = 2010Sci...327...78D | s2cid = 17309571 | doi-access = free }}</ref> which has since become part of Chinese genomics company [[Beijing Genomics Institute|BGI]] in 2013.<ref>{{cite web|url=http://www.completegenomics.com/|title=About Us β Complete Genomics|last=brandonvd|website=Complete Genomics|access-date=2018-07-02}}</ref> The two companies have refined the technology to allow for longer read lengths, reaction time reductions and faster time to results. In addition, data are now generated as contiguous full-length reads in the standard FASTQ file format and can be used as-is in most short-read-based bioinformatics analysis pipelines.<ref name=":1">{{cite journal | vauthors = Huang J, Liang X, Xuan Y, Geng C, Li Y, Lu H, Qu S, Mei X, Chen H, Yu T, Sun N, Rao J, Wang J, Zhang W, Chen Y, Liao S, Jiang H, Liu X, Yang Z, Mu F, Gao S | display-authors = 6 | title = A reference human genome dataset of the BGISEQ-500 sequencer | journal = GigaScience | volume = 6 | issue = 5 | pages = 1β9 | date = May 2017 | pmid = 28379488 | pmc = 5467036 | doi = 10.1093/gigascience/gix024 }}</ref>{{citation needed|date=July 2018}} The two technologies that form the basis for this high-throughput sequencing technology are [[DNA nanoball sequencing|DNA nanoballs]] (DNB) and patterned arrays for nanoball attachment to a solid surface.<ref name=":0" /> DNA nanoballs are simply formed by denaturing double stranded, adapter ligated libraries and ligating the forward strand only to a splint oligonucleotide to form a ssDNA circle. Faithful copies of the circles containing the DNA insert are produced utilizing Rolling Circle Amplification that generates approximately 300β500 copies. The long strand of ssDNA folds upon itself to produce a three-dimensional nanoball structure that is approximately 220 nm in diameter. Making DNBs replaces the need to generate PCR copies of the library on the flow cell and as such can remove large proportions of duplicate reads, adapter-adapter ligations and PCR induced errors.<ref name=":1" />{{citation needed|date=July 2018}} [[File:MGISEQ-2000RS.jpg|thumb|A BGI MGISEQ-2000RS sequencer]] The patterned array of positively charged spots is fabricated through photolithography and etching techniques followed by chemical modification to generate a sequencing flow cell. Each spot on the flow cell is approximately 250 nm in diameter, are separated by 700 nm (centre to centre) and allows easy attachment of a single negatively charged DNB to the flow cell and thus reducing under or over-clustering on the flow cell.<ref name=":0" />{{citation needed|date=July 2018}} Sequencing is then performed by addition of an oligonucleotide probe that attaches in combination to specific sites within the DNB. The probe acts as an anchor that then allows one of four single reversibly inactivated, labelled nucleotides to bind after flowing across the flow cell. Unbound nucleotides are washed away before laser excitation of the attached labels then emit fluorescence and signal is captured by cameras that is converted to a digital output for base calling. The attached base has its terminator and label chemically cleaved at completion of the cycle. The cycle is repeated with another flow of free, labelled nucleotides across the flow cell to allow the next nucleotide to bind and have its signal captured. This process is completed a number of times (usually 50 to 300 times) to determine the sequence of the inserted piece of DNA at a rate of approximately 40 million nucleotides per second as of 2018.{{citation needed|date=July 2018}} ==== SOLiD sequencing ==== [[File:Library preparation for the SOLiD platform.svg|right|thumb|Library preparation for the SOLiD platform]] {{Main|ABI Solid Sequencing}} [[File:Two-base encoding scheme.pdf|thumb|Two-base encoding scheme. In two-base encoding, each unique pair of bases on the 3' end of the probe is assigned one out of four possible colors. For example, "AA" is assigned to blue, "AC" is assigned to green, and so on for all 16 unique pairs. During sequencing, each base in the template is sequenced twice, and the resulting data are decoded according to this scheme.]] [[Applied Biosystems]]' (now a [[Life Technologies (Thermo Fisher Scientific)|Life Technologies]] brand) SOLiD technology employs [[sequencing by ligation]]. Here, a pool of all possible oligonucleotides of a fixed length are labeled according to the sequenced position. Oligonucleotides are annealed and ligated; the preferential ligation by [[DNA ligase]] for matching sequences results in a signal informative of the nucleotide at that position. Each base in the template is sequenced twice, and the resulting data are decoded according to the [[2 base encoding]] scheme used in this method. Before sequencing, the DNA is amplified by emulsion PCR. The resulting beads, each containing single copies of the same DNA molecule, are deposited on a glass slide.<ref name="pmid18477713">{{cite journal | vauthors = Valouev A, Ichikawa J, Tonthat T, Stuart J, Ranade S, Peckham H, Zeng K, Malek JA, Costa G, McKernan K, Sidow A, Fire A, Johnson SM | title = A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning | journal = Genome Res. | volume = 18 | issue = 7 | pages = 1051β63 | date = July 2008 | pmid = 18477713 | pmc = 2493394 | doi = 10.1101/gr.076463.108 }}</ref> The result is sequences of quantities and lengths comparable to Illumina sequencing.<ref name="pmid18165802"/> This [[sequencing by ligation]] method has been reported to have some issue sequencing palindromic sequences.<ref name="Yu-Feng Huang, Sheng-Chung Chen, Yih-Shien Chiang, Tzu-Han Chen & Kuo-Ping Chiu 2012 S10"/> ==== Ion Torrent semiconductor sequencing ==== {{Main|Ion semiconductor sequencing}} Ion Torrent Systems Inc. (now owned by [[Life Technologies (Thermo Fisher Scientific)|Life Technologies]]) developed a system based on using standard sequencing chemistry, but with a novel, semiconductor-based detection system. This method of sequencing is based on the detection of [[hydrogen ion]]s that are released during the [[DNA polymerase|polymerisation]] of [[DNA]], as opposed to the optical methods used in other sequencing systems. A microwell containing a template DNA strand to be sequenced is flooded with a single type of [[nucleotide]]. If the introduced nucleotide is [[complementarity (molecular biology)|complementary]] to the leading template nucleotide it is incorporated into the growing complementary strand. This causes the release of a hydrogen ion that triggers a hypersensitive ion sensor, which indicates that a reaction has occurred. If [[homopolymer]] repeats are present in the template sequence, multiple nucleotides will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal.<ref name="rusk">{{cite journal | vauthors = Rusk N | year = 2011 | title = Torrents of sequence | journal = Nat Methods | volume = 8 | issue = 1| page = 44 | doi=10.1038/nmeth.f.330| s2cid = 41040192 | doi-access = free }}</ref> [[File:From second to fourth-generation sequencing, illustration on TAGGCT template.svg|thumb|right| Sequencing of the TAGGCT template with IonTorrent, PacBioRS and GridION]] ==== DNA nanoball sequencing ==== {{Main|DNA nanoball sequencing}} [[DNA nanoball sequencing]] is a type of high throughput sequencing technology used to determine the entire [[genomic sequence]] of an organism. The company [[Complete Genomics]] uses this technology to sequence samples submitted by independent researchers. The method uses [[rolling circle replication]] to amplify small fragments of genomic DNA into DNA nanoballs. Unchained sequencing by ligation is then used to determine the nucleotide sequence.<ref name = "Drmanac_2010" /> This method of DNA sequencing allows large numbers of DNA nanoballs to be sequenced per run and at low [[reagent]] costs compared to other high-throughput sequencing platforms.<ref>{{cite journal | vauthors = Porreca GJ | title = Genome Sequencing on Nanoballs | journal = Nature Biotechnology | volume = 28 | issue = 1 | pages = 43β44 | year = 2010 | pmid = 20062041 | doi = 10.1038/nbt0110-43 | s2cid = 54557996 }}</ref> However, only short sequences of DNA are determined from each DNA nanoball which makes mapping the short reads to a [[reference genome]] difficult.<ref name = "Drmanac_2010"/> ==== Heliscope single molecule sequencing ==== Heliscope sequencing is a method of [[Single-molecule magnetic sequencing|single-molecule sequencing]] developed by [[Helicos Biosciences]]. It uses DNA fragments with added poly-A tail adapters which are attached to the flow cell surface. The next steps involve extension-based sequencing with cyclic washes of the flow cell with fluorescently labeled nucleotides (one nucleotide type at a time, as with the Sanger method). The reads are performed by the Heliscope sequencer.<ref>{{cite web|url=http://www.helicosbio.com/Products/HelicosregGeneticAnalysisSystem/HeliScopetradeSequencer/tabid/87/Default.aspx|archive-url=https://web.archive.org/web/20091102041828/http://www.helicosbio.com/Products/HelicosregGeneticAnalysisSystem/HeliScopetradeSequencer/tabid/87/Default.aspx|archive-date=2009-11-02|title=HeliScope Gene Sequencing / Genetic Analyzer System : Helicos BioSciences|date=2 November 2009}}</ref><ref>{{cite journal | vauthors = Thompson JF, Steinmann KE | title = Single molecule sequencing with a HeliScope genetic analysis system | journal = Current Protocols in Molecular Biology | volume = Chapter 7 | pages = Unit7.10 | date = October 2010 | pmid = 20890904 | pmc = 2954431 | doi = 10.1002/0471142727.mb0710s92 }}</ref> The reads are short, averaging 35 bp.<ref>{{cite web |url=http://seqll.com/technical-description/ |archive-url=https://web.archive.org/web/20140808055229/http://seqll.com/technical-description/ |url-status=dead |archive-date=8 August 2014 |publisher=SeqLL |access-date=9 August 2015 |title=tSMS SeqLL Technical Explanation}}</ref> What made this technology especially novel was that it was the first of its class to sequence non-amplified DNA, thus preventing any read errors associated with amplification steps.<ref>{{cite journal |last1=Heather |first1=James M. |last2=Chain |first2=Benjamin |title=The sequence of sequencers: The history of sequencing DNA |journal=Genomics |date=January 2016 |volume=107 |issue=1 |pages=1β8 |doi=10.1016/j.ygeno.2015.11.003 |pmc=4727787 |pmid=26554401 }}</ref> In 2009 a human genome was sequenced using the Heliscope, however in 2012 the company went bankrupt.<ref>{{cite book|author1=Sara El-Metwally |title=Next Generation Sequencing Technologies and Challenges in Sequence Assembly |volume=7 |author2=Osama M. Ouda |author3=Mohamed Helmy |publisher=Next Generation Sequencing Technologies and Challenges in Sequence Assembly, Springer Briefs in Systems Biology Volume 7 |year=2014 |pages=51β59|doi=10.1007/978-1-4939-0715-1_6 |chapter=New Horizons in Next-Generation Sequencing |series=SpringerBriefs in Systems Biology |isbn=978-1-4939-0714-4 }}</ref> ==== Microfluidic Systems ==== There are two main microfluidic systems that are used to sequence DNA; [[Droplet-based microfluidics|droplet based microfluidics]] and [[digital microfluidics]]. Microfluidic devices solve many of the current limitations of current sequencing arrays. Abate et al. studied the use of droplet-based microfluidic devices for DNA sequencing.<ref name=":3">{{cite journal | vauthors = Abate AR, Hung T, Sperling RA, Mary P, Rotem A, Agresti JJ, Weiner MA, Weitz DA | display-authors = 6 | title = DNA sequence analysis with droplet-based microfluidics | journal = Lab on a Chip | volume = 13 | issue = 24 | pages = 4864β9 | date = December 2013 | pmid = 24185402 | pmc = 4090915 | doi = 10.1039/c3lc50905b }}</ref> These devices have the ability to form and process picoliter sized droplets at the rate of thousands per second. The devices were created from [[Polydimethylsiloxane|polydimethylsiloxane (PDMS)]] and used Forster resonance energy transfer, [[FΓΆrster resonance energy transfer|FRET assays]] to read the sequences of DNA encompassed in the droplets. Each position on the array tested for a specific 15 base sequence.<ref name=":3" /> Fair et al. used digital microfluidic devices to study DNA [[pyrosequencing]].<ref name=":4">{{Cite journal| vauthors = Fair RB, Khlystov A, Tailor TD, Ivanov V, Evans RD, Srinivasan V, Pamula VK, Pollack MG, Griffin PB, Zhou J |date= January 2007 |title=Chemical and Biological Applications of Digital-Microfluidic Devices |journal=IEEE Design & Test of Computers|volume=24|issue=1|pages=10β24|doi=10.1109/MDT.2007.8 |hdl= 10161/6987 |citeseerx=10.1.1.559.1440 |s2cid= 10122940 }}</ref> Significant advantages include the portability of the device, reagent volume, speed of analysis, mass manufacturing abilities, and high throughput. This study provided a proof of concept showing that digital devices can be used for pyrosequencing; the study included using synthesis, which involves the extension of the enzymes and addition of labeled nucleotides.<ref name=":4" /> Boles et al. also studied pyrosequencing on digital microfluidic devices.<ref name=":5">{{cite journal | vauthors = Boles DJ, Benton JL, Siew GJ, Levy MH, Thwar PK, Sandahl MA, Rouse JL, Perkins LC, Sudarsan AP, Jalili R, Pamula VK, Srinivasan V, Fair RB, Griffin PB, Eckhardt AE, Pollack MG | display-authors = 6 | title = Droplet-based pyrosequencing using digital microfluidics | journal = Analytical Chemistry | volume = 83 | issue = 22 | pages = 8439β47 | date = November 2011 | pmid = 21932784 | pmc = 3690483 | doi = 10.1021/ac201416j }}</ref> They used an electro-wetting device to create, mix, and split droplets. The sequencing uses a three-enzyme protocol and DNA templates anchored with magnetic beads. The device was tested using two protocols and resulted in 100% accuracy based on raw pyrogram levels. The advantages of these digital microfluidic devices include size, cost, and achievable levels of functional integration.<ref name=":5" /> DNA sequencing research, using microfluidics, also has the ability to be applied to the [[RNA-Seq|sequencing of RNA]], using similar droplet microfluidic techniques, such as the method, inDrops.<ref>{{cite journal | vauthors = Zilionis R, Nainys J, Veres A, Savova V, Zemmour D, Klein AM, Mazutis L | title = Single-cell barcoding and sequencing using droplet microfluidics | journal = Nature Protocols | volume = 12 | issue = 1 | pages = 44β73 | date = January 2017 | pmid = 27929523 | doi = 10.1038/nprot.2016.154 | s2cid = 767782 }}</ref> This shows that many of these DNA sequencing techniques will be able to be applied further and be used to understand more about genomes and transcriptomes.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)