Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
DNA sequencing
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Short-read sequencing methods {{Anchor|Next-generation methods}} === {{Further|Short-read sequencing}} <!-- NB. Next-generation sequencing redirects to this section --> ==== Massively parallel signature sequencing (MPSS) ==== The first of the high-throughput sequencing technologies, [[massively parallel signature sequencing]] (or MPSS, also called next generation sequencing), was developed in the 1990s at Lynx Therapeutics, a company founded in 1992 by [[Sydney Brenner]] and [[Applied Biosystems#History|Sam Eletr]]. MPSS was a bead-based method that used a complex approach of adapter ligation followed by adapter decoding, reading the sequence in increments of four nucleotides. This method made it susceptible to sequence-specific bias or loss of specific sequences. Because the technology was so complex, MPSS was only performed 'in-house' by Lynx Therapeutics and no DNA sequencing machines were sold to independent laboratories. Lynx Therapeutics merged with Solexa (later acquired by [[Illumina (company)|Illumina]]) in 2004, leading to the development of sequencing-by-synthesis, a simpler approach acquired from [[Manteia Predictive Medicine]], which rendered MPSS obsolete. However, the essential properties of the MPSS output were typical of later high-throughput data types, including hundreds of thousands of short DNA sequences. In the case of MPSS, these were typically used for sequencing [[cDNA]] for measurements of [[gene expression]] levels.<ref name="Brenner_2000"/> ==== Polony sequencing ==== {{Main|Polony sequencing}} The [[polony sequencing]] method, developed in the laboratory of [[George M. Church]] at Harvard, was among the first high-throughput sequencing systems and was used to sequence a full ''[[E. coli]]'' genome in 2005.<ref name=Shendure2005>{{cite journal | vauthors = Shendure J, Porreca GJ, Reppas NB, Lin X, McCutcheon JP, Rosenbaum AM, Wang MD, Zhang K, Mitra RD, Church GM | title = Accurate multiplex polony sequencing of an evolved bacterial genome. | journal = Science | volume = 309 | issue = 5741 | pages = 1728β32 | date = 9 September 2005 | pmid = 16081699 | doi = 10.1126/science.1117389 | bibcode = 2005Sci...309.1728S | s2cid = 11405973 | doi-access = free }}</ref> It combined an in vitro paired-tag library with emulsion PCR, an automated microscope, and ligation-based sequencing chemistry to sequence an ''E. coli'' genome at an accuracy of >99.9999% and a cost approximately 1/9 that of Sanger sequencing.<ref name=Shendure2005 /> The technology was licensed to Agencourt Biosciences, subsequently spun out into Agencourt Personal Genomics, and eventually incorporated into the [[Applied Biosystems]] SOLiD platform. Applied Biosystems was later acquired by [[Life Technologies (Thermo Fisher Scientific)|Life Technologies]], now part of [[Thermo Fisher Scientific]]. ==== 454 pyrosequencing ==== {{Main|454 Life Sciences#Technology}} A parallelized version of [[pyrosequencing]] was developed by [[454 Life Sciences]], which has since been acquired by [[Roche Diagnostics]]. The method amplifies DNA inside water droplets in an oil solution (emulsion PCR), with each droplet containing a single DNA template attached to a single primer-coated bead that then forms a clonal colony. The sequencing machine contains many [[picoliter]]-volume wells each containing a single bead and sequencing enzymes. Pyrosequencing uses [[luciferase]] to generate light for detection of the individual nucleotides added to the nascent DNA, and the combined data are used to generate sequence [[read (biology)|reads]].<ref name="Margulies_2005"/> This technology provides intermediate read length and price per base compared to Sanger sequencing on one end and Solexa and SOLiD on the other.<ref name="pmid18165802"/> ==== Illumina (Solexa) sequencing ==== {{Main|Illumina dye sequencing}} [[Solexa]], now part of [[Illumina (company)|Illumina]], was founded by [[Shankar Balasubramanian]] and [[David Klenerman]] in 1998, and developed a sequencing method based on reversible dye-terminators technology, and engineered polymerases.<ref name = "Bentley_2008"/> The reversible terminated chemistry concept was invented by Bruno Canard and Simon Sarfati at the Pasteur Institute in Paris.<ref>{{Citation|last1=Canard|first1=Bruno|last2=Sarfati|first2=Simon | name-list-style = vanc |title=Novel derivatives usable for the sequencing of nucleic acids|date=13 October 1994|url=http://www.google.ge/patents/CA2158975A1|access-date=2016-03-09}}</ref><ref>{{cite journal | vauthors = Canard B, Sarfati RS | title = DNA polymerase fluorescent substrates with reversible 3'-tags | journal = Gene | volume = 148 | issue = 1 | pages = 1β6 | date = October 1994 | pmid = 7523248 | doi = 10.1016/0378-1119(94)90226-7 }}</ref> It was developed internally at Solexa by those named on the relevant patents. In 2004, Solexa acquired the company [[Manteia Predictive Medicine]] in order to gain a massively parallel sequencing technology invented in 1997 by [[Pascal Mayer]] and Laurent Farinelli.<ref name=DNA_colony_patents /> It is based on "DNA clusters" or "DNA colonies", which involves the clonal amplification of DNA on a surface. The cluster technology was co-acquired with Lynx Therapeutics of California. Solexa Ltd. later merged with Lynx to form Solexa Inc. [[File:Illumina HiSeq 2500.jpg|thumb|An Illumina HiSeq 2500 sequencer]] [[File:Illumina NovaSeq 6000 flow cell.jpg|thumb|Illumina NovaSeq 6000 flow cell]] In this method, DNA molecules and primers are first attached on a slide or flow cell and amplified with [[polymerase]] so that local clonal DNA colonies, later coined "DNA clusters", are formed. To determine the sequence, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. A camera takes images of the [[Fluorescent labeling|fluorescently labeled]] nucleotides. Then the dye, along with the terminal 3' blocker, is chemically removed from the DNA, allowing for the next cycle to begin. Unlike pyrosequencing, the DNA chains are extended one nucleotide at a time and image acquisition can be performed at a delayed moment, allowing for very large arrays of DNA colonies to be captured by sequential images taken from a single camera. [[File:Illumina MiSeq sequencer.jpg|thumb|An Illumina MiSeq sequencer]] Decoupling the enzymatic reaction and the image capture allows for optimal throughput and theoretically unlimited sequencing capacity. With an optimal configuration, the ultimately reachable instrument throughput is thus dictated solely by the analog-to-digital conversion rate of the camera, multiplied by the number of cameras and divided by the number of pixels per DNA colony required for visualizing them optimally (approximately 10 pixels/colony). In 2012, with cameras operating at more than 10 MHz A/D conversion rates and available optics, fluidics and enzymatics, throughput can be multiples of 1 million nucleotides/second, corresponding roughly to 1 human genome equivalent at 1x [[Coverage (genetics)|coverage]] per hour per instrument, and 1 human genome re-sequenced (at approx. 30x) per day per instrument (equipped with a single camera).<ref name="pmid18576944">{{cite journal | vauthors = Mardis ER | title = Next-generation DNA sequencing methods | journal = Annu Rev Genom Hum Genet | volume = 9 | pages = 387β402 | year = 2008 | pmid = 18576944 | doi = 10.1146/annurev.genom.9.081307.164359 }}</ref> ====Combinatorial probe anchor synthesis (cPAS)==== This method is an upgraded modification to combinatorial probe anchor ligation technology (cPAL) described by [[Complete Genomics]]<ref name=":0">{{cite journal | vauthors = Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, Carnevali P, Nazarenko I, Nilsen GB, Yeung G, Dahl F, Fernandez A, Staker B, Pant KP, Baccash J, Borcherding AP, Brownley A, Cedeno R, Chen L, Chernikoff D, Cheung A, Chirita R, Curson B, Ebert JC, Hacker CR, Hartlage R, Hauser B, Huang S, Jiang Y, Karpinchyk V, Koenig M, Kong C, Landers T, Le C, Liu J, McBride CE, Morenzoni M, Morey RE, Mutch K, Perazich H, Perry K, Peters BA, Peterson J, Pethiyagoda CL, Pothuraju K, Richter C, Rosenbaum AM, Roy S, Shafto J, Sharanhovich U, Shannon KW, Sheppy CG, Sun M, Thakuria JV, Tran A, Vu D, Zaranek AW, Wu X, Drmanac S, Oliphant AR, Banyai WC, Martin B, Ballinger DG, Church GM, Reid CA | display-authors = 6 | title = Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays | journal = Science | volume = 327 | issue = 5961 | pages = 78β81 | date = January 2010 | pmid = 19892942 | doi = 10.1126/science.1181498 | bibcode = 2010Sci...327...78D | s2cid = 17309571 | doi-access = free }}</ref> which has since become part of Chinese genomics company [[Beijing Genomics Institute|BGI]] in 2013.<ref>{{cite web|url=http://www.completegenomics.com/|title=About Us β Complete Genomics|last=brandonvd|website=Complete Genomics|access-date=2018-07-02}}</ref> The two companies have refined the technology to allow for longer read lengths, reaction time reductions and faster time to results. In addition, data are now generated as contiguous full-length reads in the standard FASTQ file format and can be used as-is in most short-read-based bioinformatics analysis pipelines.<ref name=":1">{{cite journal | vauthors = Huang J, Liang X, Xuan Y, Geng C, Li Y, Lu H, Qu S, Mei X, Chen H, Yu T, Sun N, Rao J, Wang J, Zhang W, Chen Y, Liao S, Jiang H, Liu X, Yang Z, Mu F, Gao S | display-authors = 6 | title = A reference human genome dataset of the BGISEQ-500 sequencer | journal = GigaScience | volume = 6 | issue = 5 | pages = 1β9 | date = May 2017 | pmid = 28379488 | pmc = 5467036 | doi = 10.1093/gigascience/gix024 }}</ref>{{citation needed|date=July 2018}} The two technologies that form the basis for this high-throughput sequencing technology are [[DNA nanoball sequencing|DNA nanoballs]] (DNB) and patterned arrays for nanoball attachment to a solid surface.<ref name=":0" /> DNA nanoballs are simply formed by denaturing double stranded, adapter ligated libraries and ligating the forward strand only to a splint oligonucleotide to form a ssDNA circle. Faithful copies of the circles containing the DNA insert are produced utilizing Rolling Circle Amplification that generates approximately 300β500 copies. The long strand of ssDNA folds upon itself to produce a three-dimensional nanoball structure that is approximately 220 nm in diameter. Making DNBs replaces the need to generate PCR copies of the library on the flow cell and as such can remove large proportions of duplicate reads, adapter-adapter ligations and PCR induced errors.<ref name=":1" />{{citation needed|date=July 2018}} [[File:MGISEQ-2000RS.jpg|thumb|A BGI MGISEQ-2000RS sequencer]] The patterned array of positively charged spots is fabricated through photolithography and etching techniques followed by chemical modification to generate a sequencing flow cell. Each spot on the flow cell is approximately 250 nm in diameter, are separated by 700 nm (centre to centre) and allows easy attachment of a single negatively charged DNB to the flow cell and thus reducing under or over-clustering on the flow cell.<ref name=":0" />{{citation needed|date=July 2018}} Sequencing is then performed by addition of an oligonucleotide probe that attaches in combination to specific sites within the DNB. The probe acts as an anchor that then allows one of four single reversibly inactivated, labelled nucleotides to bind after flowing across the flow cell. Unbound nucleotides are washed away before laser excitation of the attached labels then emit fluorescence and signal is captured by cameras that is converted to a digital output for base calling. The attached base has its terminator and label chemically cleaved at completion of the cycle. The cycle is repeated with another flow of free, labelled nucleotides across the flow cell to allow the next nucleotide to bind and have its signal captured. This process is completed a number of times (usually 50 to 300 times) to determine the sequence of the inserted piece of DNA at a rate of approximately 40 million nucleotides per second as of 2018.{{citation needed|date=July 2018}} ==== SOLiD sequencing ==== [[File:Library preparation for the SOLiD platform.svg|right|thumb|Library preparation for the SOLiD platform]] {{Main|ABI Solid Sequencing}} [[File:Two-base encoding scheme.pdf|thumb|Two-base encoding scheme. In two-base encoding, each unique pair of bases on the 3' end of the probe is assigned one out of four possible colors. For example, "AA" is assigned to blue, "AC" is assigned to green, and so on for all 16 unique pairs. During sequencing, each base in the template is sequenced twice, and the resulting data are decoded according to this scheme.]] [[Applied Biosystems]]' (now a [[Life Technologies (Thermo Fisher Scientific)|Life Technologies]] brand) SOLiD technology employs [[sequencing by ligation]]. Here, a pool of all possible oligonucleotides of a fixed length are labeled according to the sequenced position. Oligonucleotides are annealed and ligated; the preferential ligation by [[DNA ligase]] for matching sequences results in a signal informative of the nucleotide at that position. Each base in the template is sequenced twice, and the resulting data are decoded according to the [[2 base encoding]] scheme used in this method. Before sequencing, the DNA is amplified by emulsion PCR. The resulting beads, each containing single copies of the same DNA molecule, are deposited on a glass slide.<ref name="pmid18477713">{{cite journal | vauthors = Valouev A, Ichikawa J, Tonthat T, Stuart J, Ranade S, Peckham H, Zeng K, Malek JA, Costa G, McKernan K, Sidow A, Fire A, Johnson SM | title = A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning | journal = Genome Res. | volume = 18 | issue = 7 | pages = 1051β63 | date = July 2008 | pmid = 18477713 | pmc = 2493394 | doi = 10.1101/gr.076463.108 }}</ref> The result is sequences of quantities and lengths comparable to Illumina sequencing.<ref name="pmid18165802"/> This [[sequencing by ligation]] method has been reported to have some issue sequencing palindromic sequences.<ref name="Yu-Feng Huang, Sheng-Chung Chen, Yih-Shien Chiang, Tzu-Han Chen & Kuo-Ping Chiu 2012 S10"/> ==== Ion Torrent semiconductor sequencing ==== {{Main|Ion semiconductor sequencing}} Ion Torrent Systems Inc. (now owned by [[Life Technologies (Thermo Fisher Scientific)|Life Technologies]]) developed a system based on using standard sequencing chemistry, but with a novel, semiconductor-based detection system. This method of sequencing is based on the detection of [[hydrogen ion]]s that are released during the [[DNA polymerase|polymerisation]] of [[DNA]], as opposed to the optical methods used in other sequencing systems. A microwell containing a template DNA strand to be sequenced is flooded with a single type of [[nucleotide]]. If the introduced nucleotide is [[complementarity (molecular biology)|complementary]] to the leading template nucleotide it is incorporated into the growing complementary strand. This causes the release of a hydrogen ion that triggers a hypersensitive ion sensor, which indicates that a reaction has occurred. If [[homopolymer]] repeats are present in the template sequence, multiple nucleotides will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal.<ref name="rusk">{{cite journal | vauthors = Rusk N | year = 2011 | title = Torrents of sequence | journal = Nat Methods | volume = 8 | issue = 1| page = 44 | doi=10.1038/nmeth.f.330| s2cid = 41040192 | doi-access = free }}</ref> [[File:From second to fourth-generation sequencing, illustration on TAGGCT template.svg|thumb|right| Sequencing of the TAGGCT template with IonTorrent, PacBioRS and GridION]] ==== DNA nanoball sequencing ==== {{Main|DNA nanoball sequencing}} [[DNA nanoball sequencing]] is a type of high throughput sequencing technology used to determine the entire [[genomic sequence]] of an organism. The company [[Complete Genomics]] uses this technology to sequence samples submitted by independent researchers. The method uses [[rolling circle replication]] to amplify small fragments of genomic DNA into DNA nanoballs. Unchained sequencing by ligation is then used to determine the nucleotide sequence.<ref name = "Drmanac_2010" /> This method of DNA sequencing allows large numbers of DNA nanoballs to be sequenced per run and at low [[reagent]] costs compared to other high-throughput sequencing platforms.<ref>{{cite journal | vauthors = Porreca GJ | title = Genome Sequencing on Nanoballs | journal = Nature Biotechnology | volume = 28 | issue = 1 | pages = 43β44 | year = 2010 | pmid = 20062041 | doi = 10.1038/nbt0110-43 | s2cid = 54557996 }}</ref> However, only short sequences of DNA are determined from each DNA nanoball which makes mapping the short reads to a [[reference genome]] difficult.<ref name = "Drmanac_2010"/> ==== Heliscope single molecule sequencing ==== Heliscope sequencing is a method of [[Single-molecule magnetic sequencing|single-molecule sequencing]] developed by [[Helicos Biosciences]]. It uses DNA fragments with added poly-A tail adapters which are attached to the flow cell surface. The next steps involve extension-based sequencing with cyclic washes of the flow cell with fluorescently labeled nucleotides (one nucleotide type at a time, as with the Sanger method). The reads are performed by the Heliscope sequencer.<ref>{{cite web|url=http://www.helicosbio.com/Products/HelicosregGeneticAnalysisSystem/HeliScopetradeSequencer/tabid/87/Default.aspx|archive-url=https://web.archive.org/web/20091102041828/http://www.helicosbio.com/Products/HelicosregGeneticAnalysisSystem/HeliScopetradeSequencer/tabid/87/Default.aspx|archive-date=2009-11-02|title=HeliScope Gene Sequencing / Genetic Analyzer System : Helicos BioSciences|date=2 November 2009}}</ref><ref>{{cite journal | vauthors = Thompson JF, Steinmann KE | title = Single molecule sequencing with a HeliScope genetic analysis system | journal = Current Protocols in Molecular Biology | volume = Chapter 7 | pages = Unit7.10 | date = October 2010 | pmid = 20890904 | pmc = 2954431 | doi = 10.1002/0471142727.mb0710s92 }}</ref> The reads are short, averaging 35 bp.<ref>{{cite web |url=http://seqll.com/technical-description/ |archive-url=https://web.archive.org/web/20140808055229/http://seqll.com/technical-description/ |url-status=dead |archive-date=8 August 2014 |publisher=SeqLL |access-date=9 August 2015 |title=tSMS SeqLL Technical Explanation}}</ref> What made this technology especially novel was that it was the first of its class to sequence non-amplified DNA, thus preventing any read errors associated with amplification steps.<ref>{{cite journal |last1=Heather |first1=James M. |last2=Chain |first2=Benjamin |title=The sequence of sequencers: The history of sequencing DNA |journal=Genomics |date=January 2016 |volume=107 |issue=1 |pages=1β8 |doi=10.1016/j.ygeno.2015.11.003 |pmc=4727787 |pmid=26554401 }}</ref> In 2009 a human genome was sequenced using the Heliscope, however in 2012 the company went bankrupt.<ref>{{cite book|author1=Sara El-Metwally |title=Next Generation Sequencing Technologies and Challenges in Sequence Assembly |volume=7 |author2=Osama M. Ouda |author3=Mohamed Helmy |publisher=Next Generation Sequencing Technologies and Challenges in Sequence Assembly, Springer Briefs in Systems Biology Volume 7 |year=2014 |pages=51β59|doi=10.1007/978-1-4939-0715-1_6 |chapter=New Horizons in Next-Generation Sequencing |series=SpringerBriefs in Systems Biology |isbn=978-1-4939-0714-4 }}</ref> ==== Microfluidic Systems ==== There are two main microfluidic systems that are used to sequence DNA; [[Droplet-based microfluidics|droplet based microfluidics]] and [[digital microfluidics]]. Microfluidic devices solve many of the current limitations of current sequencing arrays. Abate et al. studied the use of droplet-based microfluidic devices for DNA sequencing.<ref name=":3">{{cite journal | vauthors = Abate AR, Hung T, Sperling RA, Mary P, Rotem A, Agresti JJ, Weiner MA, Weitz DA | display-authors = 6 | title = DNA sequence analysis with droplet-based microfluidics | journal = Lab on a Chip | volume = 13 | issue = 24 | pages = 4864β9 | date = December 2013 | pmid = 24185402 | pmc = 4090915 | doi = 10.1039/c3lc50905b }}</ref> These devices have the ability to form and process picoliter sized droplets at the rate of thousands per second. The devices were created from [[Polydimethylsiloxane|polydimethylsiloxane (PDMS)]] and used Forster resonance energy transfer, [[FΓΆrster resonance energy transfer|FRET assays]] to read the sequences of DNA encompassed in the droplets. Each position on the array tested for a specific 15 base sequence.<ref name=":3" /> Fair et al. used digital microfluidic devices to study DNA [[pyrosequencing]].<ref name=":4">{{Cite journal| vauthors = Fair RB, Khlystov A, Tailor TD, Ivanov V, Evans RD, Srinivasan V, Pamula VK, Pollack MG, Griffin PB, Zhou J |date= January 2007 |title=Chemical and Biological Applications of Digital-Microfluidic Devices |journal=IEEE Design & Test of Computers|volume=24|issue=1|pages=10β24|doi=10.1109/MDT.2007.8 |hdl= 10161/6987 |citeseerx=10.1.1.559.1440 |s2cid= 10122940 }}</ref> Significant advantages include the portability of the device, reagent volume, speed of analysis, mass manufacturing abilities, and high throughput. This study provided a proof of concept showing that digital devices can be used for pyrosequencing; the study included using synthesis, which involves the extension of the enzymes and addition of labeled nucleotides.<ref name=":4" /> Boles et al. also studied pyrosequencing on digital microfluidic devices.<ref name=":5">{{cite journal | vauthors = Boles DJ, Benton JL, Siew GJ, Levy MH, Thwar PK, Sandahl MA, Rouse JL, Perkins LC, Sudarsan AP, Jalili R, Pamula VK, Srinivasan V, Fair RB, Griffin PB, Eckhardt AE, Pollack MG | display-authors = 6 | title = Droplet-based pyrosequencing using digital microfluidics | journal = Analytical Chemistry | volume = 83 | issue = 22 | pages = 8439β47 | date = November 2011 | pmid = 21932784 | pmc = 3690483 | doi = 10.1021/ac201416j }}</ref> They used an electro-wetting device to create, mix, and split droplets. The sequencing uses a three-enzyme protocol and DNA templates anchored with magnetic beads. The device was tested using two protocols and resulted in 100% accuracy based on raw pyrogram levels. The advantages of these digital microfluidic devices include size, cost, and achievable levels of functional integration.<ref name=":5" /> DNA sequencing research, using microfluidics, also has the ability to be applied to the [[RNA-Seq|sequencing of RNA]], using similar droplet microfluidic techniques, such as the method, inDrops.<ref>{{cite journal | vauthors = Zilionis R, Nainys J, Veres A, Savova V, Zemmour D, Klein AM, Mazutis L | title = Single-cell barcoding and sequencing using droplet microfluidics | journal = Nature Protocols | volume = 12 | issue = 1 | pages = 44β73 | date = January 2017 | pmid = 27929523 | doi = 10.1038/nprot.2016.154 | s2cid = 767782 }}</ref> This shows that many of these DNA sequencing techniques will be able to be applied further and be used to understand more about genomes and transcriptomes.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)