Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sequence analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Variant calling === Identifying variants is a popular aspect of sequence analysis as variants often contain information of biological significance, such as explaining the mechanism of drug resistance in an infectious disease. These variants could be single nucleotide variants (SNVs), small insertions/deletions (indels), and large [[structural variation|structural variants]]. The read alignments are sorted using [[SAMtools]], after which variant callers such as GATK<ref>{{cite journal |last1=McKenna |first1=Aaron |last2=Hanna |first2=Matthew |last3=Banks |first3=Eric | display-authors = 2 |title=The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data |journal=Genome Research |date=July 2010 |volume=20 |issue=9 |pages=1297β1303 |doi=10.1101/gr.107524.110 |pmid=20644199 |pmc=2928508 |url=https://doi.org/10.1101/gr.107524.110}}</ref> are used to identify differences compared to the reference sequence. The choice of variant calling tool depends heavily on the sequencing technology used, so GATK is often used when working with short reads, while long read sequences require tools like DeepVariant<ref>{{cite journal |last1=Poplin |first1=R |last2=Chang |first2=PC |last3=Alexander |first3=D |display-authors=2 |title=A universal SNP and small-indel variant caller using deep neural networks |journal=Nature Biotechnology |date=September 2018 |volume=36 |issue=10 |pages=983β987 |doi=10.1038/nbt.4235 |pmid=30247488 |url=https://doi.org/10.1038/nbt.4235}}</ref> and Sniffles.<ref>{{cite journal |last1=Sedlazeck |first1=F.J. |last2=Rescheneder |first2=P |last3=Smolka |first3=M |display-authors=2 |title=Accurate detection of complex structural variations using single-molecule sequencing |journal=Nature Methods |date=April 2018 |volume=15 |issue=6 |pages=461β468 |doi=10.1038/s41592-018-0001-7 |pmid=29713083 |pmc=5990442 |url=https://doi.org/10.1038/s41592-018-0001-7}}</ref> Tools may also differ based on organism (prokaryotes or eukaryotes), source of sequence data (cancer vs [[metagenomics|metagenomic]]), and variant type of interest (SNVs or structural variants). The output of variant calling is typically in [[Variant Call Format|vcf format]], and can be filtered using allele frequencies, quality scores, or other factors based on the research question at hand.<ref name=sequence_analysis/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)