Editing Biostatistics (section)

=== Expression data ===

Studies for differential expression of genes from [[RNA-Seq]] data, as for [[Real-time polymerase chain reaction|RT-qPCR]] and [[microarrays]], demands comparison of conditions. The goal is to identify genes which have a significant change in abundance between different conditions. Then, experiments are designed appropriately, with replicates for each condition/treatment, randomization and blocking, when necessary. In RNA-Seq, the quantification of expression uses the information of mapped reads that are summarized in some genetic unit, as [[exon]]s that are part of a gene sequence. As [[microarray]] results can be approximated by a normal distribution, RNA-Seq counts data are better explained by other distributions. The first used distribution was the [[Poisson distribution|Poisson]] one, but it underestimate the sample error, leading to false positives. Currently, biological variation is considered by methods that estimate a dispersion parameter of a [[negative binomial distribution]]. [[Generalized linear model]]s are used to perform the tests for statistical significance and as the number of genes is high, multiple tests correction have to be considered.<ref>{{cite journal| doi =10.1186/gb-2010-11-12-220| pmid =21176179| pmc =3046478| title =From RNA-seq reads to differential expression results| journal =Genome Biology| volume =11| issue =12| pages =220| year =2010| last1 =Oshlack| first1 =Alicia| last2 =Robinson| first2 =Mark D| last3 =Young| first3 =Matthew D| doi-access =free}}</ref> Some examples of other analysis on [[genomics]] data comes from microarray or [[proteomics]] experiments.<ref>{{cite book|title=Statistical Analysis of Gene Expression Microarray Data|author1=Helen Causton |author2=John Quackenbush |author3=Alvis Brazma |publisher=Wiley-Blackwell|year=2003}}</ref><ref>{{cite book|title=Microarray Gene Expression Data Analysis: A Beginner's Guide|author=Terry Speed|publisher=Chapman & Hall/CRC|year=2003}}</ref> Often concerning diseases or disease stages.<ref>{{cite book|title=Medical Biostatistics for Complex Diseases|author1=Frank Emmert-Streib |author2=Matthias Dehmer |publisher=Wiley-Blackwell|year=2010|isbn= 978-3-527-32585-6}}</ref>