Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
High-throughput screening
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Experimental design and data analysis == With the ability of rapid screening of diverse compounds (such as [[small molecule]]s or [[siRNA]]s) to identify active compounds, HTS has led to an explosion in the rate of data generated in recent years .<ref name=HoweNature2008>{{cite journal |vauthors=Howe D, Costanzo M, Fey P, Gojobori T, Hannick L, Hide W, Hill DP, Kania R, Schaeffer M, Pierre SS, Twigger S, White O, Rhee SY |title=Big data: The future of biocuration |journal=Nature |volume=455 |issue= 7209|pages=47β50 |year=2008 |pmid= 18769432|doi=10.1038/455047a |bibcode = 2008Natur.455...47H |pmc= 2819144 }}</ref> Consequently, one of the most fundamental challenges in HTS experiments is to glean biochemical significance from mounds of data, which relies on the development and adoption of appropriate experimental designs and analytic methods for both quality control and hit selection .<ref name= ZhangBook2011> {{cite book |author= Zhang XHD |year=2011 |title= Optimal High-Throughput Screening: Practical Experimental Design and Data Analysis for Genome-scale RNAi Research |publisher =Cambridge University Press |isbn=978-0-521-73444-8}}</ref> HTS research is one of the fields that have a feature described by John Blume, Chief Science Officer for Applied Proteomics, Inc., as follows: Soon, if a scientist does not understand some statistics or rudimentary data-handling technologies, he or she may not be considered to be a true molecular biologist and, thus, will simply become "a dinosaur."<ref name=EisensteinNature2006>{{cite journal |author=Eisenstein M |title=Quality control |journal=Nature |volume=442 |issue= 7106|pages=1067β70 |year=2006 |pmid= 16943838|doi=10.1038/4421067a |bibcode = 2006Natur.442.1067E |doi-access=free}}</ref> === Quality control === High-quality HTS assays are critical in HTS experiments. The development of high-quality HTS assays requires the integration of both experimental and computational approaches for quality control (QC). Three important means of QC are (i) good plate design, (ii) the selection of effective positive and negative chemical/biological controls, and (iii) the development of effective QC metrics to measure the degree of differentiation so that assays with inferior data quality can be identified. <ref name=ZhangetalJBS2008>{{cite journal |vauthors=Zhang XH, Espeseth AS, Johnson EN, Chin J, Gates A, Mitnaul LJ, Marine SD, Tian J, Stec EM, Kunapuli P, Holder DJ, Heyse JF, Strulocivi B, Ferrer M |title=Integrating experimental and analytic approaches to improve data quality in genome-scale RNAi screens |journal=Journal of Biomolecular Screening |volume=13 |issue= 5|pages=378β89 |year=2008 |pmid= 18480473|doi=10.1177/1087057108317145 |s2cid=22679273 |doi-access=free}}</ref> A good plate design helps to identify systematic errors (especially those linked with well position) and determine what normalization should be used to remove/reduce the impact of systematic errors on both QC and hit selection.<ref name="ZhangBook2011" /> Effective analytic QC methods serve as a gatekeeper for excellent quality assays. In a typical HTS experiment, a clear distinction between a positive control and a negative reference such as a negative control is an index for good quality. Many quality-assessment measures have been proposed to measure the degree of differentiation between a positive control and a negative reference. Signal-to-background ratio, signal-to-noise ratio, signal window, assay variability ratio, and [[Z-factor]] have been adopted to evaluate data quality. <ref name="ZhangBook2011" /> <ref name=ZhangJHetalJBS1999>{{cite journal |vauthors=Zhang JH, Chung TD, Oldenburg KR |title=A simple statistical parameter for use in evaluation and validation of high throughput screening assays |journal=Journal of Biomolecular Screening |volume=4 |issue= 2|pages=67β73 |year=1999 |pmid= 10838414|doi=10.1177/108705719900400206 |s2cid=36577200 |doi-access=free}}</ref> Strictly standardized mean difference ([[SSMD]]) has recently been proposed for assessing data quality in HTS assays. <ref name=ZhangGenomics2007>{{cite journal |author=Zhang, XHD |title=A pair of new statistical parameters for quality control in RNA interference high-throughput screening assays |journal=Genomics |volume=89 |issue= 4|pages=552β61 |year=2007 |pmid= 17276655|doi=10.1016/j.ygeno.2006.12.014 |doi-access=}}</ref> <ref name=ZhangJBS2008>{{cite journal |author=Zhang XHD |title=Novel analytic criteria and effective plate designs for quality control in genome-scale RNAi screens |journal=Journal of Biomolecular Screening |volume=13 |issue= 5|pages=363β77 |year=2008 |pmid= 18567841|doi=10.1177/1087057108317062 |s2cid=12688742 |doi-access=free}}</ref> === Hit selection === A compound with a desired size of effects in an HTS is called a hit. The process of selecting hits is called hit selection. The analytic methods for hit selection in screens without replicates (usually in primary screens) differ from those with replicates (usually in confirmatory screens). For example, the z-score method is suitable for screens without replicates whereas the [[t-statistic]] is suitable for screens with replicates. The calculation of SSMD for screens without replicates also differs from that for screens with replicates .<ref name="ZhangBook2011" /> For hit selection in primary screens without replicates, the easily interpretable ones are average fold change, mean difference, percent inhibition, and percent activity. However, they do not capture data variability effectively. The z-score method or SSMD, which can capture data variability based on an assumption that every compound has the same variability as a negative reference in the screens. <ref name=ZhangJBS2007>{{cite journal |author=Zhang XHD |title=A new method with flexible and balanced control of false negatives and false positives for hit selection in RNA interference high-throughput screening assays |journal=Journal of Biomolecular Screening |volume=12 |issue= 5|pages=645β55 |year=2007 |pmid= 17517904|doi=10.1177/1087057107300645 |doi-access=free}}</ref><ref name=ZhangetalJBS2007>{{cite journal |vauthors=Zhang XH, Ferrer M, Espeseth AS, Marine SD, Stec EM, Crackower MA, Holder DJ, Heyse JF, Strulovici B |title=The use of strictly standardized mean difference for hit selection in primary RNA interference high-throughput screening experiments |journal=Journal of Biomolecular Screening |volume=12 |issue= 4|pages=645β55 |year=2007 |pmid= 17435171|doi=10.1177/1087057107300646 |s2cid=7542230 |doi-access=free}}</ref> However, outliers are common in HTS experiments, and methods such as z-score are sensitive to outliers and can be problematic. As a consequence, robust methods such as the z*-score method, SSMD*, B-score method, and quantile-based method have been proposed and adopted for hit selection.<ref name="caraus974"/> <ref name="ZhangBook2011" /> <ref name=ZhangPharmacogenomics2006>{{cite journal |vauthors=Zhang XH, Yang XC, Chung N, Gates A, Stec E, Kunapuli P, Holder DJ, Ferrer M, Espeseth AS |title=Robust statistical methods for hit selection in RNA interference high-throughput screening experiments |journal=Pharmacogenomics |volume=7 |issue= 3|pages=299β09 |year=2006 |pmid= 16610941|doi=10.2217/14622416.7.3.299 }}</ref> <ref name=BrideauJBS2003>{{cite journal |vauthors=Brideau C, Gunter G, Pikounis B, Liaw A |title=Improved statistical methods for hit selection in high-throughput screening |journal=Journal of Biomolecular Screening |volume=8 |issue= 6|pages=634β47 |year=2003 |pmid= 14711389|doi=10.1177/1087057103258285 |doi-access=free}}</ref> In a screen with replicates, we can directly estimate variability for each compound; as a consequence, we should use SSMD or t-statistic that does not rely on the strong assumption that the z-score and z*-score rely on. One issue with the use of t-statistic and associated p-values is that they are affected by both sample size and effect size.<ref name=Cohen1994>{{cite journal |author=Cohen J|title=The Earth Is Round (P-Less-Than.05)|journal=American Psychologist |volume=49 |issue= 12|pages=997β1003|year=1994 |doi=10.1037/0003-066X.49.12.997 |issn=0003-066X }}</ref> They come from testing for no mean difference, and thus are not designed to measure the size of compound effects. For hit selection, the major interest is the size of effect in a tested compound. SSMD directly assesses the size of effects.<ref name=ZhangPharmacogenomics2009>{{cite journal |author=Zhang XHD|title= A method for effectively comparing gene effects in multiple conditions in RNAi and expression-profiling research |journal=Pharmacogenomics |volume=10 |issue= 3|pages=345β58 |year=2009 |pmid= 20397965|doi=10.2217/14622416.10.3.345}}</ref> SSMD has also been shown to be better than other commonly used effect sizes.<ref name=ZhangSBR2010>{{cite journal |author=Zhang XHD|title= Strictly standardized mean difference, standardized mean difference and classical t-test for the comparison of two groups |journal= Statistics in Biopharmaceutical Research |volume=2 |issue= 2|pages=292β99 |year=2010 |doi=10.1198/sbr.2009.0074 |s2cid= 119825625 }}</ref> The population value of SSMD is comparable across experiments and, thus, we can use the same cutoff for the population value of SSMD to measure the size of compound effects .<ref name=ZhangPharmacogenomics2010>{{cite journal |author=Zhang XHD |title= Assessing the size of gene or RNAi effects in multifactor high-throughput experiments |journal=Pharmacogenomics |volume=11 |issue= 2|pages=199β213 |year=2010 |pmid= 20136359|doi=10.2217/PGS.09.136 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)