Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Chemical database
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Similarity == {{Main|Chemical similarity}} There is no single definition of molecular similarity, however the concept may be defined according to the application and is often described as an [[inverse element|inverse]] of a [[distance|measure of distance]] in descriptor space. Two molecules might be considered more similar for instance if their difference in [[molecular weight]]s is lower than when compared with others. A variety of other measures could be combined to produce a multi-variate distance measure. Distance measures are often classified into [[Euclidean distance|Euclidean measure]]s and non-Euclidean measures depending on whether the [[triangle inequality]] holds. Maximum Common Subgraph ([[Maximum common subgraph isomorphism problem|MCS]]) based substructure search <ref name="SMSD09">{{cite journal|first1=S. A. |last1=Rahman|first2= M. |last2=Bashton |first3= G. L. |last3= Holliday |first4= R. |last4=Schrader |first5=J. M. |last5=Thornton |year=2000 |title=Small Molecule Subgraph Detector (SMSD) toolkit|journal= Journal of Cheminformatics|volume=1|issue=1|page=12|doi=10.1186/1758-2946-1-12|pmid=20298518|pmc=2820491 |doi-access=free }}</ref>(similarity or distance measure) is also very common. MCS is also used for screening drug like compounds by hitting molecules, which share common subgraph (substructure).<ref>{{cite journal|first1=S. Asad |last1=Rahman|first2= M. |last2=Bashton|first3= G. L. |last3=Holliday|first4= R. |last4=Schrader |first5=J. M. |last5=Thornton|title= Small Molecule Subgraph Detector (SMSD) Toolkit|journal= Journal of Cheminformatics |year=2009|volume= 1|issue=1|page=12 |doi=10.1186/1758-2946-1-12 |pmid=20298518|pmc=2820491|url=http://www.ebi.ac.uk/thornton-srv/software/SMSD/ |doi-access=free }}</ref> Chemicals in the databases may be [[cluster (computing)|cluster]]ed into groups of 'similar' molecules based on similarities. Both hierarchical and non-hierarchical clustering approaches can be applied to chemical entities with multiple attributes. These attributes or molecular properties may either be determined empirically or computationally derived [[Molecular descriptor|descriptors]]. One of the most popular clustering approaches is the [[Jarvis-Patrick algorithm]].<ref>{{cite journal|last=Butina|first= Darko |year=1999|title= Unsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets|journal= Chem. Inf. Comput. Sci. |volume=39|issue= 4 |pages=747β750|doi=10.1021/ci9803381}}</ref> In [[pharmacological]]ly oriented chemical repositories, similarity is usually defined in terms of the biological effects of compounds ([[ADME]]/tox) that can in turn be semiautomatically inferred from similar combinations of physico-chemical descriptors using [[QSAR]] methods.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)