Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
CATH database
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Infobox biodatabase |title = CATH |logo = [[file:CATH - Protein Structure Classification Database.png|200px]] |description =Protein Structure Classification |scope = |organism = |center =[[University College London]] |laboratory = Institute of Structural and Molecular Biology |author = |citation = Dawson et al. (2016) <ref name=Cathv4.1>{{cite journal | vauthors = Dawson NL, Lewis TE, Das S, Lees JG, Lee D, Ashford P, Orengo CA, Sillitoe I | display-authors = 6 | title = CATH: an expanded resource to predict protein function through structure and sequence | journal = Nucleic Acids Research | volume = 45 | issue = D1 | pages = D289βD295 | date = January 2017 | pmid = 27899584 | pmc = 5210570 | doi = 10.1093/nar/gkw1098 }}</ref> |released = 1997 |standard = |format = |url = {{URL|cathdb.info}} |download = {{URL|cathdb.info/download}} |webservice = |sql = |sparql = |webapp = |standalone = |license = |versioning = |frequency = CATH-B is released daily. Official releases are approximately annual. |curation = |bookmark = |version = 4.3 }} {{Redirect|CATH|other uses|Cath (disambiguation){{!}}Cath}} [[Image:CATH hierarchy.png|thumb|262px|Schematic representation of the three top levels of the CATH classification scheme.<ref name="Orengo_1997">{{cite journal | vauthors = Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM | title = CATH--a hierarchic classification of protein domain structures | journal = Structure | location = London, England | volume = 5 | issue = 8 | pages = 1093β108 | date = August 1997 | pmid = 9309224 | doi = 10.1016/s0969-2126(97)00260-8 | doi-access = free }}</ref>]] The '''CATH Protein Structure Classification database''' is a free, publicly available online resource that provides information on the evolutionary relationships of [[protein domain]]s. It was created in the mid-1990s by Professor [[Christine Orengo]] and colleagues including [[Janet Thornton]] and [[David Tudor Jones|David Jones]],<ref name="Orengo_1997" /> and continues to be developed by the Orengo group at [[University College London]]. CATH shares many broad features with the [[Structural Classification of Proteins|SCOP]] resource, however there are also many areas in which the detailed classification differs greatly.<ref>{{cite web|url=http://www.cathdb.info |title=CATH: Protein Structure Classification Database at UCL |website=Cathdb.info |access-date=2017-03-09}}</ref><ref>{{cite web|url=http://www.cathdb.info/wiki/doku/?id=tutorials:index |title=CATH |website=Cathdb.info |access-date=2017-03-09}}</ref><ref>{{cite web|url=https://twitter.com/CATHDatabase |title=CATH Database (@CATHDatabase) |publisher=[[Twitter]] |access-date=2017-03-09}}</ref><ref name= Pearl2003>{{cite journal | vauthors = Pearl FM, Bennett CF, Bray JE, Harrison AP, Martin N, Shepherd A, Sillitoe I, Thornton J, Orengo CA | display-authors = 6 | title = The CATH database: an extended protein family resource for structural and functional genomics | journal = Nucleic Acids Research | volume = 31 | issue = 1 | pages = 452β455 | date = January 2003 | pmid = 12520050 | pmc = 165509 | doi = 10.1093/nar/gkg062 }}</ref> ==Hierarchical organization== Experimentally determined protein three-dimensional structures are obtained from the [[Protein Data Bank]] and split into their consecutive [[polypeptide chains]], where applicable. Protein domains are identified within these chains using a mixture of automatic methods and manual curation.<ref>{{Cite web |title=CATH |url=http://cathdb.info/wiki/doku/?id=faq |access-date=2024-09-14 |website=cathdb.info |language=en}}</ref> The domains are then classified within the CATH structural hierarchy: at the Class (C) level, domains are assigned according to their [[Protein secondary structure|secondary structure]] content, i.e. all [[Alpha helix|alpha]], all [[Beta sheet|beta]], a mixture of alpha and beta, or little secondary structure; at the Architecture (A) level, information on the secondary structure arrangement in three-dimensional space is used for assignment; at the Topology/fold (T) level, information on how the secondary structure elements are connected and arranged is used; assignments are made to the [[Protein superfamily|Homologous superfamily]] (H) level if there is good evidence that the domains are related by evolution<ref name="Orengo_1997" /> i.e. they are homologous. {| class="wikitable" |+The four main levels of the CATH hierarchy: !# !Level !Description |- |1 || '''C'''lass || the overall secondary-structure content of the domain. (Equivalent to the [[Structural Classification of Proteins database|SCOP]] [[Protein fold class|Class]]) |- |2 || '''A'''rchitecture || high structural similarity but no evidence of [[homology (biology)|homology]]. |- |3 || '''T'''opology/fold || a large-scale grouping of topologies which share particular structural features (Equivalent to the 'fold' level in SCOP) |- |4 || '''H'''omologous superfamily || indicative of a demonstrable evolutionary relationship. (Equivalent to SCOP [[protein superfamily|superfamily]]) |} Additional sequence data for domains with no experimentally determined structures are provided by CATH's sister resource, Gene3D, which are used to populate the homologous superfamilies. Protein sequences from UniProtKB and Ensembl are scanned against CATH HMMs to predict domain sequence boundaries and make homologous superfamily assignments. ==Releases== The CATH team releases new data both as daily snapshots, and official releases approximately annually. The latest release of CATH-Gene3D (v4.3) was released in December 2020 and consists of:<ref>{{Cite web |title=CATH |url=http://cathdb.info/wiki/doku/?id=release_notes |access-date=2024-09-14 |website=cathdb.info}}</ref> * 500,238 structural protein domain entries * 151 mln non-structural protein domain entries * 5,481 homologous superfamily entries * 212,872 functional family entries ==Open-source software== CATH is an [[open source software]] project, with developers developing and maintaining a number of open-source tools,<ref>{{cite web|url=http://www.cathdb.info/wiki/doku/?id=cath_tools|title=Tools|website=cathdb.info|access-date=2016-12-18}}</ref> which are available publicly on [[GitHub]].<ref>{{Citation |title=UCLOrengoGroup/cath-tools |date=2024-09-09 |url=https://github.com/UCLOrengoGroup/cath-tools |access-date=2024-09-14 |publisher=UCLOrengoGroup}}</ref> == References == {{Reflist}} {{Use dmy dates|date=April 2017}} [[Category:Protein structure databases]] [[Category:Protein structure]] [[Category:Protein folds]] [[Category:Protein classification]] [[Category:Protein superfamilies]] [[Category:University College London]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Infobox biodatabase
(
edit
)
Template:Redirect
(
edit
)
Template:Reflist
(
edit
)
Template:Use dmy dates
(
edit
)