Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Distributed hash table
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Implementations == Most notable differences encountered in practical instances of DHT implementations include at least the following: * The address space is a parameter of DHT. Several real-world DHTs use 128-bit or 160-bit key space. * Some real-world DHTs use hash functions other than [[SHA-1]]. * In the real world the key {{var serif|1=k}} could be a hash of a file's ''content'' rather than a hash of a file's ''name'' to provide [[content-addressable storage]], so that renaming of the file does not prevent users from finding it. * Some DHTs may also publish objects of different types. For example, key {{var serif|1=k}} could be the node {{var serif|1=ID}} and associated data could describe how to contact this node. This allows publication-of-presence information and often used in IM applications, etc. In the simplest case, {{var serif|1=ID}} is just a random number that is directly used as key {{var serif|1=k}} (so in a 160-bit DHT {{var serif|1=ID}} will be a 160-bit number, usually randomly chosen). In some DHTs, publishing of nodes' IDs is also used to optimize DHT operations. * Redundancy can be added to improve reliability. The {{var serif|1=(k, data)}} key pair can be stored in more than one node corresponding to the key. Usually, rather than selecting just one node, real world DHT algorithms select {{var serif|1=i}} suitable nodes, with {{var serif|1=i}} being an implementation-specific parameter of the DHT. In some DHT designs, nodes agree to handle a certain keyspace range, the size of which may be chosen dynamically, rather than hard-coded. * Some advanced DHTs like [[Kademlia]] perform iterative lookups through the DHT first in order to select a set of suitable nodes and send {{var serif|1=put(k, data)}} messages only to those nodes, thus drastically reducing useless traffic, since published messages are only sent to nodes that seem suitable for storing the key {{var serif|1=k}}; and iterative lookups cover just a small set of nodes rather than the entire DHT, reducing useless forwarding. In such DHTs, forwarding of {{Var serif|put(k, data)}} messages may only occur as part of a self-healing algorithm: if a target node receives a {{var serif|1=put(k, data)}} message, but believes that {{var serif|1=k}} is out of its handled range and a closer node (in terms of DHT keyspace) is known, the message is forwarded to that node. Otherwise, data are indexed locally. This leads to a somewhat self-balancing DHT behavior. Of course, such an algorithm requires nodes to publish their presence data in the DHT so the iterative lookups can be performed. * Since on most machines sending messages is much more expensive than local hash table accesses, it makes sense to bundle many messages concerning a particular node into a single batch. Assuming each node has a local batch consisting of at most {{var serif|1=b}} operations, the bundling procedure is as follows. Each node first sorts its local batch by the identifier of the node responsible for the operation. Using [[bucket sort]], this can be done in {{var serif|1=O(b + n)}}, where {{var serif|1=n}} is the number of nodes in the DHT. When there are multiple operations addressing the same key within one batch, the batch is condensed before being sent out. For example, multiple lookups of the same key can be reduced to one or multiple increments can be reduced to a single add operation. This reduction can be implemented with the help of a temporary local hash table. Finally, the operations are sent to the respective nodes.<ref>{{Cite book|url=https://www.springer.com/gp/book/9783030252083|title=Sequential and Parallel Algorithms and Data Structures: The Basic Toolbox|last1=Sanders|first1=Peter|last2=Mehlhorn|first2=Kurt|last3=Dietzfelbinger|first3=Martin|last4=Dementiev|first4=Roman|date=2019|publisher=Springer International Publishing|isbn=978-3-030-25208-3|language=en|access-date=2020-01-22|archive-date=2021-08-17|archive-url=https://web.archive.org/web/20210817105142/https://www.springer.com/gp/book/9783030252083|url-status=live}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)