Editing Distributed hash table (section)

=== Keyspace partitioning ===
Most DHTs use some variant of [[consistent hashing]] or [[rendezvous hashing]] to map keys to nodes. The two algorithms appear to have been devised independently and simultaneously to solve the distributed hash table problem.

Both consistent hashing and rendezvous hashing have the essential property that removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected. Contrast this with a traditional [[hash table]] in which addition or removal of one bucket causes nearly the entire keyspace to be remapped. Since any change in ownership typically corresponds to bandwidth-intensive movement of objects stored in the DHT from one node to another, minimizing such reorganization is required to efficiently support high rates of [[Churn rate|churn]] (node arrival and failure).

==== Consistent hashing ====
{{further|Consistent hashing}}

Consistent hashing employs a function <math>\delta(k_1, k_2)</math> that defines an abstract notion of the distance between the keys <math>k_1</math> and <math>k_2</math>, which is unrelated to geographical distance or [[network latency]]. Each node is assigned a single key called its ''identifier'' (ID). A node with ID <math>i_x</math> owns all the keys <math>k_m</math> for which <math>i_x</math> is the closest ID, measured according to <math>\delta(k_m, i_x)</math>.

For example, the [[Chord (peer-to-peer)|Chord DHT]] uses consistent hashing, which treats nodes as points on a circle, and <math>\delta(k_1, k_2)</math> is the distance traveling clockwise around the circle from <math>k_1</math> to <math>k_2</math>. Thus, the circular keyspace is split into contiguous segments whose endpoints are the node identifiers. If <math>i_1</math> and <math>i_2</math> are two adjacent IDs, with a shorter clockwise distance from <math>i_1</math> to <math>i_2</math>, then the node with ID <math>i_2</math> owns all the keys that fall between <math>i_1</math> and <math>i_2</math>.

==== Rendezvous hashing ====
{{further|Rendezvous hashing}}

In rendezvous hashing, also called highest random weight (HRW) hashing, all clients use the same hash function <math>h()</math> (chosen ahead of time) to associate a key to one of the ''n'' available servers.
Each client has the same list of identifiers {{math|{{mset|''S''<sub>1</sub>, ''S''<sub>2</sub>, ..., ''S''<sub>''n''</sub> }}}}, one for each server.
Given some key ''k'', a client computes ''n'' hash weights {{math|1=''w''<sub>1</sub> = ''h''(''S''<sub>1</sub>, ''k''), ''w''<sub>2</sub> = ''h''(''S''<sub>2</sub>, ''k''), ..., ''w''<sub>''n''</sub> = ''h''(''S''<sub>''n''</sub>, ''k'')}}.
The client associates that key with the server corresponding to the highest hash weight for that key.
A server with ID <math>S_x</math> owns all the keys <math>k_m</math> for which the hash weight <math>h(S_x, k_m)</math> is higher than the hash weight of any other node for that key.

==== Locality-preserving hashing ====
{{further|Locality-preserving hashing}}

Locality-preserving hashing ensures that similar keys are assigned to similar objects. This can enable a more efficient execution of range queries, however, in contrast to using consistent hashing, there is no more assurance that the keys (and thus the load) is uniformly randomly distributed over the key space and the participating peers. DHT protocols such as Self-Chord and Oscar<ref>{{Cite journal|last1=Girdzijauskas|first1=Šarūnas|last2=Datta|first2=Anwitaman|last3=Aberer|first3=Karl|date=2010-02-01|title=Structured overlay for heterogeneous environments|journal=ACM Transactions on Autonomous and Adaptive Systems|volume=5|issue=1|pages=1–25|doi=10.1145/1671948.1671950|s2cid=13218263|issn=1556-4665|url=http://infoscience.epfl.ch/record/134972|access-date=2020-03-12|archive-date=2020-07-12|archive-url=https://web.archive.org/web/20200712230838/https://infoscience.epfl.ch/record/134972|url-status=live}}</ref> address such issues. Self-Chord decouples object keys from peer IDs and sorts keys along the ring with a statistical approach based on the [[swarm intelligence]] paradigm.<ref>{{cite journal |last1=Forestiero |first1=Agostino |last2=Leonardi |first2=Emilio |last3=Mastroianni |first3=Carlo |last4=Meo |first4=Michela |title=Self-Chord: A Bio-Inspired P2P Framework for Self-Organizing Distributed Systems |journal=IEEE/ACM Transactions on Networking |date=October 2010 |volume=18 |issue=5 |pages=1651–1664 |doi=10.1109/TNET.2010.2046745 |s2cid=14797120 |url=http://porto.polito.it/2370172/ |access-date=2019-07-28 |archive-date=2012-07-01 |archive-url=https://web.archive.org/web/20120701163158/http://porto.polito.it/2370172/ |url-status=live }}</ref> Sorting ensures that similar keys are stored by neighbour nodes and that discovery procedures, including [[Range query (data structures)|range queries]], can be performed in logarithmic time. Oscar constructs a navigable [[small-world network]] based on [[random walk]] sampling also assuring logarithmic search time.