Distributed data store

Revision as of 22:40, 24 May 2025 by imported>GrinningIodize (Replace Freenet with Hyphanet following name change)
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Template:Short description Template:Essay-like Template:Memory types

A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion.<ref>Template:Citation</ref> It is usually specifically used to refer to either a distributed database where users store information on a number of nodes, or a computer network in which users store information on a number of peer network nodes.<ref name="urlDistributed Data Storage - an overview | ScienceDirect Topics">{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Distributed databasesEdit

Distributed databases are usually non-relational databases that enable a quick access to data over a large number of nodes. Some distributed databases expose rich query abilities while others are limited to a key-value store semantics. Examples of limited distributed databases are Google's Bigtable, which is much more than a distributed file system or a peer-to-peer network,<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> Amazon's Dynamo<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> and Microsoft Azure Storage.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

As the ability of arbitrary querying is not as important as the availability, designers of distributed data stores have increased the latter at an expense of consistency. But the high-speed read/write access results in reduced consistency, as it is not possible to guarantee both consistency and availability on a partitioned network, as stated by the CAP theorem.

Peer network node data storesEdit

In peer network data stores, the user can usually reciprocate and allow other users to use their computer as a storage node as well. Information may or may not be accessible to other users depending on the design of the network.

Most peer-to-peer networks do not have distributed data stores in that the user's data is only available when their node is on the network. However, this distinction is somewhat blurred in a system such as BitTorrent, where it is possible for the originating node to go offline but the content to continue to be served. Still, this is only the case for individual files requested by the redistributors, as contrasted with networks such as Hyphanet, Winny, Share and Perfect Dark where any node may be storing any part of the files on the network.

Distributed data stores typically use an error detection and correction technique. Some distributed data stores (such as Parchive over NNTP) use forward error correction techniques to recover the original file when parts of that file are damaged or unavailable. Others try again to download that file from a different mirror.

ExamplesEdit

Distributed non-relational databasesEdit

Product License High availability Notes
Apache Accumulo Template:Free
Aerospike Template:Free
Apache Cassandra Template:Free Template:Yes formerly used by Facebook
Apache Ignite Template:Free
Bigtable Template:Proprietary used by Google
Couchbase Template:Free used by LinkedIn, PayPal, and eBay
CrateDB Template:Free Template:Yes
Apache Druid Template:Free used by Netflix, and Yahoo
Dynamo Template:Proprietary used by Amazon
etcd Template:Free Template:Yes
Hazelcast Template:Proprietary
HBase Template:Free Template:Yes formerly used by Facebook
Hypertable Template:Free Baidu
MongoDB Template:Proprietary
MySQL NDB Cluster Template:Free Template:Yes SQL and NoSQL APIs
Riak Template:Free Template:Yes
Redis Template:Free Template:Yes
ScyllaDB Template:Free
Voldemort Template:Free used by LinkedIn

Peer network node data storesEdit

See alsoEdit

ReferencesEdit

Template:Reflist

ja:分散ファイルシステム#分散データストア