Editing PostgreSQL (section)

== Storage and replication ==

=== Replication ===
PostgreSQL includes built-in binary replication based on shipping the changes ([[write-ahead logging|write-ahead logs]] (WAL)) to replica nodes asynchronously, with the ability to run read-only queries against these replicated nodes. This allows splitting read traffic among multiple nodes efficiently. Earlier replication software that allowed similar read scaling normally relied on adding replication triggers to the master, increasing load.

PostgreSQL includes built-in synchronous replication<ref name="H Online" /> that ensures that, for each write transaction, the master waits until at least one replica node has written the data to its transaction log. Unlike other database systems, the durability of a transaction (whether it is asynchronous or synchronous) can be specified per-database, per-user, per-session or even per-transaction. This can be useful for workloads that do not require such guarantees, and may not be wanted for all data as it slows down performance due to the requirement of the confirmation of the transaction reaching the synchronous standby.

Standby servers can be synchronous or asynchronous. Synchronous standby servers can be specified in the configuration which determines which servers are candidates for synchronous replication. The first in the list that is actively streaming will be used as the current synchronous server. When this fails, the system fails over to the next in line.

Synchronous [[multi-master replication]] is not included in the PostgreSQL core. Postgres-XC which is based on PostgreSQL provides scalable synchronous multi-master replication.<ref name="Postgres-XC" /> It is licensed under the same license as PostgreSQL. A related project is called [[Postgres-XL]]. Postgres-R is yet another [[Fork (software development)|fork]].<ref name=postgres-r /> Bidirectional replication (BDR) is an asynchronous multi-master replication system for PostgreSQL.<ref name=bdr />

Tools such as repmgr make managing replication clusters easier.

Several asynchronous trigger-based replication packages are available. These remain useful even after introduction of the expanded core abilities, for situations where binary replication of a full database cluster is inappropriate:

* [[Slony-I]]
* Londiste, part of SkyTools (developed by [[Skype]])
* Bucardo multi-master replication (developed by [[Backcountry.com]])<ref name="Fischer" />
* [[SymmetricDS]] multi-master, multi-tier replication

=== Indexes ===
PostgreSQL includes built-in support for regular [[B-tree]] and [[hash table]] indexes, and four index access methods: generalized search trees ([[GiST]]), generalized [[inverted index]]es (GIN), Space-Partitioned GiST (SP-GiST)<ref name="SP-GiST" /> and [[Block Range Index]]es (BRIN). In addition, user-defined index methods can be created, although this is quite an involved process. Indexes in PostgreSQL also support the following features:

* [[Expression index]]es can be created with an index of the result of an expression or function, instead of simply the value of a column.
* [[Partial index]]es, which only index part of a table, can be created by adding a WHERE clause to the end of the CREATE INDEX statement. This allows a smaller index to be created.
* The planner is able to use multiple indexes together to satisfy complex queries, using temporary in-memory [[bitmap index]] operations (useful for [[data warehouse]] applications for joining a large [[fact table]] to smaller [[dimension table]]s such as those arranged in a [[star schema]]).
* [[k-nearest neighbors algorithm|''k''-nearest neighbors]] (''k''-NN) indexing (also referred to KNN-GiST<ref name="KNN-GiST" />) provides efficient searching of "closest values" to that specified, useful to finding similar words, or close objects or locations with [[Geographic data and information|geospatial]] data. This is achieved without exhaustive matching of values.
* Index-only scans often allow the system to fetch data from indexes without ever having to access the main table.
* [[Block Range Index]]es (BRIN).

=== Schemas ===
PostgreSQL schemas are [[namespace]]s, allowing objects of the same kind and name to co-exist in a single database.
They are not to be confused with a
[[database schema]]—the abstract, structural, organizational specification which defines how every table's data relates to data within other tables.
All PostgreSQL database objects, except for a few global objects such as [[#Security|roles]] and [[tablespace]]s, exist within a schema.
They cannot be nested, schemas cannot contain schemas. 
The permission system controls access to schemas and their content.
By default, newly created databases have only a single schema called ''public'' but other schemas can be added and the public schema isn't mandatory.

A {{code|search_path}} setting determines the order in which PostgreSQL checks schemas for unqualified objects (those without a prefixed schema). By default, it is set to {{code|$user, public}} ({{code|$user}} refers to the currently connected database user). This default can be set on a database or role level, but as it is a session parameter, it can be freely changed (even multiple times) during a client session, affecting that session only.

Non-existent schemas, or other schemas not accessible to the logged-in user, that are listed in search_path are silently skipped during object lookup.

New objects are created in whichever valid schema (one that can be accessed) appears first in the search_path.

=== Data types ===
A wide variety of native [[data type]]s are supported, including:

* Boolean
* [[Arbitrary-precision arithmetic|Arbitrary-precision]] numerics
* Character (text, varchar, char)
* Binary
* Date/time (timestamp/time with/without time zone, date, interval)
* Money
* Enum
* Bit strings
* Text search type
* Composite
* HStore, an extension enabled key–value store within PostgreSQL<ref>{{Cite web |url=https://www.linuxjournal.com/content/postgresql-nosql-database |title=PostgreSQL, the NoSQL Database &#124; Linux Journal |website=www.linuxjournal.com}}</ref>
* Arrays ([[dynamic array|variable-length]] and can be of any data type, including text and composite types) up to 1&nbsp;GB in total storage size
* Geometric primitives
* [[IPv4]] and [[IPv6]] addresses
* [[Classless Inter-Domain Routing]] (CIDR) blocks and [[MAC address]]es
* [[XML]] supporting [[XPath]] queries
* [[Universally unique identifier]] (UUID)
* JavaScript Object Notation ([[JSON]]), and a faster [[binary code|binary]] JSONB (not the same as [[BSON]]<ref name="jsonb" />)

In addition, users can create their own data types which can usually be made fully indexable via PostgreSQL's indexing infrastructures{{snd}} GiST, GIN, SP-GiST. Examples of these include the [[geographic information system]] (GIS) data types from the [[PostGIS]] project for PostgreSQL.

There is also a data type called a ''domain'', which is the same as any other data type but with optional constraints defined by the creator of that domain. This means any data entered into a column using the domain will have to conform to whichever constraints were defined as part of the domain.

A data type that represents a range of data can be used which are called range types. These can be discrete ranges (e.g. all integer values 1 to 10) or continuous ranges (e.g., any time between {{nowrap|10:00 am}} and {{nowrap|11:00 am}}). The built-in range types available include ranges of integers, big integers, decimal numbers, time stamps (with and without time zone) and dates.

Custom range types can be created to make new types of ranges available, such as IP address ranges using the inet type as a base, or float ranges using the float data type as a base. Range types support inclusive and exclusive range boundaries using the {{kbd|[]}} and {{kbd|()}} characters respectively. (e.g., {{code|[4,9)}} represents all integers starting from and including 4 up to but not including 9.) Range types are also compatible with existing operators used to check for overlap, containment, right of etc.

=== User-defined objects ===
New types of almost all objects inside the database can be created, including:

* Casts
* Conversions
* Data types
* [[Data domain]]s
* Functions, including aggregate functions and window functions
* Indexes including custom indexes for custom types
* Operators (existing ones can be [[Operator overloading|overloaded]])
* Procedural languages

=== Inheritance ===
Tables can be set to inherit their characteristics from a ''parent'' table. Data in child tables will appear to exist in the parent tables, unless data is selected from the parent table using the ONLY keyword, i.e. {{code|lang="sql" | SELECT * FROM ONLY parent_table;}}. Adding a column in the parent table will cause that column to appear in the child table.

Inheritance can be used to implement table partitioning, using either triggers or rules to direct inserts to the parent table into the proper child tables.

This feature is not fully supported. In particular, table constraints are not currently inheritable. All check constraints and not-null constraints on a parent table are automatically inherited by its children. Other types of constraints (unique, primary key, and foreign key constraints) are not inherited.

Inheritance provides a way to map the features of generalization hierarchies depicted in [[entity–relationship model|entity–relationship diagrams]] (ERDs) directly into the PostgreSQL database.

=== Other storage features ===
* [[Referential integrity]] constraints including [[foreign key]] constraints, column [[Constraint satisfaction|constraints]], and row checks
* Binary and textual large-object storage
* [[Tablespace]]s
* Per-column collation
* Online backup
* Point-in-time recovery, implemented using write-ahead logging
* In-place upgrades with pg_upgrade for less downtime<!-- (supports upgrades from 8.3.x<ref>{{Cite web|title=PostgreSQL: Documentation: 9.0: pg_upgrade|url=https://www.postgresql.org/docs/9.0/pgupgrade.html|access-date=2020-06-09|website=www.postgresql.org}}</ref> and later) -->