What Are The Backup and Restore In PostgreSQL With Example

Backup and Restore

PostgreSQL provides strong backup and restore features that are necessary for disaster recovery, data persistence, and system updates. There are three main methods for backing up PostgreSQL data. Continuous Archiving with Point-in-Time Recovery (PITR), File System Level Backup, and SQL Dump. Each has advantages and disadvantages of its own.

SQL Dump (Logical Backup)

SQL dumps entail creating a file with SQL commands that can restore the database to its original state. Two primary tools for this are offered by PostgreSQL:

pg_dump: PostgreSQL’s basic pg_dump tool creates logical backups of a single database. A basic PostgreSQL client application, it allows remote backups from any computer having read access to the database. To backup a database completely, pg_dump must be run using a database superuser account with read access to all tables. Pg_dump outputs a SQL file that can be used to restore the database to its original state.

Formats: Besides SQL text, pg_dump can output TAR, compressed, and custom formats. The custom format (-Fc) is advised since it provides more functionality than raw SQL, enabling parallel processing with pg_restore and selective restores.

Advantages

Version Compatibility: Version compatibility is the only way to move databases between various system architectures (e.g., 32-bit to 64-bit), and output may typically be reloaded into more recent PostgreSQL versions.

Remote backup: PostgreSQL’s logical backup utilities and replication features enable remote backup. Client SQL dump tools like pg_dump and pg_dumpall can be used from any remote host with network access and database server user rights. Users utilise command-line options like -h and -p to indicate the destination PostgreSQL server’s hostname or IP address and port when remote dumping. Standard client authentication is used.

Consistency & Concurrency:Generally speaking, dumps don’t interfere with other database operations and are internally consistent (a snapshot taken at the beginning of pg_dump).

Selectivity: Disables schemas (-n), tables (-t), or excludes them (-T, -N). It is also possible to dump data-only (-a) or schema-only (-s).

Options: Options include omitting object ownership (-O) and privileges (-x), including CREATE DATABASE commands (-C), and disposing of big objects (-b). OIDs can be maintained by using -o.

pg_dumpall: A specialised PostgreSQL client application, pg_dumpall, performs a system-wide logical backup of a database cluster. Instead of focussing on one database, pg_dumpall collects all databases in the cluster and important global objects like role definitions (users), tablespace definitions, and privileges.

Superuser Access: Superusers in PostgreSQL have significant power over the database cluster and can avoid most permission checks, except for logging in. Initdb creates a default superuser role named postgres (or the operating system user who initialised the database) when a PostgreSQL database cluster is initialised.

Selectivity: The query planner in PostgreSQL relies on selectivity, the predicted fraction of rows that meet a WHERE or JOIN condition. The planner needs this estimation to make informed decisions and determine the best query plan.

Configuration Files: PostgreSQL’s behaviour, security, and operation are controlled by various configuration files in the database cluster’s data directory. The server relies on these plain-text files, which are formatted with key-value pairs and can be commented with a hash symbol (#).

Restoring SQL Dumps

Plain Text SQL: SQL plain text recovered with psql. The target database must be manually built, ideally from template 0, to be perfect. Psql dbname < dumpfile exemplifies createdb -T template0 dbname. Use -1 or single transaction to wrap the restoration in one transaction and roll back everything if an issue occurs.

Custom/TAR/Directory Formats: PG_restore restored directory, TAR, and custom formats. Pg_restore may concurrently and selectively restore objects with the -j or jobs option, making it flexible. Postgres is frequently supplied when using create to build the database during restoration.

Efficient Restoration: For effective restoration, you should disable autocommit, temporarily remove foreign key constraints and indexes, increase maintenance_work_mem and max_wal_size (or checkpoint_segments for older versions), and disable WAL archiving/streaming replication for large databases. For query planner statistics to be updated, always execute ANALYSE thereafter.

Code Example:

-- Create a sample table and insert data
CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    department TEXT NOT NULL
);

INSERT INTO employees (name, department) VALUES
('Alice', 'HR'),
('Bob', 'Finance'),
('Charlie', 'IT'); 
SELECT * FROM employees;

Output:

CREATE TABLE
INSERT 0 3
 id |  name   | department 
----+---------+------------
  1 | Alice   | HR
  2 | Bob     | Finance
  3 | Charlie | IT
(3 rows)

File System Level Backup

Using this method, the files used by PostgreSQL to hold data are directly copied.

Procedure: The complete data directory can be copied using programs like tar or cpio.

Restrictions:To guarantee a viable backup, the database server must be shut down. Since data files are dependent on commit log files, this method is only effective for full backup and restoration of a database cluster, not for individual tables or databases. The backup needs to include WAL files as well.

Alternatives:

Consistent Snapshot: A “frozen snapshot” from when the server is operating can be taken if the file system permits it. The server will restart after treating this as a crash and replaying WAL.If the database is distributed, take photos of all file systems at once.

Rsync: To limit downtime, run rsync while the server is running, then shut it down and run checksum again to record last-minute changes.

Performance: File system backups are larger than SQL dumps because indexes are copied rather than rebuilt, but they finish faster.

Continuous Archiving and Point-in-Time Recovery (PITR)

PITR combines a file-system-level base backup and WAL archiving for PostgreSQL. Because of its excellent reliability, this method is recommended.

WAL: PostgreSQL keeps track of each modification made to data files in a WAL. PITR is enabled and crash-safety is guaranteed (changes can be undone from log entries following a crash).

Benefits:

No Perfectly Consistent Base Backup Needed: There’s no need for a flawless base backup because log replay fixes base backup inconsistencies.

Continuous Backup: Continuous backup, sometimes called continuous archiving and Point-in-Time Recovery (PITR), is a complex PostgreSQL backup approach that ensures stability and granular recovery. It uses a file-system-level base backup and a continuous Write-Ahead Log (WAL) archive.

Point-in-Time Recovery: Point-in-Time Recovery (PITR), also known as continuous archiving, is a complex PostgreSQL backup and recovery approach for high dependability and data restoration management. This approach lets a database be restored to any time since a base backup, not simply the last full backup.

Warm/Hot Standby: Warm Standby and Hot Standby use Write-Ahead Log (WAL) shipping for high availability (HA) and disaster recovery in PostgreSQL. A Warm Standby server reads WAL records from a primary server to stay current. High availability and catastrophe recovery can be achieved by swiftly promoting the standby, which holds almost all of the primary’s data, to the primary.

Setting Up WAL Archiving: Setting archive_mode, wal_level, and postgresql.conf requires a shell command or archive library. Filename is %f and path is %p when using archive_command. To succeed, it must not overwrite files and leave zero.

Making a Base Backup: Using pg_basebackup is easiest. Regular files or tar archives can contain the essential WAL files for a basic backup. Execution can be parallel. Using tar, a low-level API transfers the data directory before SELECT pg_stop_backup(). If a symlink, exclude pg_wal/, postmaster.pid, postmaster.opts, pg_replslot/, pg_dynshmem/, pg_notify/, pg_serial/, pg_snapshots/, pg_stat_tmp/, pg_subtrans/, and pgsql_tmp or

Recovering with Continuous Archive:

Turn off the server.
copy cluster data and tablespaces to a temporary location.
Every file in the data directory and tablespaces should be deleted.
Restore the database files from the base backup, verifying ownership and permissions.
Delete or rebuild old WAL files in pg_wal/.
Copy any saved unarchived WAL segments to pg_wal/.
Recovery.conf setup options include recovery_target_* parameters like name, time, and timeline and restore_command. The data directory should contain recovery.signal. Later iterations integrated recovery.conf and postgresql.conf.
Start the server. WAL files play back in recovery mode.
Verify the database. Allow satisfied users to connect.

Timelines: To prevent overwrites and permit trial-and-error recovery, a new timeline is created after archive recovery to distinguish freshly discovered WAL data from previous history.

DBeaver Features for Backup and Restore

DBeaver’s graphical backup and restore interface employs PostgreSQL’s built-in functions pg_dump, dumpall, and restore.

Backup: DBeaver supports PostgreSQL single and global backups. It lets you choose objects to dump and define format (custom, plain, or tar), compression, and encoding.

Restore: Restoring a PostgreSQL database or cluster from a backup is essential for disaster recovery, data migration, and testing. The first backup type strongly influences restoration.

Shell Commands: DBeaver can directly perform shell commands for backup and restore. Automatic pg_dump operations can execute before disconnecting from a database.

Tasks: chores are saved and reused database operations setups that automate routine chores and execute them with a single click in tools like DBeaver. A Database Tasks view lets users browse, organise, run, edit, and delete these tasks.

PostgreSQL Backup and Restore using logical SQL dumps for flexibility and upgrades, physical file system Backup and Restore for speed, and advanced continuous archiving with PITR for dependability and granular recovery points. Technique selection depends on recovery time, recovery point, and operational needs.

Page Content

Tutorials