Overview

For database security purposes, openGauss provides three backup and restoration types, multiple backup and restoration solutions, and data reliability assurance mechanisms during backup and restoration.

Backup and restoration can be classified into logical backup and restoration, physical backup and restoration, and flashback.

  • Logical backup and restoration: backs up data by logically exporting data. This method can dump data that is backed up at a certain time point, and restore data only to this backup point. A logical backup does not back up data processed between failure occurrence and the last backup. It applies to scenarios where data rarely changes. Such data damaged due to misoperation can be quickly restored using a logical backup. To restore all the data in a database through logical backup, rebuild a database and import the backup data. Logical backup is not recommended for databases requiring high data availability because it takes a long time for data restoration. Logical backup is a major approach to migrate and transfer data because it can be performed on any platform.

  • Physical backup and restoration: copies physical files in the unit of disk blocks from the primary node to the standby node to back up a database. A database can be restored using backup files, such as data files and archive log files. A physical backup is useful when you need to quickly back up and restore the complete database within a short period of time. Backup and restoration can be implemented at low costs through proper planning.

  • Flashback: This function is used to restore dropped tables from the recycle bin. Like in a Window OS, dropped table information is stored in the recycle bin of databases. The MVCC mechanism is used to restore data to a specified point in time or change sequence number (CSN).

    The three data backup and restoration solutions supported by openGauss are as follows. Methods for restoring data in case of an exception differ for different backup and restoration solutions.

    Table 1 Comparison of three backup and restoration types

    Backup Type

    Application Scenario

    Media

    Tool Name

    Recovery Time

    Advantage and Disadvantage

    Logical backup and restoration

    Small volume of data needs to be processed.

    You can back up a single table, multiple tables, a single database, or all databases. The backup data needs to be restored using gsql or gs_restore. When the data volume is large, the restoration takes a long time.

    Disk:

    SSD

    gs_dump

    It takes a long time to restore data in plain-text format. It takes a moderate time to restore data in archive format.

    This tool is used to export database information. Users can export a database or its objects (such as schemas, tables, and views). The database can be the default postgres database or a user-specified database. The exported file can be in plain-text format or archive format. Data in plain-text format can be restored only by using gsql, which takes a long time. Data in archive format can be restored only by using gs_restore. The restoration time is shorter than that of the plain-text format.

    gs_dumpall

    Long data recovery time.

    This tool is used to export all information of the openGauss database, including the data of the default postgres database, data of user-specified databases, and global objects of all openGauss databases.

    Only data in plain-text format can be exported. The exported data can be restored only by using gsql, which takes a long time.

    Physical backup and restoration

    Huge volume of data needs to be processed. It is mainly used for full backup and restoration as well as the backup of all WAL archive and run logs in the database.

    gs_backup

    Small data volume and fast data recovery.

    The OM tool for exporting database information can be used to export database parameter files and binary files. It helps openGauss to back up and restore important data, and display help and version information. During the backup, you can select the type of the backup content. During the restoration, ensure that the backup file exists in the backup directory of each node. During cluster restoration, the cluster information in the static configuration file is used for restoration. It takes a short time to restore only parameter files.

    gs_basebackup

    During the restoration, you can directly copy and replace the original files, or directly start the database on the backup database. The restoration takes a short time.

    This too is used to fully copy the binary files of the server database. Only the database at a certain time point can be backed up. With PITR, you can restore data to a time point after the full backup time point.

    gs_probackup

    Data can be directly restored to a backup point and the database can be started on the backup database. The restoration takes a short time.

    gs_probackup is a tool used to manage openGauss backup and restoration. It periodically backs up openGauss instances. It supports the physical backup of a standalone database or a primary database node in a cluster. It supports the backup of contents in external directories, such as script files, configuration files, log files, and dump files. It supports incremental backup, periodic backup, and remote backup. The time required for incremental backup is shorter than that for full backup. You only need to back up the modified files. Currently, the data directory is backed up by default. If the tablespace is not in the data directory, you need to manually specify the tablespace directory to be backed up. Currently, data can be backed up only on the primary node.

    Flashback restoration

    Applicable to:

    1. A table is deleted by mistake.

    2. Data in the tables needs to be restored to a specified time point or CSN.

    None

    You can restore a table to the status at a specified time point or before the table structure is deleted. The restoration takes a short time.

    Flashback can selectively and efficiently undo the impact of a committed transaction and recover from a human error. Before the flashback technology is used, the committed database modification can be retrieved only by means of restoring backup or PITR. The restoration takes several minutes or even hours. After the flashback technology is used, it takes only seconds to restore the committed data before the database is modified. The restoration time is irrelevant to the database size.

    Flashback supports two recovery modes:

    • Multi-version data restoration based on MVCC: applicable to the query and restoration of data that is deleted, updated, or inserted by mistake. You can configure the retention period of the old version and run the corresponding query or restoration command to query or restore data to a specified time point or CSN.
    • Recovery based on the recycle bin (similar to that on Windows OS): This method is applicable to the recovery of tables that are dropped or truncated by mistake. You can configure the recycle bin switch and run the corresponding restoration command to restore the tables that are dropped or truncated by mistake.

    While backing up and restoring data, take the following aspects into consideration:

    • Whether the impact of data backup on services is acceptable

    • Database restoration efficiency

      To minimize the impact of database faults, try to minimize the restoration duration, achieving the highest restoration efficiency.

    • Data restorability

      Minimize data loss after the database is invalidated.

    • Database restoration cost

    There are many factors that need to be considered while you select a backup policy on the live network, such as backup objects, data volume, and network configuration. Table 2 lists available backup policies and applicable scenarios for each backup policy.

    Table 2 Backup policies and scenarios

    Backup Policy

    Key Performance Factor

    Typical Data Volume

    Performance Specifications

    Database instance backup

    • Data amount
    • Network configuration

    Data volume: PB level

    Object quantity: about 1 million

    Backup:

    • Data transfer rate on each host: 80 Mbit/s (NBU/EISOO+Disk)
    • Disk I/O rate (SSD/HDD): about 90%

    Table backup

    • Schema where the table to be backed up resides
    • Network configuration (NBU)

    Data volume: 10 TB level

    Backup: depends on query performance rate and I/O rate

    NOTE:

    For multi-table backup, the backup time is calculated as follows:

    Total time = Number of tables x Starting time + Total data volume/Data backup speed

    In the preceding information:

    • The starting time of a disk is about 5s. The starting time of an NBU is longer than that of a disk (depending on the NBU deployment).
    • The data backup speed is about 50 MB/s on a single node. (The speed is evaluated based on the backup of a 1 GB table from a physical host to a local disk.)

    The smaller the table is, the lower the backup performance will be.

Feedback
编组 3备份
    openGauss 2024-04-22 00:47:24
    cancel