Backup and Restore

As any database system that contains valuable data, Sedna databases should be backed up regularly. In this section we present different approaches to back up Sedna data and the process of migration between different Sedna releases.

2.5.1 Export/Import Utility

The purpose of se_exp utility is to provide functionality of exporting/importing data. The idea behind the se_exp method is to generate a set of XML files and XQuery scripts to restore the database in the same state as it was at the time of the exporting.

Important note: current version of the se_exp utility doesn’t support exporting/importing triggers, documents with multiple roots and empty documents (empty documents nodes and document nodes with multiple root elements are allowed by XQuery data model but cannot be serialized as is without modifications).

Note that se_exp is a regular Sedna client application. This means that you can export data from any remote host that has access to Sedna. The requirement is that se_exp should operate with special permissions. In particular, it must have read access to all documents in the database including system metadata. This means that you should run it as a user which has DBA role.

se_exp [options] command dbname path
options:
  -help               display this help and exit
  --help              display this help and exit
  -version            display product version and exit
  -verbose on/off     verbose output (default off)
  -host host          hostname of the machine with Sedna running
                      (default localhost)
  -port-number port   socket listening port  (default 5050)
  -name name          user name
  -pswd password      user password
   command            export | restore | import
   db-name            database name
   path               path to exported/imported data

There are three commands to manipulate data with se_exp. They are export, restore and import. Below we describe each of these commands in details.

Export

The purpose of export command is to export data from specified database. The basic usage of this command is:

se_exp export dbname path

Note 1 If the directory to which path refers contains files that have the same names as the files created by se_exp they will be replaced.

While export process se_exp generates a set of XML files and a set of XQuery scripts to recreate the state of database. For each XML document in the database including XML documents in collections se_exp generates an XML file. Some XML files with system metadata are also generated. Note that the security metadata is exported in insecure way (the file contains unencrypted user names and passwords).

Exported data created by se_exp is transaction consistent, that is, updates to the database while se_exp is running will not be in the exported data.

To specify which database server se_exp should contact, use the command line options -host host. The default host is the local host. As Sedna client application, se_exp requires user name and password to connect to the database. You can either specify them with -name and -pswd options or type user name and password in the dialog while running se_exp.

Restore

The restore command restores data created by the export command into the empty database. The restore command is intended for migration between different releases of Sedna and for back up of your data in XML format. The basic usage of this command is:

se_exp restore dbname path

The parameter dbname specifies the database in Sedna to restore data into. The path parameter specifies the directory with data to restore.

The database dbname will not be created by this command. It is required that the target database already exist and run before starting the restore process. You must create it yourself with the help of se_cdb command and start it with se_sm command. It is also required that the target database is empty, i.e. it doesn’t contain any data or any users or roles except the default one.

Import

The import command imports data created by the export command into an existing database which may be not empty. The basic usage of this command is:

se_exp import dbname path

The parameter dbname specifies the database in Sedna to import data to. The path parameter specifies the directory with data to import.

The database dbname will not be created by this command. It is required that the target database already exist and run before starting the import process.

You can import data into database which contains some data and has some security politics. The only restriction is that there should not be any conflicts in the names of XML documents, collections or indices.

The main difference between restore and import command is that import command doesn’t import any security information. All data is imported by the user who run the se_exp utility, i.e. by the user with the name and password specified with se_exp parameters. Of cause this user should have enough rights to create collections, load documents and create indices.

2.5.2 File system level backup

An alternative strategy to backup a database is to directly copy the directories that Sedna uses to store the data of the database. Read Section 2.1 to find out where Sedna stores databases. You can use whatever method you prefer for doing usual file system backups. To restore a database, copy the corresponding backup directory to the location where Sedna stores databases.

There is a requirement is that the target database must be stopped in order to get a usable backup. Half-way measures such as disallowing all connections will not work.

Note that a file system backup will not necessarily be smaller than an back up via export. On the contrary, it will most likely be larger.

Note 2 The database directory copied to different machine or different version of the same operating system might not work properly. If you want to restore a database on another machine or OS installation use se_exp utility instead.

2.5.3 Hot Backup

Another alternative is to backup a database while it is still running. Such procedure is called hot backup. The purpose is to create a consistent backup copy while users are still performing some requests. This copy can then be restored by copying corresponding backup directory to the directory where Sedna stores databases. Such hot backups can be done in incremental mode, which allows more efficient archiving of database changes.

In a nutshell, when hot backup is called Sedna makes copies of all database files necessary to restore consistent database state in case of failure. The main difference between file system level backup (described in Section 2.5.2) and hot backup is that the target database has not to be stopped. As a tradeoff, restoration from hot backup copy may be a slower process, depending on the recency of the copy. Note, that hot backup copy guarantees durability of all transactions that had been committed at the moment of starting hot backup process.

To make a hot backup you must use provided se_hb utility. The usage is as follows:

Usage: se_hb [options] dbname path

options:
  -help                        display this help and exit
  --help                       display this help and exit
  -checkpoint                  make checkpoint before backup
  -time-dir                    create timestamp-subdir
  -make-dir                    create directory if it doesn’t exist
  -incr-mode <increment_mode>  increment mode (start, add, stop)
  -port port-number            port number to connect to Governor

  dbname                       the name of the database
   path                        the name of the backup directory

You can use -checkpoint option to make sure that checkpoint is made before hot backup process is started. This may make restoration process faster, since checkpoint fixates consistent state of the database and this state will be reflected in the hot backup copy. But at the same time backup process may take more time, depending on the user activity at the time of the backup.

If you want the destination directory (specified as path in command line) to be created, you must specify -make-dir option.

Note that se_hb may overwrite some of the previous backup files if you specify nonempty destination directory. So, if you make two consequent hot backups of the same database in the same directory, the older backup will be lost. -time-dir option prevents this by creating subdirectory named with the current date-time within path directory. In this case the destination of hot backup copy will be: <path>/backup-<dbname>-<current date>-<current time>/. It is recommended that you use the -time-dir option or provide a directory free of the previous backups.

You can specify port number to connect to governor through -port option. If port number is not specified in command line, hot backup process tries to find sednaconf.xml file and use port number specified as listener_port parameter. If it still cannot find port number, it will try to use default 5050 value.

Note 3 You must run se_hb utility on the same machine as the target database is running. It will try to connect to the target database through the specified port to the localhost.

Incremental hot backup and corresponding -incr-mode option will be explained below.

Incremental Hot Backup

Let us assume that you have made hot backup using command like this se_hb mydb /backup. If later you want to make another one to be sure updated data will be reflected in the backup copy, you can issue the same command. However, if the amount of changes is small, it is desirable to copy only this changes without making copy of the entire database again. This is where incremental mode becomes useful.

First of all, you must create a primary copy, which is essentially the copy of the entire database. You can do it by specifying -incr-mode start in command line (for example, se_hb -incr-mode start mydb /backup). Then, when you need to make subsequent hot backups of the same database, you can specify -incr-mode add in command line (for example, se_hb -incr-mode add mydb /backup). If the amount of changes is small, such backup process will take much less time.

Note, however, that when you use -incr-mode start option it will switch the original (active) database in incremental mode. This means it will start to store more files to allow “-incr-mode add” backups. In this case original database can grow in size more rapidly in case it is updated. To switch off incremental mode you must specify -incr-mode stop in command line. This command allows the original database to drop unnecessary files, but it also makes “-incr-mode add” option impossible. Thus, to start another incremental backup process you must repeat the whole process all over again (-incr-mode start call and additional -incr-mode add calls when needed). Note that se_hb with -incr-mode stop option does not make any additional hot backup copies, it just switches off incremental mode.

Note 4 The database switches off incremental mode when new nonincremental (without -incr-mode option) hot backup is created. This is similar to the -incr-mode stop option, only in this case hot backup copy of the entire database is created.

If you make new primary copy (i.e. with the -incr-mode start option specified) while database is still in incremental mode, -incr-mode add option will archive increments valid only for this last primary copy. It is recommended that you periodically make new primary copy with -incr-mode start option. Unless, of course, the database is rarely updated.

With incremental hot backup you have two options: you can archive all backups in the same directory or in different directories. If you archive all backups in the same directory (as in /backup in our example above) you can restore only state corresponding to the last incremental backup. On the other hand, storing incremental backups in different directories makes possible some kind of point-in-time recovery, i.e. you can restore state corresponding to any of the incremental backups. For the details see the next section.

Restore from Hot Backup Copy

Note 5 Since hot backup is made on file system level basis, the same note as in the “File system level backup” section applies here too, i.e. recovery of the hot backup copy on different machine or different version of the same operating system cannot be guaranteed.

Note 6 Recovery of hot backup copy on different release of Sedna cannot be guaranteed. See Section 2.5.4 for further details.

To restore the backed up database you must copy saved files to the directory where Sedna stores database files. For the information about Sedna directory structure read Section 2.1. For example, let us assume that hot backup have been made in /backup directory, and Sedna stores its files in SEDNA_INSTALL directory. In this case you can find cfg and data subdirectories in /backup directory. To restore database you must copy this directories in SEDNA_INSTALL directory. Note, that you should remove SEDNA_INSTALL/data/<dbname>_files/ (where <dbname> is the name of the backed up database) directory before you copy backup files, since old files may interfere with restoration process. After you copy backup files in the corresponding directories, you can use se_sm command to start the database. When you do it for the first time SM runs recovery process to restore database state corresponding to the moment hot backup took place. It can take some time depending on the recency of the copy.

For the incremental backed up database the process may be different. If you have made all backups (primary copy and additional “-incr-mode add” copies) in the same directory the process is the same. However, if you have some of the “-incr-mode add” copies in the different directories you must copy files from all this directories in order corresponding hot backups were taken to fully restore database state. This makes possible restoration of older state of the database. For example, let us assume that primary copy is stored in /backup/p directory and additional -incr-mode add copies are stored in /backup/1 and /backup/2 directories in the order of creation. Then you can restore database to the state corresponding to any of the hot backups by copying only those directories that you need. For example, by copying /backup/p and /backup/1 directories you can restore database state corresponding to the moment when -incr-mode add /backup/1 was made. Of course, you cannot “skip” directories. For example, copying only /backup/p and /backup/2 would not be possible, since in this case /backup/1 is also needed.

2.5.4 Migration Between Releases

In this section we discuss how to migrate your data from one release of Sedna to another. As the internal data storage format is subject to change between different releases of Sedna it is a frequently required task to accurately migrate data.

It is recommended that you use se_exp utility to pass through this problem. The process consists of four steps.

2.5 Backup and Restore