New to Stash? Please start here.
BackupSession is a Kubernetes
CustomResourceDefinition(CRD) which represents a backup run of the respective target(s) referenced by a
BackupBatch in a Kubernetes native way.
Stash operator creates a Kubernetes
CronJob according to the schedule defined in a
BackupBatch. On each backup schedule, this
CronJob creates a
BackupSession object. It points to the respective
BackupBatch. The controller that runs inside backup sidecar (in case of backup via jobs, it is stash operator itself) watches this
BackupSession object and starts taking backup instantly.
You can also create a
BackupSession object manually to trigger backup at any time.
Like any official Kubernetes resource, a
BackupSession created for backing up a WordPress Application and it’s components’ is shown below,
apiVersion: stash.appscode.com/v1beta1 kind: BackupSession metadata: creationTimestamp: "2020-07-25T17:41:28Z" labels: app: stash stash.appscode.com/invoker-name: wordpress-backup stash.appscode.com/invoker-type: BackupBatch name: wordpress-backup-1578458376 namespace: demo spec: invoker: apiGroup: stash.appscode.com kind: BackupBatch name: wordpress-backup status: conditions: - lastTransitionTime: "2020-07-25T17:41:31Z" message: Repository exist in the backend. reason: BackendRepositoryFound status: "True" type: BackendRepositoryInitialized - lastTransitionTime: "2020-07-25T17:41:48Z" message: Successfully applied retention policy. reason: SuccessfullyAppliedRetentionPolicy status: "True" type: RetentionPolicyApplied - lastTransitionTime: "2020-07-25T17:41:50Z" message: Repository integrity verification succeeded. reason: SuccessfullyVerifiedRepositoryIntegrity status: "True" type: RepositoryIntegrityVerified - lastTransitionTime: "2020-07-25T17:41:50Z" message: Successfully pushed repository metrics. reason: SuccessfullyPushedRepositoryMetrics status: "True" type: RepositoryMetricsPushed phase: Succeeded sessionDuration: 22.575920065s sessionDeadline: "2020-07-25T17:46:28Z" targets: - phase: Succeeded preBackupActions: - InitializeBackendRepository ref: apiVersion: apps/v1 kind: Deployment name: wordpress stats: - duration: 831.018039ms hostname: app phase: Succeeded snapshots: - fileStats: modifiedFiles: 0 newFiles: 1 totalFiles: 1 unmodifiedFiles: 0 name: b54ee4a0 path: /var/www/html processingTime: "0:00" totalSize: 0 B uploaded: 711 B totalHosts: 1 - phase: Succeeded postBackupActions: - ApplyRetentionPolicy - VerifyRepositoryIntegrity - SendRepositoryMetrics ref: apiVersion: appcatalog.appscode.com/v1alpha1 kind: AppBinding name: wordpress-mysql stats: - duration: 1.147010638s hostname: db phase: Succeeded snapshots: - fileStats: modifiedFiles: 0 newFiles: 1 totalFiles: 1 unmodifiedFiles: 0 name: b30beb44 path: dumpfile.sql processingTime: "0:00" totalSize: 0 B uploaded: 3.408 MiB totalHosts: 1
Here, we are going to describe the various sections of a
metadata.name indicates the name of the
BackupSession. This name is automatically generated by respective
CronJob and it follows the following pattern:
<BackupConfiguration/BackupBatch name>-<creation timestamp in Unix epoch seconds>.
metadata.namespace indicates the name of the
BackupSession. It is the same as the namespace of respective
metadata.labels holds respective
BackupBatch kind and name as a label. The stash backup sidecar container use this label to watch only the BackupSessions of that
If you create
BackupSessionmanually to trigger a backup instantly, make sure that you have added
stash.appscode.com/invoker-type: <BackupConfiguration/BackupBatch kind>and
stash.appscode.com/invoker-name: <BackupConfiguration/BackupBatch name>label to your
BackupSession. Otherwise, it will not trigger backup for workloads (those resources that are backed up using sidecar).
BackupSession object has the following fields in the
spec.invoker specifies the
name of the respective object which is responsible for invoking this backup session.
.status section of
BackupSession shows stats and progress of backup process in this session.A backup sidecar container or job updates the respective fields under
.status section after it completes its task.
.status section consists of the following fields:
status.phase indicates the overall phase of the backup process for this BackupSession.
status.phase will be
Succeeded only if the phase of all targets is
Succeeded. If any of the targets fail to complete its backup,
status.phase will be
status.sessionDuration indicates the total time taken to complete the backup of all targets in this session.
status.sessionDeadline indicates the the deadline of the backup process.
BackupSession will be considered
Failed if the backup does not complete within this deadline.
status.conditions shows the conditions of different operations/steps of the backup process. The following conditions are set by the Stash operator on a BackupSession.
|Indicates whether the backend repository was initialized or not.|
|Indicates whether the retention policies were applied or not.|
|Indicates whether the repository integrity check succeeded or not.|
|Indicates whether the Repository metrics for this backup session were pushed or not.|
|Indicates whether the global PreBackupHook was executed successfully or not. Only available during backup using BackupBatch.|
|Indicates whether the global PostBackupHook was executed successfully or not. Only available during backup BackupBatch.|
|Indicates whether the session deadline was exceeded or not.|
status.targets field contains an array of the status of the individual target for a backup run. Each target’s status field consists of the following sub-fields:
totalHosts : Not every pod or replica of a target is subject to backup. Thus, we refer those entities that are subject to backup as a host.
totalHosts specifies the total number of hosts of the target that will be backed up for this BackupSession. For more details on how many hosts will be backed up for which types of workload, please visit here.
preBackupActions : Specifies a list of actions that the backup process should execute before taking backup. For example, the backend repository must be initialized by one of the targets before taking backup. Stash automatically assigned which target should execute this action. The
preBackupActions should not be confused with
preBackup hook. The hooks are meant to be configured by the users where the
preBackupActions are meant to be configured by Stash itself.
postBackupActions : Similar to
preBackupActions, it specifies a list of actions that a backup process should execute after taking the backup. For example, when all the targets complete their backup, one target must apply retention policy into the repository. Stash automatically selects which target should execute these
ref refers to the target whose backup stats has been presented by this array entry.
phase indicates the backup phase of the target.
phase will be
Succeeded only if the phase of all hosts are
Succeeded. If any of the hosts fail to complete its backup,
phase will be
stats section is an array of backup statistics about individual hosts of the target. Each host adds its statistics in this array after completing its backup process.
Each stats entry consists of the following fields:
hostnameindicates the name of the host.
phaseindicates the backup phase of this host.
durationindicates the total time taken to complete backup for this host.
snapshotsfield holds statistics of each of these individual snapshots. Each snapshot statistics has the following fields:
nameindicates the name of the snapshot.
pathindicates the file path that was backed up in this snapshot.
totalSizeindicates the size of data to backup from this path.
uploadedindicates the size of the data that was uploaded to the backend for this snapshot. This could be much smaller than
sizeif some data was already uploaded in the backend in previous backup sessions.
processingTimeindicates the time taken to process the data of the target path.
fileStatsfield show statics of files that were backed up in this snapshot.
totalFilesshows the total number of files that were backed up in this snapshot.
newFilesshows the number of new files that were backed up in this snapshot.
modifiedFilesshows the number of files that were modified since last backup of this directory.
unmodifiedFilesshows the number of files that haven’t changed since the last backup of this path.
errorshows the reason for failure if the backup process failed for this host.
Stash uses two different models for backup depending on the target type. It uses sidecar model for Kubernetes workloads and job model for the rest of the targets. In the sidecar model, Stash injects a sidecar inside the targeted workload and the sidecar is responsible for taking backup. In the job model, Stash launches a job to take a backup of the target.
Stash uses an identifier called host to separate the backed up data of different subjects in the backed. This host identification process depends on the backup model and the target types. The backup strategy and host identification strategy for different types of the target is explained below.
Stash uses the sidecar model to backup Kubernetes workloads. However, not every sidecar takes backup. How many sidecars will take backup depends on the type of the workload. We can divide them into the following categories:
aliasprovided in the BackupConfiguration/BackupBatch is used as a host identifier. If the
aliaswas not provided, then it defaults to
host-0. The total number of hosts for these types of workload is 1.
aliaswas not provided in the BackupConfiguration/BackupBatch, then the host identifiers are generated as
host-2etc. The total number of hosts for a StatefulSet is the number of replicas.
Stash uses the job model to backup a stand-alone PVC. Stash launches a job to backup the targeted PVC. The
alias provided in the BackupConfiguration/BackupBatch is used as the host identifier. If the
alias was not provided, it defaults to
host-0. The total number of hosts for a stand-alone PVC backup is 1.
Stash uses the job model to backup a database. Stash launches a job to backup the targeted database. In this case, the number of hosts depends on the database type.
aliasand the total number of hosts is 1.
aliasand the total number of hosts is 1.
Stash uses the job model for taking volume snapshots. Each volume is considered as different hosts and they are identified by their name. Hence, the number of total hosts for VolumeSnapshot is the number of targeted volumes. However, since VolumeSnapshot is handled by the respective CSI driver, the host identifier does not play any role to separate their data.