You are looking at the documentation of a prior release. To read the documentation of the latest release, please
visit here.
Stash Architecture
Stash is a Kubernetes operator for restic. At the heart of Stash, it is a Kubernetes controller. It uses Custom Resource Definition(CRD) to specify targets and behaviors of backup and restore process in a Kubernetes native way. A simplified architecture of Stash is shown below:
Components
Stash consists of various components that implement backup and restore logic. This section will give you a brief overview of such components.
Stash Operator
When a user installs Stash, it creates a Kubernetes Deployment typically named stash-operator
. This deployment controls the entire backup and restore process. stash-operator
deployment runs two containers. One of them is called operator
which performs the core functionality of Stash and the other one is pushgateway
which is a Prometheus pushgateway.
Operator
operator
container runs all the controllers as well as an Aggregated API Server.
Controllers
Controllers watch various Kubernetes resources as well as the custom resources introduced by Stash. It applies the backup or restore logic for a target resource when requested by users.
Aggregated API Server
Aggregated API Server self-hosts validating and mutating webhooks and runs an Extension API Server for Snapshot resource.
Mutating Webhook: Stash uses Mutating Webhook to inject backup
sidecar
or restoreinit-container
into a workload if any backup or restore process is configured for it. It is also used for defaulting custom resources.Validating Webhook: Validating Webhook is used to validate the custom resource objects.
Snapshot Server: Stash uses Kubernetes Extended API Server to provide
view
andlist
capability of backed up snapshots. When a user requests for Snapshot objects, Snapshot server reads respective information directly from backend repository and returns object representation in a Kubernetes native way.
Pushgateway
pushgateway
container runs Prometheus pushgateway. All the backup sidecars/jobs and restore init-containers/jobs send Prometheus metrics to this pushgateway after completing their backup or restore process. Prometheus server can scrape those metrics from this pushgateway.
Backend
Backend is the storage where Stash stores backed up files. It can be a cloud storage like GCS bucket, AWS S3, Azure Blob Storage etc. or a Kubernetes persistent volume like NFS, PersistentVolumeClaim, etc. To learn more about backend, please visit here.
CronJob
When a user creates a BackupConfiguration object, Stash creates a CronJob with the schedule specified in it. At each scheduled slot, this CronJob triggers a backup for the targeted workload.
Backup Sidecar / Backup Job
When a user creates a BackupConfiguration object, Stash injects a sidecar
to the target if it is a workload (i.e. Deployment
, DaemonSet
, StatefulSet
etc.). This sidecar
takes backup when the respective CronJob triggers a backup. If the target is a database or stand-alone volume, Stash creates a job to take backup at each trigger.
Restore Init-Container / Restore Job
When a user creates a RestoreSession object, Stash injects an init-container
to the target if it is a workload (i.e. Deployment
, DaemonSet
, StatefulSet
etc.). This init-container
performs restore process on restart. If the target is a database or stand-alone volume, Stash creates a job to restore the target.
Custom Resources
Stash uses Custom Resource Definition(CRD) to specify targets and behaviors of backup and restore process in a Kubernetes native way. This section will give you a brief overview of the custom resources used by Stash.
Repository
A
Repository
specifies the backend storage system where the backed up data will be stored. A user has to createRepository
object for each backup target. Only one target can be backed up into oneRepository
. For details aboutRepository
, please visit here.BackupConfiguration
A
BackupConfiguration
specifies the backup target, behaviors (schedule, retention policy etc.),Repository
object that holds backend information etc. A user has to create oneBackupConfiguration
object for each backup target. When a user creates aBackupConfiguration
, Stash creates a CronJob for it and injects backup sidecar to the target if it is a workload (i.e. Deployment, DaemonSet, StatefulSet etc.). For more details aboutBackupConfiguration
, please visit here.BackupSession
A
BackupSession
object represents a backup run of a target. It is created by respective CronJob at each scheduled time slot. It refers to aBackupConfiguration
object for necessary configuration. Controller that runs inside backup sidecar (in case of backup via job, it is stash operator itself) will watch thisBackupSession
object and start taking the backup instantly. A user can also create aBackupSession
object manually to trigger instant backups. For more details aboutBackupSession
s, please visit here.RestoreSession
A
RestoreSession
specifies what to restore and the source of data. A user has to create aRestoreSession
object when s/he wants to restore a target. When s/he creates aRestoreSession
, Stash injects aninit-container
into the target workload (launches a job if the target is not a workload) to restore. For more details aboutRestoreSession
, please visit here.Function
A
Function
is a template for a container that performs only a specific action. For example,pg-backup
function only dumps and uploads the dumped file into the backend, whereasupdate-status
function updates the status of the respectiveBackupSession
andRepository
and sends Prometheus metrics topushgateway
based on the output of another function. For more details aboutFunction
, please visit here.Task
A complete backup or restore process may consist of several steps. For example, in order to backup a PostgreSQL database we first need to dump the database, upload the dumped file to backend and then we need to update
Repository
andBackupSession
status and send Prometheus metrics. We represent such individual steps viaFunction
objects. An entire backup or restore process needs an ordered execution of one or more functions. ATask
specifies an ordered collection of functions along with their parameters.Function
andTask
enables users to extend or customize the backup/restore process. For more details aboutTask
, please visit here.BackupBlueprint
A
BackupBlueprint
enables users to provide a blueprint forRepository
andBackupConfiguration
object. Then, s/he just needs to add some annotations to the workload s/he wants to backup. Stash will automatically create respectiveRepository
andBackupConfiguration
according to the blueprint. In this way, users can create a single blueprint for all similar types of workloads and backup them only by applying some annotations on them. In Stash parlance, we call this process Auto Backup. For more details aboutBackupBlueprint
, please visit here.BackupBatch
Sometimes, a single stateful component may not meet the requirements of your application. For example, in order to deploy a WordPress, you will need a Deployment for the WordPress and another Deployment for database to store it’s contents. Now, you may want to backup both of the deployment and database under a single configuration as they are parts of a single application.
A
BackupBatch
is a KubernetesCustomResourceDefinition
(CRD) which lets you configure backup for multiple co-related stateful components(workload, database etc.) under a single configuration. For more details, please visit here.AppBinding
An
AppBinding
holds necessary information to connect with a database. For more details aboutAppBinding
, please visit here.Snapshot
A
Snapshot
is a representation of a backup snapshot in a Kubernetes native way. Stash uses Kuberentes Extended API Server for handlingSnapshot
s. For more details aboutSnapshot
s, please visit here.