You are looking at the documentation of a prior release. To read the documentation of the latest release, please
visit here.
Stash Architecture
Stash is a Kubernetes operator for restic. At the heart of Stash, it is a Kubernetes controller. It uses Custom Resource Definition(CRD) to specify targets and behaviors of backup and restore process in a Kubernetes native way. A simplified architecture of Stash is shown below:
Components
Stash consists of various components that implement backup and restore logic. This section will give you a brief overview of such components.
Stash Operator
When a user installs Stash, it creates a Kubernetes Deployment typically named stash-operator. This deployment controls the entire backup and restore process. stash-operator deployment runs two containers. One of them is called operator which performs the core functionality of Stash and the other one is pushgateway which is a Prometheus pushgateway.
Operator
operator container runs all the controllers as well as an Aggregated API Server.
Controllers
Controllers watch various Kubernetes resources as well as the custom resources introduced by Stash. It applies the backup or restore logic for a target resource when requested by users.
Aggregated API Server
Aggregated API Server self-hosts validating and mutating webhooks and runs an Extension API Server for Snapshot resource.
- Mutating Webhook: Stash uses Mutating Webhook to inject backup - sidecaror restore- init-containerinto a workload if any backup or restore process is configured for it. It is also used for defaulting custom resources.
- Validating Webhook: Validating Webhook is used to validate the custom resource objects. 
- Snapshot Server: Stash uses Kubernetes Extended API Server to provide - viewand- listcapability of backed up snapshots. When a user requests for Snapshot objects, Snapshot server reads respective information directly from backend repository and returns object representation in a Kubernetes native way.
Pushgateway
pushgateway container runs Prometheus pushgateway. All the backup sidecars/jobs and restore init-containers/jobs send Prometheus metrics to this pushgateway after completing their backup or restore process. Prometheus server can scrape those metrics from this pushgateway.
Backend
Backend is the storage where Stash stores backed up files. It can be a cloud storage like GCS bucket, AWS S3, Azure Blob Storage etc. or a Kubernetes persistent volume like NFS, PersistentVolumeClaim, etc. To learn more about backend, please visit here.
CronJob
When a user creates a BackupConfiguration object, Stash creates a CronJob with the schedule specified in it. At each scheduled slot, this CronJob triggers a backup for the targeted workload.
Backup Sidecar / Backup Job
When a user creates a BackupConfiguration object, Stash injects a sidecar to the target if it is a workload (i.e. Deployment, DaemonSet, StatefulSet etc.). This sidecar takes backup when the respective CronJob triggers a backup. If the target is a database or stand-alone volume, Stash creates a job to take backup at each trigger.
Restore Init-Container / Restore Job
When a user creates a RestoreSession object, Stash injects an init-container to the target if it is a workload (i.e. Deployment, DaemonSet, StatefulSet etc.). This init-container performs restore process on restart. If the target is a database or stand-alone volume, Stash creates a job to restore the target.
Custom Resources
Stash uses Custom Resource Definition(CRD) to specify targets and behaviors of backup and restore process in a Kubernetes native way. This section will give you a brief overview of the custom resources used by Stash.
- Repository - A - Repositoryspecifies the backend storage system where the backed up data will be stored. A user has to create- Repositoryobject for each backup target. Only one target can be backed up into one- Repository. For details about- Repository, please visit here.
- BackupConfiguration - A - BackupConfigurationspecifies the backup target, behaviors (schedule, retention policy etc.),- Repositoryobject that holds backend information etc. A user has to create one- BackupConfigurationobject for each backup target. When a user creates a- BackupConfiguration, Stash creates a CronJob for it and injects backup sidecar to the target if it is a workload (i.e. Deployment, DaemonSet, StatefulSet etc.). For more details about- BackupConfiguration, please visit here.
- BackupSession - A - BackupSessionobject represents a backup run of a target. It is created by respective CronJob at each scheduled time slot. It refers to a- BackupConfigurationobject for necessary configuration. Controller that runs inside backup sidecar (in case of backup via job, it is stash operator itself) will watch this- BackupSessionobject and start taking the backup instantly. A user can also create a- BackupSessionobject manually to trigger instant backups. For more details about- BackupSessions, please visit here.
- RestoreSession - A - RestoreSessionspecifies what to restore and the source of data. A user has to create a- RestoreSessionobject when s/he wants to restore a target. When s/he creates a- RestoreSession, Stash injects an- init-containerinto the target workload (launches a job if the target is not a workload) to restore. For more details about- RestoreSession, please visit here.
- Function - A - Functionis a template for a container that performs only a specific action. For example,- pg-backupfunction only dumps and uploads the dumped file into the backend, whereas- update-statusfunction updates the status of the respective- BackupSessionand- Repositoryand sends Prometheus metrics to- pushgatewaybased on the output of another function. For more details about- Function, please visit here.
- Task - A complete backup or restore process may consist of several steps. For example, in order to backup a PostgreSQL database we first need to dump the database, upload the dumped file to backend and then we need to update - Repositoryand- BackupSessionstatus and send Prometheus metrics. We represent such individual steps via- Functionobjects. An entire backup or restore process needs an ordered execution of one or more functions. A- Taskspecifies an ordered collection of functions along with their parameters.- Functionand- Taskenables users to extend or customize the backup/restore process. For more details about- Task, please visit here.
- BackupBlueprint - A - BackupBlueprintenables users to provide a blueprint for- Repositoryand- BackupConfigurationobject. Then, s/he just needs to add some annotations to the workload s/he wants to backup. Stash will automatically create respective- Repositoryand- BackupConfigurationaccording to the blueprint. In this way, users can create a single blueprint for all similar types of workloads and backup them only by applying some annotations on them. In Stash parlance, we call this process Auto Backup. For more details about- BackupBlueprint, please visit here.
- BackupBatch - Sometimes, a single stateful component may not meet the requirements of your application. For example, in order to deploy a WordPress, you will need a Deployment for the WordPress and another Deployment for database to store it’s contents. Now, you may want to backup both of the deployment and database under a single configuration as they are parts of a single application. - A - BackupBatchis a Kubernetes- CustomResourceDefinition(CRD) which lets you configure backup for multiple co-related stateful components(workload, database etc.) under a single configuration. For more details, please visit here.
- AppBinding - An - AppBindingholds necessary information to connect with a database. For more details about- AppBinding, please visit here.
- Snapshot - A - Snapshotis a representation of a backup snapshot in a Kubernetes native way. Stash uses Kuberentes Extended API Server for handling- Snapshots. For more details about- Snapshots, please visit here.







