New to Stash? Please start here.

Function

What is Function

A complete backup or restore process may consist of several steps. For example, in order to backup a PostgreSQL database we first need to dump the database and upload the dumped file to a backend. Then we need to update the respectiveRepository and BackupSession status and send Prometheus metrics. In Stash, we call such individual steps a Function.

A Function is a Kubernetes CustomResourceDefinition(CRD) which basically specifies a template for a container that performs only a specific action. For example, postgres-backup-* function only dumps and uploads the dumped file into the backend where update-status function updates the status of respective BackupSession and Repository and sends Prometheus metrics to pushgateway based on the output of postgres-backup-* function.

When you install Stash, some Functions will be pre-installed for supported targets like databases, etc. However, you can create your own function to customize or extend the backup/restore process.

Function CRD Specification

Like any official Kubernetes resource, a Function has TypeMeta, ObjectMeta and Spec sections. However, unlike other Kubernetes resources, it does not have a Status section.

A sample Function object to backup a PostgreSQL is shown below,

apiVersion: stash.appscode.com/v1beta1
kind: Function
metadata:
  name: postgres-backup-11.2
spec:
  image: stashed/postgres-stash:11.2
  args:
  - backup-pg
  - --provider=${REPOSITORY_PROVIDER:=}
  - --bucket=${REPOSITORY_BUCKET:=}
  - --endpoint=${REPOSITORY_ENDPOINT:=}
  - --path=${REPOSITORY_PREFIX:=}
  - --secret-dir=/etc/repository/secret
  - --scratch-dir=/tmp
  - --hostname=${HOSTNAME:=host-0}
  - --pg-args=${pgArgs:=}
  - --namespace=${NAMESPACE:=default}
  - --app-binding=${TARGET_NAME:=}
  - --retention-keep-last=${RETENTION_KEEP_LAST:=0}
  - --retention-prune=${RETENTION_PRUNE:=false}
  - --output-dir=${outputDir:=}
  - --enable-cache=${ENABLE_CACHE:=true}
  - --max-connections=${MAX_CONNECTIONS:=0}
  volumeMounts:
  - name: ${secretVolume}
    mountPath: /etc/repository/secret
  runtimeSettings:
    container:
      resources:
        requests:
          memory: 256M
        limits:
          memory: 256M
      securityContext:
        runAsUser: 5000
        runAsGroup: 5000

A sample Function that updates BackupSession and Repository status and sends metrics to Prometheus pushgateway is shown below,

apiVersion: stash.appscode.com/v1beta1
kind: Function
metadata:
  name: update-status
spec:
  image: appscode/stash:pg
  args:
  - update-status
  - --namespace=${NAMESPACE:=default}
  - --repository=${REPOSITORY_NAME:=}
  - --backup-session=${BACKUP_SESSION:=}
  - --restore-session=${RESTORE_SESSION:=}
  - --output-dir=${outputDir:=}

Here, we are going to describe the various sections of a Function crd.

Function Spec

A Function object has the following fields in the spec section:

spec.image

spec.image specifies the docker image to use to create a container using the template specified in this Function.

spec.command

spec.command specifies the commands to be executed by the container. Docker image’s ENTRYPOINT will be executed if no commands are specified.

spec.args

spec.args specifies a list of arguments that will be passed to the entrypoint. You can templatize this section using envsubst style variables. Stash will resolve all the variables before creating the respective container. A variable should follow the following patterns:

  • ${VARIABLE_NAME:=default-value}
  • ${VARIABLE_NAME:=}

In the first case, if Stash can’t resolve the variable, the default value will be used in place of this variable. In the second case, if Stash can’t resolve the variable, an empty string will be used to replace the variable.

Stash Provided Variables

Stash operator provides the following built-in variables based on BackupConfiguration, BackupSession, RestoreSession, Repository, Task, Function, BackupBlueprint etc.

Environment VariableUsage
NAMESPACENamespace of backup or restore job/workload
BACKUP_SESSIONName of the respective BackupSession object
RESTORE_SESSIONName of the respective RestoreSession object
REPOSITORY_NAMEName of the Repository object that holds respective backend information
REPOSITORY_PROVIDERType of storage provider. i.e. gcs, s3, aws, local etc.
REPOSITORY_SECRET_NAMEName of the secret that holds the credentials to access the backend
REPOSITORY_BUCKETName of the bucket where backed up data will be stored
REPOSITORY_PREFIXA prefix of the directory inside bucket where backed up data will be stored
REPOSITORY_ENDPOINTURL of S3 compatible Minio/Rook server
REPOSITORY_URLURL of the REST server for REST backend
HOSTNAMEAn identifier for the backed up data. If multiple pods backup in same Repository (i.e. StatefulSet or DaemonSet) this host name is to used identify data of the individual host.
SOURCE_HOSTNAMEAn identifier of the host whose backed up data will be restored
TARGET_NAMEName of the target of backup or restore
TARGET_API_VERSIONAPI version of the target of backup or restore
TARGET_KINDKind of the target of backup or restore
TARGET_NAMESPACENamespace of the target object for backup or restore
TARGET_MOUNT_PATHDirectory where target PVC will be mounted in stand-alone PVC backup or restore
TARGET_PATHSArray of file paths that are subject to backup
RESTORE_PATHSArray of file paths that are subject to restore
RESTORE_SNAPSHOTSName of the snapshot that will be restored
TARGET_APP_VERSIONVersion of the application pointed by an AppBinding
TARGET_APP_GROUPThe application group where the app pointed by an AppBinding belongs
TARGET_APP_RESOURCEThe resource kind under an application group that the app pointed by an AppBinding works with
TARGET_APP_TYPEThe total types of the application. It’s simply TARGET_APP_GROUP/TARGET_APP_RESOURCE
TARGET_APP_REPLICASNumber of replicas of an application targeted for backup or restore
RETENTION_KEEP_LASTNumber of latest snapshots to keep
RETENTION_KEEP_HOURLYNumber of hourly snapshots to keep
RETENTION_KEEP_DAILYNumber of daily snapshots to keep
RETENTION_KEEP_WEEKLYNumber of weekly snapshots to keep
RETENTION_KEEP_MONTHLYNumber of monthly snapshots to keep
RETENTION_KEEP_YEARLYNumber of yearly snapshots to keep
RETENTION_KEEP_TAGSKeep only those snapshots that have these tags
RETENTION_PRUNESpecify whether to remove data of old snapshot completely from the backend
RETENTION_DRY_RUNSpecify whether to run cleanup in test mode
ENABLE_CACHESpecify whether to use cache while backup or restore
MAX_CONNECTIONSSpecifies number of parallel connections to upload/download data to/from backend
NICE_ADJUSTMENTAdjustment value to configure nice to throttle the load on cpu.
IONICE_CLASSName of the ionice class
IONICE_CLASS_DATAValue of the ionice class data
ENABLE_STATUS_SUBRESOURCESpecifies whether crd has subresource enabled
PROMETHEUS_PUSHGATEWAY_URLURL of the Prometheus pushgateway that collects the backup/restore metrics
INTERIM_DATA_DIRDirectory to store backed up or restored data temporarily before uploading to the backend or injecting into the target

If you want to use a variable that is not present this table, you have to provide its value in spec.task.params section of BackupConfiguration crd.

spec.workDir

spec.workDir specifies the container’s working directory. If this field is not specified, the container’s runtime default will be used.

spec.ports

spec.ports specifies a list of the ports to expose from the respective container that will be created for this function.

spec.volumeMounts

spec.volumeMounts specifies a list of volume names and their mountPath that will be mounted into the container that will be created for this function.

spec.volumeDevices

spec.volumeDevices specifies a list of the block devices to be used by the container that will be created for this function.

spec.runtimeSettings

spec.runtimeSettings.container allows to configure runtime environment of a backup job at container level. You can configure the following container level parameters:

FieldUsage
resourcesCompute resources required by sidecar container or backup job. To know how to manage resources for containers, please visit here.
livenessProbePeriodic probe of backup sidecar/job container’s liveness. Container will be restarted if the probe fails.
readinessProbePeriodic probe of backup sidecar/job container’s readiness. Container will be removed from service endpoints if the probe fails.
lifecycleActions that the management system should take in response to container lifecycle events.
securityContextSecurity options that backup sidecar/job’s container should run with. For more details, please visit here.
niceSet CPU scheduling priority for the backup process. For more details about nice, please visit here.
ioniceSet I/O scheduling class and priority for the backup process. For more details about ionice, please visit here.
envA list of the environment variables to set in the container that will be created for this function.
envFromThis allows to set environment variables to the container that will be created for this function from a Secret or ConfigMap.

spec.podSecurityPolicyName

If you are using a PSP enabled cluster and the function needs any specific permission then you can specify the PSP name using spec.podSecurityPolicyName field. Stash will add this PSP in the respective RBAC roles that will be created for this function.

Note that Stash operator can’t give permission to use a PSP to a backup job if the operator itself does not have permission to use it. So, if you want to specify PSP name in this section, make sure to add that in stash-operator ClusterRole too.

Next Steps

  • Learn how to use Function to create a Task from here.