Redis + PVC: A Match Made in Data Persistence Heaven
Introduction
Recently, I developed several standalone applications and deployed them on the cloud. Each of these applications had a need for data storage, and the process of choosing a suitable storage solution involved taking into account multiple factors.
Considerations
When selecting a database, there are several factors to take into account, and the most suitable database varies on a case by case basis. But in general these were the key factors I considered:
The structure of the data: As the applications didn't need a complex form of storage with multiple models and relations between them a simple key/value storage would suffice. Is the wanted data separated into multiple models, where a more complex form of retrieval is needed?
-
Ease of use: The cognitive cost of learning and setting up a working database fulfilling the requirements.
-
Cost: The price and cost of using a database can vary and minimising costs while not impacting performance was important.
-
Community: The documentation, support and number of users. Using a database with frequent breaking changes was not an option.
Selecting a database
In theory a simple key-value object like a JavaScript dictionary would fulfil the minimum requirement of the application, but as performance was important, and the applications could have periods of high load requiring multiple replicas it would not work as the applications would require some form of shared memory. The choice fell on Redis, an in-memory database. There were a number of reasons why, but the mains ones were:
- The applications required no form of complex query language and only needed simple set and get methods.
- Redis is very fast and also has a low latency
- Redis has a large community and is an industry standard when it comes to in-memory databases and is a very popular choice.
- Developing with Redis locally is also very simple, and can be done through a docker-compose.yml-file. An example of a compose-file running a node application and setting up a local redis instance:
docker-compose.yml
version: '3.9'
services:
cache:
image: redis:alpine
ports:
- '6379:6379'
app:
env_file:
- .env
image: node:18
working_dir: /app
environment:
- REDIS_URI=redis://cache:6379
volumes:
- ./:/app
command: npm run dev
ports:
- '8080:8080'
depends_on:
- cache
Deployment and persistence
For deployment, Kubernetes was chosen due to its powerful eco system and community (and the fact that a cluster was readily available to deploy more artifacts to).
As this article focuses on Redis and Data persistence, I will not include the manifests related to the application here. As mentioned briefly earlier, Redis is an in-memory database, which means that the default behaviour is to wipe all data upon system restarts or crashes. Luckily!!! There are ways around this tradeoff by making use of a brilliant feature called Persistent Volume Claims (...or PVC for short)
What are Persistent Volume Claims (PVC)?
A PVC is a way to request specific storage resources, such as disk space or IOPS, and have the cluster automatically provision the storage and make it available to the pod(s) that the PVC is bound to. When a pod is deleted, the PVC will not be deleted automatically, the storage resources will still exist in the cluster, this allows to keep the data even if the pod is deleted, and reuse it later.
Creating the manifest
The manifest I used in order to persist data is added below. It's a fairly simple manifest that uses most of the default configuration without any key eviction strategies (this is due to the relatively small amount of data to be stored). Another option to persist the data would be to use a managed Redis instance (although it would cost alot more 💵)!
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-service-config
namespace: casper-prod
labels:
app: redis-service
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-service
namespace: casper-prod
labels:
app: redis-service
spec:
replicas: 1
selector:
matchLabels:
app: redis-service
template:
metadata:
labels:
app: redis-service
spec:
containers:
- name: redis-service
image: redis:4.0.11-alpine
args: ['/usr/local/etc/redis/redis.conf']
volumeMounts:
- name: config
mountPath: /usr/local/etc/redis/redis.conf
readOnly: true
subPath: redis.conf
- name: redis-persistent-storage
mountPath: /data/redis
ports:
- containerPort: 6379
resources:
limits:
cpu: '1000m'
memory: '2000Mi'
requests:
cpu: '1000m'
memory: '2000Mi'
volumes:
- name: config
configMap:
defaultMode: 0666
name: redis-service-config
- name: redis-persistent-storage
persistentVolumeClaim:
claimName: redis-pv-claim
---
kind: Service
apiVersion: v1
metadata:
name: redis-service
namespace: casper-prod
labels:
app: redis-service
spec:
selector:
app: redis-service
ports:
- name: redis-service
port: 6379
By using this setup, data is persisted even when the application crashes, kubernetes updates or the pod shuts down. It is also possible to mount the PVC to multiple Redis instances (although not recommended as writing to the same data results in errors with locking and data consistency!!! If the application requires sharing data between multiple Redis instances, check out Redis cluster instead).
Conclusion
In conclusion, when creating standalone applications and deploying them on the cloud, data storage is an important factor to consider. Applications have different needs and selecting the correct storage can make or break an application. When selecting Redis, an in-memory database that is fast and has a low latency, one has to be aware that it by default is an in-memory database and does not persist data. This can be prevented by using PVCs allowing for specific storage available to Pods, which makes it a great tool to know about, and something I will consider as an alternative to managed database instances in the future.