RBD Volume Mount Fencing in Kubernetes

Kubernetes can use RBD images as persistent block storage for Linux containers. However, there is only one container can mount a RBD volume in read-write mode. If multiple containers write to the same RBD volume without high level coordination, data corruption will likely occur, as reported in a recent case.

One intuitive solution is to make persistent block storage provider be restrictive on client mount. For instance, Google Compute Engine’s Persistent Disk allows only one read-write mount.

Another approach is fencing. A RBD image writer needs to hold an exclusive lock on an image during mount. If the writer fails to acquire the lock, it is safe to assume the image is being used by others. The writer shouldn’t attempt to mount the RBD volume in this case. As a result, only one writer can use the image and no more data corruption.

This is the how the RBD volume mount fencing pull request does for Kubernetes. I tried the following test and found this fixes the mount racing problem.

I have two Fedora 21 hosts. Each loads my fix and runs as a local cluster :


# ./hack/local-up-cluster.sh

# start the rest of the kubectl routines

Then each local cluster creates a Pod using RBD volume:


#./cluster/kubectl.sh create -f examples/rbd/rbd.json

Watch RBD image lock:


# rbd lock list foo --pool kube
There is 1 exclusive lock on this image.
Locker ID Address 
client.4494 kubelet_lock_magic_host 10.16.154.78:0/1026846 


On both clusters, get the Pod status. I see one cluster has a running Pod and another cluster sees Pod pending.

Running Pod:


# ./cluster/kubectl.sh get pod
NAME READY REASON RESTARTS AGE
rbd       1/1         Running   0                 5m


The other Pod:


# ./cluster/kubectl.sh get pod
NAME READY REASON   RESTARTS AGE
rbd3     0/1        Image: kubernetes/pause is ready, container is creating 0 4m

Then I delete the running Pod, the second one immediately becomes running.

So with this fix, Pods do get fenced off.

Advertisements

One thought on “RBD Volume Mount Fencing in Kubernetes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s