Vault 2015 Notes: Second Day Morning

Maxim’s FUSE improvement talk.  The writeback cache was the first slide when I arrived. The writeback cache reduced write latency and parallel writeback processing.  It accumulated page cache and kernel writeback would kick off the actual I/O. I vaguely heard “tripled”.  The performance comparison showed both baseline and improvement parity (~30% better) and commodity vs. Dell EQL SAN (mixed). The future improvement included eliminate global lock, variable message size, multi-queue, NUMA affinity. FUSE daemon might be able to talk to multiple queues in /dev/fuse and thus avoided contention. Oracle was said to submit patches to just do those things. The patches were said to improve performance quite a bit. Ben England from Red Hat asked about zero copy inside FUSE. Maxim pondered on kernel bypass for a second but hesitated to come a conclusion. Jeff Darcy asked about if FUSE API change needed to take advantage of these features, answers seemed to be not much. Second and following questions on invalidation writeback cache while one client still held them, answers seemed to be “depend” (expect “stale” data). Writeback cache could be disabled but on a volume level.

Anand’s talk on Glusterfs and NFS Ganesha. Ganesha became much better than last time I worked on it. Stackable FSALs, RDMA (libmooshika), dynamic exports. His focus was on CMAL (cluster manager abstraction layer), i.e. making active/active NFS heads possible. And you don’t need to have a clustered filesystem to use the CMAL framework. CMAL is able to migrate service IP. The clustered Ganesha with Glusterfs used VIP and Pacemaker/Corosync (could it scale?). Each Ganesha node is notified by DBUS message to initiate migration. The active/active tricks seemed to be embeded in the protocol NLM protocol (for v3 via SM_NOTIFY) and STALE_CLIENTID/STALE_STATEID (for v4). Jeff Layton didn’t object such architecture. Anand’s next topic was pNFS with Glusterfs, File Layout of course, anonymous FD was mentioned.  This appeared a more economic and scalable solution alternative. Questions on Ganesha vs. in-kernel NFS server performance parity, cluster scalability.

Venky’s Glusterfs compliance topic started with a low key tone. But think about it, there are many opportunities in his framework.: BitRot detection, tiering, dedupe, compression were quickly talked. It is easy to double that list and point to a use case. The new Glusterfs journal features callback mechanism, supports richer format. The “log mining” is on individual bricks, it could require some programming to get the (especially distributed) volume level picture. The metadata journals contain enough information, so say if you like to run forensics utilities, they could be very helpful to plot the data lifecycle.


