GlusterFS is a scale-out clustered filesystem for storage high
availability. It's open source, dual-licensed under the GPL v2 or the
LGPL v3. It uses either Ethernet or Infiniband for the interconnect
between the storage nodes, which means it's supports TCP/IP, RDMA and SDP.

GlusterFS is setup using servers and clients, where clients use the FUSE
driver to communicate with the servers. Servers communicate with each
other, and are setup via exporting "bricks" to the cluster. It uses
eventual consistency, versus strong consistency, for data integrity.

Rather than using a centralized metadata storage server, such as MooseFS
or Ceph, it uses an elastic hashing algorithm to determine where data is
to be stored, and retrieved. GlusterFS also supports geo-replication,
for having a mirror of the cluster stored elsewhere for disaster recovery.

I'll show how to setup a basic 3-node storage cluster, discussing some
server topologies. I'll mention the differences between:

* Distributed volumes
* Replicated volumes
* Striped volumes
* Distributed replicated volumes
* Distributed striped volumes

I'll only be concerned about covering the features under version 3.3, as
3.4 has not yet released, although I'll mention some things we should
see when it releases. I'll also be making some light comparisons to
other clustered storage technologies, in case anyone is familiar with
those already. I'll be using ZFS on Linux as the underlying filesystem
of choice in the presentation.


Comments are closed.

Excellent presentation, explained what GlusterFS is, what it isn't, and how to set it up