Since moving to a laptop for my primary device, I have had an underutilized desktop sitting around. That in addition to a box full of old hard drives made me want to build a home NAS to store my ever growing collection of data. The machine already had Arch Linux installed on it. I would have installed FreeBSD so ZFS would be better supported, but I wanted to use this machine as a Docker host as well. Docker is still pretty experimental to FreeBSD. In this post I will go over the process of setting up my new ZFS based file server.
Intro to ZFS
ZFS was originally developed as a proprietary file system for Solaris OS by Sun Microsystems. Since then it has been open-sourced, then forked back to closed source (boo Oracle). The current open source version is known as OpenZFS, with the Linux version is called ZFS on Linux (ZoL). All references to ZFS in this post are references to ZoL. Because ZFS is licensed as CDDL, a free (but non GPL compatible) license, it is not included as a binary kernel module for Linux.
There a quite a few interesting features of ZFS. One pretty unique one is that it combines the volume manager and the file system. This makes the storage easier to manage. ZFS is a copy on write (CoW) file system. This means that new data is written to a new block, instead of replacing the old data. If an error were to occur during a write, the old data is still available. ZFS also places a large emphasis on data integrity. By storing checksums of the files in the metadata, it can detect errors and automatically repair them.
- vdev: a grouping of drives
- zpool: the data structure that sits atop one or more vdevs
- dataset: a file system or snapshot of the data
- snapshot: a read only version of a dataset that is taken, and can be reverted to
- scrub: the process of verifying files against their checksum stored in the metadata
Preparing the Hard Drives
First step was to prepare the hard drives that I was going to use. Much of this process is documented in some way on the arch wiki.
All the drives I have are pretty old, and I didn’t want to put a faulty drive into a new pool, so I tested each drive using the built in SMART tools. Note that many of the following commands will require you to run as a privileged user.
$ smartctl -t long /dev/whatever
I also tested with hdparm to check the speeds of each disk, as one slow disk can throw off the speed of the whole array.
$ hdparm -Tt /dev/whatever
After testing, I wiped the disks completely, so there wouldn’t be the possibility of errors from old partition information floating around on the drives.
$ mdadm --misc --zero-superblock /dev/whatever
Everything on the hardware side is now ready to go. After redoing my cable management, it was time to install the software.
My machine runs Archlinux, and someone is kind enough to make a custom repository for ZFS installs.
To add the user repository to the pacman.conf file:
$ echo "[archzfs]\nServer = http://archzfs.com/$repo/x86_64" >> /etc/pacman.conf
To add the unofficial signing key:
$ pacman-key -r <keyid> $ pacman-key -f <keyid> $ pacman-key --lsign-key <keyid>
There is a lag between kernel updates, and ZFS driver updates, so it was necessary to downgrade my kernel for this. If you need to downgrade your kernel, the archwiki has a good resource.
From there I installed ZFS:
$ pacman -S zfs-linux
This will pull in any other dependencies as necessary.
Setting up the Storage Pool
I had room for four disks on my machine, and I wanted to have more storage than what was allowed by a 2x2 mirror, so I went with RAIDZ. For a detailed discussion on the different raid levels ZFS supports, see. It is also important to note that growing the size of your array requires either adding an entirely new vdev (good luck if you used all your SATA ports) or replacing the drives one by one and rebuilding the array. This is something you should be aware of if you ever plan on upgrading your storage capacity.
In the following example, I will be creating a pool named zstore.
Load the kernel module:
$ modprobe zfs $ echo zfs >> /etc/modules-load.d/zfs.conf
Enable ZFS system services:
$ systemctl enable --now zfs.target $ systemctl enable --now zfs-import-cache $ systemctl enable --now zfs-mount
Don’t use the standard /dev/sdX as these labels can change between reboots, causing ZFS to fail when it tries to use the wrong disks. To get the list of disk id’s:
$ ls -lh /dev/disk/by-id
Creating the zpool:
$ mkdir /mnt/zstore $ zpool create -m /mnt/zstore zstore raidz all-your-disk-ids
This will create you main storage pool. If you run
zpool status you should get an output displaying stats on your newly created pool. And that is pretty much it. You can now add files to your pool, snapshot it, scrub it, etc.
Instead of having one big pool of data, I recommend creating separate datasets depending on what you are putting there. e.g. have a separate dataset for your media files:
$ zfs create zstore/media
That way you can granularly control settings such as compression depending on the file type and intended use case.
Sharing the Data
The next step for me was being able to access the files easily from my laptop or other devices.
I decided to go with NFS.
$ pacman -S nfs-utils
Create a bind mount to the ZFS dataset:
$ echo "/mnt/zstore/media /srv/nfs none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0" >> /etc/fstab
Add desired shares to /etc/exports:
$ echo "/srv/nfs/media *(rw)" >> /etc/exports
Note that this can be done natively with ZFS as well
$ zfs set sharenfs=on zstore
Enable the service
$ systemctl enable --now nfs-server
Install nfs-utils on your client machine as well, and you should now be able to access your ZFS shares over your network.
It has been a couple of weeks since I set all of this up, and the storage pool has been working great so far. I find that I am mostly limited on the network side of things (need to go to gigabit).
Something that I wish was supported was native encryption. There is an experimental patch, but I am waiting until it is more fleshed out. Right now I am just encrypting any private files with GPG before storing them.
-  ZFS wiki page
-  ZFS on Linux
-  The FreeBSD handbook has a pretty good section on ZFS.
-  The arch wiki page on ZFS on ZFS. If you don’t user Arch Linux, the documentation is typically pretty great on here.
-  demizerone the Arch ZFS maintainer.
-  archzfs user repository graciously maintained by demizer
-  Arch Linux Downgrading a Package
-  A discussion on ZFS Raid Levels
-  A discussion on the importance of planning for your ZFS storage needs.  ArchWiki NFS Page