Replace Disk for Lustre on ZFS JBODS

by Andrew Wagner — last modified Mar 10, 2014 11:42 AM

 

Important Note: The definitive source for Lustre documentation is the Lustre Operations Manual available at https://wiki.hpdd.intel.com/display/PUB/Documentation.

These documents are copied from internal SSEC working documentation that may be useful for some, but be we provide no guarantee of accuraccy, correctness, or safety. Use at your own risk.

 

How-To for Replacing Disks for Lustre on ZFS

Note this is specifically for ZFS with "JBOD" attached disks. If you use a RAID card your procedure is somewhat different.

Identify Disk

The correct enclosure and slot should be listed in your monitoring system warning about the problem (we use nagios/check_mk).

For example:

arch01e37s9

That translates to the Arch01 rack, the enclosure at unit 37, and disk in slot 9. For the slot 9 disk, you can look at the numbering on the side of the front of the enclosure to identify the correct slot. Before going upstairs to pull the disk, note the path to the disk in /etc/zfs/vdev.id.conf and save it. We're trying to figure out if new disks change that path.

Replace Disk Physically

Pull the correct disk out of the slot and insert the replacement. They will not be lit up, so don't wait for it to initialize.

Replace Disk in the ZFS Pool

ZFS will note the absence of the disk. 

zpool status

We need to make the alias arch01e37s9 match up with the new disk path.

ls -l /dev/disk/by-path/

The new disk will not have any partitions on it (/dev/sdaf or a similar name without 2 partitions hanging off it). Copy the path and replace the path to that disk in vdev.id.conf if the paths are different. Afterwards, run 

udevadm trigger

That enables the alias path. Now you can issue the rebuild command to zfs.

If this disk were in pool "server1-ost19" for example

zpool replace server1-ost19 arch01e37s9

If you run a zpool status command, you'll see the disk resilvering in the pool with a rate and time to completion.