Properly Partitioning a HW RAID Volume over 2TB in size
- November 4th, 2012
- Write comment
Like most geeks I’ve got lots of files and a need for spacious network shares. However, as soon as you go over a 2TB partition, you’ll encounter problems w/ how to correctly partition a drive or RAID volume using the newer 4096 sector spec called simply Advanced Format. I initially stumbled upon a series of articles written by Roderick W. Smith over on IBM’s DeveloperWorks when searching how to properly check partition alignment under Linux for SSDs but quickly had an “Uh Oh” moment when I realized SSDs weren’t my only problem and I had in fact been flying blind on larger than 2TB partitions under Linux. Thankfully I didn’t have systems in production where this was a problem but I was literally building a system at work where this could bite me in the rear!
To bottom line it for you, if you’re using software RAID or a true HW RAID setup, you need to stripe your file system across the partition in accordance to the number of discs involved, type of RAID and the stripe size in KBs. My example below is strictly regarding Areca ARC RAID cards and might not be applicable to your setup.
I’m using an ARC-1680ix-24 w/ 4GB of ram on board. I’ve got 24 – 500GB HDs with one volume using all HDs in a RAID 6 raidset. So, for me the math is such when using EXT4 filesystem:
chunk size = 128KB (For Areca ARC-1680s it's your Stripe size.) block size = 4KB (My desired partition is over 2TB. 6.5TB to be exact.) stride = chunk / block = 128KB / 4KB = 32 Stripe-width = stride * (( # disks in RAID) - # of RAID parity disks) = 32 * ((using all 24 disks) - RAID 6 uses 2 parity disks) = 32 * (24 - 2) = 32 * 22 = 704
so in other words, once I’ve created the basic partition on my device and I’m ready to format it as EXT4 I’ll end up using:
mkfs.ext4 -v -m .1 -b 4096 -E stride=32,stripe-width=704 /dev/sdd1
I’m not going to go into every detail of the line so please read RAID Setup over on Kernel.org for the skinny.
So from Soup to Nuts:
- Use parted to create the partition table on my new volume & create a massive single partition at the right offset:
parted /dev/sdd
- Once in the parted tool:
mklabel gpt unit s mkpart primary 2048s 100% name 1 BFS quit
- Next, you’ll need to create an EXT4 filesystem inside your newly created partition from above:
mkfs.ext4 -v -m .1 -b 4096 -E stride=32,stripe-width=704 /dev/sdd1
- Now you’ll want to add it to your fstab file so let’s grab the UUID from blkid next:
blkid -o list
- And edit your /etc/fstab file so it will automount it at boot time
. UUID=8e0a7d10-blah-blah-tomatoes-are-yummy-b4a0f6a13c15 /bfs ext4 defaults 1 2 .
- Finally, create the mount point and mount it.
sudo mkdir /bfs mount /bfs
Here’s a couple of links I found note worthy while going down this rabbit hole:
https://raid.wiki.kernel.org/index.php/RAID_setup#Calculation (shows the actual formula shown above & values)
http://insights.oetiker.ch/linux/raidoptimization.html (a great read)
http://ubuntuforums.org/showthread.php?t=1715375 (Rod weighs in with helpful advice in the Ubuntu forums)
http://www.gnu.org/software/parted/manual/parted.html (parted’s man page over on GNU.ORG)
http://en.wikipedia.org/wiki/Ext4
http://lwn.net/Articles/377897/ (talks about 4KB sector size disks and let the panic ensue!)
http://whattheit.wordpress.com/2011/08/23/linux-aligning-partitions-to-a-hardware-raid-stripe/ (lots of theory but looks incomplete)
Last 4 links are from Rod Smith:
http://www.rodsbooks.com/gdisk/advice.html (using gdisk but applicable to parted)
http://www.rodsbooks.com/gdisk/index.html (main gdisk site!)
http://www.ibm.com/developerworks/linux/library/l-4kb-sector-disks/ (talks about the severe performance effects if you gloss over this stuff!)
http://www.ibm.com/developerworks/linux/library/l-gpt/ (good overview of GPT & understanding why they’re moving away from MBR)
(Personal Note: By finally posting this up on my blog I can close 10 tabs I’ve had open since middle of 2011!)