Wednesday, 12 March 2014

mdadm with 4tb HDDs + kernel tweaks for improved speeds pt2 (tweaking)

Creating a linux software RAID with HDDs greater than 4tb and kernel tweaking for increased speeds pt2 (tweaking)

 Continuing from pt1

I've created a raid device (/dev/md0) and it's currently rebuilding the array with a whopping 16TB cpacity.
You can examine this via here:

mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6fc77ab1:cb9bdc74:2d2f3938:9ba4b4e7
           Name : HAL:HAL  (local to host HAL)
  Creation Time : Tue Mar 11 23:31:15 2014
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 7813771264 (3725.90 GiB 4000.65 GB)
     Array Size : 15627540480 (14903.58 GiB 16002.60 GB)
  Used Dev Size : 7813770240 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 77542ec8:7c4efdea:9510ca31:fd4bb187

    Update Time : Wed Mar 12 00:16:15 2014
       Checksum : e038ac94 - correct
         Events : 14

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAA ('A' == active, '.' == missing)


Now the speed you're getting will be the basic speed that'll use up to around 40% of your CPU (YMMV). To speed up the rebuild speed, i've used this kernel tweak:

echo 1024 > /sys/block/md0/md/stripe_cache_size
This has increased my rebuild speed double, from ~56k/s to around 96k/s
root@HAL:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sde1[5] sdd1[3] sdc1[2] sdb1[1] sda1[0]
      15627540480 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [UUUU_]
      [========>............]  recovery = 42.7% (1669179940/3906885120) finish=388.2min speed=96060K/sec
     
unused devices: <none>
CPU usage during rebuild.
top - 07:57:38 up  8:30,  2 users,  load average: 1.29, 1.23, 1.15
Tasks: 107 total,   2 running, 105 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.1%us, 28.0%sy,  0.0%ni, 67.0%id,  0.0%wa,  0.0%hi,  3.8%si,  0.0%st
Mem:   8047068k total,   528472k used,  7518596k free,      672k buffers
Swap:        0k total,        0k used,        0k free,   339700k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                
 2046 root      20   0     0    0    0 S   39  0.0 121:38.13 md0_raid5                                                                                                                               
 2048 root      20   0     0    0    0 D   23  0.0  73:15.66 md0_resync   
 
iostat during rebuild
(notice odd tps for sda1 - not sure what that is about ....)
Every 1.0s: iostat -k 1 2                                                                                                                                                     Wed Mar 12 08:18:06 2014

Linux 3.11.0-15-generic (HAL)   12/03/14        _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.62    0.19    5.44    0.50    0.00   93.25

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             219.66     56178.28         0.00 1789479436         74
sdb             110.27     56178.21         0.00 1789477229         74
sdc             110.26     56178.13         0.00 1789474572         74
sdd             110.27     56177.87         0.00 1789466360         74
sde             111.24         0.08     56176.92       2529 1789436170
sdf               0.45         7.86        13.15     250385     418979
md0               0.01         0.02         0.00        704          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.10    0.00   30.39    0.00    0.00   68.51

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             343.00     87048.00         0.00      87048          0
sdb             172.00     85000.00         0.00      85000          0
sdc             177.00     87560.00         0.00      87560          0
sdd             193.00     95240.00         0.00      95240          0
sde             191.00         0.00     95240.00          0      95240
sdf               0.00         0.00         0.00          0          0
md0               0.00         0.00         0.00          0          0

Now, the best file system to use is xfs, as it will auto calculate the stripe size you need to get the best perfomance.

When i tried this, it complained that it already contained a partition table, so I overwrote it.

root@HAL:~# mkfs.xfs /dev/md0
mkfs.xfs: /dev/md0 appears to contain a partition table (gpt).
mkfs.xfs: Use the -f option to force overwrite.
root@HAL:~# mkfs.xfs /dev/md0 -f
log stripe unit (524288 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/md0               isize=256    agcount=32, agsize=122090240 blks
         =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=4096   blocks=3906885120, imaxpct=5
         =                       sunit=128    swidth=512 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0



Once mounted, using mount /dev/md0 /mnt/<your dir>

you should be able to see this when you do a df -h
 root@HAL:/etc/samba# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf1       7.5G  1.3G  6.2G  17% /
udev            3.8G  8.0K  3.8G   1% /dev
tmpfs           1.6G  912K  1.6G   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            3.9G     0  3.9G   0% /run/shm
/dev/md0         15T   95G   15T   1% /mnt/md0



Don't forget to use blkid to add the newly mount raid array to fstab!
root@HAL:/etc/samba# blkid
/dev/sda1: UUID="6fc77ab1-cb9b-dc74-2d2f-39389ba4b4e7" UUID_SUB="77542ec8-7c4e-fdea-9510-ca31fd4bb187" LABEL="HAL:HAL" TYPE="linux_raid_member"
/dev/sdb1: UUID="6fc77ab1-cb9b-dc74-2d2f-39389ba4b4e7" UUID_SUB="20a6a078-6ae1-309a-408e-44bb1f2bb18c" LABEL="HAL:HAL" TYPE="linux_raid_member"
/dev/sdc1: UUID="6fc77ab1-cb9b-dc74-2d2f-39389ba4b4e7" UUID_SUB="f4fa566f-7bf7-aecc-db62-3db7e33ee637" LABEL="HAL:HAL" TYPE="linux_raid_member"
/dev/sdd1: UUID="6fc77ab1-cb9b-dc74-2d2f-39389ba4b4e7" UUID_SUB="d55da20b-16e7-e3b1-14b7-4d4069ca14f7" LABEL="HAL:HAL" TYPE="linux_raid_member"
/dev/sde1: UUID="6fc77ab1-cb9b-dc74-2d2f-39389ba4b4e7" UUID_SUB="65c1c74a-94f4-a448-69f2-0cee3b237402" LABEL="HAL:HAL" TYPE="linux_raid_member"
/dev/sdf1: LABEL="HAL" UUID="ec1e225b-ea2b-41f8-a2c9-199bdcb8d541" TYPE="xfs"
/dev/md0: UUID="8cc6eaad-e7c1-4e0d-820c-3b6334406011" TYPE="xfs"


I've already installed samba and am copying files from another NAS to my new one.
With this, I am maxing out my gig ethernet on my windows machine at around 88MB/s, which is almost reaching the limit on my gig network.

using iostat -k 1 2
Every 1.0s: iostat -k 1 2                                                                                                                                                     Wed Mar 12 20:05:50 2014

Linux 3.11.0-15-generic (HAL)   12/03/14        _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.57    0.07   13.06    1.13    0.00   85.17

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             207.90     52663.69       412.26 3913864509   30638377
sdb             105.77     52663.40       412.90 3913842698   30686017
sdc             105.76     52664.02       412.61 3913888889   30664137
sdd             105.81     52664.60       413.06 3913931749   30698045
sde             105.43        13.69     53063.40    1017416 3943569905
sdf               0.35         4.31         9.65     320101     716898
md0               6.60         0.05      1623.15       3551  120629217

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.62    0.00   12.42   46.58    0.00   40.37

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              59.00         0.00     14764.00          0      14764
sdb              41.00         0.00     18976.00          0      18976
sdc              46.00       288.00     19240.00        288      19240
sdd              44.00       264.00     18976.00        264      18976
sde              41.00         0.00     18976.00          0      18976
sdf               0.00         0.00         0.00          0          0
md0             154.00         0.00     38916.00          0      38916


I am getting these speeds. Although I haven't tested the raid speed using bonnie, I am confident that I can saturate my gig ethernet link, and once I've bonded the dual NICs I have, we'll see how much I can push to my NAS

Final stats using vnstat, using a test 100gb
                           rx         |       tx
--------------------------------------+------------------
  bytes                   109.78 GiB  |      768.67 MiB
--------------------------------------+------------------
          max          779.37 Mbit/s  |     6.16 Mbit/s
      average          593.73 Mbit/s  |     4.06 Mbit/s
          min               0 kbit/s  |        0 kbit/s
--------------------------------------+------------------
  packets                   80111899  |         9964991
--------------------------------------+------------------
          max              67732 p/s  |        8634 p/s
      average              51651 p/s  |        6424 p/s
          min                  3 p/s  |           0 p/s
--------------------------------------+------------------
  time                 25.85 minutes








No comments:

Post a Comment