mirror of
https://github.com/nikdoof/website-hugo.git
synced 2025-12-13 13:22:22 +00:00
large-storage-custom-nas - index.md
Put back the rest of the article and see if it works now.
This commit is contained in:
@@ -225,3 +225,323 @@ The best we can do with 3 controllers is then 72 drives ( 1 RES240 card + other
|
||||
|
||||
Close but not yet 120 drives - ARG!!!!!
|
||||
|
||||
# Another change of design
|
||||
|
||||
As I was cabling this, space for cabling was also becoming a premium, even with SFF cables. So now I changed tac on the design for the last time, and brough my expectations down a little.
|
||||
|
||||
{{< image src="images/IMG_2883.jpg" width="150x" class="is-pulled-right" height="200x" title="">}}
|
||||
|
||||
So I did have this
|
||||
|
||||
{{< image src="images/IMG_2892.jpg" width="150x" class="is-pulled-right" height="200x" title="">}}
|
||||
|
||||
I am now going to do this
|
||||
|
||||
{{< image src="images/IMG_2905.jpg" width="150x" class="is-pulled-right" height="200x" title="">}}
|
||||
|
||||
Instead of using the 9 locations for drives I am going for 6. The last 3 I can use for the RES240 cards and allow some space for cabling, air flow and as it turns out some boot space as well.
|
||||
|
||||
In this picture you can see 84 drives, but I did mention 72, where did the other 12 come from ? Remember the 3 original controller cards, as it turns out they fit into the remaining PCI slots.
|
||||
|
||||
{{< image src="images/IMG_2937 (1).jpg" width="150x" class="is-pulled-right" height="200x" title="">}}
|
||||
|
||||
We now end up with this
|
||||
|
||||
{{< image src="images/IMG_2906.jpg" width="150x" class="is-pulled-right" height="200x" title="">}}
|
||||
|
||||
To show that it all works under Linux
|
||||
|
||||
It does take about 4 minutes to boot though!
|
||||
|
||||
```
|
||||
- it boots - success
|
||||
- you have loads of disks
|
||||
- it is low power
|
||||
- quite quiet
|
||||
- how to use !!
|
||||
```
|
||||
|
||||
# Lets RAID!!!!
|
||||
|
||||
Do not RAID 84 disks together, for the main reason, it you lose more than about 3 you lose the lot!
|
||||
|
||||
You could create 4 x 21 raid sets, some redundancy
|
||||
I am not going into the pros and cons of how you configure RAID
|
||||
|
||||
My testing
|
||||
|
||||
I did get 84 SSDs from Ebay, 120Gig, reasonably cheap at ~5ukp each - of which many did not work or last very long.
|
||||
|
||||
I raided 21 disks, using the following structure
|
||||
|
||||
There are 3 controller cards with 24 disks ( 1 x RES240 with 20 and 4 drives direct )
|
||||
There are 3 controller cards with 4 disks each
|
||||
|
||||
Take 6 disks from the three controller cards with 24 disks and one each from the others = 21
|
||||
|
||||
Linux will tell you which disk is attached to which controller!!
|
||||
|
||||
I paritioned the drives and set a UUID on them, some already had one so I just reused that.
|
||||
|
||||
----------- NEED ADD LINUX COMMANDS FOR THAT HERE THE OUTPUT IS MASSIVE, so PERHAPS A LINK to ANOTHER PAGE -----------------
|
||||
|
||||
## RAID Setup
|
||||
|
||||
I set up a RAID5 set with the following command
|
||||
|
||||
```
|
||||
mdadm --create /dev/md1 --level=5 --raid-devices=21 \
|
||||
/dev/disk/by-partuuid/1ea8d03a-01 \
|
||||
/dev/disk/by-partuuid/4860020c-71cc-3042-95fc-5410011ce0b4 \
|
||||
/dev/disk/by-partuuid/104eb35f-b56a-cd44-a7c7-08c4a8d14b88 \
|
||||
/dev/disk/by-partuuid/dce51c65-31dd-304e-8181-d5f6a114a5d0 \
|
||||
/dev/disk/by-partuuid/e0598cbf-e288-c84c-b9c2-3bbcd375cc5d \
|
||||
/dev/disk/by-partuuid/63bfd06e-01 \
|
||||
/dev/disk/by-partuuid/318f464f-01 \
|
||||
/dev/disk/by-partuuid/ed30e572-01 \
|
||||
/dev/disk/by-partuuid/52dd6fef-01 \
|
||||
/dev/disk/by-partuuid/9902088f-7465-784b-8679-66fd69587999 \
|
||||
/dev/disk/by-partuuid/84aca13e-01 \
|
||||
/dev/disk/by-partuuid/ed61aa88-01 \
|
||||
/dev/disk/by-partuuid/71c2c981-01 \
|
||||
/dev/disk/by-partuuid/b1d7726c-01 \
|
||||
/dev/disk/by-partuuid/9ef9ee1c-01 \
|
||||
/dev/disk/by-partuuid/a0f92d92-01 \
|
||||
/dev/disk/by-partuuid/b4d2d687-0e82-584c-93c8-59fde62012a0 \
|
||||
/dev/disk/by-partuuid/eb95d63d-01 \
|
||||
/dev/disk/by-partuuid/465e0077-01 \
|
||||
/dev/disk/by-partuuid/abbcfdad-01 \
|
||||
/dev/disk/by-partuuid/d0131bfc-01
|
||||
```
|
||||
|
||||
Then using this command
|
||||
|
||||
```
|
||||
mdadm --detail /dev/md1
|
||||
```
|
||||
|
||||
I checked the status until sync was completed
|
||||
|
||||
```
|
||||
root@test1:~# mdadm --detail /dev/md1
|
||||
/dev/md1:
|
||||
Version : 1.2
|
||||
Creation Time : Sat Jan 27 23:25:15 2024
|
||||
Raid Level : raid5
|
||||
Array Size : 2343055360 (2.18 TiB 2.40 TB)
|
||||
Used Dev Size : 117152768 (111.73 GiB 119.96 GB)
|
||||
Raid Devices : 21
|
||||
Total Devices : 21
|
||||
Persistence : Superblock is persistent
|
||||
|
||||
Intent Bitmap : Internal
|
||||
|
||||
Update Time : Sat Jan 27 23:26:00 2024
|
||||
State : clean, degraded, recovering
|
||||
```
|
||||
|
||||
Finally, 23 minutes later - which is pretty fast.
|
||||
|
||||
```
|
||||
/dev/md1:
|
||||
Version : 1.2
|
||||
Creation Time : Sat Jan 27 23:25:15 2024
|
||||
Raid Level : raid5
|
||||
Array Size : 2343055360 (2.18 TiB 2.40 TB)
|
||||
Used Dev Size : 117152768 (111.73 GiB 119.96 GB)
|
||||
Raid Devices : 21
|
||||
Total Devices : 21
|
||||
Persistence : Superblock is persistent
|
||||
|
||||
Intent Bitmap : Internal
|
||||
|
||||
Update Time : Sat Jan 27 23:49:04 2024
|
||||
State : clean
|
||||
Active Devices : 21
|
||||
Working Devices : 21
|
||||
Failed Devices : 0
|
||||
Spare Devices : 0
|
||||
|
||||
Layout : left-symmetric
|
||||
Chunk Size : 512K
|
||||
|
||||
Consistency Policy : bitmap
|
||||
|
||||
Name : test1:1 (local to host test1)
|
||||
UUID : 77a16172:a32f8d3b:a54cbaef:ae6062ac
|
||||
Events : 284
|
||||
|
||||
Number Major Minor RaidDevice State
|
||||
0 65 129 0 active sync /dev/sdy1
|
||||
1 65 161 1 active sync /dev/sdaa1
|
||||
2 65 177 2 active sync /dev/sdab1
|
||||
3 65 193 3 active sync /dev/sdac1
|
||||
4 65 209 4 active sync /dev/sdad1
|
||||
5 65 225 5 active sync /dev/sdae1
|
||||
6 67 17 6 active sync /dev/sdax1
|
||||
7 67 33 7 active sync /dev/sday1
|
||||
8 67 49 8 active sync /dev/sdaz1
|
||||
9 67 65 9 active sync /dev/sdba1
|
||||
10 67 81 10 active sync /dev/sdbb1
|
||||
11 67 97 11 active sync /dev/sdbc1
|
||||
12 8 1 12 active sync /dev/sda1
|
||||
13 8 17 13 active sync /dev/sdb1
|
||||
14 8 33 14 active sync /dev/sdc1
|
||||
15 8 49 15 active sync /dev/sdd1
|
||||
16 8 65 16 active sync /dev/sde1
|
||||
17 8 81 17 active sync /dev/sdf1
|
||||
18 68 145 18 active sync /dev/sdbv1
|
||||
19 68 209 19 active sync /dev/sdbz1
|
||||
21 69 17 20 active sync /dev/sdcd1
|
||||
```
|
||||
|
||||
# Speed testing
|
||||
|
||||
I started with some simple tests, linear write, 20Gig write.
|
||||
|
||||
21 disk array write speed
|
||||
|
||||
```
|
||||
root@test1:/mnt/test1# time dd if=/dev/zero of=test1.img bs=1G count=20
|
||||
20+0 records in
|
||||
20+0 records out
|
||||
21474836480 bytes (21 GB, 20 GiB) copied, 39.6505 s, 542 MB/s
|
||||
|
||||
real 0m39.700s
|
||||
user 0m0.000s
|
||||
sys 0m25.679s
|
||||
```
|
||||
|
||||
It is doing around 4.3gbp/s - which is not bad, give 3 of the controllers are limited to 3gbp/s.
|
||||
|
||||
Let's do a read speed.
|
||||
|
||||
21 disk read speed
|
||||
```
|
||||
root@test1:/mnt/test1# time dd if=test1.img of=/dev/null
|
||||
41943040+0 records in
|
||||
41943040+0 records out
|
||||
21474836480 bytes (21 GB, 20 GiB) copied, 49.0952 s, 437 MB/s
|
||||
|
||||
real 0m49.103s
|
||||
user 0m20.529s
|
||||
sys 0m28.248s
|
||||
```
|
||||
|
||||
It is doing around 3.4gbp/s - which is not bad, give 3 of the controllers are limited to 3gbp/s.
|
||||
|
||||
These results suggests the 3gbp/s controllers are the bottle neck, but even so giving relatively good numbers, certainly saturating the dual GigE on the motherboard if the data was leaving the host.
|
||||
|
||||
Let's try a few more tools, good old hdparm
|
||||
|
||||
```
|
||||
root@test1:/mnt/test1# hdparm -tT /dev/md1
|
||||
|
||||
/dev/md1:
|
||||
Timing cached reads: 21896 MB in 1.99 seconds = 10996.23 MB/sec
|
||||
Timing buffered disk reads: 6050 MB in 3.00 seconds = 2013.94 MB/sec
|
||||
```
|
||||
|
||||
These seem very very wrong, if this is to be believed, we are doing 16gbp/s, doubtful!
|
||||
|
||||
but for comparison a single HDD
|
||||
|
||||
```
|
||||
/dev/sda:
|
||||
Timing cached reads: 13594 MB in 1.96 seconds = 6939.87 MB/sec
|
||||
Timing buffered disk reads: 308 MB in 3.01 seconds = 102.22 MB/sec
|
||||
```
|
||||
|
||||
Running FIO
|
||||
|
||||
I have never used FIO before although I have seen some numbers. I am not sure what it means but finding answers on the forums suggests it performs very well.
|
||||
|
||||
I used this command and the output was
|
||||
|
||||
```
|
||||
root@test1:/mnt/test1# fio --randrepeat=1 --ioengine=libaio --direct=1 -gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=105000M --readwrite=randrw --rwmixread=80
|
||||
```
|
||||
|
||||
The resulting output
|
||||
|
||||
```
|
||||
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
|
||||
fio-3.28
|
||||
Starting 1 process
|
||||
test: Laying out IO file (1 file / 105000MiB)
|
||||
|
||||
Jobs: 1 (f=1): [m(1)][100.0%][r=107MiB/s,w=26.3MiB/s][r=27.3k,w=6743 IOPS][eta 00m:00s]
|
||||
test: (groupid=0, jobs=1): err= 0: pid=5841: Sun Jan 28 00:26:31 2024
|
||||
read: IOPS=21400, BW=83.7MiB/s (87.8MB/s)(82.0GiB/1003444msec)
|
||||
bw ( KiB/s): min= 16, max=222240, per=100.00%, avg=86497.62, stdev=65547.58, samples=1990
|
||||
iops : min= 4, max=55560, avg=21624.22, stdev=16386.89, samples=1990
|
||||
write: IOPS=5358, BW=20.9MiB/s (21.9MB/s)(20.5GiB/1003444msec); 0 zone resets
|
||||
bw ( KiB/s): min= 7, max=54776, per=100.00%, avg=22629.65, stdev=16075.92, samples=1902
|
||||
iops : min= 1, max=13694, avg=5657.23, stdev=4018.95, samples=1902
|
||||
cpu : usr=8.15%, sys=30.35%, ctx=6181617, majf=0, minf=9
|
||||
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
|
||||
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
|
||||
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
|
||||
issued rwts: total=21503052,5376948,0,0 short=0,0,0,0 dropped=0,0,0,0
|
||||
latency : target=0, window=0, percentile=100.00%, depth=64
|
||||
|
||||
Run status group 0 (all jobs):
|
||||
READ: bw=83.7MiB/s (87.8MB/s), 83.7MiB/s-83.7MiB/s (87.8MB/s-87.8MB/s), io=82.0GiB (88.1GB), run=1003444-1003444msec
|
||||
WRITE: bw=20.9MiB/s (21.9MB/s), 20.9MiB/s-20.9MiB/s (21.9MB/s-21.9MB/s), io=20.5GiB (22.0GB), run=1003444-1003444msec
|
||||
|
||||
Disk stats (read/write):
|
||||
md1: ios=21495384/5378585, merge=0/0, ticks=31171400/31594164, in_queue=62765564, util=98.75%, aggrios=1535273/512707, aggrmerge=19497/20772, aggrticks=2230727/289715, aggrin_queue=2522157, aggrutil=67.55%
|
||||
sdcd: ios=1535876/513140, merge=21689/23237, ticks=2654830/534888, in_queue=3190903, util=61.87%
|
||||
sdbz: ios=1538111/515020, merge=20991/21733, ticks=2125317/432538, in_queue=2558543, util=59.88%
|
||||
sdbv: ios=1537666/514959, merge=18396/18476, ticks=464324/124285, in_queue=589472, util=58.81%
|
||||
sdbc: ios=1532266/510557, merge=19268/20329, ticks=3051970/591808, in_queue=3646077, util=63.18%
|
||||
sdy: ios=1533226/509737, merge=20233/21591, ticks=2234145/408149, in_queue=2643231, util=60.97%
|
||||
sdbb: ios=1533713/511182, merge=21722/23302, ticks=4517761/823997, in_queue=5354193, util=67.55%
|
||||
sdba: ios=1535677/513601, merge=22025/23731, ticks=341058/37258, in_queue=378959, util=58.52%
|
||||
sdf: ios=1536464/513569, merge=16959/17704, ticks=2528700/421838, in_queue=2954362, util=61.62%
|
||||
sdaz: ios=1536534/513980, merge=19586/20512, ticks=370571/36670, in_queue=407738, util=58.58%
|
||||
sde: ios=1533479/510022, merge=18606/20941, ticks=280942/40412, in_queue=323263, util=58.49%
|
||||
sday: ios=1534307/512167, merge=17272/17549, ticks=270661/38194, in_queue=310634, util=58.45%
|
||||
sdd: ios=1535438/512562, merge=20708/23502, ticks=363458/39116, in_queue=402964, util=58.52%
|
||||
sdax: ios=1532955/511012, merge=18014/19189, ticks=410740/34215, in_queue=445357, util=58.49%
|
||||
sdc: ios=1537377/515147, merge=20142/22924, ticks=409537/121937, in_queue=531866, util=58.89%
|
||||
sdb: ios=1538098/515180, merge=18736/19803, ticks=368787/35413, in_queue=404619, util=58.49%
|
||||
sda: ios=1535075/512676, merge=16968/17633, ticks=370662/34674, in_queue=405753, util=58.52%
|
||||
sdae: ios=1532895/510419, merge=21198/22907, ticks=1018891/54725, in_queue=1074198, util=58.79%
|
||||
sdad: ios=1534688/512528, merge=22364/23787, ticks=13922220/1049834, in_queue=14974170, util=64.80%
|
||||
sdac: ios=1536754/514816, merge=19230/21200, ticks=6133168/506412, in_queue=6640614, util=58.98%
|
||||
sdab: ios=1536332/513772, merge=17682/17898, ticks=2471157/365254, in_queue=2838303, util=62.80%
|
||||
sdaa: ios=1533806/510807, merge=17657/18275, ticks=2536377/352416, in_queue=2890093, util=62.51%
|
||||
```
|
||||
|
||||
Interesting things to note - the read IOPS was 21k and the write IOPS was 5k, these are huge.
|
||||
|
||||
# Conclusions
|
||||
|
||||
- Cheap and fast storage servers are relatively straight forward to build
|
||||
- Physical Space regardless of how much you have should be valued
|
||||
- For drives > 10 use a SAS controller and a SAS expander card
|
||||
- For the best performance spread your drives across your controllers, it's a little work, but worth it!
|
||||
- SSD are low power, noise and heat - they are ££££s but certainly worth it
|
||||
- Do not RAID 84 drives into one array, at most do 21 into one array ,use different controllers!!!
|
||||
- Capacity at current rate 8TB @ £500 each , approx £42000 for 672TB raw storage which is
|
||||
6.7pence per Gigabyte fixed cost - not monthly, quarterly etc.
|
||||
|
||||
- Cost to build - time is the biggest factor but hardware wise if all priced
|
||||
|
||||
- PSU - £120 - never be cheap with a PSU
|
||||
- M/B - £350 - never be cheap with a M/B
|
||||
- LSI cards - 3x - £100
|
||||
- RES240 cards - 3X - £100
|
||||
- Additional controllers - no cost - but £20 each - £60 ( if anyone wants one I have about 100 )
|
||||
- Cables - too much - £300
|
||||
|
||||
- Chassis - you can get them built by Protocase - the specs can be downloaded £600
|
||||
|
||||
- Custom PCBs - £600 - it's the connectors - the PCBs are about £30, but each one has 30 connectors, Digikey about 1.50 average.
|
||||
|
||||
- For £1500 you too could build one!
|
||||
|
||||
I need to add in the power pictures and some other things etc
|
||||
|
||||
|
||||
Reference in New Issue
Block a user