Our disks on one of our servers are showing 100% utilization but no read or write. This Gist contains the output of iostat -xc 1 and errors on syslog. This one has the status info from the RAID card sudo tw-cli /c6 show. (All output is also below).
The latest firmware version is installed on the RAID card. What may be the problem?
All above logs:
iostat -xc 1
avg-cpu: %user %nice %system %iowait %steal %idle
7.34 0.00 2.78 89.87 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 2.00 0.00 3.00 0.00 16.50 11.00 0.02 6.67 0.00 6.67 6.67 2.00
sdb 0.00 2.00 0.00 3.00 0.00 16.50 11.00 0.02 6.67 0.00 6.67 6.67 2.00
md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 4.00 0.00 16.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 154.00 0.00 0.00 0.00 0.00 100.00
sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 6.00 0.00 0.00 0.00 0.00 100.00
sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 140.00 0.00 0.00 0.00 0.00 100.00
sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 9.00 0.00 0.00 0.00 0.00 100.00
sdn 0.00 0.00 0.00 0.00 0.00 0.00 0.00 6.00 0.00 0.00 0.00 0.00 100.00
sdo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdp 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
sdq 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdr 0.00 0.00 0.00 0.00 0.00 0.00 0.00 7.00 0.00 0.00 0.00 0.00 100.00
sds 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
sdt 0.00 0.00 0.00 0.00 0.00 0.00 0.00 161.00 0.00 0.00 0.00 0.00 100.00
sdu 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdv 0.00 0.00 0.00 0.00 0.00 0.00 0.00 6.00 0.00 0.00 0.00 0.00 100.00
sdw 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 100.00
sdx 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
sdy 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 100.00
sdz 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 0.00 0.00 0.00 0.00 100.00
syslog
Nov 19 13:52:21 localhost kernel: [437390.892018] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=0.
Nov 19 13:52:21 localhost kernel: [437390.892400] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=1.
Nov 19 13:52:21 localhost kernel: [437390.892782] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=2.
Nov 19 13:53:52 localhost kernel: [437481.824239] sd 6:0:20:0: WARNING: (0x06:0x002C): Command (0x2a) timed out, resetting card.
Nov 19 13:54:49 localhost kernel: [437538.800015] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=2.
Nov 19 13:54:49 localhost kernel: [437538.912011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=4.
Nov 19 13:54:49 localhost kernel: [437539.024012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=5.
Nov 19 13:54:49 localhost kernel: [437539.136018] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=7.
Nov 19 13:54:49 localhost kernel: [437539.248013] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=8.
Nov 19 13:54:50 localhost kernel: [437539.360011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=10.
Nov 19 13:54:50 localhost kernel: [437539.472018] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=11.
Nov 19 13:54:50 localhost kernel: [437539.584018] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=12.
Nov 19 13:54:50 localhost kernel: [437539.696014] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=13.
Nov 19 13:54:50 localhost kernel: [437539.808013] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=15.
Nov 19 13:54:50 localhost kernel: [437539.920030] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=16.
Nov 19 13:54:50 localhost kernel: [437540.032011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=17.
Nov 19 13:54:50 localhost kernel: [437540.144010] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=19.
Nov 19 13:54:50 localhost kernel: [437540.256011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=20.
Nov 19 13:54:51 localhost kernel: [437540.368016] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=21.
Nov 19 13:55:01 localhost CRON[19106]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Nov 19 13:55:01 localhost CRON[19107]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
Nov 19 13:55:24 localhost kernel: [437573.717341] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=0.
Nov 19 13:55:24 localhost kernel: [437573.717726] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=1.
Nov 19 13:55:24 localhost kernel: [437573.718104] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=2.
Nov 19 13:56:48 localhost kernel: [437657.824255] sd 6:0:5:0: WARNING: (0x06:0x002C): Command (0x2a) timed out, resetting card.
Nov 19 13:57:35 localhost kernel: [437704.612013] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=5.
Nov 19 13:57:35 localhost kernel: [437704.724015] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=7.
Nov 19 13:57:35 localhost kernel: [437704.836011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=10.
Nov 19 13:57:35 localhost kernel: [437704.948012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=12.
Nov 19 13:58:27 localhost kernel: [437756.720447] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=0.
Nov 19 13:58:27 localhost kernel: [437756.721151] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=1.
Nov 19 13:58:27 localhost kernel: [437756.721586] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=2.
Nov 19 13:59:48 localhost kernel: [437837.824227] sd 6:0:3:0: WARNING: (0x06:0x002C): Command (0x2a) timed out, resetting card.
Nov 19 14:00:01 localhost CRON[21249]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
Nov 19 14:00:49 localhost kernel: [437899.088012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=0.
Nov 19 14:00:49 localhost kernel: [437899.200012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=1.
Nov 19 14:00:50 localhost kernel: [437899.312010] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=2.
Nov 19 14:00:50 localhost kernel: [437899.424014] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=3.
Nov 19 14:00:50 localhost kernel: [437899.536013] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=4.
Nov 19 14:00:50 localhost kernel: [437899.648012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=5.
Nov 19 14:00:50 localhost kernel: [437899.760014] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=6.
Nov 19 14:00:50 localhost kernel: [437899.872010] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=7.
Nov 19 14:00:50 localhost kernel: [437899.984010] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=8.
Nov 19 14:00:50 localhost kernel: [437900.096012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=9.
Nov 19 14:00:50 localhost kernel: [437900.208012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=10.
Nov 19 14:00:51 localhost kernel: [437900.320011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=11.
Nov 19 14:00:51 localhost kernel: [437900.432013] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=12.
Nov 19 14:00:51 localhost kernel: [437900.544012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=13.
Nov 19 14:00:51 localhost kernel: [437900.656011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=14.
Nov 19 14:00:51 localhost kernel: [437900.768012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=15.
Nov 19 14:00:51 localhost kernel: [437900.880011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=16.
Nov 19 14:00:51 localhost kernel: [437900.992010] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=17.
Nov 19 14:00:51 localhost kernel: [437901.104011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=18.
Nov 19 14:00:51 localhost kernel: [437901.216011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=19.
Nov 19 14:00:52 localhost kernel: [437901.328011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=20.
Nov 19 14:00:52 localhost kernel: [437901.440011] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=21.
Nov 19 14:00:52 localhost kernel: [437901.552013] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=22.
Nov 19 14:00:52 localhost kernel: [437901.664012] 3w-9xxx: scsi6: AEN: INFO (0x04:0x005E): Cache synchronization completed:unit=23.
Nov 19 14:01:30 localhost kernel: [437939.895004] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=0.
Nov 19 14:01:30 localhost kernel: [437939.895388] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=1.
Nov 19 14:01:30 localhost kernel: [437939.895768] 3w-9xxx: scsi6: AEN: INFO (0x04:0x0029): Verify started:unit=2.
status info from raid card
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
u0 SINGLE VERIFY-PAUSED - 5% - 2793.96 RiW ON
u1 SINGLE VERIFY-PAUSED - 3% - 2793.96 RiW ON
u2 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u3 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u4 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u5 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u6 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u7 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u8 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u9 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u10 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u11 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u12 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u13 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u14 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u15 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u16 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u17 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u18 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u19 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u20 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u21 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u22 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
u23 SINGLE VERIFY-PAUSED - 0% - 2793.96 RiW ON
VPort Status Unit Size Type Phy Encl-Slot Model
p0 VERIFYING u0 2.73 TB SATA 0 - ST3000DM001-9YN166
p1 VERIFYING u1 2.73 TB SATA 1 - ST3000DM001-9YN166
p2 VERIFYING u2 2.73 TB SATA 2 - ST3000DM001-9YN166
p3 VERIFYING u3 2.73 TB SATA 3 - ST3000DM001-9YN166
p4 VERIFYING u4 2.73 TB SATA 4 - ST3000DM001-9YN166
p5 VERIFYING u5 2.73 TB SATA 5 - ST3000DM001-9YN166
p6 VERIFYING u6 2.73 TB SATA 6 - ST3000DM001-9YN166
p7 VERIFYING u7 2.73 TB SATA 7 - ST3000DM001-9YN166
p8 VERIFYING u8 2.73 TB SATA 8 - ST3000DM001-9YN166
p9 VERIFYING u9 2.73 TB SATA 9 - ST3000DM001-9YN166
p10 VERIFYING u10 2.73 TB SATA 10 - ST3000DM001-9YN166
p11 VERIFYING u11 2.73 TB SATA 11 - ST3000DM001-9YN166
p12 VERIFYING u12 2.73 TB SATA 12 - ST3000DM001-9YN166
p13 VERIFYING u13 2.73 TB SATA 13 - ST3000DM001-9YN166
p14 VERIFYING u14 2.73 TB SATA 14 - ST3000DM001-9YN166
p15 VERIFYING u15 2.73 TB SATA 15 - ST3000DM001-9YN166
p16 VERIFYING u16 2.73 TB SATA 16 - ST3000DM001-9YN166
p17 VERIFYING u17 2.73 TB SATA 17 - ST3000DM001-9YN166
p18 VERIFYING u18 2.73 TB SATA 18 - ST3000DM001-9YN166
p19 VERIFYING u19 2.73 TB SATA 19 - ST3000DM001-9YN166
p20 VERIFYING u20 2.73 TB SATA 20 - ST3000DM001-9YN166
p21 VERIFYING u21 2.73 TB SATA 21 - ST3000DM001-9YN166
p22 VERIFYING u22 2.73 TB SATA 22 - ST3000DM001-9YN166
p23 VERIFYING u23 2.73 TB SATA 23 - ST3000DM001-9YN166
Name OnlineState BBUReady Status Volt Temp Hours LastCapTest
bbu On Yes OK OK OK 0 xx-xxx-xxxx