Dell EqualLogic PS4000: Hands-on Review Part 2

By TechHead guest contributor, James Pearce:  EqualLogic’s PS4000 iSCSI storage arrays are targeted at the SME sector, particularly for VMware virtualisation, as well as “branch office” use for larger companies.  The XV model is reviewed, with 15k SAS drives, dual controllers and dual power supplies.

This is quite a complicated product, so I’ve split this into four sections:

Dell EqualLogicPart 1An Introduction to the PS4000

Part 2 – EqualLogic Networking with Force10

Part 3System Management and Monitoring

Part 4Performance

This part is verging on a deep-dive but bear with it… the results are worth it!

 

Part 2 – EqualLogic Networking with Force10

Being iSCSI based, the PS4000 has a great dependence on decent GbE switches.  iSCSI traffic should always be on dedicated switches because it demands essentially no loss, is sensitive to latency, and has high throughput.  It is also sensitive traffic (using arp cache poisoning, physical access to a network carrying iSCSI is enough to assemble a replica of entire LUNs being accessed given sufficient time).

ESX also has some switching demands for iSCSI – specifically flow control and ‘well behaved’ port buffers, both with the aim of minimising packet loss.  Jumbo frames are also supported for paid-for ESX(i) installations, but their implementation will provide only a marginal throughput increase for heavy sequential IO (even with a 1500-byte MTU, IP is around 97% efficient).

Force10

Dell EqualLogic - Force 10 EqualLogic recommend Force10 S25N switches for the PS series, positioned just above Dell PowerConnect.  At least two are required, and must be configured with a decent interconnect (preferably stacked).

The S25’s are somewhat purist – hyperterminal and a serial port is the order of the day, and the CLI is sufficiently different to Cisco’s IOS to make configuration time consuming and frustrating for the uninitiated.  Also, since there is no support in Dell’s OpenManage suite for them, for most SMEs lacking OpenView will have no way to monitor them.

Still what they lack in ease of management they make up for in performance – these 24-port units have a 95Mpps backplane (twice that of the Cisco 3750G) and a switching latency specified as “<5 µs”.  They also have (separately licensed) line-rate layer-3 capabilities.

The S25s are fitted with two internal PSUs and are rated at 100W, but used only about 50W during testing.

Force10 Switch Configuration

Rather unhelpfully all interfaces on the S25s are shipped administratively down so manual configuration is mandatory from the serial interface.  The basic configuration is at least reasonably straightforward,

  • enable all ports
  • create a 4-port LAG (or configure stacking)
  • enable flow control on all ports
  • configure global 1Q buffer policy

A simple configuration looks something like this (I’ve condensed this by showing only one port, but all ports will have the same configuration applied):

interface GigabitEthernet 0/1
no ip address
mtu 9252
switchport
flowcontrol rx on tx on
spanning-tree rstp edge-port
no shutdown

!
interface Port-channel 1
no ip address
switchport
channel-member GigabitEthernet 0/21-24
no shutdown
!
interface Vlan 1
!
interface Vlan 10
ip address 192.168.1.1/24
untagged GigabitEthernet 0/1-20
untagged Port-channel 1
no shutdown
!
buffer-profile global 1Q

The buffer-profile policy line of the configuration always displays that a reboot is needed due to a bug with v7.x of the firmware, but it’s correct implementation (dynamic port buffers) can be confirmed using show buffer-profile:

Force10#show buffer-profile detail fp-uplink

Global Pre-defined buffer policy: 1Q

Stack-Unit 0 Port-set 0

Buffer-profile : -

Dynamic Buffer 1603.75 KB (Current), 389.75 KB (Configured)

EqualLogic Configuration

The PS4000 series have a dedicated management interface, but even so this option needs to be manually set to be sure the system does not route iSCSI traffic through that interface.  Other than this and the basic IP configuration (one IP per interface, plus a virtual group IP), there’s not much to tweak.

The system is configured through the console port via a text-mode ‘wizard’, a simple process that takes only a few minutes.  Once done the system is available for use immediately, undertaking background initialisation in its own time.  Performance is of course impacted for until the initialisation is complete – I measured about 30MB/s sequential throughput whilst it was initialising.

VMware ESX iSCSI Configuration

There is some complexity to the ‘optimal’ ESX configuration for EqualLogic storage as detailed in Dell’s ESX multipath configuration guide.  Although the PS series can be just connected up and used almost immediately, significant performance gains can be made taking the time to get this right.

There’s a couple of tweaks that Dell don’t give away, although their excellent support teams will offer the information when pushed hard enough.  One is to disabled Nagle for connections through guest OS initiators (like the Microsoft iSCSI Initiator), since it is counter-productive with iSCSI, and the other is to set the round-robin IOPS value on a per-LUN basis in ESX to get proper load balancing between connections.

With this configuration applied, the vSphere multi-path capabilities can provide the full bandwidth right to any particular VM – over 200MB/s read performance with this array.

It’s worth noting that there are some issues presently, firstly that custom IOPS balancing settings are lost between host reboots, and secondly that LUNs with low IO levels will trigger lost path warnings periodically, which will be addressed with the ESX patch 5 due later this month.

Nagle
The Nagle algorithm is designed to prevent poor programming from flooding a network sending 1-byte packets, instead saving up all the 1-byte requests and pushing them out when there’s enough to fill a packet.  Unfortunately when combined with delayed ACK, the inevitably incomplete packet at the end of a transmission might not get sent immediately because there’s a standoff between the two ends, one waiting for an ACK and the other for the final packet (if it is an odd numbered packet), until a timer value expires.

For iSCSI, the impact is to add significant latency to small writes for applications that work with shallow queue depths.

Dell EqualLogicThere is no option in the ESX software initiator to directly control Nagle (delayed ACK can be controlled, hidden in initiator advanced option “Delayed ACK”, which is enabled by default).  On Linux it’s disabled by setting the socket option TCP_NODELAY, and in Windows it’s controlled on a per-interface basis with registry parameter TcpAckFrequency (Default is 2, 1=Disables delayed ACK), as documented in KB Q328890.

It’s worth noting that EqualLogic recommend using guest initiators for performance sensitive applications instead of vmfs volumes.  But in my testing, vmfs provided storage had more than adequate performance for all applications I tested, with the exception of the ancient Oracle import job which we’ll see later.

Round-Robin IOPS
By default, VMware’s iSCSI initiator will balance requests across multiple paths when LUNs are configured for round-robin, but it only changes paths every 1,000 commands.  As a result, performance is still limited to the capacity of one link, since only one link is really active at once.  A host startup script can be used to change this to a lower value (the trade-off being higher CPU utilisation to calculate the paths more often) as below, in testing I found that a value of 3 gave the highest performance.

/etc/rc3.d/S99_nmpfix.sh:

#!/bin/bash

Sleep 60

for i in `esxcli nmp device list|grep -i "Device Display Name: EQLOGIC" –before-context=1|grep -i "naa."|grep -i -v "Device Display Name"` ;

do esxcli nmp roundrobin setconfig –device $i –type "iops" –iops=3

done

nano can be used on ESX to type this in, and note that from ‘for’ to ‘Name” ;’ needs to be all on one line.  This code was developed from a vmware communities posting, here.  Once in, set it’s permissions to allow execution (‘chmod 777 S99_nmpfix.sh’) then run it and check the result using a second script thus:

nmpcheck.sh:

#!/bin/bash

for i in `esxcli nmp device list|grep -i "Device Display Name: EQLOGIC" –before-context=1|grep -i "naa."|grep -i -v "Device Display Name"` ;

do esxcli nmp roundrobin getconfig –device $i

done

Coming Next

The networking side does take some time to get right, but since the EqualLogic platform is easily capable of taking centre-stage for many companies deploying storage at this price point, it’s something that warrants the time taken to fully understand and test.

Next, I’ll look at the web management interface and the other configuration and monitoring tools EqualLogic provide.

_________________________________________________

About the Author

 James PearceJames is regular guest contributor to TechHead and is a Kent based qualified accountant, currently working in information security and technical architecture with  most of his  time “being spent on virtualisation and business continuity at the moment”.

 

Technorati Tags: Dell,EqualLogic,
  • sysadmin

    Great article!
    when will part 3 come online??

    • http://techhead.co Kiwi Si

      Hi,

      James has just completed the finishing touches on the third part and it will be posted sometime this week. Watch this space! :)

      Cheers,

      Simon

  • Pingback: Dell EqualLogic PS4000: Hands-on Review Part 3 | TechHead.co.uk()

  • Pingback: EqualLogfic PS4000 Review | TechHead.co.uk()

  • http://www.coolgray.dk René Frej Nielsen

    Hi,

    We have just had a PS4000VX and a PS4000E (for replication) installed. I didn’t really understand if you recommended that Delayed Ack is disabled or if it should be left enabled as the default is.

    Regards
    René Frej Nielsen