Example Architectural Decision – Jumbo Frames for IP Storage (Do not use Jumbo Frames)

Problem Statement

When using IP based storage over a converged 10GB network, should Jumbo Frames be used?

Requirements

1. Fully Supported storage

2. Maximum vSphere environment availability

3. Maximize performance where possible

Assumptions

1. Converged 10GB Network which is highly available

2. Two (or more) 10GB connections per ESXi host

Constraints

1. No dedicated network for IP storage traffic

Motivation

1. Simplify the environment

Architectural Decision

Do not use Jumbo Frames

Justification

1. Reduce the complexity in the environment for initial implementation

2. Simplify ongoing support / troubleshooting

3. For a Jumbo Frame to be transmitted without fragmentation, All devices end to end must support and be configured for Jumbo Frames

4. While there can be performance benefits of Jumbo Frames for IP Storage this is not generally seen across the board and depends on I/O types

5. Ensure IP storage packets are not fragmented or dropped by mis-configured devices or devices that do not support Jumbo Frames

6. Storage performance for the virtual environment will generally be constrained by the storage array controllers not the storage area network

7. Ensure packet fragmentation does not occur as all devices support a default MTU of 1500

8. Increasing the MTU will decrease the number of packets required for the same bandwidth but where the bottleneck is throughput (bytes) there will be minimal/no benefit

9. Jumbo Frames will only assist where the network is constrained at an interrupt level

Implications

1. IP Storage may have reduced performance in some circumstances compared to what Jumbo Frames may offer

Alternatives

1. Use Jumbo Frames

Relates Articles

1. Example Architectural Decision – Jumbo Frames for IP Storage (Use Jumbo Frames)

 Contributors

Thanks to Rob McNab (IBM) and Peter McCrystal (IBM) for their input into this example architectural decision.

 

 

Example Architectural Decision – Storage I/O Control for Clusters Protected by SRM (Example 2 – Use SIOC)

Problem Statement

In an environment with one or more clusters with virtual machines protected by SRM, What is the most appropriate configuration of Storage I/O control?

Requirements

1. SRM solution must not be impacted

Assumptions

1. vSphere Version 4.1 or later

2. FC (Block) Based Storage OR NFS (File) based Storage

3. Number of datastores is fairly static

Constraints

1. Storage I/O control can prevent unmounting of datastore during a Recovery which can lead to errors being reported by SRM

Motivation

1. Where possible ensure consistent storage performance for all virtual machines

Architectural Decision

Enable and Configure Storage I/O control for all datastores.

Set the congestion threshold to 20ms

Leave the shares value default

Add a Step to each SRM recovery Plan as Step 1 and Select the Step Placement of “Before selected step”.

Configure step type as “Command of SRM Server” and execute the Scheduled Task which will disable SIOC prior to executing a SRM recovery

Justification

1. The benefits of Storage I/O control can still be achieved without impact to the SRM solution

2. SIOC will not impact SRM failover as it will be disabled automatically as part of the SRM recovery plan

3. In the event the Protected site or is lost, SIOC will not prevent failover

Implications

1. Increased complexity for the SRM solution

2. An additional step to excecute a “Command of SRM Server” is required

3. A Scheduled Task will need to be setup and configured with setting “Allow task to be ran on demand”

4. A script to disable SIOC will need to be prepared and configured with all datastores

Alternatives

1. Enable Storage I/O control and leave default settings

2. Enable storage I/O control and set share values on virtual machines

3. Enable Storage I/O control and set a lower “congestion threshold”

4. Enable Storage I/O control and set a higher “congestion threshold”

5. Disable Storage I/O control

Relates Articles

1. Example Architectural Decision –  Storage I/O Control for Clusters Protected by SRM (Example 2 – Don’t Use SIOC)

 

Example Architectural Decision – Storage I/O Control for Clusters Protected by SRM (Example 1 – Don’t Use SIOC)

Problem Statement

In an environment with one or more clusters with virtual machines protected by SRM, What is the most appropriate configuration of Storage I/O control?

Requirements

1. SRM solution must not be impacted

Assumptions

1. vSphere Version 4.1 or later

2. FC (Block) Based Storage OR NFS (File) based Storage

Constraints

1. Storage I/O control can prevent unmounting of datastore during a Recovery which can lead to errors being reported by SRM

Motivation

1. Where possible ensure consistent storage performance for all virtual machines

2. Simplicity

Architectural Decision

Do not use Storage I/O control for datastores protected by SRM

Justification

1. Storage I/O control can prevent unmounting of datastore during a Recovery which can lead to errors being reported by SRM

2. Storage I/O control can prevent re-mounting of datastore/s during a failback which can lead to errors being reported by SRM and prevent failback without manual intervention

3. Solution does not require any custom steps added to SRM to facilitate a successful recovery

Implications

1. Storage I/O control cannot be used for Datastores protected by SRM

2. In the event of storage contention, SIOC will not be able to ensure fairness between virtual machines based on their share values

3. Storage Performance may degrade significantly during contention

Alternatives

1. Enable Storage I/O control and leave default settings

2. Enable storage I/O control and set share values on virtual machines

3. Enable Storage I/O control and set a lower “congestion threshold”

4. Enable Storage I/O control and set a higher “congestion threshold”

5. Enable Storage I/O control and as part of the DR runsheet, disable SIOC prior to executing a SRM recovery

Relates Articles

1. Example Architectural Decision –  Storage I/O Control for Clusters Protected by SRM (Example 2 – Use SIOC)