Insights

10 Nutanix Networking Best Practices

If you're looking to set up a Nutanix network, here are 10 best practices to follow for optimal performance.

As a leading hyperconverged infrastructure (HCI) solution, Nutanix provides a single platform for compute, storage, and networking. This article provides an overview of the key networking best practices for Nutanix deployments.

1. Create a dedicated network for Nutanix

Nutanix uses a lot of network bandwidth for replication and other internal traffic. By creating a dedicated network, you can ensure that this internal traffic doesn’t impact the performance of your applications and users.

Additionally, having a dedicated network simplifies troubleshooting and helps to prevent issues from spreading. If there’s a problem with the Nutanix cluster, you can isolate it to the dedicated network without affecting other parts of your infrastructure.

2. Use jumbo frames (MTU 9000) on all networks

Jumbo frames allow for more data to be transferred in each network packet, which reduces overhead and increases efficiency. When using jumbo frames, it’s important to make sure that all devices on the network support them; otherwise, communication will break down.

Nutanix recommends using jumbo frames on all networks except for the management network, which should use standard frames (MTU 1500).

3. Enable flow control on the physical switch ports

When flow control is disabled, the switch can drop packets if it starts to get overwhelmed. This can cause problems for the Nutanix cluster because it relies on a constant stream of data in order to function properly.

Enabling flow control ensures that the switch will never drop any packets, which eliminates the possibility of any disruptions to the Nutanix cluster.

4. Disable Spanning Tree Protocol (STP) or Rapid Spanning Tree Protocol (RSTP)

STP and RSTP are link layer protocols that prevent loops by creating a logical topology of the network. However, in a Nutanix environment, there is no need for STP or RSTP because the physical topology is already loop-free.

Not only does disabling STP or RSTP improve performance, but it also reduces CPU utilization and memory usage on the network devices.

5. Configure VLANs and IP subnets to support multicast traffic

Multicast traffic is used by many of the features in the Nutanix platform, such as Acropolis File Services (AFS), Prism Central, and vSphere Replication. If your network is not configured to support multicast traffic, these features will not work properly.

To configure your network for multicast traffic, you’ll need to create a VLAN and an IP subnet that are both dedicated to multicast traffic. You can then configure your switches and routers to allow multicast traffic on those VLANs and subnets.

By following this best practice, you can be sure that your Nutanix platform will work properly and take full advantage of all its features.

6. Ensure that your switches are configured with IGMP snooping enabled

When IGMP snooping is enabled, the switch will listen in on IGMP traffic and only forward it to the ports that have hosts that have joined the multicast group. This prevents unnecessary flooding of multicast traffic to all ports, which can cause performance issues.

It’s also important to configure your switches for jumbo frames, which are larger than normal Ethernet frames. Jumbo frames can improve performance by reducing the overhead associated with smaller frames.

Finally, you should make sure that your switch ports are configured for the correct speed and duplex settings. Incorrectly configured ports can lead to a number of problems, including packet loss and latency.

7. Do not use link aggregation groups (LAGs) in active/active mode

When LAGs are used in active/active mode, each host in the cluster will send traffic to every other host in the cluster over all of the links in the LAG. This can lead to suboptimal use of bandwidth and can cause increased latency.

It’s much better to use LAGs in active/passive mode, where only one link in the LAG is active at any given time. This ensures that traffic is only sent over one link at a time, which leads to more efficient use of bandwidth and lower latency.

8. Set up redundant links between each node and the top-of-rack (ToR) switch

If one of the links between a node and ToR switch fails, traffic will still be able to flow over the other link. This ensures that there is no single point of failure in the network.

Nutanix also recommends using jumbo frames (MTU 9000) to improve network performance. Jumbo frames can help reduce CPU utilization on the nodes because fewer packets need to be processed.

Finally, Nutanix recommends enabling Flow Control on all ports to prevent network congestion. Flow Control allows the sending device to pause the transmission of data when the receiving device is not able to process it fast enough. This helps to avoid dropped packets and ensures that data is delivered reliably.

9. Connect each NIC port to different ToR switches

If you connect all of your NIC ports to the same ToR switch, you’re creating a single point of failure. If that switch goes down, your entire network will go down with it.

By connecting each NIC port to a different ToR switch, you’re ensuring that your network will stay up even if one of the switches goes down. This is a simple but important best practice that can save you a lot of headaches down the road.

10. Avoid using LACP, MLAG, vPC, or other similar features

When using any of these features, traffic that is supposed to go over one link may instead be sent over another link. This can cause problems because the return traffic may not come back over the same link, leading to a “hairpinning” situation.

Hairpinning can cause all sorts of problems, including decreased performance, increased latency, and even dropped packets. In some cases, it can even lead to a complete loss of connectivity.

The best way to avoid these problems is to simply not use any of these features. If you need to use them for some reason, make sure that you understand the potential risks and take steps to mitigate them.

Previous

10 React Form Validation Best Practices

Back to Insights
Next

10 Linux Service Account Best Practices