Tag Archives: limited bandwidth

Talking about "Pinning the wrong way" – another smart UCS idea!!


This totally surprised me as I understood it right, looks like Cisco got it wrong big time with this “pinning” feature in UCS IO Module FEX!!! They have fixed configurations for traffic flow from server slots depending on how many uplinks are used. So if you ever change the number of connected uplinks, then you need to relocate servers to appropriate slots based on bandwidth requirements, following the pinning table below.

That’s totally not a really smart idea!

Read this from UCS configuration guide:

Pinning Server Traffic to Server Ports

All server traffic travels through the I/O module to server ports on the fabric interconnect. The number of links for which the chassis is configured determines how this traffic is pinned.

The pinning determines which server traffic goes to which server port on the fabric interconnect. This pinning is fixed. You cannot modify it.

As a result, you must consider the server location when you determine the appropriate allocation of bandwidth for a chassis.

You must review the allocation of ports to links before you allocate servers to slots. The cabled ports are

not necessarily port 1 and port 2 on the I/O module.

If you change the number of links between the fabric interconnect and the I/O module, you must reacknowledge the chassis to have the traffic rerouted.

Note

All port numbers refer to the fabric interconnect-side ports on the I/O module.

Chassis with One I/O Module (FEX)

So you can have two servers per uplink leading to 2:1 subscription if all the four links are used, but the traffic flows are fixed. For example, traffic from Server 1 and 5 share uplink1. You don’t have a choice to change that, remember pinning is fixed and you cannot modify, but your choice is to decide if  two servers A and B has higher bandwidth requirements, then you don’t want them in Slot 1&5 or 2&6 or 3&7 or 4&8 together, to avoid bandwidth starvation. So you may want to plug Server A with Server C because server C needs little bandwidth. If all the servers need higher bandwidth, then all of them will starve now and then…..

What happens if your uplink bandwidth needs change?

For example, let say you had 2 uplinks connected and you made sure servers are plug in to appropriate slots such that none of them starve for uplink bandwidth. Finally you optimized that.

Then later you decided that you connect more uplinks to support your new bandwidth needs, let’s say 4. Since the server slots pinned to a specific uplink changes as you connected 4 uplinks, you have to figure out again which slot specific server should go so that they don’t compete for uplink bandwidth and avoid starvation. 

I think Cisco “Pinned” themselves wrong with UCS!! Don’t you??

Cheers

Sri

Smile

Do you see a need to manage bandwidth for avoiding traffic contention on 10Gigabit Super Highway in virtualized Datacenters? Here is my take on it…what’s yours?


First of all, 10Gig is a huge jump in the amount the bandwidth made available for use for server connectivity, as compared to today’s 1GE options…Remember we were running our business today with about 8 or 12  1GE NICs and couple of 4G HBAs for network and storage connectivity in the server. Two of these 10Gig NICs can bring 20Gig to play with, for similar purposes, which infact is equal or more than we are used to today. If we need more in future, we could always add few more 10Gig NICs to share the traffic loads of tomorrow.

Since the bandwidth is a constraint in 1Gig world as today’s applications grew out of those bandwidths available, we needed to manage traffic congestion, and all these technologies that talk about dedicated bandwidth per traffic type made much sense…

But when you move on to 10GE connectivity…That is a different story!!…Since you got more bandwidth than you need for today’s traffics and while that guarantees congestion free traffic flows; traffic management tools like NIC partitioning techniques like HP Flex-10, VMware Rate Limiting, Network IO control (NetIOC)….are not needed at all.

At this point, all these bandwidth management schemes are unnecessary and should not be used to avoid the complexities that they bring along with feature functions they support..

10GE brings much needed port consolidation (especially in compact form factors like blades) and also a bandwidth capacity that is more than you need for today’s applications and traffics..

We have to make that point clear to the technology consumers and shift them from old school thinking of dedicated physical ports with specific bandwidths, etc…for meeting traffic isolation and bandwidth needs; to new school thinking of sharing the same wire and using VLANs and other techniques.

To sell their stuff…what some companies (selling proprietary techniques like Flex-10) are doing is, fanning the existing paranoia or fear of critical traffic not having enough bandwidth on the wire…while not highlighting that 10Gig actually solves bandwidths needs by itself, on it’s own, & no need for special techniques, period!!

If your core infrastructure is not ready for 10Gig connectivity, then you should look for high port density solutions like Dell 1GE Quad Port NICs with PowerConnect M6348 high port density Switches for M1000e Blade Chassis…to serve the need to have more ports to support virtualization.

Right industry standard technologies are evolving to support the need to improve efficiency in IO virtualization like Single Root IO Virtualization (SR-IOV)…which is much needed, to bring the benefits of Virtualization to every type of computing including HPC.

SR-IOV is not about carving bandwidth, in fact it does not do that at all. It virtualizes a physical NIC and presents as multiple devices to hypervisor…hypervisor does that today. So SR-IOV technique is offloading that work to a NIC and leaving more cycles for hypervisor to do important things like supporting more VMs 🙂 and running them more efficiently. Also it supports direct access to these devices from guest VMs/OSs, bypassing hypervisor…drastically improving efficiency while still able to do vMotion as hypervisor still have access to control plane of these devices. All these devices share the same wire…so there is no dedicated bandwidth per device here…remember 10Gig is enough!!

When in future, if Users run into bandwidth contention, even with 10Gig NICs because apps grew out and started generating larger traffics, then they can add more, like they do today with 1Gig NICs…

In a “worst case”, where they still run in to bandwidth contention as they ran out of space to add more 10NICs, then we still have better tools like NetIOC or Rate Limiting for traffic shapping….and I don’t see that need in the near future.

Hope this makes sense…and provides the bigger picture on 10GE for Virtualization..

Let’s get it right, the correct perception of using these 10GE super highways…where you don’t need traffic management, when there is no congestion….in my words “simply don’t add complexity when not needed”. 

In the mean time, we are working hard to make those technologies that are much needed like SR-IOV, happen in real products…as it makes better sense.

That’s my take on this topic…what’s yours?

Here are few papers out there which dwells deep on this subject,

  1. Simplify VMware vSphere* 4 Networking with10 Gigabit Server Adapters

  2. “Optimizing QoS for 10GE in Virtual Networks”


  3. Virtual Switches Demand Rethinking Connectivity for Servers

    Signing off for now…..Sri

Talking about "Pinning the wrong way" – another smart UCS idea!!


This totally surprised me as I understood it right, looks like Cisco got it wrong big time with this “pinning” feature in UCS IO Module FEX!!! They have fixed configurations for traffic flow from server slots depending on how many uplinks are used. So if you ever change the number of connected uplinks, then you need to relocate servers to appropriate slots based on bandwidth requirements, following the pinning table below.

That’s totally not a really smart idea!

Read this from UCS configuration guide:

Pinning Server Traffic to Server Ports

All server traffic travels through the I/O module to server ports on the fabric interconnect. The number of links for which the chassis is configured determines how this traffic is pinned.

The pinning determines which server traffic goes to which server port on the fabric interconnect. This pinning is fixed. You cannot modify it.

As a result, you must consider the server location when you determine the appropriate allocation of bandwidth for a chassis.

You must review the allocation of ports to links before you allocate servers to slots. The cabled ports are

not necessarily port 1 and port 2 on the I/O module.

If you change the number of links between the fabric interconnect and the I/O module, you must reacknowledge the chassis to have the traffic rerouted.

Note

All port numbers refer to the fabric interconnect-side ports on the I/O module.

Chassis with One I/O Module (FEX)

So you can have two servers per uplink leading to 2:1 subscription if all the four links are used, but the traffic flows are fixed. For example, traffic from Server 1 and 5 share uplink1. You don’t have a choice to change that, remember pinning is fixed and you cannot modify, but your choice is to decide if  two servers A and B has higher bandwidth requirements, then you don’t want them in Slot 1&5 or 2&6 or 3&7 or 4&8 together, to avoid bandwidth starvation. So you may want to plug Server A with Server C because server C needs little bandwidth. If all the servers need higher bandwidth, then all of them will starve now and then…..

What happens if your uplink bandwidth needs change?

For example, let say you had 2 uplinks connected and you made sure servers are plug in to appropriate slots such that none of them starve for uplink bandwidth. Finally you optimized that.

Then later you decided that you connect more uplinks to support your new bandwidth needs, let’s say 4. Since the server slots pinned to a specific uplink changes as you connected 4 uplinks, you have to figure out again which slot specific server should go so that they don’t compete for uplink bandwidth and avoid starvation. 

I think Cisco “Pinned” themselves wrong with UCS!! Don’t you??

Cheers

Sri