Nov 182011
 

A big thanks to my colleague Stuart Street for helping put this one together.

 

The Scenario:

Existing VMWARE 4.1 Cluster. 5 blades

Environment 1.3.(1t)

BIOS S5500.1.4.1f.0.120820101100

We needed to add a further 3 blades to the existing cluster.

So, the plan was to copy (clone)  from an existing service profile, create a firmware and boot policy on the new blades.  Or from the service profile template, create a new Unique service profile.

Next we diligently checked the various pools for enough spare entries. (IP Mgmt – UUID – MAC address – WWPN – WWNN). This was confirmed.

Trying to create the Service profile failed. Initially we saw cryptic error messages, leading us to believe it was mgmt. ip  or vnic  pool related.  The second error message does refer to a pool based issue.

SP creation method:-Cloning Service Profile

Error cloning

lsServer: can’t contain: object of class vnicIpV4PooledAddr with RN ipv4-pooled-addr, DN os org-root/ls-SERVERNAME/ipv4-pooled-addr.

SP creation method:-Creating Service Profile from Template

ucs error creating service profiles from template. Cause: pooled address is unknown

There were no clues elsewhere, as the service profile creation just fails and therefore the profile takes nothing from any of the pools.

So, we manually rolled back the firmware on the new blades before trying to create a new SP. But we could not do the BIOS however, unless there is a spare existing SP you can use to create a BIOS policy. In our case there was, so tried this BIOS version downgrade, this did not fix the problem.

Resolution:

The exact fix – Create a new UUID pool, add enough UUID’s – Point the service Profile at this new pool – result – clone or template method now both work.

“Explanation” is that somehow, rolling the firmware back on the Blades results in the existing UUID pool entries normally eligible becoming stranded for some reason.  Below is the link to the Cisco TAC explanation.  This  is not quite really the scenario we saw at all, and the labs at Cisco have not reproduced it  – but the fix worked!  We did not delete the existing UUID pool, didn’t see the point in taking any risk. Also the Fabric Interconnects were at 1.3.(1T) throughout.

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtr74080

(NOTE: you will need a Cisco ID to see the above)

Thoughts:

So if you get any trouble with Service profile creation, on my list would be to start with creating a new UUID pool, referencing that new pool in the SP.  I probably would do the same with the other pools one by one as well in case this leads to resolution.. can always delete these if it turns out  they don’t help.

Post to Twitter

Twitter links powered by Tweet This v1.8.3, a WordPress plugin for Twitter.