Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update hardware requirements doc. Add compute node UUID work around #49

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions docs/hardware-requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,65 @@ There are a few known hardware related issues with illumos.
- There have been several issues with Intel CPUs regarding
their C-States. SmartOS has worked around them, but you should
consider disabling them in your BIOS.

SmartOS/Triton

- SmartOS depends upon the hardware bios serial number in order to
generate a UUID on boot; which is then assigned to a given compute
node. (As displayed by the `sysinfo` command) In some rare cases, such
as with the "Dell PowerEdge c6100" blade-type line of servers, the
chassis sometimes incorrectly assigns the same serial number across
all of the blades installed in the same unit. This can cause issues
for some software such as "Triton Datacenter" that directly rely upon
the chassis serial number and the subsequent UUID generated on boot
by SmartOS, to be entirely unique.

Example:

In the case of "Triton Data Center" and the aforementioned, "Dell
c6100" chassis, 1 compute node (blade/sled), will be properly detected
by "cnapi" and consequently, the "Operator Portal" on boot, while the
other 3 quietly PXE boot without detection by Triton. To determine if
a duplicate server UUID is the cause of your issue, simply ssh into
each of the compute nodes in question, and run: `sysinfo | json UUID`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sysinfo command a code block

If more than 1 compute node share the same UUID, then a duplicate
serial number is likely the cause of the issue. You can also verify
the duplicate serial numbers on each node with the following:

`ipmitool fru print 0`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a code block


You should receive output resembling this:

Chassis Type : Rack Mount Chassis
Chassis Part Number :
Chassis Serial :
Board Mfg Date : Wed Nov 7 02:43:00 2012
Board Mfg : Dell Inc.
Board Product : PowerEdge
Board Serial : CN0D61XP747512B60255A08
Board Part Number : 282BNP0616
Product Manufacturer : Dell Inc.
Product Name : C6100
Product Part Number :
Product Version :
Product Serial : DB3KYV1
Product Asset Tag:
Comment on lines +92 to +105
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block needs to be further indented to be a code block

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be something with the browser cache when reviewing with 'make serve'. They were all code blocks but refreshes would break the formatting. Ill put them back in.


To work around the issue, you must set a unique serial number for
each compute node using `ipmitool`. SmartOS compute nodes come with
`ipmitool` preinstalled so this is as easy as:

1. SSH to the affected compute node
2. On your local desktop, randomly generate, as unicast as possible,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unicast? Did you mean unique?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha whoops...will update tomorrow

a new serial number. In my scenario I simply used `pwgen` on my Mac
to generate a 7 digit, random, alpha-numeric string. But you can
probably use "/dev/urandom", python, openssl or a myriad of other
tools to achieve the same result.
`pwgen -sB 7 1`
3. On each node run the following three commands:
`ipmitool fru edit 0 field c 1 <NEW_SERIAL>`
`ipmitool fru edit 0 field b 2 <NEW_SERIAL>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a command block. Double indent under a list item to get a command block.

Also, should the same number be put in each field? Should they be different numbers? Just want to double check on this. It’s not what I would have expected, but if you confirm this is correct, I’ll trust you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, same number on all 3.

`ipmitool fru edit 0 field p 4 <NEW_SERIAL>`
4. Double check that the new serial number has been set:
`ipmitool fru print 0`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also make this a code block

5. Reboot the compute node, it should now be detected by Triton.
314 changes: 313 additions & 1 deletion package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.