Cisco UCS Log Fullness due to ECC Memory Errors

Greetings, everyone! I recently had a customer who was running into an issue where they were seeing the Cisco UCS System Event Log (SEL) fullness being reported within vCenter Server.

Upon looking at the host’s SEL Logs tab in UCS Manager, we could see that the SEL had filled up due to a significant number of ECC errors on a particular set of DIMMs. Typically, we could just clear the SEL and move on, but I’ve found that following these steps can not only clear the SEL, but may reset the ECC memory error state to help determine if a DIMM truly is flaky.

  1. Open your SSH client of choice and connect to the Cisco UCS Manager.


  2. Log in to UCS Manager. In this particular environment, the customer had to logon using their domain credentials in this format:
    ucs-DOMAIN\USERID


  3. Run these set of commands to connect to the particular blade (if applicable), reset the memory errors, and clear the SEL.

    In this example, connect to Chassis #3, Blade #2:
    scope server 3/2

    Then, reset all ECC memory errors being reported in the SEL:
    reset-all-memory-errors

    Commit the changes to UCS manager:
    commit-buffer

    The next step is to reset or clear the SEL:
    clear sel

    Again, commit the changes to UCS Manager:
    commit-buffer

  4. I believe the last step is optional, but in my experience, it didn’t hurt. Reset the CIMC, just to be safe.
    reset

    As usual, commit the changes:
    commit-buffer

    Doing so will drop any connection to the CIMC for that server (including the SSH session that was established earlier in this post).


  5. Ping or try to connect to the CIMC address after a few minutes to ensure connectivity and remote management.

And that’s pretty much all there is to it! Hopefully you found this post helpful. As always, thanks for stopping by!

PowerCLI: Find Host Profiles and Versions in vCenter

As part of our planned upgrade to vSphere 6.7, we needed the ability to quickly scan the various vCenter Servers for host profiles that may be configured for version 5.5 or older. According to the vSphere 6.7 Release Notes, if these older host profiles are found, the vCenter pre-upgrade check will fail.

Continue reading “PowerCLI: Find Host Profiles and Versions in vCenter”

PowerCLI: Find BIOS-Enabled VMs

This script is an idea that spun off of my previous post, PowerCLI: Find UEFI-Enabled VMs. If you’re preparing to enable Secure Boot in a VMware environment, it may be helpful to identify the VMs that cannot be upgraded. As you might recall, enabling secure boot requires the following:

  • VMware vSphere 6.5 or higher
  • Virtual hardware version 13 or higher
  • VMs need to be configured with EFI boot firmware

Continue reading “PowerCLI: Find BIOS-Enabled VMs”

PowerCLI: Find UEFI-Enabled VMs

With all the news regarding the Spectre and Meltdown CPU vulnerabilities over the past several months, there’s been a greater focus to get VMware virtual machines to virtual hardware version 9 or higher, as noted by Andrea Mauro’s post regarding these vulnerabilities. In addition to that, several companies and organizations may be looking to enable Secure Boot, a feature first introduced with vSphere 6.5. However, in order to enable secure boot, the virtual machine needs to be configured with both EFI boot firmware AND be on virtual hardware version 13 or higher.

Continue reading “PowerCLI: Find UEFI-Enabled VMs”

Finding NICs That Aren’t VMXNET3

Earlier this week, someone on our team received a request to change a VMware virtual machine’s NIC from e1000 to VMXNET3. While the change was a bit manual in nature due to the Guest OS configuration changes, it got us thinking… How many other VM’s might still have e1000 NIC adapters? So, I started working on a script to find out.

Continue reading “Finding NICs That Aren’t VMXNET3”

PowerCLI: Create New VM Port Groups in a Cluster

Hello again, everyone! Recently, I’ve been working on a script that will create new VM Port Groups on a virtual standard switch (vSS) in a given cluster. While this could probably be alleviated by using a virtual distributed switch (vDS), let’s assume that you have a need to stick with vSS for whatever reason (licensing, company standards, etc.).

In this script, it validates that the VLAN number is in fact a whole number within the range of 1 through 4905. At the end of the script, it asks if you’d like to add another port group to the same cluster or not. I found this to be very handy if you’re standing up a new cluster that only contained vSS, or simply adding more port groups to an existing cluster.

Continue reading “PowerCLI: Create New VM Port Groups in a Cluster”

PowerCLI: Get or Set VAAI Settings for VMware Hosts

During a recent technical engagement with a vendor, my team was asked to verify that VAAI was disabled for all hosts attached to that vCenter. There are several different ways to go about doing this, so I figured I would put this blog post together to showcase some of the different ways in which this can be accomplished. There are three settings that need to be reviewed (or changed). They are: DataMover.HardwareAcceleratedMove, DataMover.HardwareAcceleratedInit, and VMFS3.HardwareAcceleratedLocking. A value of 1 means the setting is enabled, and a value of 0 means the setting is disabled.

Continue reading “PowerCLI: Get or Set VAAI Settings for VMware Hosts”