Search This Blog

2011-09-23

Oracle Applicance - Memory Uncorrectedable Error

Produce: Oracle Appliance

Bug description:
There is a bug in Oracle Appliance firmware during BIOS POST initialization, which will throw following error during boot up

b | 9/23/2011 | 13:48 | Memory | Uncorrectable Error | Asserted | OEM Data-2 0x11 OEM Data-3 0x8c

Solution:
Launch Oracle ILOM CLI command:
set /SYS/MB/Px/Dy/ clear_fault_action = true

Where Px/Dy is DIMM RAM fault reported

In example above:
set /SYS/MB/8/d/ clear_fault_action = true

Avoid restarting server during BIOS initialization. If needed to, wait until it passed VGA initialization, which trigger this bug

Interpret DIMM error message
  1. Login to ILOM
  2. Use web browser to open System Event Log
  3. Data-2 = Data Byte 2
  4. 0x13 decode as bit 6-7 = 00 means ECC Memory Error, bit 4-5 = 1 means Branch Memory 1, bit 0-3 = 1 means CPU 1
  5. Data-3 = Data Byte 3
  6. 0x8d decode as bit 4-7 = 8 means DIMM 0 RAM pair with ECC error on DIMM8, bit 0-3 = 12 means DIMM 1 RAM pair with ECC error on DIMM12. These are the label on DIMM memory module, where RAM installed in pair

No comments: