目    录    1
1 故障摘要    1
1.1 故障的系统及配置    1
1.2 故障处理简要    1
2 故障现象(描述故障的总体情况)    1
3 故障分析及处理(记录故障分析及处理)    2
4 目前状况    4
5 总结建议(本次故障处理的总结建议)    4
6 遗留问题(故障处理完毕后尚遗留待处理的问题)    4
1 故障摘要
1.1 故障的系统及配置
硬件: SUN M4000
软件: XSCF,Solaris10
1.2 故障处理简要
2011-6-8:发现M4000内存有报警
2 故障现象(描述故障的总体情况)
远程登录XSCF
XSCF> showstatus
    MBU_A Status:Normal;
*      MEMB#0 Status:Deconfigured;
service fault*          MEM#0A Status:Deconfigured;
*          MEM#1A Status:Deconfigured;
*          MEM#2A Status:Degraded;
*          MEM#3A Status:Faulted;
XSCF> showhardconf
*      MEMB#0 Status:Deconfigured; Ver:0101h; Serial:BF0947H3W8  ;
            + FRU-Part-Number:CF00541-0545 09  /541-0545-09          ;
*          MEM#0A Status:Deconfigured;
                + Code:2c000000000000000818HTF25672PY-667G10100-d36165a1;
                + Type:2A; Size:2 GB;
*          MEM#1A Status:Deconfigured;
                + Code:ce0000000000000001M3 93T5660QZA-CE6 4151-5224785b;
                + Type:2A; Size:2 GB;
*          MEM#2A Status:Degraded;
                + Code:ce0000000000000001M3 93T5660QZA-CE6 4151-5224785e;
                + Type:2A; Size:2 GB;
*          MEM#3A Status:Faulted;
                + Code:ce0000000000000001M3 93T5660QZA-CE6 4151-52247833;
                + Type:2A; Size:2 GB;
        MEMB#1 Status:Normal; Ver:0101h; Serial:BF0947H3P2  ;
            + FRU-Part-Number:CF00541-0545 09  /541-0545-09          ;
            MEM#0A Status:Normal;
                + Code:2c000000000000000818HTF25672PY-667G10100-d3616807;
                + Type:2A; Size:2 GB;
3 故障分析及处理(记录故障分析及处理)
内存故障,处理过程如下
1.
首先掉电,然后现场更换状态为Degraded和Faulted的内存,启动后得到如下输出
XSCF> showstatus
*  MBU_A Status:Degraded;
*      MEMB#0 Status:Degraded;
2.
从原厂申请Service password,例如
TOW  JOT  RAID KAHN STAG SICK
NOW  CAFE VERY NASH FONT BORN
GLOM FUSE OTT  EAST SOOT ABLE
JAB  MORT BULL BAR  DUTY MERT
CURD SHUN SLIT DEE  AX  ABE
然后XSCF> enableservice
Service Password:
  **** **** ***  **** **** ***
  **** **** **** **** **** ****
  **** **** **** **** **** ***
  **** **** ***  **** **** ****
  **** **** **** **** ***  ***
Mode password is: JOEY DISH OILY
XSCF> service
Mode password: **** **** ****
进入service,运行命令service>clearfault MBU_A
service>clearfault MBU_A/MEMB#0
检验内存是否正常service>showstatus
    MBU_A Status:Normal;
*      MEMB#1 Status:Deconfigured;
*          MEM#0A Status:Deconfigured;
*          MEM#1A Status:Faulted;
*          MEM#2A Status:Deconfigured;
*          MEM#3A Status:Deconfigured;
3.
掉电更换状态为Faulted的内存,重启后得到如下输出
XSCF> showstatus
    MBU_A Status:Normal;
*      MEMB#1 Status:Degraded;
*          MEM#0A Status:Degraded;
清除错误
service> clearfault MBU_A/MEMB#1
XSCF> showstatus
    MBU_A Status:Normal;
        MEMB#1 Status:Normal;
MEM#0A Status:Degraded;
service>clearfault /MBU_A/MEMB#1/MEM#0Aclearfault: Fault cannot be cleared for this FRU.FRU will be marked to clear faulton next circuit breaker off and on.Continue? [y|n]: YFault will be cleared after circuit breaker off and on
第三次掉电重启,黄灯灭,得到如下输出
XSCF> showstatus
No failures found in System Initialization.
进入操作系统
XSCF> console -d 0
Connect to DomainID 0?[y|n] :y
启动后,查看系统硬件信息
root@TJANACOL1 # prtdiag -v
couldn't set locale correctly
System Configuration:  Sun Microsystems  sun4u Sun SPARC Enterprise M4000 Server
System clock frequency: 1012 MHz
Memory size: 16384 Megabytes    #能够认到内存
4 目前状况
故障已经解决.系统业务运行恢复.
5 总结建议(本次故障处理的总结建议)
  应注意2个问题
  1.硬件更换必须在掉电情形下进行
  2.更换硬件后,需要清除错误,并根据需要掉电重启2-3次
6 遗留问题(故障处理完毕后尚遗留待处理的问题)

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。