What are these false hardware errors?

435 pts.
Tags:
AS/400
Pat ran into this problem when applying PTFs. He writes, "We are running V5R2. When applying the CUMs, I always do the HIPER and DATABASE PTFs before I do the CUM. I IPL between the two groups. During my last two CUM applies, I got false hardware errors that halted the IPL. The first time I got the code that indicated there was a bad disk controller. There was not. "The second time I got the code that indicated two disk drives wer missing. They were not. Both times I was able to manually IPL from the B side, then re-IPL unattended from the B side and everything was fine. "Both times I called IBM and was told the the 'microcode must have gotten confused.' With the next CUM being released next week, I really don't want to have it happen again. Coincedence or actual problem?" What do you think? -- Michelle Davidson, editor, Search400.com

Answer Wiki

Thanks. We'll let you know when a new response is added.

With any upgrade of code there is always the chance of the microcode getting “confused”. Have there been any changes recently in the way the system is accessed? Have programmers changed any of the system values or tried to modify any native IBM commands(RSTLIB etc) with code of their own? I worked Disaster Recovery for 7 years and we had certain clients that required us to wipe our Test system 3 times to get it back to a normal state because they modified where IBM does not recommend.

===========================================================

<i>The first time I got the code that indicated there was a bad disk controller.
The second time I got the code that indicated two disk drives wer missing.</i>

I assume when you say you “got the code” you mean a SRC code and that you called IBM hardware (not software) support at that time. And you have logged two separate support calls.

At this point, ensure that all problem log issues are cleared. All suggested PTFs have been applied and all corrective actions have been taken. If parts are called out, send the problem reports to IBM through the problem log.

DSPPTF to an outfile. Query it for any that have actions that still need to be taken. Perform the actions.

If you have HiPers to apply, load and apply them at least a week before the cume. You want to run clean for a few days, and a week is a good length of time. If any HiPers apply, you’ll want to ensure again that no actions are missed.

Plan the cume for at least 6-8 weeks after it’s available. Most issues will have been found by someone else by then. Relevant HiPers will then become available, so that’s when you’ll order them, load them and apply them. (And run for a week.)

The plan will include enough time for problems when the cume applies. If a SRC code appears, call hardware support. With two previous incidents, ‘confused’ microcode is not going to be acceptable. The only resolution has to be the method of clearing it up.

Or replacing your disk controller (or whatever is sending the intermittent error report).

Tom

Discuss This Question:  

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following