Graybeard52
2405 pts. | Jul 7 2009 11:52AM GMT
Whether the system restarts after power loss is not controlled by 01BN, but by a system value. Open Navigator and check power settings. You certainly could have a power supply issue, might also be an overheating problem. You don’t get messages in the log if the system just crashes. Most any other cause would have something logged. I would go with both suggestions already made - (1) get a UPS, even if its only a few minutes. Its an absolute requirement for any AS400. And (2) use GO PROBLEM and see what you can find.
Djac
685 pts. | Jul 7 2009 1:30PM GMT
Any chance of it being a thermal shutdown? Perhaps the building management have been having work done on the air con while people are out of the office? Just a thought.
Hafwhit
630 pts. | Jul 7 2009 2:09PM GMT
What model is your system? If it is an older system then you may have a power supply problem. Check the problem log and see if anything is listed. If you do not have a UPS connected then you may be having a power issue. You may need to look at the Product Activity Log of System Service Tools and check the Power Subsystem Log.
Good luck and let us know what you find out.
Florenjm
70 pts. | Jul 7 2009 6:52PM GMT
Just to reinforce the colleagues’ answers, I would emphasize the following:
1: Always use a UPS (an AS/400 certified for that matter). I use APC (only the Smart UPS type) and they work great. Make sure it is connected (by serial port) to the AS/400 for orderly shutdown and restart control.
2: Ensure good round-the-clock ventilation, preferably Air Conditioning.
3: Ensure external power is round the clock, and direct from breakers. And,
4: Ensure the System Values are appropriate; as well as the Power on/off Schedule.
After these, if problems persist, then you may have a hardware problem (or perhaps some program); then proceed with the other suggestions …
Mcl
2500 pts. | Jul 7 2009 9:37PM GMT
Not an answer to your question - but a related story I heard somewhere. Just can’t remember where.
Seems that someone had a similar problem where a system was shutting down mysteriously. Nothing in the logs - and always when the office staff was out.
Well… The cleaning folks would come in and unplug the system (not knowing what it was) so they could plug in their floor cleaner. Of course, when they were done they put all the plugs back into the right places…
Regards
Mike
ScottCoffey
10 pts. | Jul 8 2009 4:58PM GMT
I’ve seen a system power *on* at unexpected times due to a bad CMOS battery. I suppose it could power off for the same reason.
Lovemyi
1455 pts. | Jul 8 2009 6:03PM GMT
You might also want to check the authority of the PWRDWNSYS command as someone may have changed it from being secured by the QSYSOPR and QSECOFR IDs. Also anyone with *ALLOBJ authority can run the command. So I would check the log to see if someone else may have issued the command with the default status *CNTRLD which will end the system in 3600 seconds (60 minutes) by default.
Gmil494
270 pts. | Jul 9 2009 12:44AM GMT
All,
Much appreciated the troubelshooting and answers. An update -
As/400 9406-170-2290 model about 85 GB DASD with 832MB memory. No UPS. I considered the APC 1500VA, but was not sure it would work on the AS/400. I have an IBM 4247-001 printer on the 4-port twinax interface. There are 2 unused ports, one of which I believe is for a UPS. Presently there is no UPS (for enough power for an orderly shutdown - notice there is a Q-system file for that value also.
Background: There may have been a storm in the area about 6/25/2009. There is a or should be a CMOS battery, but where it is in the box and how to access or change it out I’m not sure. I think the CMOS battery is good, but I can’t validate that (no way to test as in a notebook computer?)
There is an IBM 4247-001 printer (system printer, I tried to configure with the umpteen parameters as PRT01 and I don’t have it correct as far as parameters required) and an IBM InfoWindow II 3487 as the hardwired console / terminal (2 of 4 ports used on the twinax interface). The device (3487) had earlier been popping (in a home at the time) when I would IPL the machine. Since the machine was moved to an office (however not a raised floor data center - I used to be a computer operator on an IBM 3090-600J in 1989 in a large data center) I have not experienced this problem. However (background info again) I thought the hardwired console/device may have been causing the problem. So I unplugged the 3487 from electrical power (but left connected without power to the twinax controller/infterface). It seemed to run without the shutdown “syndrome.” Then one day I came into the office and it was silent again. okay, so maybe the 3487 hardwired console was not the cause. I now have had it operational with no unscheduled shutdowns (knock on wood) since I modified the The STGLOWACN I had configured to *PWRDWNSYS. I changed it to *MSG. So far it has been running since, but no guarantee that is what the problem was.
There are no entries in the Scheduled shutdown screen (list of IPL/shutdown times, it is meant to run 24×7x365/366).
The office is air conditioned. I have a atomic/satellite clock and the temperature usually is shown between (it has a built in thermometer) 66 and 70 degrees. The sun rises in the morning and comes in the only window in the small office the AS/400 is in. That is the only time the room heats up, so I try to keep the temperature lower(changing thermostat) in morning, and the rest of the time the office maintains relative (and blinds keep direct sunlight from coming into the the office via the window.
The machine is a development and production system, but most times there is only one or two persons on the system, so it is not being overworked (probably underworked if anything) and probably is idling mostly 75% of the time.
All ideas and suggestions are appreciated from your responses.
Mcl
2500 pts. | Jul 9 2009 4:25PM GMT
UPS port - is a DB-9 connector on the back of the 170. Your UPS will need to have a “dry” switch contact output for the interface to the 170. All it does is tell OS/400 that you are on battery power. What you do from there is based on how your system is set up. I believe the recommendation is to set up a message queue and a program to monitor that message queue. You set the QUPSMSGQ system value to tell the OS about the message queue. Your monitoring program has to determine how long thew power has been out and do an orderly power down at the appropriate time. Time is based on the capacity of the UPS and the load on the UPS. IBM has some good suggestions on thr programming - I just don’t know the location off hand.
CMOS battery - well, you’ll have to research that one.
The 3487 had been “popping”?? Can you be more specific?
3487 console should be device zero on port zero of the twix-ax interface. I believe it should normally be powered on at all times. I’ve seen issues where the console was not powered on resulting in a hardware-type SRC code displayed on the front panel.
Only other thing I can tell you is to make sure the 170 has its own dedicated power circuit - NOTHING else plugged into the same outlet or on the same circuit breaker. That will eliminate any issues with possible surges from something else on the same circuit - but of course it does not eliminate surges on the supply to your breaker panel.
If the problem persists you may need to invest in something to monitor the power that will record any outages/surges or brownouts.
Gmil494
270 pts. | Jul 9 2009 8:55PM GMT
The 3487 InfoWindow II some time ago made a snapping/popping noise, and hasn’t done so recently. A person that heard it said probably the CRT/Monitor may be nearing the end of its service life and failing or may require replacement.
Appreciate the info on the UPS and connection DB-9 on back of AS/400. UPS will soon be purchased.
AS/400 ran good for about 2 days and shut down again this morning (QHST log) about 0130 hrs. This time I noticed message “Previous ending abnormal, reason code 9.” I have yet to find that in documentation
exactly what that code signifies.
I’ve moved the power cords into another electrical outlet, and see if that resolves the problem (as far as I know this is in a multi-office building but I believe the office the machine is in is separate breaker for each office (unfortunately the thermostat for air conditioning/heating controls 3 separate offices, but I have a eky so can adjust it needed. Usually if a sunny day in morning, office gets warm until about 10-11am and rest of the day the sun is on a different part of the building, then air conditioning has to be adjusted to lessen the cooling needed.
Gmil494
270 pts. | Jul 9 2009 11:10PM GMT
And also related to the 3487 Terminal (Console), just within the last few days to one week, the left shift key on the keyboard no longer generates the appropriate upper case character. The right shift key still works (noticed this when trying to use the WRKUSR JOB *ALL command.) correctly though.
I’m not sure if the keyboard (I don’t have any extra keyboard to swap out with it and test further) or the logic unit is causing the failure.
MrObvious
140 pts. | Jul 10 2009 7:53PM GMT
You need one of these.
<a href="http://stlouis.craigslist.org/sys/1234080925.html" title="http://stlouis.craigslist.org/sys/1234080925.html" target="_blank">http://stlouis.craigslist.org/sys/123408…</a>
Mcl
2500 pts. | Jul 10 2009 10:34PM GMT
From what I remember, my 170 plugged into a 110V AC power source - single phase.
That UPS in the Craig’s list add is for a three-phase power source. That would likely require some electrical work. Add to that the batteries are five years old. Six years is about end of life for most UPS batteries to be considered reliable. Yeah, you can stretch them out longer, but not for that price.
There are smaller UPS units that are rated for that system requirements.
Got a digital clock radio? Or something with a digital clock that blinks on and off when it looses power? Plug it in to the same outlet as the 170. The next time the 170 goes down, see if the clock is blinking.
On the power down with reason code 9 - I would do a WRKPRB and see if anything was there.
If this machine is under service with IBM, they can probably tell you more.
Oh, that 3487 popping noise - It didn’t get dropped when it was moved, did it? That sounds typically like high-voltage discharge from the crt circuitry - could be age related - the things stay on forever and get warm. Fine another one (a 3196 would work) - you may need it.
Turn the keyboard over and shake out the aardvark clipings and other junk that might have accumulated under the keys. Well, do that before you go looking for another keyboard!
Regards
Mike
Gmil494
270 pts. | Jul 11 2009 3:55AM GMT
I’m thinking I saw a APC UPS at Office Depot (APC 1500VA which for a 170 I would think would be sufficient) at largest they stocked, with a DB-9 connector(if it is equipped that way, I’ll have to check it out) it might do the job, without adding 350lbs load to the 2nd floor office floor.
A capacitor(IIRC) could be going out on the CRT. I might try flipping the keyboard over and see what happens with that. It worked up until the last few weeks. I actually bumped the keyboard drawer (on computer desk) and the keyboard fell to the floor. I’m not sure that did it though. The keyboard/AS/400 have performed flawlessly until about 6/25/2009. The popping/snapping may be a capacitor (CRTs have them and quit I guess) and happened last year when the machine was installed in another location. It could be that the trucking company (motor carrier) that delivered it (I was not there when it was delivered, but was told the driver got it to the back door of the semi and that was it… and it was well packaged… and the back portion (plastic case) of the AS/400 had one small cracked area. The CRT was visibly undamaged as was the rest of the equipment. But also when CRTs get old, I guess capacitors get old and fail. I can get a 3488 logic (& keyboard if necessary) unit, add a new flat panel monitor to the 3488 and still have a good console and more energy conserving. I may invest in that soon.
I have a clock, but (unfortunately) it is battery powered and a Skyscan/Atomic/Satellite clock. So it won’t work as only battery powered. But, I have a radio and some other equipment I might be able to plug into the outlet and test and see what happens. Fortunately, I have a number of electrical outlets in the office (2nd floor, air conditioned, air filters changed onece every three month, but not a raised floor data center (not a clean room environment - more clean than a data center) and very controlled environment, but still good as far as offices go generally. The two power cords on the AS/400 are long enough and with 6 or 7 different electrical outlets in the office, I have plugged the machine into a difference outlet, and so far, so good, no problems. I originally thought the 3487 if it was shorting out (snapping/opping noise when turning it on and off in previous location, I have left it on since relocating to the office) it might have been causing a problem being hardwired as terminal/console, so isolated it in order to troubleshoot in process of elimination. I may plug (I have left it connected, but powered off) the 3487 console into a different outlet and then turn it on again and see what happens, but I’ll wait and see how long the AS/400 runs before powering up the 3487 again. Presently I have a notebook computer (not as a hardwired console but I still can get on the system) connected via ethernet/TCP/IP (IP 192.168.0.108) to the AS/400 and I can sign on (I have not limited the QSECOFR or other userIDs from anything other than the console, same reason I didn’t set the IPL on power restoration in the respective config Q file, in case of storm/water/fire damage. If there would be a failure in the roof, etc, water and electricity don’t mix (not well anyway) and don’t want auto IPL after power restoration, until I determine physically that it is safe to IPL the machine after inspecting the reasons for the shutdown in the first place.
Florenjm
70 pts. | Jul 13 2009 3:55PM GMT
I own an As/400 9406-170-2291, pretty much fully loaded (large tower with 2 Power Supplies) and am using a APC Smart UPS Model 1400. Only connect to it the AS/400 box and Infowindows II Display. It has power to spare. Do not connect anything else to it!
There are UPS models (APC) –referred as XL - with extended runtime option, that allows an extra batteries module to be attached to it. This is an additional tower or box just with batteries, connected to the main unit with heavy cable to a special connector in the back.
The DB9 cable for signal connection from server to UPS can be purchased in ebay, as well as the UPS. This is what I did and have no complaints. You may need to get your own batteries or battery module which I have also bought thru ebay (same with the server). This in case the UPS is sold without batteries or old batteries.
The reason for insisting on the smart APC is because the UPS design is much better and the AC produced by it is the closest to a Sine wave, as that produced by the local power utility and, the transfer is smoother.
Of course when buying on ebay, always check the feedback.
You do not need to buy a brand new UPS, they are very high quality and lasts for years; same as the 170.
Jose
Mid11
65 pts. | Jul 14 2009 3:12PM GMT
Message CPI091D is generated during an abnormal IPL. (after abnormal shutdown)
note for CPI091D “If more than one reason code applies, the reason with the highest number is reported.”
CPI091D Previous ending abnormal, reason code &1.
Technical description: The reason code for the previous system ending was 9
1 - The system ended a job abnormally.
2 - The Power Down System (PWRDWNSYS) command did not complete.
3 - The system ended unexpectedly.
4 - The system ended before the previous IPL completed.
5 - The system ended before database recovery completed.
6 - The system either ended with a reference code, ended while on auxiliary power,
or was not ended with the PWRDWNSYS command.
7 - PWRDWNSYS did not complete within the time limit specified by system value QPWRDWNLMT.
8 - Some data could not be written to auxiliary storage while the system was ending.
9 - Some data could not be written to auxiliary storage since the previous IPL.
If more than one reason code applies, the reason with the highest number is reported.
This could be due to an application data in memory not getting written to the database during power fail.
You could have data loss.
Look for message…
CPF3124
Message . . . . : File <filename> in <library> in use at abnormal system end.
Cause . . . . . : Member <member> file <filename> in library <library> was in use when the system
abnormally ended. <recordnumber> is the last relative record number.
For power failures while on a ups you should see in the log…
CPF1816 System utility power failed at <datetime>.
CPF1817 System utility power restored at <datetime>.
Gmil494
270 pts. | Jul 18 2009 3:27PM GMT
Appreciate the Professional help (you can tell the pro’s from the wannbe’s!!) I have had uninterrupted operations (what I was used to prior to this glitch…) since buying and plugging into the UPS (kanoking on wood as I write this…okay ….someday I’ll be an AS/400 pro!). So far the system is running according to my previous and AS/400 reliable reputation! Much appreciated, all who contributed! (the real pro’s!)






