NavigationUser login |
Hanging Transactions in v10This is just a shot in the dark...pp v10 is crashing mid-afternoon with 10-15 transactions using ~10%cpu. Earlier in the day one transaction may reach 80%. The database is not crashing. Any ideas? 10.0.4/Unix HP By Anonymous | The Rough Guide | add new comment
Not much to hang ontoHi, Wow. Is this recent or how long has it been occurring? Does the appearance of the problem correspond with any recent customizations, Ventyx hotfixs, database or operating system changes? Do you consider your site more or less vanilla? What is the possibility that a more obscure PassPort product is the culprit? Are there any data loading jobs going on around the time of the run away transactions? Is Portal/J running on Apache/Tomcat? Is there anything interesting in the catalina.out log? Any chance you can get Unix "ps -elf" output for the runaway transaction? The goal being to see the tranID. If you can get "ps -elf" output can you get "ptree PID" output? Here's my guess. Data is being loaded into PassPort (as opposed to being entered by a user using the panels). The data is not quite right and when it's being accessed, a .ccp panel program is looping. If the loop contains an SQL either from the .ccp or a .csm subroutine, you'll be able to track it down via a database monitor. But this doesn't sound like what you have. You use the word "crash" so it sounds like a tight loop in a .ccp without SQL. Do particular users complain that PassPort hangs? If so track down what business entity they were working on and how it originated or has been updated. The problem with my guess is that PassPort is based on IBM CICS thinking and it would be a strange situation where a .ccp looped without user interfacing. Do you debug in production using Animator? I have seen run away processes with Animator, but I can't think of how to do it at this moment. Is your PassPort application server running production only? Is your Portal/J server running production only? Are there other applications that communicate in near real time with PassPort? Is FI Connect for api messaging in use? Are there any indications that tigfisub.csm(not sure of the program name, the program that allows outside applications to invoke a PassPort tranID) is involved? Things to try to eliminate a database issue: 1. A really good clue would be if you are able to monitor the database when the problem is occurring and see if a lot of database hits are occurring and what they are. A lot of database hits would indicate program looping that contains SQL. 2. If in the database monitor a SQL executes for a really long time it could be an index issue or database statistics problem. I realize this wouldn't account for a PassPort application server process clocking CPU. You have me curious. By webmaster | reply
Not much to hang onto - replyI posted a rather lengthy reply but it never posted apparently. Anyway, the hanging transactions have, for the most part, subsided. Was thinking some of the users were doing things wrong, some printers were not defined, etc. I know, not much to go on, but things have calmed down. We have been adjusting the memory allocations in the PROD.INI file and are considering bouncing the server on a nightly basis. Also may be a memory leak but someone else is pursuing that. By Anonymous | reply
a little more infoDid an experiment of putting a loop with a sleep into a panel's .ccp program. The sleep was done by waiting for keyboard input from a bogus location - see www.codecomments.com/archive266-2004-9-276676.html (ACCEPT SOME-NUM-FIELD AT 2479 WITH SIZE 1 AUTO TIMEOUT SLEEP-TIME). The sleep was implemented as a .csm program without CICS and the .ccp called it with COBOL CALL "CALL JUNK". This resulted in the PassPort panel "hanging" for the user. I was able to find the Unix process with ps -elf, but it does not contain more useful information. The process contained "tigrts TIGLSTN region". My idea of the Tomcat catalina.out log doesn't look promising as Portal/J writes very little information. The Portal/J servlet.log and pjserv.log might be useful. The process that hung because of sleep did not consume CPU. It appears you have a .ccp or .csm that is in a tight code loop consuming CPU without any SQL in the loop. I just can't think of how this situation would come about. By webmaster | reply
|
Batch job being launched from online?
When transactions are "hanging" it's possible that "tigrts TIBSHELL" processes are running. Use Unix command "ps -ef | egrep "TIBSHELL|userid" to look for them.
"tigrts TIBSHELL" indicates batch jobs launched from online PassPort panels. These might indicate a problem with MicroFocus COBOL licenses.