Using IPCS to diagnose abends
This appendix describes certain abend types and how to analyze them. Following are the procedures to analyze the dumps:
First symptoms
Messages indicate a system or user abend. For example, message IEA995I has been issued to the operator console. A dump was produced. An error was recorded in the logrec data set.
Information and tools needed for analysis
 – IPCS installed
 – SVC dump, SYSUDUMP, SYSMDUMP, or SYSABEND dump
 – Logrec error record
 – Master trace
 – Job log
Types of dumps to be analyzed:
 – Abend0C1: PSW, REGS and some basics
 – ABEND0C4
 – High CPU
 – Enqueue Contention
 – Wait
 – Storage
 – LE
B.1 Lab exercises
There are xxxx dumps that you can work on. You do not need to go through each sequentially. An index to the dumps follows:
An “Introduction to IPCS tools” dump.
An abend0C1: PSW, REGS and some basics
Lab setup instructions
At the IPCS primary options panel, shown in Figure B-1, choose Option 0 for defaults.
------------------- z/OS 01.08.00 IPCS PRIMARY OPTION MENU -------------------
OPTION ===>
********************
0 DEFAULTS - Specify default dump and options * USERID - ROGERS
1 BROWSE - Browse dump data set * DATE - 07/02/05
2 ANALYSIS - Analyze dump contents * JULIAN - 07.036
3 UTILITY - Perform utility functions * TIME - 11:52
4 INVENTORY - Inventory of problem data * PREFIX - ROGERS
5 SUBMIT - Submit problem analysis job to batch * TERMINAL- 3278T
6 COMMAND - Enter subcommand, CLIST or REXX exec * PF KEYS - 24
T TUTORIAL - Learn how to use the IPCS dialog ********************
X EXIT - Terminate using log and list defaults
Enter END command to terminate IPCS dialog
Figure B-1 IPCS PRIMARY OPTION MENU
Lab exercise #1:
Switch dumps by typing =0 (zero) on the IPCS command line.
Change the DSNAME to ITSOE.ABCVOL8.AB047.
Press Enter and proceed back to IPCS Option 6 (commands) by typing =6 on the command line. Proceed with the exercise.
The Problem: Diagnose a SLIP dump.
When selecting Option 0, Figure B-2 on page 339 is displayed. Add the dump data set name to the Source field to initialize the dump, as follows:
Source ==> DSNAME('ITSO.ABCVOL8.AB047')
------------------------- IPCS Default Values -------------------------------
Command ===>
You may change any of the defaults listed below. The defaults shown before
any changes are LOCAL. Change scope to GLOBAL to display global defaults.
Scope ==> LOCAL (LOCAL, GLOBAL, or BOTH)
If you change the Source default, IPCS will display the current default
Address Space for the new source and will ignore any data entered in
the Address Space field.
  Source ==> DSNAME('ITSO.ABCVOL8.AB047')
Address Space ==>
Message Routing ==> NOPRINT TERMINAL
Message Control ==> CONFIRM VERIFY FLAG(WARNING)
Display Content ==> NOMACHINE REMARK REQUEST NOSTORAGE SYMBOL
Press ENTER to update defaults.
Use the END command to exit without an update.
Figure B-2 Default panel after selecting Option 0
Use IPCS commands
The IP ST REGS command tells you what the registers were at the time of the dump, as follows:
For SLIP dump REGS at time slip matched.
For console dump - typically all zeros.
For abend dump - they are theoretically the REGS at time of abend.
For standalone dump use IP CPU REGs to get REGS from each CPU.
The IP ST FAILDATA command formats the SDWA if it is present. Generally it will give you a better overall picture but may not always be there and may not be the same as ST REGS due to recovery actions.
Information from the IP ST REGS command
If the calling program is in AR mode, all addresses that it passes, whether they are in a GPR or in a parameter list, must be ALET-qualified. A parameter list can be in an address space other than the calling program's primary address space or in a data space, but it cannot be in the calling program's secondary address space.
What does an AR contain? An AR contains a token, an access list entry token (ALET). An ALET is an index to an entry on the access list. An access list is a table of entries, each of which points to an address space, data space, or hiperspace to which a program has access.
The following questions can all be answered by using the IP ST REGS command.
Questions
1. What dump is it? Console or slip dump_____
2. What abend does this dump show?______
3. Was this dump in AR mode at the time of the failure? _______________
4. What was the failing PSW address? _________
5. What ASID is this failing code executing in? _________
6. What was the failing TCB address? ________
7. What is the value in R14? ________
8. Where does register 14 point to? When you browse the address, be aware to get rid of the high order bit which is used for addressing mode. __________
IP SYSTRACE
Use to determine what else was happening on the system at the time of the dump.
Options to use:
IP SYSTRACE ALL - formats all ASIDs.
IP SYSTRACE TIME(LOCAL) - converts the time to local time (readable).
IP SYSTRACE ASID(X'nn') - formats only trace records associated with the requested ASID.
Things to look for in the SYSTRACE:
If a WAIT entry is found in SYSTRACE the system is not running 100% CPU.
EXT 1005 entries for the same ASID may be indicative of a loop.
Only traces traceable events such as SVCs, PCs.
 
Note: See Chapter 8 in z/OS MVS Diagnosis: Tools and Service Aids, SY28-1085 for examples and details of SYSTRACE. See “SYSTRACE definitions” on page 315 for sample output.
Questions
1. Looking at the systrace for asid x’20’, at what time do we get the first *RCVY ABT entry? Use IP SYSTRACE TIME(LOCAL) ASID(x'20')
2. What was the preceding SVC good for?_______________
3. Do we call FRR service?____
4. What does the abend047 mean? ____________
5. Could we recover the error? _______
Some Key Fields in IP SUMM FORMAT
The IP SUMM FORMAT ASID(X'nn') command will format lots of data about the specified address space. In this lab we will be interested in the following fields:
RBOPSW - (contained in the RB under the TCB: of interest) - Can be found by going to the bottom and issuing F 'TCB: 00nnnnnn' PREV, then F ACTIVE to find the most recently active RB. This field shows the last running PSW address at the time the dump was taken or the address that the TCB entered a wait at.
WLIC - (found in the same manner as RBOPSW above) shows the last interrupt that occurred on a given RB.
GPR values - Show the register values at the time of the interrupt in the *previous* RB. That means that the RB with the WLIC value stores its registers in the next RB, or in the TCB if there is not a following RB.
TCB summary at the very bottom of the output - Contains a CMP field that shows the last completion code issued for a TCB.
Figure B-3 shows an example of a TCB summary where the last TCB shows a completion code of ABEND 047.
JOB PHILGER1 ASID 0020 ASCB 00FCAE80 FWDP 00FCAB80 BWDP 00FB8000 PAGE
00000006
TCB AT CMP NTC OTC LTC TCB BACK PAGE
007FE040 00000000 00000000 00000000 007FF890 007FD0C0 00000000 00000056
007FD0C0 00000000 00000000 007FE040 00000000 007FF890 007FE040 00000062
007FF890 00000000 007FD0C0 007FE040 007FF130 007FF130 007FD0C0 00000067
007FF130 04822000 00000000 007FF890 007FF3A0 007FF3A0 007FF890 00000074
007FF3A0 80047000 00000000 007FF130 00000000 00000000 007FF130 00000084
Figure B-3 TCB summary
Figure B-4 shows the result of issuing the BOTTOM command followed by the F 'ACTIVE' previous command to locate the TOP RB of the Last Task in the address space. Note that this task is issuing an SVC 6B.
 
Note: The WLIC field shows 00026B, which means the last SVC this task issued was SVC 6B modeset.
PRB: 007FF020
-0020 XSB...... 7FFFDD60 FLAGS2... 00 RTPSW1... 078D0000
-0014 A4B000C2 RTPSW2... 0002006B 00000000
-0008 FLAGS1... 00000000 WLIC..... 0002006B
+0000 RSV...... 00000000 00000000 SZSTAB... 00110082
+000C CDE...... 007FE000 OPSW..... 078D0000 A4B000C2
+0018 SQE...... 00000000 LINK..... 007FF3A0
+0020 GPR0-3... FD000008 00006000 00000040 007D19D4
+0030 GPR4-7... 007D19B0 007FF130 007BAFC8 FD000000
+0040 GPR8-11.. 007FCAC8 007CF8F0 00000000 007FF130
+0050 GPR12-15. 8759F022 00006008 007FCB14 007FCAF8
Figure B-4 PRB layout
SUMMARY FORMAT exercises
Questions
1. Use IP SUMM FORMAT ASID(X'20') followed by the BOTTOM command. Looking at the TCB summary, what are the TCBs ending with a non-zero completion code: ________________
2. Could we recover the errors? ______________________
3. What shows that we could not recover abend047? ___________________
4. Use F 'TCB: 00' PREV command to find the TCB that took the ABEND047 then issue F 'ACTIVE' to find the top RB.
 – From that RB what are the values of OPSW _______________________
 – And the WLIC value _______________
 – What does WLIC field tell? _____________
5. Where does the EP point to? __________________________
6. What is the start area pointed to by ENTPT?____________________
7. Does the task run secure? ________
8. Where do we find the registers at time of OPSW shown in PRB: 007FF020? ________
Diagnosing an ABEND0C1 dump
The exercises on the following pages are designed to demonstrate how to diagnose an ABEND0C1. An ABEND0C1 is an attempt by the processor to execute an instruction that is not valid or not coded correctly.
Typically the abend will occur when a program executes a bad branch. Thus, often the PSW where the abend occurs is less important than where the last valid instruction was executed. There are a couple of ways to determine that.
Find a base register. Many programs use a base register to establish addressability. This may be one or more registers but typically R12 is chosen. Thus looking at R12 may point to code that was last in control.
Find the source of the branch. By convention often the BALR 14,15 instruction is used to get from one program to another. If this is the case, R14 will point to the source of the call.
Look at the TCB/RBs of the abending task. In some cases the previous RB can give a clue as to what program was to get control next. For instance, perhaps the previous RB has a WLIC of 00010006 which would be a LINK SVC and will enable you to look at the parmlist for the link to find the information about what program got control as a result.
Examine SYSTRACE for the ASID/TCB that abended. Perhaps there was a traceable event that occurred prior to the abend that will give you a clue as to what program was in control leading up to the abend.
Use any details you get from the above to search problem databases for a known fix for a vendor problem or to feed back to the programmer for a customer-written program.
Lab exercise #2:
Switch dumps by typing =0 (zero) on the IPCS command line.
Change the DSNAME to ITSO.ABCVOL8.AB0C1.
Press Enter and proceed back to IPCS Option 6 (commands) by typing =6 on the command line. Proceed with the exercise.
The Problem: Diagnose an ABEND0C1 ABEND dump.
Questions
1. Determine what this dump is all about: Issue the IP LIST TITLE command.? ______.
2. Using the IP SYSTRACE ALL command and issuing a F '*PGM', what PSW address was the PGM 001 (a.k.a. ABEND0C1) taken at? _______________________
3. Fill in the abend code in the *RCVY entry below based on the *RCVY entry that immediately follows the *PGM 001:
 – *RCVY PROG 94 __ __ __ 000 (file in the 3 missing characters)
4. Use the IP ST REGS command to get the relevant information about the abend 0C1. Record the following:
 – PSW _______________________________________
 – R14 ________________________________________
 – Primary ASID (PASN) __________________________
 – Abending JOBNAME _______________________
 – Failing TCB address
5. Use the =1 command to get into IPCS browse:
 – Browse the PSW address what 'instruction' does the PSW point to? ________________
6. Often, branches are accomplished with BALR 14,15, making R14 point to the caller. Check R14 in this dump and see what instruction reg 14 points to: Browse the address in R14.______
7. Get the module name which issues the SVC 3 instruction. Use IP WHERE 00FDCA98 __________________________
8. Get the module name pointed to PSW at time of error ______________
9. Due to we know that abend 0C1 PSW points to the failing instruction, what does this area show? _________
10. What offset is it in the module? In our case the module starts with 90ECD00C. ______
Diagnosing an USS ICH408I security violation
This exercise is designed to show how to diagnose an ICH408I related abend. The dump we will look at was taken due to an USS file access was denied. From the dump we will get the users RACF definitions and USS related UID which is defined in OMVS segment for this user.
Lab exercise #3:
Switch dumps by typing =0 (zero) on the IPCS command line.
Change the DSNAME to ITSO.ABCVOL8.SECURUSS.
Press Enter and proceed back to IPCS Option 6 (commands) by typing =6 on the command line. Proceed with the exercise.
The Problem: Diagnose an ICH408I slip dump.
Questions:
1. Have a look at the IP VERBX MTRACE and get the last ICH408I message
2. Which user failed to access a file? ______________
3. What is the file name he would like to access? _________
4. According to the permission bits he was not allowed to work with this file. To get the users RACF and OMVS security related definitions we need to check the ACEE. The ACEE is pointed to by SENV which will be provided in our ASXB control block which is pointed to by ASCB. If a TCB has its own ACEE security, it will show the ACEE address in its own SENV field. In this dump we have only one SENV pointed to by ASXB. Browse this address and get user name and default group. You need to look at RACF data area manual to get ACEE layout. Be aware to look at the correct address space. ____________________
5. To get the information to which RACF groups the userid is connected, have a look at ACEE offset x’74’. Browse this address and you will get the group related information. ______________
6. If a user is requesting USS service a USP (User Security Packet) is provided which show the user permission. Get the following control blocks in the dump:
a. ACEE points to ACEX at offset x'98'. Browse the address and check the eye catcher ___________
b. ACEX points to USP at offset x'48’. Browse area pointed to by the address at offset x’48’ ___________
Answers to questions: See “Lab exercise #3 - Answers diagnosing ICH408I” on page 356. To get RACF control block layout have a look at z/OS V1R13.0 Security Server RACF Data Areas GA22-7680-13.
RACF provides trace possibilities to get access violation information. Have a look at Figure B-5 on page 345 where we trace ch_access using service number 6.
1. Add following member into your SYS1.PROCLIB
//GTFRACF PROC MEMBER=GTFPRMUS
//BR14 EXEC PGM=IEFBR14,REGION=512K
//SYSPRINT DD SYSOUT=*
//D DD DISP=(OLD,DELETE),UNIT=3380,VOL=SER=VSMI04,
// DSN=HILG.TRACE
//IEFPROC EXEC PGM=AHLGTF,PARM='MODE=EXT,DEBUG=NO,SA=100K,AB=100K',
// REGION=2880K,TIME=NOLIMIT
//IEFRDER DD DSNAME=HILG.TRACE,UNIT=3380,VOL=SER=VSMI04,
// DISP=(NEW,CATLG),SPACE=(TRK,(100))
//SYSLIB DD DSNAME=SYS1.IBM.PARMLIB(&MEMBER),DISP=SHR
2. Allocate a parm member like GTFPRMUS in one of your Parmlibs.
Add :
TRACE=USRP
USR=(F44),END
3. Start GTFRACF with the following command:
S GTFRACF.GTFR
**
GTFRACF will be submitted and the GTF name will be GTFR.
4. Activate the trace options using the SDSF command:
#set trace(callable(TYPE(6)) jobname(xxxx)) list
***************************************************************
You need to check the prefix for the command in the IEFSSNxx
Parmlib member.
My IEFSSN member shows RACF entry
SUBSYS SUBNAME(RACF) INITRTN(IRRSSI00) INITPARM(#)
Which leads to # command prefix.(on my system)
**************************************************************
5. Stop USS Ctrace
TRACE CT,OFF,COMP=SYSOMVS enter
6. Start USS CTRACE
TRACE CT,64M,COMP=SYSOMVS enter
Replyid,OPTIONS=(ALL),END enter
7. Activate the following slip:
SLIP SET,MSGID=ICH408I,
JL=OMVS,ID=yyyy,
DSPNAME=('OMVS'.*),
SDATA=(ALLNUC,PSA,CSA,LPA,TRT,SQA,RGN,LSQA,SUM),END
**************************************************************
8. Recreate problem
Figure B-5 RACF trace
 
Note: Service number information can be found in z/OS Security Server RACF Diagnosis Guide Tracing the Callable Services, RACROUTE, and RACF Database Manager Request calls, GA22-7689-14.
Diagnosing storage problems - ABEND878
To diagnose storage problems with a dump, it is best to use the VERBX VSMDATA ‘SUMMARY’ command in IPCS. There is a wealth of information about the output of this command. Chapter 29 of z/OS MVS Diagnosis: Reference, GA22-7588 provides details.
In general the approach is to determine whether this is a common or local storage problem. The exercise that follows details a common storage shortage problem. The steps for diagnosing a local (ASID) storage problem are similar.
SSRV trace entries
For virtual storage management, use the following information:
For SSRV 132 (Storage Obtain)
SSRV 133 (Storage Release)
SSRV requests for VSM
For an SSRV request to virtual storage management, the data is:
Under UNIQUE-1: Information input to the VSM storage service, the bytes are as follows:
0 Flags:
X... .... RESERVED
.1.. .... KEY was specified
..1. .... AR 15 is in use
..0. .... AR 15 is not in use
...1 .... LOC=(nnn,64) was specified. Storage can be backed abov
the bar
.... 1... CHECKZERO=YES was specified
.... 0... CHECKZERO=NO was specified explicitly, or by default
.... .1.. TCBADDR was specified on STORAGE OBTAIN or RELEASE
.... ..00 OWNER=HOME was specified explicitly, or by default
.... ..01 OWNER=PRIMARY was specified
.... ..10 OWNER=SECONDARY was specified
.... ..11 OWNER=SYSTEM was specified
1 Storage key (bits 8 through 11)
2 Subpool number
3 Request flags:
1... .... ALET operand specified
.1.. .... Storage can be backed anywhere
..00 .... Storage must have callers residency
..01 .... Storage must have a 24-bit address
..10 .... The request is for an explicit address
..11 .... Storage can have a 24- or 31-bit address
.... 1... Maximum and minimum request
.... .1.. Storage must be on a page boundary
.... ..1. Unconditional request
.... ...0 OBTAIN request
.... ...1 FREEMAIN request
SSRV storage size
Under UNIQUE-2, the following information is needed for question # 2:
In an SSRV trace entry for a VSM STORAGE OBTAIN or GETMAIN, one of the following:
The length of the storage successfully obtained
The minimum storage requested, if the storage was not obtained
ABEND878 - finding the request
Lab exercise #4:
Switch dumps by typing =0 (zero) on the IPCS command line.
Change the DSNAME to ITSO.ABCVOL8.AB878CSA.
Press Enter and proceed back to IPCS Option 6 (commands) by typing =6 on the command line. Proceed with the exercise.
The Problem: Diagnose an ABEND878 ABEND dump.
Questions
1. What kind of dump is this? Use IP ST to get title. __________
2. According to the dump type we should use IP ST REGS to get the abend and reason code. Look at register 1 for abend and register 15 for reason code. _______________
3. Have a look at z/OS V1R13.0 MVS System Codes SA22-7626-23 to get the error code information. Where doe we get the storage problem? RGN CSA SQA?
4. Issue the IP SYSTRACE command and then F '*S' to find the failing SVC. Have a look at the preceding SVC 78.
 – What request does SVC 78 represent? _____________________________________
 – Note: this request had been an SVC entered, the mapping you need to use is found in z/OS MVS Diagnosis: Reference, GA22-7588 under SVC 10 (0A0A) or SVC 132 (0A78).
 – What was the PSW address of the request? _______________________________
 – Note: if it was PC entered as this one was you will need to get the PSW address from the PC entry, which in this case is a 30B (storage obtain) and use the information provided above.
 – What subpool was requested? ______________________
 – Was storage requested above or below the line? ________________________
 – What was the size requested for the storage? _________________________
5. Looking backward in the system trace, is there an apparent pattern? To do this, issue F ‘78’ prev .__________________________
ABEND878 - analyzing storage use
Using the same dump, issue the VERBX VSMDATA ‘SUMMARY’ command.
Questions
1. Issue the F 'GLOBAL DATA' command. Using the table found, fill in the following information from the Global Data Area:
 – SIZE OF:
 – CSA _______________________________
 – SQA_______________________________
 – ECSA______________________________
 – ESQA______________________________
 – Was any of CSA or ECSA converted to SQA in this dump? This information is provided in CSACV field. ______
If large amounts of CSA have been converted to SQA, suspect an SQA problem.
2. Use the F 'CSA TOTAL' command to find the total current usage of CSA/ECSA (note: CSA is the lower number and ECSA is the upper number). Use SQA Total to get the SQA information and fill in the information below:
 – Current usage of:
 – CSA _______________________________
 – SQA_______________________________
3. Do we have sufficient storage available in our CSA below? Check IP VERBX VSMDATA output for CSASZ and CSAALLO. ________
ABEND878 - CSA/SQA tracker
Enter the VERBEXIT VSMDATA OWNCOMM command to display information about jobs or address spaces that hold storage in the common service area (CSA), extended CSA, system queue area (SQA), or extended SQA. The dump being analyzed with VERBEXIT VSMDATA OWNCOMM must contain the SQA and ESQA subpools. If you use the SDUMP or SDUMPX macro or the DUMP command to obtain the dump, make sure to specify the SQA option of the SDATA parameter.
Enter the VERBEXIT VSMDATA ‘OWNCOMM DETAIL’ command to obtain a report that displays a list of storage ranges owned by one or more jobs.
Lab exercise #5:
Switch dumps by typing =0 (zero) on the IPCS command line.
Change the DSNAME to ITSO.S2895.DUMP2.
Press Enter and proceed back to IPCS Option 6 (commands) by typing =6 on the command line. Proceed with the exercise.
The Problem: Diagnose an ABEND878 ABEND dump.
Questions
Use z/OS MVS Diagnosis: Reference, GA22-7588, which describes the output of the VERBX VSMDATA OWNCOMM command to complete this exercise.
1. . Using the same dump as on the previous page, issue the IP VERBX VSMDATA 'OWNCOMM SUMMARY' command
 – What jobname consumed the most below CSA? _________________
 – How much CSA was allocated to that jobname? ________________
2. Issue the IPVERBX VSMDATA 'OWNCOMM DETAIL ASIDLIST(32)' command. Answer the following questions about the storage:
 – What jobname allocated this storage? _________________________________
 – What was the length of the storage requested? __________________________
 – What was the return address of the storage request in the first entry? ___________________
 – What were the first 16 bytes of the storage area in question? ___________________________________________________
 – Is there an obvious pattern here? ___________
Diagnosing local storage shortage
This exercise will shorten the process by:
Understanding the failing request
Getting a picture of current local storage usage
Using that picture to evaluate where (high private or user region) the problem lies.
Using VSM control blocks to specifically identify the problem pattern
Using IPCS tools to identify the problem program
Lab exercise #5:
Switch dumps by typing =0 (zero) on the IPCS command line.
Change the DSNAME to ITSO.ABCVOL8.AB878
Press Enter and proceed back to IPCS Option 6 (commands) by typing =6 on the command line. Proceed with the exercise.
The Problem: Diagnose local storage shortages.
SSRV trace entries
For this exercise, in the SYSTRACE, use the following information. An example of a SYSTRACE entry is shown in Figure A-21 on page 315 and Figure A-22 on page 315.
Under UNIQUE-1:
Byte 2 - Contains the subpool number.
Byte 3 - Request flags.
1... .... ALET operand specified
.1.. .... Storage can be backed anywhere
..00 .... Storage must have callers residency
..01 .... Storage must have a 24-bit address
..10 .... The request is for an explicit address
..11 .... Storage can have a 24- or 31-bit address
.... 1... Maximum and minimum request
.... .1.. Storage must be on a page boundary
.... ..1. Unconditional request
.... ...0 OBTAIN request
.... ...1 FREEMAIN request
Under UNIQUE-2:
In an SSRV trace entry for a VSM STORAGE OBTAIN or GETMAIN, one of the following:
 – The length of the storage successfully obtained
 – The minimum storage requested, if the storage was not obtained
Under UNIQUE-3:
In an SSRV trace entry for a VSM STORAGE OBTAIN or GETMAIN, one of the following:
 – The address of the storage successfully obtained, if you specified address; otherwise, zero.
 – The maximum storage requested, if the storage was not obtained
In an SSRV trace entry for a VSM STORAGE RELEASE or FREEMAIN:
The address of the storage to be released.
Under UNIQUE-4:
Left 2 bytes: ASID of the target address space
Next byte4: Reserved
Right byte:
If the GETMAIN/FREEMAIN/STORAGE OBTAIN/STORAGE RELEASE is unconditional, an abend will be issued and the SSRV trace entry 3rd byte of UNIQUE-4 will contain X'FF'. If the GETMAIN/FREEMAIN/STORAGE OBTAIN/STORAGE RELEASE is conditional, no abend will be issued and the SSRV trace entry 3rd byte of UNIQUE4 will contain the actual return code from the storage service.
Questions
1. Issue the IP SYSTRACE ALL followed by the F *SVC command to find the SVC D request for this error. Back up a couple of lines with the UP 5 command. Use the mappings provided in z/OS MVS Diagnosis: Tools and Service Aids, SY28-1085 (SSRV trace entries) to fill in the following information:
 – What was the ASID where the failure occurred? ____________
 – What was the size requested of the failing GETMAIN? ______________________
 • Does this seem excessive? ________________
 – What was the requested subpool? ___________________
 – Based on the Subpool requested, is this a global or local problem? ____________
 • SP 0-127 are low private (Region) subpools.
2. Issue the IP VERBX VSMDATA 'SUMMARY NOG ASIDLIST(32)' command, go to the bottom of this output and find the local storage map. Fill in the following values from the map:
 – _________________ <- Max Ext. User Region Address
 – _________________ <- Ext. User Region Top
 – __________________ <- Ext. User Region Start
3. Extended private storage grows down until it reaches the current top of region; subsequent local storage may then fail as a result. Based on the storage map, did this happen? _____
4. The user region grows up until the current top of the region approaches the maximum user region. Subsequent region requests that would push the current top of the region over the max will fail. Did this occur in this case? ___________
At this point we can assume that the problem is with the user region. This is not always as obvious when REGION and PRIVATE storage “collide.” To determine whether the problem is that the user region is exhausted or whether instead it is somehow fragmented, look for FBQEs that describe storage in the USER REGION range if there are any. Get the FBQEs for below storage, How many bytes are free? Do a find for 00006000. ___________________________________________
5. Find a pattern in the user region subpools. Look at the Local Subpool Summary near the bottom of the report. What TCB has the largest storage allocation in total? __________
6. Is the storage getmained by this TCB in the same subpool?
7. How much storage did this TCB allocate below? Search for “Total allocation to TCB at address 7FF3A0”.
8. Even we requested size x’EA60’to allocate storage what storage did we get for each of these allocation request. Check the DQEs for this TCB________________
9. Pick any one of the addresses to browse and record the data you found:
 – ________________________________________________
10. Go back to SYSTRACE ALL and determine the PSW address where the GETMAIN was issued from. Browse that storage and record the eyecatcher of the offending module:
 – Have a look at the previous instruction. Does it show the getmain SVC? __________
 – Module name _______
11. Use the SUMMARY FORMAT ASID(X'20') command to find the EP name (under the RB that took the abend). Max to the bottom using PF8. Select the TCB address with a completion code. Find the TCB above with the command F 'TCB: NNNNNNNN' PREV (address must be 8 digits). Then find the first active RB with F ‘ACTIVE’. What is the EP...... name under the RB. ___________________________________
Diagnosing LE U4083 abend
The most important part to catch useful dumps is to use the correct LE runtime options. You should contact the IBM support center an discuss what runopts should be used for your error scenario. For example there are differences for z/OS and CICS errors how to get a dump. This exercise will guide you through an LE dump. It will show that the abend we got the dump for may not be the original abend. It’s important to check whether ZMCH control blocks are available. Look for CAA and DSA areas. If they are present, we will get useful LE information.
Lab exercise #6:
Switch dumps by typing =0 (zero) on the IPCS command line.
Change the DSNAME to ITSO.ABCVOL8.ABU4083
Press Enter and proceed back to IPCS Option 6 (commands) by typing =6 on the command line. Proceed with the exercise.
The Problem: Diagnose LE abend dump
If you look at the LE TraceBack in the dump, which shows the module calling sequence, you need to read it from bottom to top. The last accessed module is at the top of the output and the first called module at the end of the listing. As shown in Example B-1, module CEEBBEXT was called first.
Example: B-1 TraceBack output
Traceback:
DSA Entry E Offset Load Mod Program
 
1 EQA00HKS -19EF4530 EQA00EVH EQA00HKS
2 EQA00EVH +0000616A EQA00EVH EQA00EVH
3 CEEZIDT +000004BC CEEPLPKA CEEZIDT
4 CEEBBEXT -0AE90C8C CEEPLPKA CEEBBEXT
Questions
1. What is the dump title? _________________
2. Get the abend and reason code from IP ST or IP ST W output. _______
3. Get the load module name. _________________
4. Get the csect name. __________________
5. Does the breaking event address, which provides the last branch instruction, provide a non zero value? _____
6. What does register 1 show? ____________ Is this our abend code? _________
7. What does the abend and reason code mean? Have a look at Debugging Guide and Runtime Messages GC33-6681. _________________________________________
8. Do you find an abend entry in the systrace? _______
9. Due to we need the CAA and DSA pointer. Do we have these control blocks pointed to by register 12 (CAA) and register 13 (DSA)?. Get the register values from IP ST or IP ST W. CAA should show for byte 2 and 3 ‘0800’ and the save are can be checked using the information provided in Table B-1 on page 353 ________________
10. Get the following runtime options from IP VERBX LEDATA output.
a. ABTERMENC ________________
a. TERMTHDACT _______________
a. TRAP ______________________
11. Get the LE TraceBack. IP VERBX LEDATA ‘CEEDUMP’. Which module was called
 – first? _____________
 – last? ______________
 – Which module got the exception? ___________
12. Enter IP VERBX LEDATA ‘CM’ to get condition. Can you find the ZMCH control block which belongs to our abend?______
13. What abend information do the ZMCH control blocks show? Check the preceding CIBH and look for ABCD: and ABRC:
14. ZMCH points to the original error. If you do not have any ZMCH information in any of the IPCS LE data output do a find for ZMCH in the browsed storage. You may find one. Due to we have 2 ZMCH control blocks, where do the PSWs point to? Enter ‘IP W psw address’ ____________
15. To check whether we have HEAP storage overlays you can use: IP VERBX LEDATA ‘HEAP’
16. To check whether we have STACK storage overlays you can use: IP VERBX LEDATA ‘STACK’
17. Have a look at the save are chain. Browse to register 13 address area at time of error. Use the address at offset 4 to get the previous save area. Browse the register 15 address. What eye catcher do you find? ________
18. To get the original module name you need to get the value at offset x’C’. Add this value to the module start address or do a L +2C60 (assuming you found the correct module. Compare offset x’C’ with 2C60). Offset x’14’ will show the name. See Figure B-6 on page 353 ________
19. Not all module names can be found using the offset at x’C’. If the value is negative, which means the first bit is on, the module name is preceding this line. See Figure B-7 on page 354
 
 
Note: There are different ways to save register information. The one shown in Table B-1 is the most common one.
Table B-1 Save area layout
RES(0) (reserved area)
HSA (previous sa ptr)
LSA (next sa ptr)
Register 14
Register 15
Register 0
Register 1
Register 2
Register 3
Register 4
Register 5
Register 6
Register 7
Register 8
Register 9
Register 10
Register 11
Register 12
 
 
To get the save area chain starting from the last save area you can use the runchain command:
IP RUNC ADDR(addr of previous save area ptr) LINK(4)
To get the save area chain starting from the first save area you can use the runchain command:
IP RUNC ADDR(addr of next save area ptr) LINK(8)
 
Note: According to question 10, the output will show the LE run options. This output shows who activated the runopts and whether we can override it.
PROGRAMMER DEFAULT
INSTALLATION DEFAULT
DD:CEEOPTS
OVERRIDE
IGNORED
DEFAULT SETTING
Command ===> l +2C60
2580B1E8 47F0F014 00C3C5C5 ! .00..CEE !
2580B1F0 00000358 00002C60 47F0F001 90ECD00C ! .......-.00...ü. !
**********
2580DE48 10CEB000 2580DE64 ! ........
2580DE50 00000000 00000000 0008C5D8 C1F0F0C8 ! ..........EQA00H
2580DE60 D2E20000 06000001 00000000 00000000 ! KS..............
Figure B-6 CEE module name
257FBBF0 E014A7F4 00090700 C5D8C1F0 F0C7C6E4 ! Ö.x4....EQA00GFU !
257FBC00 C3C5C500 18F358E0 D00C980C D01407FE ! CEE..3.Öü.q.ü... !
257FBC10 90ECD12C 5880C2F4 58608010 BF9F6218 ! ..J...B4.-...... !
Figure B-7 CEE module name preceding
Lab exercise #1 - Answers IP ST REGS
The following questions can all be answered by using the IP ST REGS command.
1. What dump is it? Console or slip dump__SLIP___
2. What abend does this dump show?__047____
3. Was this dump in AR mode at the time of the failure? __NO_
4. What was the failing PSW address? __24B000C2_
5. What ASID is this failing code executing in? _20__
6. What was the failing TCB address? _007FF3A0_
7. What is the value in R14? _80FDCA98_
8. Where does register 14 point to? _0A03 in module IEAVCVT_
Lab exercise #1 - Answers IP SYSTRACE
1. Looking at the systrace for asid x’20’, at what time do we get the first *RCVY ABT entry? Use IP SYSTRACE TIME(LOCAL) ASID(x'20') __09:46:16.733432_
2. What was the preceding SVC good for?__SVC 6B modeset__
3. Do we call FRR service?_NO_
4. What does the abend047 mean? ___An unauthorized program issued a restricted Supervisor Call (SVC) instruction_
 – Could we recover the error? __NO__
Lab exercise #1 - Answers Summary Format
1. Use IP SUMM FORMAT ASID(X'20') followed by the BOTTOM command. Looking at the TCB summary, what are the TCBs ending with a non-zero completion code: 007FF130>>> 04822000 and 007FF3A0>>>> 80047000_
2. Could we recover the errors? __822 was recovered, 047 not____
3. What shows that we could not recover abend047? __RTM2 work area________
4. Use F 'TCB: 00' PREV command to find the TCB that took the ABEND047 then issue F 'ACTIVE' to find the top RB.
 – From that RB what is the address pointed to by OPSW_078D0000 A4B000C2_
 – And the WLIC value _WLIC..... 0002006B___
5. What does WLIC field tell? ______SVC 6B length 2 bytes
6. Where does the EP point to? _EP....... ABEND0C1__
7. What is the start area pointed to by ENTPT?__ENTPT.... A4B00000__
8. Does the task run secure? _NO__
9. Where do we find the registers at the time of OPSW shown in PRB: 007FF020? ________
 –  
Lab exercise #2 - Answers diagnosing an ABEND0C1
1. Determine what this dump is all about: Issue the IP LIST TITLE command.? _SLIP DUMP.
2. Using the IP SYSTRACE ALL command and issuing a F '*PGM', what PSW address was the PGM 001 (a.k.a. ABEND0C1) taken at? _24B00008__
3. Fill in the abend code in the *RCVY entry below based on the *RCVY entry that immediately follows the *PGM 001:
 – *RCVY PROG 94 0c1 000 (file in the 3 missing characters)
4. Use the IP ST REGS command to get the relevant information about the abend 0C1. Record the following:
 – PSW __24B00008___
 – R14 ___80FDCA98__
 – Primary ASID (PASN) _0020_
 – Abending JOBNAME _PHILGER1__
 – Failing TCB address _007FF3A0__
5. Use the =1 command to get into IPCS browse:
 – Browse the PSW address what 'instruction' does the PSW point to? __0000___
6. Often, branches are accomplished with BALR 14,15, making R14 point to the caller. Check R14 in this dump and see what instruction reg 14 points to: Browse the address in R14.____0A03 >>> SVC 3_____
7. Get the module name that issues the SVC 3 instruction. Use IP WHERE 00FDCA98 __IEAVCVT+90__
8. Get the module name pointed to by PSW at time of error _ABEND0C1_____________
9. Because we know that abend 0C1 PSW points to the failing instruction, what does this area show? _00000000________
10. What offset is it in the module? In our case the module starts with 90ECD00C. _6__
Lab exercise #3 - Answers diagnosing ICH408I
1. Have a look at the IP VERBX MTRACE and get the last ICH408I message. _ICH408I USER(HILGER ) GROUP(SYS1 ) NAME(PETER HILGER__
2. Which user failed to access a file? _HILGER__
3. What is the file name he/she would like to access? _/u/philger/secure__
4. According to the permission bits the user was not allowed to work with this file. To get the user’s RACF and OMVS security related definitions we need to check the ACEE. The ACEE is pointed to by SENV that will be provided in our ASXB control block, which is pointed to by ASCB. If a TCB has its own ACEE security, it will show the ACEE address in its own SENV field. In this dump we have only one SENV pointed to by ASXB. Browse this address and get user name and default group. You need to look at RACF data area manual to get the ACEE layout. __HILGER____ SYS1___
5. To get the information to which RACF groups the userid is connected, have a look at ACEE offset x’74’. Browse this address and you will get the group-related information. __ACEX___
6. If a user is requesting USS service, a User Security Packet (USP) is provided, which shows the user permission. Get the following control blocks in the dump:
a. ACEE points to ACEX at offset x'98'. Browse the address and check the eyecatcher. _ACEX_
b. ACEX points to USP at offset x'48'. Browse the area pointed to by the address at offset x48 __________.
7F6447C0 E4E2D740 01000038 000004BA 000004BA ! USP
7F6447D0 000004BA 00000002 00000002 00000002 ! .........
7F6447E0 000001FF 00000000 00000000 00000000 ! .........
7F6447F0 00000001 00000002
Figure B-8 USS User security packet
Lab exercise #5 - Answers diagnosing storage - ABEND878
1. What kind of dump is this? Use IP ST to get the title. _SLIP DUMP ID=A878_
2. According to the dump type we should use IP ST REGS to get the abend and reason code. Look at register 1 for abend and register 15 for reason code. _878 / 8__
3. Have a look at z/OS V1R13.0 MVS System Codes SA22-7626-23 to get the error code information. Where doe we get the storage problem? RGN CSA SQA?
4. Issue the IP SYSTRACE command and then F '*S' to find the failing SVC. Have a look at the preceding SVC 78.
 – What request does SVC 78 represent? ___getmain__
 – Note: this request had been an SVC entered, the mapping you need to use is found in z/OS MVS Diagnosis: Reference, GA22-7588 under SVC 10 (0A0A) or SVC 132 (0A78).
 – What was the PSW address of the request? __0000710C_____
 – Note: if it was PC entered as this one was you will need to get the PSW address from the PC entry, which in this case is a 30B (storage obtain) and use the information provided above.
 – What subpool was requested? __F1 >>> 241__
 – Was storage requested above or below the line? ___below__
 – What was the size requested for the storage? __EA60 >>> 60000__
5. Looking backward in the system trace, is there an apparent pattern? To do this, issue F ‘78’ prev .___Yes....seems to be a loop__
Lab exercise #5 - Answers ABEND878 - Analyzing storage use
Using the same dump as on the previous page, issue the VERBX VSMDATA 'SUMMARY' command.
1. Issue the F 'GLOBAL DATA' command. Using the table found, fill in the following information from the Global Data Area:
 – SIZE OF:
 • CSA ______235000_____
 • SQA______3C7000_____
 • ECSA____190AB000____
 • ESQA______4D0C000__
 – Was any of CSA or ECSA converted to SQA in this dump? __NO___
If large amounts of CSA have been converted to SQA, suspect an SQA problem.
2. Use the F 'CSA TOTAL' command to find the total current usage of CSA/ECSA (note CSA is the below number and ECSA is the above number). Use SQA Total to get the SQA information and fill in the information below:
Current usage of:
 – CSA ____227000____
 – SQA___484B000____
3. Do we have sufficient storage available in our CSA below? Check IP VERBX VSMDATA output for CSASZ and CSAALLO. __NO__
Lab exercise #5 - Answers ABEND878 - CSA/SQA tracker
Use the output of the VERBX VSMDATA ‘OWNCOMM’ command to complete this exercise.
1. Using the same dump as on the previous page, issue the IP VERBX VSMDATA 'OWNCOMM SUMMARY' command.
 – What jobname consumed the most below CSA? __PHILGER1___
 – How much CSA was allocated to that jobname? ___1A8CE0___
2. Issue the IP VERBX VSMDATA 'OWNCOMM DETAIL ASIDLIST(32)' command. Answer the following questions about the storage:
 – What jobname allocated this storage? ___PHILGER1______
 – What was the length of the storage requested? _____EA60 >>> 60000__
 – What was the return address of the storage request in the first entry? _000070F4__
 – What were the first 16 bytes of the storage area in question? __00000000 00000000 00000000 00000000_____
 – Is there an obvious pattern here? __YES___
Lab exercise #5- Answers diagnosing local storage shortages
1. Issue IP SYSTRACE ALL followed by the F *SVC command to find the SVC D request for this error. Back up a couple of lines with the UP 5 command. Use the mappings provided in z/OS MVS Diagnosis: Tools and Service Aids, SY28-1085 (SSRV trace entries) to fill in the following information.
 – What was the ASID where the failure occurred? ___20____
 – What was the size requested of the failing GETMAIN? ___EA60 >>> 60000___
 – Does this seem excessive? ___YES__
 – What was the requested subpool? __12__
 – Based on the Subpool requested, is this a global or local problem? __Local_____
SP 0-127 are low private (Region) subpools.
2. Issue IP VERBX VSMDATA 'SUMMARY NOG ASIDLIST(32)', go to the bottom of this output and find the local storage map. Fill in the following values from the map:
 – __27400000_________ <- Max Ext. User Region Address
 – __273FF000_________ <- Ext. User Region Top
 – __25400000__________ <- Ext. User Region Start
3. Extended private storage grows down until it reaches the current top of the region and subsequent local storage may then fail as a result. Based on the storage map, did this happen? __NO___
4. The user region grows up until the current top of the region approaches the Max user region. Subsequent region requests that would push the current top of the region over the Max will fail. Did this occur in this case? ___YES____
5. At this point we can assume that the problem is with the user region. This isn't always as obvious when REGION and PRIVATE Storage “collide”. To determine whether the problem is that the user region is exhausted, or whether instead it is somehow fragmented, look for FBQEs that describe storage in the USER REGION range; are there any? Get the FBQEs for the storage, How many bytes are free? Do a find for 00006000.
 – ____NO_____
 – Does this suggest fragmentation or storage exhaustion? ___Exhaustion__________
6. Find a pattern in the user region subpools. Look at the Local Subpool Summary near the bottom of the report. What TCB has the largest storage allocation in total: _7FF3A0_
7. Is the storage getmained by this TCB in the same subpool? _NO____
8. How much storage did this TCB allocate below? Search for: ‘Total allocation to TCB at address 7FF3A0’ _7AE000_
9. Even if we requested size x’EA60’ to allocate storage, what storage did we get for each of these allocation requests. Check the DQEs for this TCB __F000__
10. Pick any one of the addresses to browse and record the data you found:
 – __00000000___
11. Go back to SYSTRACE ALL and determine the PSW address where the GETMAIN was issued from. Browse that storage and record the eyecatcher of the offending module:
 – Have a look at the previous instruction. Does it show the getmain SVC? _YES 0A78_
 – This module actually does not show an eyecatcher. You may use IP W xxxxxxxx__
12. Use the SUMMARY FORMAT ASID(X'20') command to find the EP name (under the RB: that took the abend). Max to the bottom using PF8. Select the TCB address with a completion code. Find the TCB above with the command F 'TCB: NNNNNNNN' PREV (address must be 8 digits). Then find the first active RB with F ‘ACTIVE’. What is the EP...... name under the RB? _______GETMAIN____
Lab exercise #6- Answers diagnosing U4083 LE abend
1. What is the dump title? __JOBNAME BCDRUN STEPNAME JAVA USER 4083__
2. Get the abend and reason code from IP ST or IP ST W output. _U4082 / 2______
3. Get the load module name. __CEEPLPKA___
4. Get the CSECT name. ___UNKNOWN________
5. Does the breaking event address, which provides the last branch instruction, provide a non zero value? __NO___
6. What does register 1 show? _84000FF3__ Is this our abend code? _YES hex value_
7. What does the abend and reason code mean? Have a look at Debugging Guide and Runtime Messages, GC33-6681. _The back chain was found in error. / Traversal of the back chain resulted in a program check __
8. Do you find an abend entry in the SYSTRACE? __YES___*SVC D entry___
9. Because we need the CAA and DSA pointer. Do we have these control blocks pointed to by register 12 (CAA) and register 13 (DSA)? Get the register values from IP ST or IP ST W. CAA should show for byte 2 and 3 ‘0800’ and the save area can be checked using the information provided in Table B-1 on page 353 ____YES____
10. Get the following runtime options from IP VERBX LEDATA output.
 – ABTERMENC __(NONE)_______
 – TERMTHDACT _(TRACE,CESE,00000096)__
 – TRAP ___(ON,SPIE)___
11. Get the LE TraceBack. IP VERBX LEDATA ‘CEEDUMP’. Which module was called
 – first? _CEEBBEXT__
 – last? _EQA00HKS__
 – Which module got the exception? __EQA00EVH__
12. Enter IP VERBX LEDATA ‘CM’ to get condition. Can you find the ZMCH control block that belongs to our abend?__NO__
13. What abend information do the ZMCH control blocks show? Check the preceding CIBH and look for ABCD: and ABRC: _ABCD:940C4000 ABRC:00000004_
14. ZMCH points to the original error. If you do not have any ZMCH information in any of the IPCS LE data outputs, do a find for ZMCH in the browsed storage. You may find one. Because we have 2 ZMCH control blocks, where do the PSWs point to? Enter ‘IP W psw address’ __EQACSUTP+026A__and _EQA00EVH+1227D6_
15. To check whether we have HEAP storage overlays you can use: IP VERBX LEDATA ‘HEAP’
16. To check whether we have STACK storage overlays you can use: IP VERBX LEDATA ‘STACK’
17. Have a look at the save area chain. Browse to register 13 address area at the time of error. Use the address at offset 4 to get the previous save area. Browse the register 15 address. What eyecatcher do you find? __CEE
18. To get the original module name you need to get the value at offset x’C’. Add this value to the module start address or do a L +2C60 (assuming you found the correct module). Compare offset x’C’ with 2C60). Offset x’14’ will show the name. See Figure B-6 on page 353 __EQA00HKS_
 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.55.18