| Halt Error Codes | Index Level | Unix Signal Usage |
| Syntax | |
| Category | Unix |
| Type | Definition |
| Description |
tool for recovery in case of a system crash, following, for instance, a power failure.
The Monitor Debugger allows: - Display and change Pick virtual memory. - Display and change 'real' memory (remember that 'real' memory is, in fact, Unix virtual memory). - Force a flush of the Pick memory space back to disk. - Display, change the status of the system semaphores, to remove a dead lock situation. - Get access to a locked system. - Terminate processes, including doing a shutdown. - Trace modifications to a memory area. - Put low level break points in the virtual code. IMPORTANT When the Monitor Debugger is entered unexpectedly, try first to type g<return> to see if the system restarts. If not, hit the <BREAK> key again to examine the problem. There are cases when the debugger is entered wrongfully. For instance, when some specially long, un-breakable, tape operations (like a rewind) are running, hitting the break key several times on line 0 may enter the Monitor Debugger with a 'tight loop' condition, which means the process was engaged in a 'long' operation, preventing it from servicing the <BREAK> key. Entering the Debugger The debugger can be entered: - Voluntarily, by hitting the <BREAK> key on a Pick process which has been started with the -D option, to enable the Monitor Debugger. - Voluntarily, by setting a monitor trace on real or virtual memory. - Voluntarily, by setting a monitor break point in virtual memory. - Following a system abort. When a serious system abort occurs, the debugger is entered. The user cannot continue from such a condition. - Following a Monitor HALT. When a process cannot continue execution, a HALT is executed on this process. This normally does not affect the other processes. The faulty process enters the Monitor debugger and waits. Type 'x' to display the hardware registers, note them to transmit them to Technical Support and type 'g' to try restarting the process. If the process aborts again, try a logoff and/or reset-user from another terminal before trying 'g'. If it fails again, type 'q'. - By hitting the <BREAK> key 5 times in less than 5 or 6 seconds on the line 0 when the system does not respond (system stuck in a tight loop, in a semaphore dead lock or line 0 comatized). When a semaphore is left hanging, or when the processor enters a short tight loop, due to an ABS corruption, for example, the process does not respond any more and is incapable of going to the Virtual Debugger. The fifth time the <BREAK> key is pressed, the signal handler checks to see if the first occurrence of the break was serviced normally. If it is not, the debugger is entered. On a busy system, it might be necessary to try the <BREAK> sequence several times to get to the Monitor Debugger. - By hitting the <BREAK> key on the line 0 when it waits for a system lock (overflow lock, spooler lock, etc...) for more than approximately 5 seconds. The different causes of entry in the debugger are displayed by a message on entry and a special prompt, as defined in the table below: CONDITION MESSAGE PROMPT Break key on line started with -D <BRK> B! System abort <ABT> A! Monitor trace <TRC> addr C! Break point <BPT> bp# I! Monitor HALT <HLT> code H! Break key on line 0 on tight loop <TLP> T! Break key on line 0 on virtual lock <VLK> V! Referencing Data Data can be referenced from the Monitor Debugger either in the virtual space or in the real memory space. Data Specifications Data location is defined by the following format: address{;window} The data is always displayed in hexadecimal. Virtual Address Specification The address of a virtual element can be represented by: [r reg|{.}fid][.|,]disp The base FID is either the content of the register reg or a FID number fid in decimal or in hexadecimal, prefixed by a dot. The displacement is either expressed in decimal, prefixed by a comma, or in hexadecimal, prefixed by a dot. For example: 1.300 Offset x'300' in frame 1. .12,16 Offset 16 in frame x'12'. r3.100 Offset x'100' off the location pointer at by register 3. Monitor Address Specification The address of a Monitor element can be represented by either of the two following forms: { [l|g] }.hexaddress{ [+|-] {.}offset} /symbol{ [+|-] {.}offset} The l prefix is used for local data. The g is used for global data. The second form requires the presence of the file sdb.sym on the current directory or on /usr/lib/pick. This file is normally not shipped with the system. It is reserved for development purpose. The optional offset which is added to, or subtracted from, the n base address is either expressed in decimal, or in hexadecimal if prefixed by a dot. For example: .40000100 Absolute address. l.0 First address in the local data space. g.100+.10 Offset x'10' off the address x'100' in global data space. /sys.time Address of symbol sys.time. /tcb0+.100 Offset +x'100' off the symbol tcb0. Window Specification The window specifies the number of bytes to display. The window is expressed in decimal or in hexadecimal, prefixed by a dot. The default window size is 4. When using a symbolic name, the window is set automatically. Changing Data When a window of data is displayed, it is followed by an equal sign = . Hitting Ctrl-N will display the next window, if available, and Ctrl-P the previous one. New data can then be entered as follows: 'char Character Insertion. A character string is preceded by a single quote. The characters in the display window are replaced by those in the input string, beginning from the left. .hex Hexadecimal string insertion. A hexadecimal string is preceded by a dot. It must contain only hexadecimal characters and an even number of nibbles. The characters in the display window are replaced by those in the input string, beginning from the left. {+|-}int Integer. The display window is treated as a numeric element. The window must be 1, 2 or 4 byte long. The new integer replaces all data in the window. Debugger Commands The debugger prompts for a command with a one character code followed by an exclamation mark (!). Commands are terminated by a carriage return. !shell Submit a Shell command. The command is submitted to shell. ? Display help information. The Unix file /usr/lib/pick/sdb.help is displayed using pg. ba{.}fid[.|,]disp bao{.}offset ba+{.}offset b{n} Add a break point. The effective address can be specified in different ways: Form 1: The effective address is computed by adding the argument fid to the fid break point offset, and disp to the displacement break point offset defined by the bo command. Form 2: The effective address is specified by the fid be defined by in break point offset and offset added to the break point offset displacement set by the bo command. Form 3: The effective address is defined by the current value of R1 to which the offset is added. Form 4: The effective address is defined by the current value of R1 to which the n * 4 is added. If n is not specified, 1 is assumed. This form is used for architectures which have a fixed 4 byte instruction size (RISC). The address must be at a virtual address boundary. Break points are global for the whole virtual machine. Once the break point is set, any process which hits it will stop. Break points are removed once they are encountered. A break point should not be set into an ABS frame which has been write required (mloaded into). Up to three break points can be set simultaneously. Upon successful setting, a '+' is displayed and the break point is displayed. Break points remain in action until they are explicitly removed or until the virtual machine is shut down. bd[*|n] Delete break points. If * is used, all breakpoints are removed. If n from 0 to 2 is used, the specified break point is deleted. bl List break points. List the Monitor break points. bo{.}fid[.|,]disp Break point offset. Define a fid and displacement which are used in computing the effective address of a break point in the ba command. d? Show/Change default dump string. In case of system abort (bus error, segmentation violation, ...), the system automatically dumps some critical elements in the file "/usr/tmp/ap.core". This file can be examined with the apcrash utility. After the display, the string can be changed by typing the new dump string after the '=' sign. The dump string can have up to 15 characters, one-character codes and arguments. See the Monitor Debugger command d below for the description of each code. The dump string can be different for each process. See the Pick Systems Reference Manual documentation for the description of 'apcrash'. dcommand.string Dump Pick core memory to the Unix file /usr/tmp/ap.core. The result of the dump can be examined with the utility apcrash. This command can be used, after an incident, to dump selected elements of the current Pick memory to be able to investigate the problem. See the Pick Systems Reference Manual documentation for the description of 'apcrash'. The Unix file can be dumped to tape using Unix utility like tar (eg. tar cv /usr/tmp/ap.core). The content of the dump is controlled by command.string which is composed of one character codes, some followed by arguments. Codes can be separated by commas for readability. The order of the codes in command.string is unimportant. This dump utility is automatically invoked in case of system abort (bus error, segmentation violation, etc...) before entering the Monitor debugger. What is dumped in this case is controlled by a default dump string (see the Monitor Debugger command d? above). Valid codes are: a : Dump 'all'. This option is equivalent to the command string "l,g,b,p,r,c,0,f1". See the description of each code below. 0 : Dump the PCB of the current process. If the PCB is not attached at this point, this performs no operation. b : Dump the buffer table. c : Dump the current process context. All the frames currently attached to the process, their forward and backward links are dumped. f{.}n : Dump the fid n. If the specified frame is not in memory, it is read from disk. g : Dump the global space. l : Dump the local (private) memory, not including the stack. p : Dump the pibs. r : Dump the hardware registers. The registers are dumped in the same order as described in the Monitor debugger 'x' command described later in this appendix. s{.}start;[*|{.}size] : Dump the main shared memory segment. Start is the starting offset, expressed in bytes and size is the size expressed in KILOBYTES. If * is used instead of size, the entire shared memory segment, starting at the specified offset, is dumped. v{.}n : Dump n virtual buffers. e Toggle the debugger ON/OFF. When OFF, prevent the entry to the debugger with the <BREAK> key. On line 0, though, the debugger will be entered in some special cases (See section 'Entering the Debugger' above) even when the debugger is disabled. f{!} : Flush memory. All frames modified in memory are written back to disk. If a disk error occurs, a minus sign is displayed. If the '!' option is specified, all frames in memory, even if they are not write required, are written to disk. g {fid.disp} : Go. Without any argument, the process resumes execution. If fid.disp is specified, control is transferred to the specified mode. fid is expressed as a relative offset in the current abs. gl{-} : This command displays or removes group locks. With no options, the monitor prints the status of the global group lock (with a "G+" or a "G-"), and scans memory for any frames which are marked as locked. For each locked frame, the monitor displays the fid, the address of the buffer table entry, and all group lock information held in the frame itself. To clear all group locks, type "gl-". To clear a specific group lock, type "gl-{fid}". Note that locks cleared from the monitor debugger may still display with the "list-locks" command. Such locks should be cleared with a "clear-locks" command when the virtual machine becomes accessable. In general, group locks should always be cleared by TCL commands only. The monitor debugger should only be used when system access is denied due to a lock set on a critical file (like the "mds,," file). h fid : Hash Fid. This command displays internal information about the specified FID if it is in memory, or the message <NIM> if the fid is not in memory. The content of the buffer table can be altered. Input is terminated by: carriage return : return to debugger. ^N : Next buffer table entry, following the age queue forward link. ^P : Previous buffer table entry, following the age queue backward link. ^F : Next buffer table entry, following the hash queue forward link. k{w}[f| pib ] : Kill. Terminate the process associated to the PIB pib or the flusher if used with the key f by sending a SIGTERM to it. The w key waits up to 10 seconds for the process to terminate. If it does not terminate, a SIGKILL is sent to it. Note that if the target process is in the Monitor Debugger or stuck on a semaphore, the SIGTERM signal will have no effect until it leaves the Monitor debugger or the semaphore is released. Kill the flusher ('k{w}f') will unconditionally log all processes off and shut down the virtual machine. l fid : Display/Modify Link fields. Displays in hexadecimal the link fields of the frame fid, in the following format: nncf:frmn:frmp:npcf:clnk= nncf: number of next contiguous frame(s) frmn: forward link frmp: backward link npcf: number of prior contiguous frame(s) clnk: core link New values for the fields can then be entered, separated by commas, with an empty field to leave a field untouched. m {*}monitor.address{;window} : Display/Change Real Memory. Display the specified window at the real address as specified (see the section about Monitor address specification above in this section). If an asterisk (*) is used, the address is considered as a pointer and its content is used as the monitor address. The length and window specification applies to the area pointed at by the pointer. IMPORTANT: Access to an illegal address will cause a segmentation violation or a bus error sending control back to the Monitor Debugger with an abort condition, from which it is impossible to recover. It is strongly advised to avoid absolute addresses, since they vary from implementation to implementation. p {pib}{[.|,]offset}{;window} Display/Change PIB. Display window bytes in the pib specified by pib (current pib if pib is omitted), at the optional offset. q{!} : Quit. Quit Monitor debugger. Confirmation is asked. Leaving the Monitor Debugger terminates the Pick process. When asked to confirm, the user must type y (no return). The optional '!' by-passes the normal Pick termination, and terminates the process abruptly. This forms should be used only in extreme situations where even quitting from the Monitor Debugger aborts. r reg{.disp}{;window} : Display data through register. Displays data pointed at by the register reg from 0 through 15. If specified, disp is added to the register displacement. c [*|sem]{[?|+|-]} Display/Change semaphore status. Display or change the semaphore specified by sem, expressed from 1 through 3, or all semaphores if the key * is used. The key + sets (locks) the specified semaphore. The key - resets (unlock) the specified semaphore. The key ? displays the information as in the example below: 00: O pid=0985 01: O pib=0023 W 02: 03: O pib=001A where semaphore 0 is owned by the process with the pid number of x'0985' - Only semaphore 0 is displayed with the Unix pid number instead of the Pick pib number; semaphore 1 is busy, owned by the pib x'23' and has at least one process waiting on it (W); semaphore 2 is free; semaphore 3 is busy, owned by process x'1A' but has no process waiting on it. If the owner pib is not setup yet when the command was executed, the "pib=?" is displayed. Re-enter the command again to see to owner pib. S{f}{h}{i}{m}{s}{w}{-} Scan buffer table bits. This command displays and/or clears buffer table bits depending upon certain criteria. The options are as follows: f Referenced bit h Hold bit i iobusy bit (disk read) m Temporary mlock bit s Suppress detail output. Show only total. w Write-required bit - Clear instead of display The user specifies which bits to search for using the above options. When a buffer is found, it is displayed in a manner similar to that of the "h" command. The user may go backwards or forwards in the selection list with the CTL-P and CTL-N commands. At the completion, the total count of items is indicated. Note that the count is only accurate if forward movement only is used. t{[mmonitor.address|fid.disp]{;window}} Set/remove Monitor trace. Without any argument, any pending monitor trace is removed. A minus sign is displayed in acknowledgment of the removal. Else, set a trace on the specified area of memory starting at monitor.address or the area of memory associated to fid.disp with a length equal to window. If no window is specified, the default window or the size of the monitor element is used. The maximum window size on a monitor address is 32767. The maximum window on a virtual address is the frame size. A plus sign is displayed in acknowledgment of the setting. The memory is checked for any change at every virtual branch or call, and every frame fault. If the memory is changed, the Monitor debugger is entered. When setting a trace on a virtual address, the frame is locked in memory. Removing the trace unlocks the frame if it was not locked when the trace was set. v{code} Enter the Virtual Debugger with a code 'code'. If 'code' is not specified, it just enter the debugger as if the <BREAK> key had been hit. 'code=14' will log off the process. This command will display 'ADDR' and fail if the Virtual Debugger PCB is not set up. x Display hardware registers. Display on the first line the program counter, followed by a variable number of 32 bit registers. The information is implementation dependent: AIX: Registers r3 through r31. SCO: Registers edi esi ebp esp ebx edx ecx eax HP-UX: Registers r2 through r30 SINIX (MIPS): Registers r1 through r30 SVS (ICL DRS6000): Registers %pc %npc %g1 %o %l %i y{!} Toggle Lock By-Pass. When ON, the process under debugger will by-pass all locks in the system, monitor and virtual. This option should be used very carefully, since it can create extensive damage if used on a live system. To be used safely, all other processes should be either logged off or stopped by setting a semaphore (see section 'Usage Hints' below). Unless used with the key !, the user has to confirm the activation of this by-pass. Usage Hints This section shows how to use the Monitor Debugger to perform some unusual actions. Extreme care must be exercised when using the debugger to remove a lock or a semaphore. This may cause data loss. It is strongly recommended to contact Technical Support when a system gets locked. - Running in single user. To prevent all users from running, except one terminal: At shell, activate a Pick process in the Debugger. ap -D <return> Once in the Monitor Debugger: B! s1+ <return> B! y <return> and confirm by typing 'y' B! g <return> All processes will now be locked, except the one running under the debugger and the flusher. This is useful when patching some critical virtual structures, like the Overflow table. To restart the multiuser activity: Break into the Monitor Debugger (either on the line which has the lock set, or on line 0 by hitting <BREAK> twice, since the line 0 should be locked by the semaphore). This enters the Monitor Debugger. B! y <return> to toggle the by-pass off B! s1- <return> B! g <return> All processes will now be unlocked. - Removing a dead lock. When a process has been killed by the system, it may have left a semaphore or a virtual lock behind. To remove the lock, make sure all users are inactive, and do the following: On the line 0, hit <BREAK> up to six times, in less than 5 or 6 seconds. The process should drop into the Monitor Debugger, with the message <TLP> in case of a tight loop, or <VLK> in case of a virtual lock. Do the following, depending on the case: T! f <return> T! s*? <return> Note which semaphore number is set and remove it by the command: T! s semnum - <return> T! g <return> If the lock is a virtual lock, it may have to be cleared. This may have disastrous results if done without some inside information. It is strongly advised to contact technical support. V! f <return> V! r15;2 <return> The system will display a number. If non zero, zero it. Else, there is another problem. Contact technical support. V! r15;2 .001A= 0 <return> V! g <return> A virtual lock might also be one of the system wide locks. Do the following to identify it: V! f <return> V! 1.100;2 <return> The system will display a number, normally 0. Type <ctrl> N 10 times or until a non zero value shows. If no null value shows, there may be another problem. Contact technical support. V! 1.100 .0000= Ctrl N V! 1.102 .001A= 0 <return> V! g <return> A virtual lock might also be an item lock (release 6.1 and above only). Do the following to identify it: V! 1.15a;6 <return> <return> The system should display a frame number. Now type the following: V! .(frame number displayed before).0;4 <return> The system displays an 8 digit hex number. The first 4 digits indicate the global lock, while the next 4 digits indicate the number of item locks. To zero both of these, type "0" at the "=" prompt followed by a <return>. A virtual lock might also be a group lock (release 6.1 and above only). Do the following to identify it: V! gl <return> If the system prints anything other than "G-" then group locks are set. To clear them, type the following: V! gl- <return> V! gl <return> The system should report "G-" after the last command. Note that the virtual machine may still report locks set if the "list-locks" command is used. To fix this, go back to tcl as follows: V! g <return> If the system does not go to a tcl prompt, then some other lock condition exists and the user should contact technical support. If a tcl prompt appears, type the following: : clear-locks (g <return> Allow some time for "clear-locks" to complete as it can be delayed by processes which have been terminated abnormally. After completion, the group locks should be cleared. If any further deadlocks are encountered, contact technical support. If only the line 0 is stuck, it might be because it has been accidentally comatized. The WHERE command shows the first two characters of the status field as FE or 7E. A logoff from another port will un-comatize the line 0, or do the following: T! p.0;1 <return> T! p.0;1 .7E= .ff <return> T! g <return> - Enabling the debugger on line 0. If the line 0 has been started without the -D option, it is impossible to get in the Monitor debugger, unless there is a lock. To enable the debugger, proceed as follows, to set temporarily a virtual lock to be able to drop in the debugger, enable it and restart: At TCL: : debug <return> ! 1.102;2 <return> ! 1.102;2= -1 <return> This sets the overflow lock ! g <return>. Line 0 (and the WHOLE system) is now locked. Wait 5 seconds and hit the <BREAK> key. This drops in the debugger. V! 1.102;2 <return> V! 1.102;2 .FFFF= 0 <return> V! e <return> V! g <return> The Monitor Debugger is now enabled on line 0. - Finding your line number. To determine the PIB number on which the debugger is running, do the following: Break into the debugger. B! p.18;2 <return> This displays in hexadecimal the pib number +1 - Restarting the virtual machine after an abort early in the boot stage. If the virtual machine boot aborts very early (during or right after the 'Diagnostic' message), after having corrected the error when possible, the Boot can be restarted quickly by doing (the line 0 must have been started with the '-D' option): ! Hit the <BREAK> key to enter the Monitor debugger B! g3.0 <return> This should redisplay the message 'Diagnostic ...' and proceed. Precautions A process started with the -D option has some special privileges, which might lead to data destruction if used indiscriminately: - Access to the virtual machine will always be granted, even if the Initialization lock is set. In particular, it would be possible to start a user process while the line 0 is in the process of initializing the virtual machine. Therefore, always make sure that the line 0 has reached, at least, the 'Diagnostics...' stage before starting a process. - A process with debugger privilege may by-pass all locks, including virtual locks! When changing data structures, be sure that nobody is accessing the virtual machine, or, better, set a semaphore, as shown above, to prevent concurrent access. For the same reason, use debugger only on one line at a time. - When the memory is full, the debugger can abort with the message 'MEM FULL'. Do a flush and retry until it succeeds. - When the system runs in 'single user' (with a lock set by the command s1+) and with the lock by-pass activated, do not try to shut down the system with the TCL command shutdown. Instead, go to the monitor debugger, do a flush (f) and kill the flush process (kf). This terminates all active processes. - When changing virtual memory with the debugger, it is a good idea to flush memory frequently, using the f command. |
| Options | |
| See Also | |
| Example | |
| Warnings | |
| Compatibility |
| Halt Error Codes | Index Level | Unix Signal Usage |