IBM Parallel Environment for AIX IBM Messages Version 2 Release 4 GC28-1982-02
IBM Parallel Environment for AIX IBM Messages Version 2 Release 4 GC28-1982-02
Note: Before using this information and the product it supports, be sure to read the general information under “Notices” on page v. | Third Edition (October 1998) | | This edition applies to Version 2, Release 4, Modification 0 of IBM Parallel Environment for AIX, program number 5765-543, and to all subsequent releases and modifications until otherwise indicated in new editions or technical newsletters. Order publications through your IBM representative or the IBM branch office serving your locality.
Contents Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trademarks | | | | | | | | | | | | vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . About This Book . . . . . . . . . . . . . . . . Who Should Use This Book . . . . . . . . . . . How to Use This Book . . . . . . . . . . . . . . Overview of Contents . . . . . . . . . . . . . . . . . . . . . . . Typographic Conventions Abbreviated Names . . . . . . . . . . . . . . .
iv IBM PE for AIX V2R4.
Notices References in this publication to IBM products, programs, or services do not imply that IBM intends to make these available in all countries in which IBM operates. Any reference to an IBM product, program, or service is not intended to state or imply that only IBM's product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any of IBM's intellectual property rights may be used instead of the IBM product, program, or service.
vi IBM PE for AIX V2R4.
Trademarks The following terms are trademarks of the International Business Machines Corporation in the United States and/or other countries: | | | | | | | | | | AIX ESCON IBM LoadLeveler Micro Channel RISC System/6000 RS/6000 SP Adobe, Acrobat, Acrobat Reader, and PostScript are trademarks of Adobe Systems, Incorporated. | Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States and/or other countries.
viii IBM PE for AIX V2R4.
About This Book | | | | This book is designed to help any user of IBM Parallel Environment for AIX (PE) who needs to know what a message means and what should be done in response to that message. This book lists all of the error messages generated by the PE software and components and describes a likely solution. This book assumes that AIX and the X-Windows system are already installed, if required. For information on installing AIX and X-Windows, consult IBM AIX for RS/6000 Installation Guide, SC23-2341.
Type Style Used For bold Bold words or characters represent system elements that you must use literally, such as command names, program names, file names, and flag names. Bold words also indicate the first use of a term included in the glossary. italic Italic words or characters represent variable values that you must supply. Italics are also used for book titles and for general emphasis in text. Constant width Examples and information that the system displays appear in constant width typeface.
Short Name Full Name STDIN standard input STDOUT standard output US User Space VT Visualization Tool Related Publications Parallel Environment (PE) Publications As an alternative to ordering the individual books, you can use SBOF-8588 to order the entire PE library.
IBM Parallel System Support Programs for AIX: Installation and Migration Guide, GA22-7347 IBM Parallel System Support Programs for AIX: Diagnosis Guide, GA22-7350 IBM Parallel System Support Programs for AIX: Command and Technical Reference, SA22-7351 IBM Parallel System Support Programs for AIX: Messages Guide, GA22-7352 As an alternative to ordering the individual books, you can use GBOF-8587 to order the entire IBM RS/6000 SP software library.
| | | Online Information Resources If you have a question about the SP, PSSP, or a related product, the following online information resources make it easy to find the information: Access the new SP Resource Center by issuing the command: /usr/lpp/ssp/bin/resource_center | | | Note that the ssp.resctr fileset must be installed before you can do this. | If you have the Resource Center on CD-ROM, see the readme.txt file for information on how to run it. | Access the RS/6000 Web Site at: http://www.
| Enhanced Job Management Function In earlier releases of PE, POE relied on the SP Resource Manager for performing job management functions. These functions included keeping track of which nodes were available or allocated and loading the switch tables for programs performing User Space communications. LoadLeveler, which had only been used for batch job submissions in the past, is now replacing the Resource Manager as the job management system for PE.
Chapter 1.
2 IBM PE for AIX V2R4.
0029-0101 0029-1002 Chapter 2. pdbx Messages 0029-0101 Your program has been loaded. Explanation: This message is issued when your program has been loaded into the tasks in the partition. This message indicates all the functions available in pdbx are available for you to use. User Response: When this message is displayed, you can start debugging the tasks in the partition.
0029-1003 0029-2001 0029-1003 Missing or invalid argument following the -d flag. For information on the correct syntax to use when invoking pdbx, type: pdbx -h Explanation: The -d flag requires an integer argument that specifies the nesting depth limit of program blocks. User Response: Specify an integer. Note that this overrides the default nesting depth limit of 25 program blocks. 0029-1005 Unable to read command file specified by the -c flag.
0029-2002 0029-2014 0029-2002 Could not add the groups events (breakpoints or tracepoints) to task: number, because this task is RUNNING. Explanation: Since the task was RUNNING and not available for debug commands, pdbx could not add the group events (breakpoints or tracepoints) for this task. It is possible to continue but the group breakpoints will not have been set for this task. User Response: Issue the group list or tasks command to check the state of the tasks.
0029-2015 0029-2021 0029-2015 Could not open socket for debugger to communicate with poe. Explanation: The socket() call failed when the debugger tried to set up communications with POE. User Response: Debugging can continue except that the information about synchronized exit will not be passed back to the debugger from the POE job. Please note that the debugger will most likely not be able to re-attach to this POE job after detaching. 0029-2016 Could not make socket connection to poe.
0029-2022 0029-2029 0029-2022 Task: number has already been loaded with a program. Explanation: The task number that you specified has already been loaded. User Response: Specify another task that has not been loaded. Issue the group list or tasks command to check the state of the tasks. The tasks in NOT LOADED state are the ones that still need to be loaded with a program.
0029-2030 0029-2036 0029-2030 The correct syntax is: 'group add group_name member_list'. A member list can contain space or comma-separated task numbers, or ranges of task numbers separated by colons or dashes. Specify the group name as a string of alphanumeric characters that starts with an alphabetic character. Explanation: Invalid syntax for the pdbx group add command. User Response: Consult the man pages for the pdbx group command and re-specify the command.
0029-2037 0029-2044 0029-2037 Cannot add task: number, because it is already in group string. Explanation: The task specified on the group add command is already included in the group specified. User Response: Retry the command specifying only task(s) that are not already included within the specified group. 0029-2038 No action has been taken because a task number is out of range. Explanation: The task specified on the group or on command is not an acceptable value.
0029-2045 0029-2051 0029-2045 Group string has been renamed to string. Explanation: You have given a new group name to a previously existing group. User Response: Note that the old group name no longer exists. 0029-2046 The correct syntax is: 'group delete group_name [member_list]'. A member list can contain space or comma-separated task numbers, or ranges of task numbers separated by colons or dashes. Specify the group name as a string of characters.
0029-2052 0029-2059 0029-2052 Internal error in string : number - No action was taken because the group has no members. Explanation: You issued the group list on an empty group. User Response: This is an internal error, retry the command. If the problem persists, gather information about it and follow local site procedures for reporting hardware and software problems. 0029-2053 Valid group actions are: add, change, delete and list. Explanation: You issued a group command with invalid syntax.
0029-2060 0029-2068 0029-2060 The correct syntax is: 'source filename'. Explanation: The source command cannot be issued with zero or greater than one arguments. User Response: Re-issue the source command with only one argument. 0029-2061 Cannot open the command file that was specified on the source command. Explanation: The source command has been issued with a filename that either does not exist or has no read permission. User Response: Make sure the file exists and has read permission.
0029-2069 0029-2075 0029-2069 Reading command file string. Explanation: The debugger has started reading the command file specified by the -c command line flag, the source command or as a result of having a .pdbxinit file in the current working directory or your home directory. User Response: None. This is an informational message. 0029-2070 command file line number: string Explanation: The debugger displays each line of the command file as it is read showing the line number and the text.
0029-2076 0029-2084 0029-2076 There are no tasks in DEBUG READY state (active). Explanation: The response to the active command is that there are no tasks that are ready to be debugged. This is to say that there are no tasks that are active with respect to the debugger. User Response: None. This is an informational message. 0029-2077 Command string is not valid when using pdbx. Explanation: pdbx does not allow the use of this command.
0029-2085 0029-2101 0029-2085 The dbx prompt modifier is too long; the maximum length is number. Explanation: The dbx prompt modifier string that you specified using the command line -dbxpromptmod flag or the MP_DBXPROMPTMOD environment variable was too long. User Response: Reset the MP_DBXPROMPTMOD environment variable or retry the pdbx command with a shorter string following the -dbxpromptmod flag. 0029-2086 Event: number cannot be deleted because it does not exist in the specified or current context.
0029-2102 0029-2108 0029-2102 The sh command with no arguments is not allowed. Explanation: You issued the sh command with no arguments, which is not allowed. User Response: Issue the sh command with a specific executable name supplied. For example: sh ls. 0029-2103 The requested command could not be executed on the specified context because at least one task in that context is currently RUNNING.
0029-2109 0029-2116 0029-2109 No action taken on task(s): string, because they have either been stopped by the debugger, finished executing, or have been unhooked. Explanation: The tasks listed were not RUNNING. These tasks may already be under the control of the debugger because of a breakpoint or step command. They could also have finished execution or be unhooked. User Response: None, this is an informational message.
0029-2117 0029-2122 0029-2117 Group string has been deleted. Explanation: You issued the group delete command and the group has been successfully deleted. User Response: None. This is an informational message. 0029-2118 No action was taken because task(s): string are currently RUNNING, and because the specified group has breakpoints or tracepoints set for it. Only tasks in the DEBUG READY state can be added to a group which has group breakpoints or tracepoints set.
0029-2123 0029-2129 0029-2123 This event cannot be set because some task(s) in the group are unhooked. Explanation: You issued a trace or stop command against a group which contains some task(s) that are unhooked. User Response: The hook command can be used to regain debugger control of of previously unhooked tasks. You can create another group which does not contain any tasks that are in the unhooked state. 0029-2124 Could not add event to task: number, because it is in state: string.
0029-2130 0029-9036 0029-2130 No action was taken because the group name specified was null. Explanation: You issued one of the group commands, but no group name was provided. User Response: Choose a group name of at most 32 characters that starts with an alphabetic character and is followed by alphanumeric characters. 0029-2131 All tasks have exited. Issue quit then restart the debugger if you wish to continue debugging. Explanation: All the tasks of the partition have exited.
0029-9039 0029-9040 -F This flag can be used to turn off lazy reading mode. Turning lazy reading mode off forces the remote dbx sessions to read all symbol table information at startup time. By default, lazy reading mode is on. Lazy reading mode is useful when debugging large executable files, or when paging space is low. With lazy reading mode on, only the required symbol table information is read upon initialization of the remote dbx sessions.
0029-9041 0029-9048 0029-9041 Cannot locate attach configuration file "string". Explanation: pdbx was unable to locate the attach configuration file. User Response: 1. Make sure that the correct POE process id was used when invoking the debugger. 2. Check the /tmp directory for the existence of a configuration file containing the POE process id. (For example, check for /tmp/.ppe.34192.attach.cfg). 0029-9042 No tasks listed in attach configuration file.
0029-9049 0029-9999 0029-9049 The following environment variables have been ignored since they are not valid when starting the debugger in attach mode - string Explanation: Some of the environment variables the user has set are not valid when starting pdbx in attach mode. A message is given indicating what variables have been ignored. The debugger continues. User Response: Note any environment variable of interest that have been ignored.
24 IBM PE for AIX V2R4.
0030-0002 0030-0033 Chapter 3. pedb Messages 0030-0002 string : Data Display data is not attached to data window [number][number]. Explanation: Cannot access information to update the data window. User Response: Further data viewing will be limited. 0030-0013 Range index value number is out of bounds. The index value must be within the range between number and number. Explanation: You have entered an index that is not within the range of acceptable values for the array selected.
0030-0034 0030-0044 0030-0034 No source file is available to edit. Explanation: pedb could not locate a source file to edit. Pressing the edit button causes an edit window to be displayed containing the file that is currently displayed in the pedb source window. Since there is no source file to edit, the edit window will not be displayed. User Response: Normal pedb processing will continue. 0030-0035 No task(s) selected. Explanation: The context has not been set to a task or a task group.
0030-0050 0030-0056 0030-0050 An invalid value: string was specified for the Play Delay. Please enter non-negative integer value. If you click on Cancel, the new delay field will be reset to the previous value of number. Explanation: An invalid value for the play delay has been entered. Only non-negative integers are valid. User Response: Specify a non-negative integer value for the play delay in tenths of seconds. 0030-0051 string number: Could not resolve mouse selection to a stack entry.
0030-0057 0030-0064 0030-0057 Task number has been detached. Explanation: A reply was received from the debug engine (dbe) that indicated the specified task has been detached. User Response: None. This is an informational message. 0030-0058 Attached to task number. Explanation: The specified task has been attached by the debugger. User Response: None. This is an informational message. 0030-0059 Debugger attached and ready.
0030-0065 0030-0071 0030-0065 Could not open socket for debugger to communicate with poe. Explanation: The socket() call failed when the debugger tried to set up communications with POE. User Response: Debugging can continue except that the information about synchronized exit will not be passed back to the debugger from the POE job. Please note that the debugger will most likely not be able to re-attach to this POE job after detaching. 0030-0066 Could not make socket connection to poe.
0030-0072 0030-0107 0030-0072 All tasks have exited. Select Ok to detach. Explanation: All the tasks in the partition have completed program execution. Selecting Ok causes pedb to detach from the program and exit. An alternative would be to click on Cancel and then select the Quit option from the File pulldown menu. Please note that this method would kill the POE job as well as causing pedb to exit.
0030-0109 0030-0113 0030-0109 string searched to the top/end of the file and did not find string Explanation: This message is formatted dynamically from the string you are searching for, and the direction of the search. Message format is: Searched to the limit of the file and did not find string. For example: User specifies a string of my_variable in this find window. If using the First or Next option, the message text will be: Searched to the end of the file and did not find my_variable.
0030-0114 0030-0120 0030-0114 Array string on task string, thread string has a different number of dimensions. It is excluded from the export. Explanation: The array with a matching array name on the specified task and thread does not meet the match criteria and is excluded from the export. User Response: The user must be aware of the match criteria when trying to allow multiple matching arrays to exported at the same time.
0030-0121 0030-0129 | | 0030-0121 The MPI application has not been run in debug mode; therefore, there will be no data on blocking calls and no timestamp information. | | Explanation: Some MPI debugging data is only collected when MPI is run in DEBUG mode. | | User Response: See the documentation concerning the setting of the MPI_EUIDEVELOP environment variable. | 0030-0122 Could not create a new request record.
0030-0130 0030-2208 | 0030-0130 Could not get message group information. | | Explanation: An error occured while attempting to retrieve group information for an MPI message record. | | User Response: If the error persists, cancel and restart the message queue debugging feature. | 0030-0131 Could not get message details for task task. | | Explanation: An error occured while attempting to retrieve message detail information for an MPI message record.
0030-2209 0030-2218 0030-2209 Task number has requested exit. Explanation: The indicated task has attempted to exit. The program terminates when all tasks have requested exit. User Response: None. This is an informational message. 0030-2212 The group was not added because the first character in the group name specified was not an alphabetic character. Explanation: The new group name specified in the Add Group Window started with a character that was not alphabetic.
0030-2219 0030-2227 0030-2219 No members were chosen. Explanation: When attempting to add a new group, you didn't choose any tasks as it's members. User Response: Select members for the new group. 0030-2220 Too many members were specified. Explanation: When attempting to add a new group, there were too many members chosen. User Response: Select fewer members for the new group. 0030-2221 Cannot delete group ALL. Explanation: Removing the group ALL is not allowed. User Response: None.
0030-2230 0030-2240 0030-2230 No Items were selected. Explanation: The user selected Apply or Ok on the Variable Selection window without choosing any variables to be displayed. User Response: None. This is an informational message. 0030-2232 Could not locate source file: string for task: number. Explanation: pedb could not locate a source file to correspond with the current program state in this task. Consequently no source file for this task will appear in the source file window.
0030-2241 0030-2257 0030-2241 Task number loaded with string string. Explanation: Describes what executable and arguments were loaded for a particular task. User Response: None. This is an informational message. 0030-2242 Unable to send command to task 'number '. Explanation: An error occurred in sending an pedb command to the indicated task. Probably the remote node is no longer accessible. User Response: Verify that the remote node in the partition can be contacted by other means.
0030-2259 0030-2266 0030-2259 Unable to write to the directory string. Explanation: pedb was not able to write to the directory specified. This is the directory that is used to write the temporary files used in visualization. User Response: Check the permissions of the directory. pedb uses this directory for temporary files. The default is /tmp. This can be overridden using the MP_TMPDIR environment variable. 0030-2260 Unable to parse the stack trace, placing task: number in exited state.
0030-2267 0030-2275 0030-2267 HDF Failure: Unable to write array slice. Explanation: An error has occurred while trying to write the array slice to the HDF file. You may encounter this error when the HDF file is corrupted or when your file system is full. User Response: Select a different file to export to and check file system space. 0030-2268 HDF Failure: Unable to close writing to file string. Explanation: An error has occurred while trying to close writing of array slices to the HDF file.
0030-2276 0030-2284 0030-2276 A non-integer value has been entered for the stride. Explanation: A non-integer value was entered in text field the specifies the stride value. User Response: Enter an integer value. 0030-2277 Zero has been entered for the stride. Enter a non-zero integer value. Explanation: The stride value must be an non-zero integer. User Response: Enter an integer value that is non zero.
0030-2285 0030-2291 0030-2285 Task number is not in DEBUG state. It is excluded from the export. Explanation: A task must be in DEBUG state to be able to participate in an export. User Response: If the user does not care that the task was excluded from the multi array export, the message can be ignored. If the user wants the array from the task to be included in the export, the user must put the task in DEBUG state prior to exporting.
0030-2292 0030-2296 0030-2292 You cannot Export at this time because the program stack has changed since you created this window. The chosen array is out of scope. Explanation: The array that was chosen in the Export window is no longer within scope. The program stack has changed due to an execution command, such as step or continue. The array chosen may no longer exist due to scoping rules.
0030-2297 0030-3008 | 0030-2297 Please specify a filename in the Export Filename field. | | Explanation: No file name has been specified the the Export Filename field of the Export window. It may be that the field is empty or that the field contains only a directory path. | | | | User Response: Please type a file name into the Export Filename field of the Export window before pressing the Export button.
0030-3014 0030-3020 0030-3014 Task number: ReplyExpression(): Internal error returned from unknown callee. Explanation: Received an error code from a routine that ReplyExpression() called but there was no additional information to pass on. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0030-3015 Task: number encountered signal: number - string Explanation: The application encountered a signal of the type specified.
0030-3021 0030-3027 0030-3021 Play mode has been stopped. Explanation: Play mode has been terminated by the halt or stop button. User Response: None. This is an informational message. 0030-3022 Play mode has been started. Explanation: Play mode has been initiated by the play button. User Response: None. This is an informational message. 0030-3023 The halt button had no effect on task: number, because it was not running. Explanation: The halt button was selected.
0030-3028 0030-3034 0030-3028 Task number: Remote debug engine was unable to set the initial breakpoint. Explanation: The remote debug engine was unable to set the initial breakpoint. User Response: Check that the file containing the main routine or the program statement has been compiled with the -g option. Check that the MP_DEBUG_INITIAL_STOP environment variable, if used, is set to an executable line of source code.
0030-3035 0030-3042 0030-3035 Task number: The breakpoint request failed. An invalid source line or invalid condition was specified. Explanation: A source line in the source code window has been selected, and a breakpoint request has been made for that line. The line selected may not have generated any executable code when compiled. If a condition was specified, it may have been invalid. No action has been taken. User Response: Select a another source line or specify a different condition.
0030-3043 0030-9022 0030-3043 Task number: The executable chosen for debugging did not have execute permission. Explanation: The remote debugger attempted to find the program to execute on a task. User Response: Update the permissions on the program file on the remote node. 0030-3044 Task number: The executable chosen for debugging is not a RS/6000 executable. Explanation: The remote debugger could not find the program to execute on a task.
0030-9051 0030-9999 Flags: -a Attaches to a running POE job by specifying its process id. The debugger must be executed from the node from which the POE job was initiated. Note that when using the debugger in attach mode there are some debugger command line arguments that should not be used. In general, any arguments that control how the partition is set up, or specify program names and arguments, should not be used. -d Sets the limit for the nesting of program blocks.
0031-001 0031-018 Chapter 4. POE Messages 0031-001 No man page available for poe Explanation: User has requested that the poe man page be displayed (via -hoption), but the /usr/man/cat1/poe.1 file does not exist, or some directory in the path leading to the file is not searchable. User Response: Check that the file exists and that all directories in the path leading to the file are searchable. The pedocs fileset may need to be installed if the file doesn't exist.
0031-019 0031-028 0031-019 pm_contact: connect failed Explanation: The Partition Manager terminates. User Response: The Partition Manager is unable to connect to a remote node. Message 0031-020 follows. Probable PE system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-020 Couldn't connect to task number (string) Explanation: The Partition Manager terminates.
0031-029 0031-040 0031-029 Caught signal number (string), sending to tasks... Explanation: The indicated signal is not used specifically by Partition Manager, and is being passed on to each remote task. User Response: Verify that the signal was intended. 0031-031 task number is alive Explanation: The message is sent from the indicated task in response to signal SIGUSR2. User Response: Verify that the signal was intended. 0031-032 exiting...
0031-041 0031-051 0031-041 sigaction(SIGIOT) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-042 sigaction(SIGEMT) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-043 sigaction(SIGFPE) Explanation: An explanatory sentence follows. The Partition Manager terminates.
0031-052 0031-062 0031-052 sigaction(SIGCONT) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-053 sigaction(SIGCHLD) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-054 sigaction(SIGTTOU) Explanation: An explanatory sentence follows. The Partition Manager terminates.
0031-063 0031-077 0031-063 sigaction(SIGDANGER) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-064 sigaction(SIGVTALRM) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-065 sigaction(SIGMIGRATE) Explanation: An explanatory sentence follows. The Partition Manager terminates.
0031-078 0031-089 0031-078 invalid retrytime Explanation: The -retrytime option was neither a 0 nor a positive number. User Response: Correct the flag. 0031-079 invalid pmlights Explanation: The -pmlights option was neither a 0 nor a positive number. User Response: Correct the flag. 0031-080 invalid usrport Explanation: The -usrport option was neither a 0 nor a positive number less than 32768. User Response: Correct the flag.
0031-092 0031-102 0031-092 MP_PROCS not set correctly Explanation: The MP_PROCS environment variable is not a positive number. User Response: Correct the variable. 0031-093 MP_INFOLEVEL not set correctly Explanation: The MP_INFOLEVEL environment variable is neither 0 or a positive number less than 32768. User Response: Correct the variable. 0031-094 MP_TRACELEVEL not set correctly Explanation: The MP_TRACELEVEL environment variable is neither 0 or a positive number less than 32768.
0031-103 0031-116 0031-103 Invalid MP_TTEMPSIZE Explanation: The MP_TTEMPSIZE environment variable specifies too large a trace file (or an invalid number). User Response: Reduce or correct the size. 0031-104 Incorrect MP_TTEMPSIZE unit Explanation: The MP_TTEMPSIZE environment variable is not of the form numberM or numberG. User Response: Correct the flag. 0031-105 Invalid MP_TPERMSIZE Explanation: The MP_TPERMSIZE environment variable specifies too large a trace file (or an invalid number).
0031-117 0031-124 0031-117 Unable to contact Resource Manager Explanation: The Partition Manager was unable to contact the Resource Manager to allocate nodes of the SP. User Response: Check that the Resource Manager is running. Otherwise, gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-125 0031-134 0031-125 Fewer nodes (number) specified in string than tasks (number). | Explanation: There was a larger number of nodes specified than what is defined in the host.list file. | | | | | User Response: Check that you haven't specified a number of nodes greater than the number of physical compute nodes in your RS/6000 SP or RS/6000 network cluster. Otherwise, wait until later when the needed number of nodes is available.
0031-135 0031-142 0031-135 Invalid labelio option, should be YES or NO Explanation: A labelio other than YES or NO was entered. User Response: Re-specify labelio with either YES or NO. 0031-136 Invalid MP_NOARGLIST option, should be YES or NO Explanation: The Partition Manager terminates. User Response: Enter YES or NO for MP_NOARGLIST.
0031-143 0031-149 0031-143 Could not read message from debug socket. Explanation: The call to read() failed when attempting to read a message from the debug socket. User Response: None. 0031-144 error creating directory for core files, reason: Explanation: A corefile directory could not be created for the given reason. User Response: Fix reason and rerun job.
0031-150 0031-155 0031-150 Unable to load shared objects required for Resource Manager Explanation: The execution environment specified use of the Resource Manager, but one or more of the following shared objects did not exist in /usr/lpp/ssp/lib or /usr/lib: jm_client.shr.o libjm_client.a libSDRs.a See IBM Parallel Environment for AIX: Operation and Use, Volume 1 for more information about the execution environment.
0031-156 0031-171 Unexpected return code number from ll_get_data (number). | 0031-156 | Explanation: An internal error has occurred. | | User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-157 Couldn't flush VT traces Explanation: The program continues. User Response: At termination, the Partition Manager was unable to successfully terminate trace processing for VT. Check any messages issued by VT.
0031-172 0031-203 0031-172 I/O buffer overflow Explanation: The stdout or stderr string overflows the output buffer (8K). The excess is discarded. User Response: Probable internal error. Normally, the output is automatically flushed if it exceeds the buffer length. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-207 0031-216 0031-207 pmd: sigaction Explanation: Error when setting up to handle a signal. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-208 pmd: fork Explanation: The pm daemon is unable to fork to execute the user application. User Response: Probable system error.
0031-217 0031-251 0031-217 POE (number), pmd (number), and dbe (number) versions are incompatible. Explanation: The versions of POE, pmd, and the debug engine (dbe) are incompatible. User Response: You should check that POE, pmd, and dbe are at compatible PE version levels. If necessary, install compatible versions. Partition manager daemon not started by LoadLeveler on node string.
0031-252 0031-259 0031-252 task number stopped: string Explanation: The indicated task has been stopped. The second variable in this message indicates the signal that stopped the task. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-253 Priority adjustment call failed: rc = number, errno = number Explanation: The call to start the priority adjustment process failed.
0031-260 0031-306 0031-260 Invalid entry in /etc/poe.priority file for user string, class string; priority adjustment function not started Explanation: In attempting to start the dispatching priority adjustment function, there was no entry for the user and class found in the /etc/poe.priority file for this task. Most likely, the entry is missing or in error. Normal application execution continues, although the priority adjustment function will not be run.
0031-307 0031-311 0031-307 remote child: error restoring stdin. Explanation: The previously closed stdin cannot be restored. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-308 Invalid value for string: string Explanation: Indicated value is not a valid setting for the indicated environment variable or command line option. User Response: Set to a valid value and rerun.
0031-312 0031-319 The checkpoint file string already exists in the working directory. | 0031-312 | | Explanation: While attempting to checkpoint the program, an existing version of the checkpoint file was found in the working directory. Execution is terminated. | | User Response: Check the name of the file specified by the MP_CHECKFILE and MP_CHECKDIR environment variables. If necessary, remove the previous version of the file. | 0031-313 | Explanation: The internal routine setExecInfo failed.
0031-320 0031-326 | | 0031-320 Error occurred saving file information during checkpointing. Return code is number. | | Explanation: An error occurred attempting to save the file information for the data segment while checkpointing the program. | | User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-327 0031-334 | | 0031-327 | Explanation: An error occurred saving the stack data while checkpointing the program. | | User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. | | 0031-328 | Explanation: An error occurred writing the footer data while checkpointing the program. | | User Response: Probable system error.
0031-335 0031-342 0031-335 SSM subtype not what was expected Explanation: An internal error was detected where an unexpected message type was returned. The remote node terminates. User Response: Probable PE error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-336 Error with VT_trc_set_params_c. Explanation: An internal error was detected after trying to set up the VT trace parameters. The remote node terminates.
0031-343 0031-349 | 0031-343 | | Explanation: An error occurred during opening the checkpoint file directory while checkpointing the program. | | User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. | 0031-344 | | Explanation: Local checkpoint/restart files were not found. As a result, restart of the program is not possible.
0031-350 0031-356 | | 0031-350 Error occurred comparing environment variables during restart. Return code is number. | | Explanation: The original POE and MPI environment variables do not match those contained in the program to be restarted. As a result, the program cannot be restarted. | | | | User Response: Make sure the contents of the checkpoint files specified by the MP_CHECKDIR and MP_CHECKFILE environment variables is valid for the previously checkpointed parallel program.
0031-357 0031-363 | 0031-357 | | Explanation: An error occurred during opening the checkpoint file directory while restoring the program. | | User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. | | 0031-358 | | | Explanation: An internal error in pm_SSM_read occurred while trying to read the messages during the synchronization of POE tasks, while restoring a previously checkpointed file.
0031-364 0031-370 Contacting LoadLeveler to string information for string job. | 0031-364 | | Explanation: Informational message to user to indicate that LoadLeveler is being used for the interactive or batch job. | User Response: None required. | 0031-365 | | Explanation: LoadLeveler either could not run the interactive job for the reason indicated, or, LoadLeveler terminated the interactive job for the reason indicated.
0031-371 0031-376 Conflicting specification for -msg_api, using string. | 0031-371 | | | | | Explanation: A batch job using POE was submitted to LoadLeveler with a network statement in the Job Command File that contained a specification for messaging API that was different than the specification provided to POE via the MP_MSG_API environment variable or the -msg_api command line option. The specification used in this case will be that which appeared in the network statement.
0031-377 0031-402 Using string for euidevice. | 0031-377 | | Explanation: Informational message to indicate the messaging passing device being used for the batch POE job submitted to LoadLeveler. | User Response: None required. | 0031-378 | | Explanation: User has submitted a POE job in batch mode under LoadLeveler and the SP_NAME environment variable or associated command-line option was set.
0031-403 0031-409 0031-403 Forcing dedicated adapter for User Space job Explanation: User explicitly specified User Space job using -euilib us or MP_EUILIB=us and poe is making sure the adapter usage requested from the Resource Manager is dedicated. This can also occur if no euilib was specified and the execution environment resulted in an implicit User Space job. User Response: None required.
0031-410 0031-415 | 3 -- Could not get hostname | 4 -- Nameserver could not resolve host | 5 -- Socket error | 6 -- Could not connect to host | 7 -- Could not send command to remote startd | | | User Response: Check pathname and permissions for /etc/pmdv2. Retry; if problem persists, gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-416 0031-604 0031-416 string: no response; rc = number Explanation: An error occurred on reading data from remote node to home node. User Response: This is an IP communication error between home and remote node. No acknowledgement of startup was received from the pmd daemon running on the indicated node. Check for error message from that node.
0031-605 0031-612 0031-605 Unexpected EOF on allocation file for task number Explanation: There were not enough entries in the hostfile for the number of processes specified. User Response: Lower the number of processes or add more entries to the hostfile.
0031-613 0031-619 0031-613 Unable to send command to task number Explanation: An error occurred in sending the poe command to the indicated task. Probably the remote node is no longer accessible. POE terminates. User Response: Verify that the remote node in the partition can be contacted by other means. If problem persists, gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-620 0031-626 0031-620 pm_SSM_write failed in sending the user/environment for taskid number Explanation: The internal pm_SSM_write function failed. The system terminates. User Response: Probable PE error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-621 pm_SSM_write failed in sending the partition map information for taskid number Explanation: The internal pm_SSM_write function failed. The system terminates.
0031-627 0031-631 0031-627 Task number connection blocked. Task will be abandoned. Explanation: While shutting down the partition, POE was unable to write to the indicated task, because the socket was blocked. The socket and task are subsequently ignored and the shutdown continues. User Response: Often this means that a remote node is not responding. The tasks running on this node must be terminated manually. Verify that the node can be contacted by other means.
0031-632 0031-638 0031-632 Can't connect to PM Array. errno = number Explanation: POE tried to connect to the Program Marker Array tool, but was unsuccessful. The system error number is returned. Most likely, the Program Marker Array has not been started. User Response: If the Program Marker Array is not being used, ignore this message. Otherwise, terminate POE, start the pmarray, and restart POE.
0031-639 0031-645 0031-639 Exit status from pm_respond = number Explanation: The pm_respond function exited with the indicated status. User Response: If other error messages occurred, perform corrective action indicated for the message(s); otherwise, no action is required. 0031-641 Unrecoverable failure in Resource Manager, terminating partition... Explanation: A non-zero return code was returned from the SP Resource Manager message interpretation function.
0031-646 0031-652 0031-646 PM Array is trying to tell us something ... Explanation: A message from PM Array is directed to the Home Node. At present there are no Home Node functions responding to the PM Array, so the message text is just printed out. User Response: Verify that the PM Array tool is working correctly. 0031-647 string Explanation: This is the message buffer text from PM Array as described in message 0031-646. User Response: Verify that the PM Array tool is working correctly.
0031-653 0031-659 0031-653 Couldn't route data from STDIN to task number Explanation: An error occurred routing STDIN to the indicated task. The partition is terminated and POE exits. User Response: Verify that the remote task is active. If the problem persists, gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-660 0031-667 0031-660 Partition Manager stopped ... Explanation: The Home Node (POE) has stopped in response to a SIGTSTOP (Z) signal. The remote nodes have been stopped. User Response: To resume the job, issue SIGCONT, or use the shell job control commands fg or bg. 0031-661 signal_sent = number not recognized Explanation: The indicated signal was recorded as being sent to the remote nodes, but is not recognized by POE. Execution continues. User Response: Probable POE internal error.
0031-668 0031-675 0031-668 pm_io_command: error in pm_SSM_write, rc = number Explanation: An error occurred while responding to a STDIO MODE QUERY message. The response is abandoned. User Response: Probable POE internal error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-669 Can't acknowledge IO_command sync Explanation: A socket error occurred while broadcasting a synchronization request acknowledgment.
0031-676 0031-687 0031-676 Invalid value string for mp_euidevice Explanation: The mp_euidevice specified on the command line with -euidevice or in the environment with MP_EUIDEVICE is not valid. User Response: Refer to IBM AIX Parallel Environment Operation and Use for valid choices and rerun. 0031-677 Unexpected return code number from _mp_stdoutmode Explanation: An error may have occurred in a lower level function.
0031-688 0031-702 0031-688 Incorrect subtype number received in structured socket message Explanation: Internal error has occurred. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-689 Unexpected return code number from _mp_stdoutmode_query Explanation: An error may have occurred in a lower level function. User Response: If earlier error messages exist, perform whatever corrective action is indicated for these.
0031-703 0031-711 0031-703 invalid nprocs argument Explanation: The nprocs argument received by the pm daemon is invalid. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-704 invalid newjob argument Explanation: The newjob argument received by the pm daemon is invalid. User Response: Probable system error.
0031-712 0031-722 0031-712 parent error reading STDIN, rc = number Explanation: pm daemon parent was unable to read STDIN. User Response: Probable system error. 0031-713 pmd parent: error w/ack for sig req to home Explanation: pm daemon parent had error sending ack for sig request. User Response: Probable system error. 0031-714 pmd parent: error writing to child's STDIN Explanation: pm daemon parent was not able to write to its child's STDIN. User Response: Probable system error.
0031-723 0031-730 0031-723 userid = Explanation: userid is set to the given userid. User Response: No response needed. 0031-724 Executing program: Explanation: The child is executing the given program. User Response: No response needed. 0031-725 Failed to exec program string; errno = number Explanation: The child failed to execute the given program. User Response: Probable system error. POE's /usr/lpp/ppe.poe/lib/libc.a may not be up to date.
0031-731 0031-803 0031-731 Error getting and setting DFS credentials. Explanation: The PMD called the poe_dce_set function to get and set the current context for establishing the DFS/DCE credentials when it encountered an error. poe_dce_set should have issued additional messages describing the errors. User Response: Contact the system administrator to ensure the DFS/DCE credentials are properly set up.
0031-804 0031-902 0031-804 -pgmmodel string ignored in remote child Explanation: -pgmmodel interpreted only in parent code. User Response: No response needed. 0031-805 Invalid programming model specified: string Explanation: -pgmmodel should be either SPMD or MPMD. User Response: Re-enter -pgmmodel with either SPMD or MPMD. 0031-806 Invalid retry count string Explanation: Retry count should be an integer. User Response: Re-enter -retry followed by an integer.
0031-903 0031-A400 0031-903 Can't confirm profiling for task number Explanation: A communication failure has occurred. User Response: Retry; if problem persists, gather information about the problem and follow local site procedures for reporting hardware and software problems. 0031-904 Can't rename profiling file to string Explanation: A communication failure may have occurred, or the profiling file could not be opened. User Response: Check path name and permissions.
0031-A401 0031-A409 0031-A401 Error in binding socket Explanation: The program pmarray terminates. An explanatory sentence is appended. User Response: Probable system error. Check the condition(s) given in the explanatory sentence. 0031-A402 Error in listen Explanation: The program pmarray terminates. An explanatory sentence is appended. User Response: Probable system error. Check the condition(s) given in the explanatory sentence. 0031-A403 Error in accept Explanation: The program pmarray terminates.
104 IBM PE for AIX V2R4.
0032-001 0032-010 Chapter 5. MPI Messages 0032-001 Invalid source task (number) in string, task number Explanation: The value of src (source task ID) is out of range. User Response: Make sure that the source task id is within the range 0 to N-1, where N is the number of tasks in the partition. 0032-002 Invalid destination task (number) in string, task number Explanation: The value of dest (destination task id) is out of range.
0032-011 0032-019 0032-011 Invalid qtype value (number) in string, task number Explanation: The value specified for qtype is invalid. User Response: Make sure that qtype is either 1, 2, or 3. 0032-012 Invalid nelem value (number) in string, task number Explanation: The value specified for nelem is invalid. User Response: Make sure that nelem is not less than 0. 0032-013 Out of memory, task number Explanation: There is insufficient memory available to continue.
0032-020 0032-029 0032-020 Invalid task id (number) in string, task number Explanation: The value specified for taskid is out of range. User Response: Make sure that taskid is within the range 0 to N-1, where N is the number of tasks in the partition. 0032-021 Invalid task rank (number) in string, task number Explanation: The value specified for rank is out of range. User Response: Make sure that rank is within the range 0 to N-1, where N is the number of tasks in the group referenced by gid.
0032-030 0032-050 0032-030 Inconsistent flag value in string, task number Explanation: The same value of flag was not specified by each task in the group. User Response: Make sure that each task specifies the same flag value 0032-031 Inconsistent gsize value in string, task number Explanation: The same value of gsize was not specified by each task in the group. User Response: Make sure that each task specifies the same gsize value.
0032-051 0032-057 0032-051 Invalid count argument Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. User Response: See the entry for the specific error code returned by the MPI function. Error Class: MPI_ERR_COUNT 0032-052 Invalid datatype argument Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred.
0032-058 0032-064 0032-058 Invalid group Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. User Response: See the entry for the specific error code returned by the MPI function. Error Class: MPI_ERR_GROUP 0032-059 Invalid operation Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred.
0032-065 0032-071 0032-065 Known error not in this list Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. User Response: See the entry for the specific error code returned by the MPI function. Error Class: MPI_ERR_OTHER 0032-066 Internal MPI error Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred.
0032-072 0032-078 | 0032-072 | | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. | User Response: See the entry for the specific error code returned by the MPI function. | Error Class: MPI_ERR_INFO | 0032-073 | | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred.
0032-079 0032-085 | 0032-079 File exists. | | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. | User Response: See the entry for the specific error code returned by the MPI function. | Error Class: MPI_ERR_FILE_EXISTS | 0032-080 | | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred.
0032-086 0032-104 | 0032-086 | | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. | User Response: See the entry for the specific error code returned by the MPI function. | Error Class: MPI_ERR_DUP_DATAREP | 0032-087 | | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred.
0032-105 0032-112 0032-105 Invalid group handle (number) in string, task number Explanation: The specified group handle is undefined or NULL. User Response: Make sure that the group handle is either predefined or was returned by an MPI function. Error Class: MPI_ERR_GROUP 0032-106 Negative length or position for buffer (number) in string, task number Explanation: The values specified for buffer size and position must be positive.
0032-113 0032-119 0032-113 Out of memory in string, task number Explanation: There is insufficient memory available to continue. User Response: Reduce the size of user storage required per task. Error Class: MPI_ERR_INTERN 0032-114 Internal error: string in string, task number Explanation: An internal software error occurred during execution. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0032-120 0032-127 0032-120 Declaration has upper bound < lower bound (number) in string, task number Explanation: No datatype can be defined with negative extent (upper bound less than lower bound). User Response: Make sure any MPI_LB or MPI_UB argument to MPI_Type_struct is consistent with the layout being defined. Error Class: MPI_ERR_ARG 0032-121 Invalid rank (number) in string, task number Explanation: The value specified for rank is out of range.
0032-128 0032-135 0032-128 Inconsistent root node (number) in string, task number Explanation: The participants in a collective operation did not all specify the same value for root. User Response: Make sure that root is identical for all tasks making the call. Error Class: MPI_ERR_NOT_SAME 0032-129 Can't use an intercommunicator (number) in string, task number Explanation: Only intra-communicators are valid with this function. User Response: Make sure that comm is a handle for an intra-communicator.
0032-136 0032-142 0032-136 Invalid communicator (number) in string, task number Explanation: The value used for communicator is not a valid communicator handle. User Response: Make sure that the communicator is valid (predefined or created by an MPI function) and has not been freed by MPI_Comm_free. Error Class: MPI_ERR_COMM 0032-137 Invalid keyval (number) in string, task number Explanation: The value used for keyval is not a valid attribute key handle.
0032-143 0032-149 0032-143 Invalid dimension count (number) in string, task number Explanation: The value specified for ndims is invalid. User Response: Make sure that the number of dimensions is greater than zero. Error Class: MPI_ERR_DIMS 0032-144 There is no solution in string, task number Explanation: There is no set of dimensions which satisfies the conditions required by a call to MPI_Dims_create.
0032-150 0032-156 0032-150 MPI is not initialized in string, task number Explanation: A call to an MPI function other than MPI_Init or MPI_Initialized was made before MPI was initialized. User Response: Call MPI_Init before any other MPI function other than MPI_Initialized. Error Class: MPI_ERR_OTHER 0032-151 MPI is already finalized in string, task number Explanation: A call to an MPI function was made when MPI was in the finalized state.
0032-157 0032-163 0032-157 Invalid request handle (number) in string, task number Explanation: The value specified is not a valid request handle. User Response: Make sure that the request handle was returned by an MPI function. Error Class: MPI_ERR_REQUEST 0032-158 Persistent request already active (number) in string, task number Explanation: An attempt was made to start a persistent request when the request was already active. User Response: Complete the request by calling MPI_Wait, MPI_Test, etc.
0032-164 0032-170 0032-164 Delete callback failed in string, task number Explanation: A non-zero return code was returned by the delete callback function associated with an attribute keyval. The specific value returned by the delete callback function is not available via MPI. User Response: Make sure that user-defined delete callback functions are functioning correctly, and are returning MPI_SUCCESS upon successful completion.
0032-172 0032-177 0032-172 Invalid color (number) in string, task number Explanation: A negative value was used for color. User Response: Make sure that color is greater than or equal to zero, or is MPI_UNDEFINED. Error Class: MPI_ERR_ARG 0032-173 Invalid node degree (number) in string, task number Explanation: A negative value was used for an element of the index array. User Response: Make sure that the index array contains only non-negative entries.
0032-178 0032-183 0032-178 A negative number of triplets was specified (number) in string, task number Explanation: The number of range triplets specified must be positive. A zero is accepted as a valid number though calling the range include or exclude function with zero ranges is probably not useful. User Response: Correct the number of ranges argument.
0032-184 0032-188 MPI was not finalized in string, task number | 0032-184 | | Explanation: An MPI program exited without calling MPI_Finalize. The parallel job is terminated with an error exit code. | | User Response: Correct the program and/or determine if the program terminated abnormally, perhaps via a library routine calling exit (0) after encountering an error condition.
0032-189 0032-253 Datatype extent cannot be expressed as an integer or MPI_Aint in string, task number | | 0032-189 | | | Explanation: A call to create a user-defined datatype would create a type with an extent or true extent set by MPI_LB or MPI_UB whose magnitude is too great to be expressed by an integer or MPI_AINT. | User Response: Restructure the program to use datatypes of smaller magnitude.
0032-254 0032-281 MP_SINGLE_THREAD is set in a multi-threaded program, detected in string, task number | | 0032-254 | | Explanation: The MP_SINGLE_THREAD environment variable is set, but multiple threads are executing. | | User Response: Unset the MP_SINGLE_THREAD environment variable and rerun the program. | Error Class: MPI_ERR_OTHER | | 0032-255 | | Explanation: The datatype given is a named predefined datatype which cannot be decoded.
0032-282 0032-305 Invalid info key number (number) in string, task number. | 0032-282 | | Explanation: The info key number specified must be between 0 and N-1, where N is the number of keys currently defined in the info argument. | User Response: Correct the info key number argument. | Error Class: MPI_ERR_ARG | 0032-283 | Explanation: The info handle provided does not represent a valid MPI_Info object.
0032-306 0032-312 Unclosed files when finalizing string, task number. | 0032-306 | Explanation: There are still open files when MPI_FINALIZE is called. | User Response: Make sure that all files are closed prior to calling MPI_FINALIZE. | Error Class: MPI_ERR_OTHER | 0032-307 | Explanation: You did not specify a documented MP_ environment variable. | User Response: Contact IBM service.
0032-313 0032-319 Invalid grid size (number) in string, task number. | 0032-313 | | Explanation: The cartesian grid of processes defined by arguments ndims and array_of_psizes to MPI_TYPE_CREATE_DARRAY() has a size different from argument size. | | User Response: Correct either the value of the size argument or the values of the array_of_psizes elements.
0032-320 0032-328 Invalid displacement (number) in string, task number. | 0032-320 | Explanation: A negative displacement has been specified. | User Response: Modify the value of the disp argument. | Error Class: MPI_ERR_ARG | 0032-321 | Explanation: The user does not have the required access permissions on the file. | User Response: Modify file access permissions.
0032-329 0032-336 Pending I/O operations when setting file size string, task number. | 0032-329 | | Explanation: The file size is being set while there are still pending I/O operations on the file. | | User Response: Modify the program so that all I/O operations are complete prior to setting the file size. | Error Class: MPI_ERR_OTHER | 0032-330 | Explanation: A negative offset has been specified. | User Response: Modify the value of the offset argument.
0032-338 0032-404 Inconsistent elementary datatypes string, task number | 0032-338 | | Explanation: The file view is being set and the elementary datatypes specified by the participating processes do not have the same extent. | | User Response: Modify the elementary datatypes and make sure they have the same extent on all processes. | Error Class: MPI_ERR_NOT_SAME | 0032-339 | Explanation: The file being opened does not reside in a file system of a supported type.
0032-405 0032-410 Internal fsync failed (number) in string, task number. | 0032-405 | Explanation: An internal call to fsync() failed. | User Response: Check error number and take appropriate action. | Error Class: MPI_ERR_IO | 0032-406 | Explanation: An internal call to lseek() failed. | User Response: Check the error number and take appropriate action. | Error Class: MPI_ERR_IO | 0032-407 | Explanation: An internal call to read() failed.
136 IBM PE for AIX V2R4.
0033-1001 0033-1007 Chapter 6. VT Messages 0033-1001 Node is inactive Explanation: The node selected for monitoring is not active. Error Class: The selected square does not represent a node that is communicating with the performance monitor. User Response: Select a different square. 0033-1002 Monitoring is currently ON. Changes will not be effective until monitoring is stopped and restarted. Explanation: You have tried to add or remove a node for monitoring while monitoring was in progress.
0033-1008 0033-1014 0033-1008 Accept failed for the PM data collector Explanation: A connection on the socket for communicating between the Performance Monitor and the dug program could not be accepted. Error Class: The accept() function failed. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-1015 0033-1022 0033-1015 Error number reading node list file string Explanation: Internal Program error. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0033-1016 Error writing socket during allocation Explanation: Internal Program error. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-1023 0033-1029 0033-1023 dug: socket() failed. Error is string Explanation: The socket for communicating between the Performance Monitor and the dug program could not be created by dug. Error Class: The socket() function failed for the indicated reason. User Response: Correct the specified problem. 0033-1024 dug: socket read failed string, Node= string. Error is string Explanation: The dug statistics collection program encountered an error reading the socket from the statistics gathering daemon.
0033-1030 0033-1036 0033-1030 pm_connect_dug() : tmpnam() failed. Unable to get Unix stream socket pathname. Error is string Explanation: The Performance Monitor was not able to establish a communication channel with the performance statistics collection program, dug. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0033-1031 pm_connect_dug() Select Failed for the Unix socket.
0033-1037 0033-1042 0033-1037 pm_connect_digq() Connection with digq timed out after number seconds. Explanation: The Performance Monitor did not receive a response from the dig query program, digq, after a reasonable amount of time. Error Class: The digq program may not have been started successfully or may have been terminated after starting. User Response: Ensure that the digq program can be invoked and continues to execute. 0033-1038 pm_read_msg: Socket read failed string.
0033-1043 0033-1048 0033-1043 digq():Error in getting the Internet address of the local host string. Error is string Explanation: The performance statistics query program, digq, was unable to get the Internet address of the local host. Error Class: The gethostbyname() function failed for the indicated reason. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0033-1044 digq():Cannot send broadcast packet.
0033-1049 0033-1054 0033-1049 digq() was unable to establish a communication channel to the Performance Monitor. Error is string Explanation: The performance statistics query program, digq, was unable to establish a communication channel to the Performance Monitor. Error Class: The bind() function failed for the indicated reason. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-1055 0033-1061 0033-1055 digq()::p;oadcast: Unable to get socket interface flags. Error is string Explanation: The performance statistics query program, digq, was unable to get the interface flags of the socket in order to locate broadcast devices. Error Class: The ioctl() function failed for the indicated reason. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-1062 0033-2003 0033-1062 dug: socket read for string failed Error is string Explanation: The dug statistics collection program encountered an error reading the socket connection to the monitor program. Error Class: The monitor program has probably terminated. User Response: Make sure the monitor program is still running and it did not experience a problem trying to communicate with the dug program. 0033-1063 dug: Monitor program closed the socket connection before string was read.
0033-2004 0033-2010 0033-2004 AddHostname() could malloc number bytes for the first element of the history buffer Explanation: The Performance Monitor could not allocate sufficient storage for the internal representation of the first node to be monitored. Error Class: Insufficient memory is currently available on the system for Performance Monitoring to operate. User Response: Remove some of the concurrently executing processes or add more memory to the system.
0033-2011 0033-2028 0033-2011 Cannot get color for string spectrum Explanation: A color for the indicated spectrum could not be obtained. Error Class: Either the X server where VT is running does not support a named color or no free color cells remain in the colormap. User Response: If the spectrum identified used resources to select the colors, ensure all the colors specified (either by default or in a resource file) are supported by the X server.
0033-2029 0033-2037 0033-2029 Cannot close previous trace file because string Explanation: An error occurred while trying to close the previous trace file. Error Class: The fclose() function failed for the indicated reason. User Response: If the problem appears to be correctable, do so. Otherwise gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-2041 0033-2048 0033-2041 Internal Error make_menu_item: Invalid Item Type Explanation: An internal program error occurred. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0033-2042 string is not valid for the sampling interval number will be used Explanation: An invalid sampling frequency was specified and the default will be used instead. Error Class: The sampling frequency must be between 1 an 999 seconds.
0033-2049 0033-2055 0033-2049 Event widgets must accept multiple ports. Configuration cannot be created. Explanation: An event/display chain was being created but the specified event widget cannot handle multiple processes. Error Class: The 'single' attribute of the event widget is set to TRUE which indicates it cannot handle multiple processes. User Response: Ensure that the specified event widget can handle multiple processes. 0033-2050 Incompatible events selected. Configuration cannot be created.
0033-2056 0033-2061 0033-2056 Request to open another display failed. Only 20 displays may be open at any time. Close some displays and try again. Explanation: Only 20 displays can be opened at a time. Error Class: An attempt was made to open another display while 20 were already opened. User Response: Close a display before attempting to open another.
0033-2062 0033-2067 0033-2062 Command line option string is not recognized or is missing a required parameter. It will be ignored Explanation: The indicated option is not recognized and the Visualization Tool does not know what to do with it other than ignore it. Error Class: The option may have been misspelled or may just be wrong. User Response: Ensure that valid options are passed to the Visualization Tool as command line options.
0033-2068 0033-2073 0033-2068 Unable to map trace file "string" into memory. Explanation: During an attempt to load a tracefile, although the file existed, was a regular file, and was successfully opened, the file could not be mapped into memory. Error Class: Available data space insufficiently large to hold mapped file data structure. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-2074 0033-3003 0033-2074 Post processing of tracefile string. is complete. Details may be found in string. Explanation: During post processing of the named tracefile, information about the tracefile and post processing was logged to the file. See the log file for specific information. found. Details about these errors are in the named log file. Error Class: Post processing of of the tracefile completed without reporting any errors. User Response: None. This is an informational message.
0033-3005 0033-3019 0033-3005 Could not obtain current time for timestamp file because string Explanation: The program could not determine the current time of day to write into the timestamp file. Error Class: The gettimeofday() function failed for the indicated reason. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-3022 0033-3029 0033-3022 Client: Cannot open stream socket for Dig Daemon, Err=string Explanation: The parallel application was attempting to create a unix socket with which to talk to the AIX statistics daemon but failed. Error Class: The socket() function failed. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-3068 0033-3073 0033-3068 VT_integrate() Could not open output file "string" Error is string Explanation: While integrating the intermediate trace files, the program could not open an intermediate output file. Error Class: The fopen() function failed for the indicated reason. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-3074 0033-3080 0033-3074 write_data_to_usd_file(), setgid() failed from root to user_gid= number, Can't create AIX trace file, Err=string Explanation: While writing the AIX statistics file to the current working directory, the program was unable to change its group id to the group id of the user that submitted the application program. Error Class: The setgid() function failed for the indicated reason.
0033-3081 0033-3087 0033-3081 PMdig::write() on socket failed in sending version to string. errno= number Explanation: dig disconnected client because of version mismatch. Error Class: May be a version problem. User Response: Check with the system administrator to insure the same version of the dig program is installed on each node, for the respective version of VT or POE.
0033-3088 0033-3094 0033-3088 VT_trrtn::write_trc_data(), Tracing continued after reducing the max size for temp file from number to number Explanation: Tracing continued even if write failed on temp disk. Error Class: Temp Disk full. User Response: Increase the space in the temp disk. 0033-3089 VT_trc_init::get_dir_stat failed for cwd directory string, Error is string Explanation: Cannot access current working directory. Error Class: System Error.
0033-3095 0033-3100 0033-3095 VT_trc_capture write_buffusd_data() Insufficient disk space to write, Tracing stopped. Space left is number, required is number Explanation: Not enough disk space left. Error Class: Insufficient disk space. User Response: Clean the disk. 0033-3096 VT_trc_init() HPSOclk_init string failure, Tracing is disabled. Error is string Explanation: Clock initialization failed during trace initialization. Tracing cannot continue and is disabled at this time.
0033-3101 0033-3108 0033-3101 VT_trc_set_params(): Setting Temp File size to threshold size. Set Size = number, Minimum size = number Explanation: The happens when user tries to set the size of temporary file to be less than the minimum threshold size. Program then automatically sets the size to the minimum size and continue trace generation. User Response: Change the parameters in VT_trc_set_param call to be above threshold value. 0033-3103 VT_trc_set_params(): Setting Buffer size to Threshold size.
0033-3109 0033-3115 0033-3109 connect_dig() Select Failed for the Unix socket. Error is string Explanation: The Tracing routine experienced an internal program error on its communication channel. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0033-3110 Accept failed for the connection from DIG executable.
0033-3116 0033-3123 0033-3116 DIG(), Connection with Application program timed out after number seconds. Explanation: The DIG daemon did not receive a response from the Trace client reasonable amount of time. Error Class: The trace client may have died or system delay. User Response: Rerun the application. 0033-3117 DIG(), Version Mismatch. Dig version=hex number, Trace Version=hex number Explanation: The version of DIG daemon does not match with the trace routine version of the application program.
0033-3124 0033-3129 0033-3124 Internal program error occurred during trace integration Explanation: During the trace integration portion of the poe job, a required structure was not initialized properly. Error Class: A program error occurred that prevents trace integration. User Response: Poe will attempt to continue but all trace data for this job is lost. Report this problem using local support procedures.
0033-3130 0033-3134 0033-3130 Unable to allocate space to store "string", which is the name of the temporary kernel statistics trace file. Explanation: The dig program (which is spawned from the parallel application) could not save the name of the file that it was supposed to write kernel trace records into. Error Class: Probably low paging space. User Response: Check paging space on the remote application during execution. If there is a low amount of available paging space, increase the paging space.
0033-3135 0033-4002 0033-3135 Unable to create temporary name for kernel statistics file. Error is string Explanation: VT trace generation could not create a temporary name for the file to record kernel statistics. The message gives the reason the system call failed. Error Class: Unknown. User Response: Attempt to correct the cause identified in the message. If that is not possible, follow local site procedures for reporting software problems.
0033-4003 0033-4007 | | 0033-4003 string History buffer position should be 0 but is number History buffer will be reset | | Explanation: During visualization, the internal semaphores of the display have become unsynchronized. | Error Class: Internal program error. | | | User Response: For the current visualization session, the display will reset its history buffer so that any previous data values will be lost. Visualization will attempt to continue but the results are suspect.
0033-4100 0033-4350 0033-4100 string Internal program error. The meter height of number is less than the minimum meter height of number. Explanation: During visualization, the program attempted to reallocate the pixmaps used to display the processor label numbers. Error Class: The meter height received was not initialized correctly. User Response: The processing operation will be terminated and VT will exit. Report the problem to local support. 0033-4101 string Internal program error.
0033-4351 0033-4355 0033-4351 Pixmap not created for compressed rectangles. Internal Error in StripGraph::SetXhatchGC(). Explanation: During visualization, an internal error occurred in the StripGraph display. Error Class: Internal Program Error when calling XCreateGC. User Response: The StripGraph display may not display hatched pattern as intended. Visualization will continue but the results may be suspect. The information presented in the message should be reported to local support.
0033-4356 0033-4575 0033-4356 Time index value incorrect. Internal Error in StripGraph::back_in_time_start_x_pos(). stripgraph->back_time_indx is greater than or equal to HIST_BUFF_LEN. HIST_BUFF_LEN = number. Explanation: During visualization, an internal error occurred in the StripGraph display. Error Class: Internal Program Error when checking internal variables. User Response: The StripGraph display cannot locate time in history buffer to start drawing. Visualization cannot continue.
2537-0002 2537-0006 Chapter 7. Xprofiler Messages 2537-0002 No file was specified in Binary Executable File dialog. Explanation: When you are trying to load one or more gmon.out files, you are required to also specify the name of the binary executable file that was executed to produce the gmon.out file(s).
2537-0007 2537-0010 2537-0007 You must first select a function from the list. Explanation: Before using the Utility->Locate in Graph option in either the Flat Profile or Function Index report window, a function in the report must be selected first. This same rule also applies to the Code Display menu options in Flat Profile report. User Response: Before using these options, you must first select a function from the report window.
2537-0011 2537-0016 2537-0011 The selected function's source file name is not available. Explanation: Internal error. There is no source file name associated with the selected function, so no file can be opened. User Response: If you continue to get this error message, gather information about the problem and follow local site procedures for reporting hardware and software problems. | | 2537-0012 shmat() failed to attach a shared memory segment or a mapped file. errno = number.
2537-0017 2537-0023 2537-0017 There must be at least one space separating the runtime option string and its corresponding value. Explanation: At least one space must be typed between an Xprofiler command-line option and its associated value. For example, -e foo or -e foo are acceptable formats, but -efoo is not. Any command-line options that were specified incorrectly are ignored. User Response: Insert a space between any Xprofiler command-line option and its associated value.
2537-0024 2537-0030 2537-0024 Cannot open file string for reading. Check for valid path and file specification and permissions. Explanation: An attempt to read data from the file in the directory that you specified failed, due to the fact that the file cannot be opened for reading. This is because either the file name or a directory in the specified path is invalid, or the file is missing read permission, or a directory in the path is missing execute permission.
2537-0031 2537-0036 2537-0031 A severe error was detected, and file processing has stopped. Refer to the message window below this window for more details. Explanation: A function involving the symbol tables for your application failed to perform correctly while Xprofiler was trying to process your input files. Xprofiler will not proceed any further, and its main display will be empty. More details regarding the exact nature of the problem appears in a message window below the window for this message.
2537-0037 2537-0041 2537-0037 The gmon.out file count data in the Xprofiler internal table is incorrect. Explanation: Internal error. Xprofiler has an internal table that contains an entry for each gmon.out file that you specified to be loaded for the application you are analyzing. In addition, there is a record for this table that contains the number of gmon.out file entries.
2537-0042 2537-0048 2537-0042 Number of CPU sampling data records in the gmon.out file string is greater than the value in the associated header. | Explanation: AIX error. The gmon.out file contains more records of CPU sampling data for one of the functions in your program than indicated by the value in the header data. Header data immediately precedes each set of CPU sampling records. User Response: Execute your program again to generate a new gmon.
2537-0049 2537-0053 2537-0049 Failed to obtain file information about string. Explanation: Failed to obtain information about the specified directory. This is because either the specified path contains a non-existent directory, or one of the parent directories in the path does not have execute permission. User Response: Verify that all directory names in the specified path are valid. If they are, then check to be sure that all directories in the path have execute permission.
2537-0054 2537-0060 2537-0054 NarcCorrelate() had a negative return value. Explanation: Internal error. The Function Call Tree was unable to be reconstructed by the NARC library, because the node and arc data that this library uses has either been overwritten, or placed in the wrong shared memory location, by Xprofiler. User Response: Exit and re-start Xprofiler, and try to display the Function Call Tree again.
2537-0061 2537-0065 2537-0061 An attempt to access call count information in the gmon.out file, string, failed. Explanation: Internal error. Xprofiler was unable to access the call count data in this gmon.out file. This problem should have been detected during an earlier stage of file processing, instead of being encountered at this time. All CPU sampling data in this file will be included in the information in the main display and all report windows, but the call count data will not.
2537-0066 2537-0072 | | 2537-0066 The following string(s) in the specified configuration file do not follow configuration file syntax key=value: string | | | Explanation: The string in the specified configuration file does not follow configuration file syntax key=value, where key is a configuration keyword such as PROG or FUNC and value is the keyword's corresponding value, such as program-name or function-name.
2537-0073 2537-0073 | | 2537-0073 The specified file string is in an un-recognized format, the file can not be processed further. | | Explanation: The specified file is not in a recognized format; the file can not be processed further. | User Response: Make sure that all files used in Xprofiler are in a supported file format. Chapter 7.
186 IBM PE for AIX V2R4.
Glossary of Terms and Abbreviations This glossary includes terms and definitions from: The Dictionary of Computing, New York: McGraw-Hill, 1994. The American National Standard Dictionary for Information Systems, ANSI X3.172-1990, copyright 1990 by the American National Standards Institute (ANSI). Copies can be purchased from the American National Standards Institute, 1430 Broadway, New York, New York 10018. Definitions are identified by the symbol (A) after the definition.
buffer. A portion of storage used to hold input or output data temporarily. C C. A general purpose programming language. It was formalized by ANSI standards committee for the C language in 1984 and by Uniforum in 1983. C++. A general purpose programming language, based on C, which includes extensions that support an object-oriented programming paradigm. Extensions include: strong typing data abstraction and encapsulation polymorphism through function overloading and templates class inheritance.
distributed shell (dsh). An Parallel System Support Programs command that lets you issue commands to a group of hosts in parallel. See IBM Parallel System Support Programs for AIX: Command and Technical Reference for details. an error in, or enhance, a previously installed product. 2) One or more separately installable, logically grouped units in an installation package. See also Licensed Program Product and package. foreign host. See remote host. domain name.
gprof. A UNIX command that produces an execution profile of C, Pascal, Fortran, or COBOL programs. The execution profile is in a textual and tabular format. It is useful for identifying which routines use the most CPU time. See the man page on gprof. GUI (Graphical User Interface). A type of computer interface consisting of a visual metaphor of a real-world scene, often of a desktop. Within that scene are icons, representing actual objects, that the user can access and manipulate with a pointing device.
latency. The time interval between the instant at which an instruction control unit initiates a call for data transmission, and the instant at which the actual transfer of data (or receipt of data at the remote end) begins. Latency is related to the hardware characteristics of the system and to the different layers of software that are involved in initiating the task of packing and transmitting the data. Licensed Program Product (LPP).
P package. A number of filesets that have been collected into a single installable image of program products, or LPPs. Multiple filesets can be bundled together for installing groups of software together. See also fileset and Licensed Program Product. parallelism. The degree to which parts of a program may be concurrently executed. parallelize. To convert a serial program for parallel execution. Parallel Operating Environment (POE).
remote shell (rsh). A command supplied with both AIX and the Parallel System Support Programs that lets you issue commands on a remote host. Report. In Xprofiler, a tabular listing of performance data that is derived from the gmon.out files of an application. There are five types of reports that are generated by Xprofiler, and each one presents different statistical information for an application. | | | | Resource Manager. A server that runs on one of the nodes of a IBM RS/6000 SP (SP) machine.
debugger to print information about the state of the program. V trace record. In PE, a collection of information about a specific event that occurred during the execution of your program. For example, a trace record is created for each send and receive operation that occurs in your program (this is optional and may not be appropriate). These records are then accumulated into a trace file which allows the Visualization Tool to visually display the communications patterns from the program. variable.
Communicating Your Comments to IBM IBM Parallel Environment for AIX Messages Version 2 Release 4 Publication No. GC28-1982-02 If you especially like or dislike anything about this book, please use one of the methods listed below to send your comments to IBM. Whichever method you choose, make sure you send your name, address, and telephone number if you would like a reply. Feel free to comment on specific errors or omissions, accuracy, organization, subject matter, or completeness of this book.
Reader's Comments — We'd Like to Hear from You IBM Parallel Environment for AIX Messages Version 2 Release 4 Publication No. GC28-1982-02 You may use this form to communicate your comments about this publication, its organization, or subject matter, with the understanding that IBM may use or distribute whatever information you supply in any way it believes appropriate without incurring any obligation to you.
Reader's Comments — We'd Like to Hear from You GC28-1982-02 Fold and Tape Please do not staple IBM Cut or Fold Along Line Fold and Tape NO POSTAGE NECESSARY IF MAILED IN THE UNITED STATES BUSINESS REPLY MAIL FIRST-CLASS MAIL PERMIT NO.
IBM Program Number: 5765-543 Printed in the United States of America on recycled paper containing 10% recovered post-consumer fiber.