Friday, February 1, 2013



Datastage Common Errors and Solutions



1.      Failed to authenticate the current user against the selected Domain: Could not connect to server.

RC:     Client has invalid entry in host file
Server listening port might be blocked by a firewall
Server is down

SOL:   Update the host file on client system so that the server hostname can be resolved from client.
Make sure the WebSphere TCP/IP ports are opened by the firewall.
Make sure the WebSphere application server is running. (OR)
Restart Websphere services.

2.      The connection was refused or the RPC daemon is not running (81016)

RC:     The dsprcd process must be running in order to be able to login to DataStage.
If you restart DataStage, but the socket used by the dsrpcd (default is 31538) was busy, the dsrpcd will fail to start. The socket may be held by dsapi_slave processes that were still running or recently killed when DataStage was restarted.

SOL:   Run "ps -ef | grep dsrpcd" to confirm the dsrpcd process is not running.
Run "ps -ef | grep dsapi_slave" to check if any dsapi_slave processes exist. If so, kill them.
Run "netstat -a | grep dsprc" to see if any processes have sockets that are ESTABLISHED, FIN_WAIT, or CLOSE_WAIT. These will prevent the dsprcd from starting. The sockets with status FIN_WAIT or CLOSE_WAIT will eventually time out and disappear, allowing you to restart DataStage.
Then Restart DSEngine.       (if above doesn’t work) Needs to reboot the system.


3.      To save Datastage logs in notepad or readable format

SOL:   a) /opt/ibm/InformationServer/server/DSEngine  (go to this directory)
                ./bin/dsjob  -logdetail project_name job_name >/home/dsadm/log.txt
           b) In director client, Project tab  Print  select print to file option and save it in local directory.

4.      "Run time error '457'. This Key is already associated with an element of this collection."
SOL:   Needs to rebuild repository objects.

a)     Login to the Administrator client
b)     Select the project
c)      Click on Command
d)     Issue the command ds.tools
e)     Select option ‘2’
f)       Keep clicking next until it finishes.
g)     All objects will be updated.

5.      To stop the datastage jobs in linux level

SOL:   ps –ef   |  grep dsadm
           To Check process id and phantom jobs
           Kill -9 process_id

6.      To run datastage jobs from command line

SOL:   cd  /opt/ibm/InformationServer/server/DSEngine
           ./dsjob  -server $server_nm   -user  $user_nm   -password   $pwd  -run $project_nm    $job_nm



7.      Failed to connect to JobMonApp on port 13401.

SOL:   needs to restart jobmoninit script (in /opt/ibm/InformationServer/Server/PXEngine/Java)
           Type    sh  jobmoninit  start $APT_ORCHHOME
           Add 127.0.0.1 local host in /etc/hosts file
           (Without local entry, Job monitor will be unable to use the ports correctly)

8.   While running ./NodeAgents.sh start command... getting the following error: “LoggingAgent.sh process stopped unexpectedly”

SOL:   needs to kill LoggingAgentSocketImpl
              Ps –ef |  grep  LoggingAgentSocketImpl   (OR)
              PS –ef |               grep Agent  (to check the process id of the above)

9.     Warning: A sequential operator cannot preserve the partitioning of input data set on input port 0
SOL:    Clear the preserve partition flag before Sequential file stages.


10.     Warning: A user defined sort operator does not satisfy the requirements.
  SOL:   Check the order of sorting columns and make sure use the same order when use join stage after sort to joing two inputs.

11.     Conversion error calling conversion routine timestamp_from_string data may have been lost. xfmJournals,1: Conversion error calling conversion routine decimal_from_string data may have been lost

SOL:    check for the correct date format or decimal format and also null values in the date or decimal fields before passing to datastage StringToDate, DateToString,DecimalToString or StringToDecimal functions.

12.     To display all the jobs in command line

SOL:   cd /opt/ibm/InformationServer/Server/DSEngine/bin
./dsjob -ljobs <project_name>

13.      “Error trying to query dsadm[]. There might be an issue in database server”
SOL:   Check XMETA connectivity.
db2 connect to xmeta (A connection to or activation of database “xmeta” cannot be made because of  BACKUP pending)

14.      “DSR_ADMIN: Unable to find the new project location”
SOL:   Template.ini file might be missing in /opt/ibm/InformationServer/Server.
           Copy the file from another severs.

15.      “Designer LOCKS UP while trying to open any stage”
SOL:   Double click on the stage that locks up datastage
           Press ALT+SPACE
           Windows menu will popup and select Restore
           It will show your properties window now
           Click on “X” to close this window.
           Now, double click again and try whether properties window appears.

16.      “Error Setting up internal communications (fifo RT_SCTEMP/job_name.fifo)
SOL:   Remove the locks and try to run (OR)
           Restart DSEngine and try to run (OR)
Go to /opt/ibm/InformationServer/server/Projects/proj_name/
ls RT_SCT* then
           rm –f  RT_SCTEMP
           then try to restart it.

17.      While attempting to compile job,  “failed to invoke GenRunTime using Phantom process helper”
RC:     /tmp space might be full
           Job status is incorrect
           Format problems with projects uvodbc.config file
SOL:   a)         clean up /tmp directory
           b)        DS Director à JOB à clear status file
           c)         confirm uvodbc.config has the following entry/format:
                       [ODBC SOURCES]
                       <local uv>
                       DBMSTYPE = UNIVERSE
                       Network  = TCP/IP
                       Service =  uvserver
                       Host = 127.0.0.1

DataStage Client/Server Connectivity


Connection from a DataStage client to a DataStage server is managed through a mechanism based upon the UNIX remote procedure call mechanism.  DataStage uses a proprietary protocol called DataStage RPC
which consists of an RPC daemon (dsrpcd) listening on TCP port number 31538 for connection requests from DataStage clients.

Before dsrpcd gets involved, the connection request goes through an authentication process.  Prior to version 8.0, this was the standard operating system authentication based on a supplied user ID and password 
(an option existed on Windows-based DataStage servers to authenticate using Windows LAN Manager, supplying the same credentials as being used on the DataStage client machine – this option was removed for 
version 8.0). With effect from version 8.0 authentication is handled by the Information Server through its login and security service.

Each connection request from a DataStage client asks for connection to the dscs (DataStage Common Server) service.The dsrpcd (the DataStage RPC daemon) checks its dsrpcservices file to determine whether there is an entry for that service and, if there is, to establish whether the requesting machine's IP address is authorized to request the service.  If all is well, then the executable associated with the dscs service (dsapi_server) is invoked.

DataStage Processes and Shared Memory

Each dsapi_server process acts as the "agent" on the DataStage server for its own particular client connection, among other things managing traffic and the inactivity timeout.  If the client requests access to the Repository, then the dsapi_server process will fork a child process called dsapi_slave to perform that work.
Typically, therefore, one would expect to see one dsapi_server and one dsapi_slave process for each connected DataStage client.  Processes may be viewed with the ps -ef command (UNIX) or with Windows Task Manager.
Every DataStage process attaches to a shared memory segment that contains lock tables and various other inter-process communication structures.  Further each DataStage process is allocated its own private 
shared memory segment.  At the discretion of the DataStage administrator there may also be shared memory segments for routines written in the DataStage BASIC language and for character maps used for National 
Language Support (NLS).  Shared memory allocation may be viewed using the ipcs command (UNIX) or the shrdump command (Windows).  The shrdump command ships with DataStage; it is not a native Windows command.

RUN-TIME

Now let us turn our attention to run-time, when DataStage jobs are executing.  The concept is a straightforward one; DataStage jobs can run even though there are not clients connected (there is a command line interface (dsjob) for requesting job execution and for performing various 
other tasks).

However, server jobs and parallel jobs execute totally differently.  A job sequence is a special case of a server job, and executes in the same way as a server job.

Server Job Execution

        Server jobs execute on the DataStage server (only) and execute in a shell called uvsh (or dssh, a synonym).  The main process that runs the job executes a DataStage BASIC routine called DSD.RUN – the name of this program shows in a ps –ef listing (UNIX). This program interrogates the local Repository to determine the runtime configuration of the job, what stages are to be run and their interdependencies.  When a server job includes a Transformer stage, a child process is forked from uvsh also running uvsh but this time executing a DataStage BASIC routine called DSD.StageRun.  Server jobs only ever have uvsh processes at run time, except where the job design specifies opening a new shell (for example sh in UNIX or DOS in Windows) to perform some specific task; these will be child processes of uvsh.

Parallel Job Execution

        Parallel job execution is rather more complex.  When the job is initiated the primary process (called the “conductor”) reads the job design, which is a generated Orchestrate shell (osh) script.  The conductor also reads the parallel execution configuration file specified by the current setting of the APT_CONFIG_FILE environment variable.  Based on these two inputs, the conductor process composes the “score”, another osh script that specifies what will actually be executed.  (Note that parallelism is not determined until run time – the same job might run in 12 nodes in one run and 16 nodes in another run. 

        This automatic scalability is one of the features of the parallel execution technology underpinning Information Server (and therefore DataStage).Once the execution nodes are known (from the configuration file) the conductor causes a coordinating process called a “section leader” to be started on each; by forking a child process if the node is on the same machine as the conductor or by remote shell execution if the node is on a different machine from the conductor (things are a little more dynamic in a grid configuration, but essentially this is what happens).  Each section leader process is passed the score and executes it on its own node, and is visible as a process running osh.  Section leaders’ stdout and stderr are redirected to the conductor, which is solely responsible for logging entries from the job.

        The score contains a number of Orchestrate operators.  Each of these runs in a separate process, called a “player” (the metaphor clearly is one of an 
orchestra).  Player processes’ stdout and stderr are redirected to their parent section leader.  Player processes also run the osh executable.Communication between the conductor, section leaders and player 
processes in a parallel job is effected via TCP.  The port numbers are configurable using environment variables.  By default, communication between conductor and section leader processes uses port number 10000 (APT_PM_STARTUP_PORT) and communication between player processes and player processes on other nodes uses port number 11000 (APT_PLAYER_CONNECTION_PORT).
To find all the processes involved in executing a parallel job (they all run osh) you need to know the configuration file that was used.  This can be found from the job's log, which is viewable using the Director client or the dsjob command line interface.



To Release Locks in DATASTAGE



There are Three methods to unlock the DataStage jobs:

     – Using DataStage Administrator Tool.

     –Using UV Utility

     – Using DataStage Director


Unlock jobs -Using DataStage Administrator Tool:

      • Log into Administrator (as dsadm)

      • Open the Command line for the project

      • Execute “LIST.READU EVERY”

      • Identify the values for INODE and USER columns for the job for which the locks need to be released.

      • Execute UNLOCK INODE <inodenumber> ALL

      • UNLOCK USER <user number> ALL


Unlock jobs -Using UV Utility:

     • Logon to Unix

     • Su to dsadm

     • Go to a corresponding project

     • Type “UV”


     • Type “DS.TOOLS”

     • Select the option “4” (4. Administer processes/locks >>)

     • Again Select the option “4” (4. List all locks)

     • Identify the values for PID columns for the job for which the locks need to be released.

     • Then select the Option “7” (7. Clear locks held by a process)

     • Give the PID value here(Enter pid#=) and press Enter.

Unlock jobs -Using DataStage Director:

    – Open DataStage Director

    – Go to the Job

    – Cleanup Resources

    – In the Job Resources window,

    – Select “Show All” (Processes)

    – Find your User Name and click on “Logout”


 E