Skip to content
davidressman edited this page Mar 21, 2013 · 6 revisions

Distributed Resource Management Application API

This guide is a tutorial for getting started programming with DRMAA. It is basically a one to one translation of the original in C for Grid Engine. It assumes that you already know what DRMAA is and that you have drmaa-python installed. If not, have a look at Installing. The following code segments are also included in the repository.

Starting and Stopping a Session

The following code segments (example1.py and example1.1.py) shows the most basic DRMAA python binding program.

#!/usr/bin/env python

import drmaa

def main():
    """Create a drmaa session and exit"""
    s=drmaa.Session()
    s.initialize()
    print 'A session was started successfully'
    s.exit()

if __name__=='__main__':
    main()

The first thing to notice is that every call to a DRMAA function will return an error code. In this tutorial, we ignore all error codes.

Now let's look at the functions being called. First, on line 7, we initialise a Session object by calling DRMAA.Session(). The Session is automatically initialized via initialize(), and it creates a session and starts an event client listener thread. The session is used for organizing jobs submitted through DRMAA, and the thread is used to receive updates from the queue master about the state of jobs and the system in general. Once initialize() has been called successfully, it is the responsibility of the calling application to also call exit() before terminating. If an application does not call exit() before terminating, session state may be left behind in the user's home directory, and the queue master may be left with a dead event client handle, which can decrease queue master performance.

At the end of our program, on line 9, we call exit(). exit() cleans up the session and stops the event client listener thread. Most other DRMAA functions must be called before exit(). Some functions, like getContact(), can be called after exit(), but these functions only provide general information. Any function that does work, such as runJob() or wait() must be called before exit() is called. If such a function is called after exit() is called, it will return an error.

#!/usr/bin/env python

import drmaa

def main():
    """Create a session, show that each session has an id,
    use session id to disconnect, then reconnect. Then exit"""
    s = drmaa.Session()
    s.initialize()
    print 'A session was started successfully'
    response = s.contact
    print 'session contact returns: ' + response
    s.exit()
    print 'Exited from session'

    s.initialize(response)
    print 'Session was restarted successfullly'
    s.exit()
    

if __name__=='__main__':
    main()

This example is very similar to Example 1. The difference is that it uses the Grid Engine feature of reconnectable sessions. The DRMAA concept of a session is translated into a session tag in the Grid Engine job structure. That means that every job knows to which session it belongs. With reconnectable sessions, it's possible to initialize the DRMAA library to a previous session, allowing the library access to that session's job list. The only limitation, though, is that jobs which end between the calls to exit() and init() will be lost, as the reconnecting session will no longer see these jobs, and so won't know about them.

Through line 9, this example is very similar to Example 1. On line 10, however, we use the contact attribute to get the contact information for this session. On line 12 we then exit the session. On line 15, we use the stored contact information to reconnect to the previous session. Had we submitted jobs before calling exit(), those jobs would now be available again for operations such as wait() and synchronize(). Finally, on line 17 we exit the session a second time.

Clone this wiki locally