NSG-R User Guide

Introduction

The goal of the NSG REST API (NSG-R) is to allow users to access and run computational neuroscience software applications on high performance computing resources supported by NSG outside of the confines of a point and click browser interface. Unlike the NSG Portal, where the user is required to login to a web browser to launch a job or to retrieve the simulation results, the NSG-R is intended to be a convenient way to run neuroscience applications programmatically on large HPC resources from the familiar environment of the researcher's laptop or from other Neuroscience Community Projects (NCP).

See the Quick Start Guide for a simple example to demonstrate submitting jobs and getting results quickly using curl commands.

Register

To use the NSG-R, you must register as a user, and register any application(s) you wish to develop, as well. An application can be a software application or the input model you have developed.

To get started, sign in or register for a NSG-R account. Once you've signed in, you can visit "My Profile" to change your account information and password. To register an application (necessary to run jobs), use the "Application Management" console, found under the "Developer" drop down menu.

DIRECT is the more common choice, and the choice you want if you wish to use the API from your application immediately. DIRECT authentication means that the username and password of the person running the application will be sent in HTTP basic authentication headers, and jobs will be submitted on behalf of the authenticated user only.

UMBRELLA is a special case used by web applications that submit jobs on behalf of multiple registered users. Web applications that use UMBRELLA authentication also authenticate with a username and password, that of the person who registered the application. The UMBRELLA application provides the identity of the user that submitted a given job using custom request headers. As a result, users registered with an UMBRELLA application need not register with the NSG-R. Because UMBRELLA authentication involves a trust relationship (i.e. we are trusting you to accurately identify the individual who submits each request), we will need to talk to you before activating your UMBRELLA application to insure all of our requirements are met.

If you are interested in registering an UMBRELLA application, please contact us.

NCP should register as an UMBRELLA application.

The examples shown in this guide are for DIRECT applications, but with minor changes, they will also work for UMBRELLA Applications, as shown in UMBRELLA Application Examples.

The base URL for the API is https://nsgr.sdsc.edu:8443/cipresrest/v1.

The examples in this guide use the unix curl command and assume you have registered with the NSG-R and have set the following environment variables:

  • URL - https://nsgr.sdsc.edu:8443/cipresrest/v1
  • PASSWORD - Your NSG-R password
  • KEY - The application ID assigned to you, when you registered the application

For example, using the bash shell:

$ # Remember to replace "MyPassWord" and "insects-..." with your information
$ export URL=https://nsgr.sdsc.edu:8443/cipresrest/v1
$ export PASSWORD=MyPassWord
$ export KEY=insects-095D20923FAE439982B6D5EBD2E339C9

curl is of course just one of many ways to interact with a REST API. There are numerous java, php, perl, python, etc., libraries that make it easy to use REST services.

Authenticate

The API requires you to send a username and password in HTTP Basic Authentication headers with each request. The use of SSL ensures that the information is transmitted securely.

In addition to sending a username and password, you must send your application ID in a custom request header named cipres-appkey.

List Jobs

Let's get started using the API. Suppose your username is tom, you've registered a DIRECT application named insects, and set URL, PASSWORD and KEY environment variables as shown before. Here's how you would get a list of the jobs you've submitted:

$ curl -u tom:$PASSWORD \
        -H cipres-appkey:$KEY\
        $URL/job/tom

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<joblist>
    <title>Submitted Jobs</title>
    <jobs>
        <jobstatus>
            <selfUri>
                <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</url>
                <rel>jobstatus</rel>
                <title>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</title>
            </selfUri>
        </jobstatus>
        <jobstatus>
            <selfUri>
                <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-CC460782E5FF464CB96791B1E6053AA4</url>
                <rel>jobstatus</rel>
                <title>NGBW-JOB-NEURON_EXPANSE-CC460782E5FF464CB96791B1E6053AA4</title>
            </selfUri>
        </jobstatus>
    </jobs>
</joblist>

To get more information about a specific job in the list, use its jobstatus.selfUri.url. For example, to retrieve the full jobstatus of the first job in the list above:

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<jobstatus>
	. . .
</jobstatus>

Alternatively, when you ask for the list of jobs, use the expand=true query parameter to request full jobstatus objects.

When there are no submitted jobs, the list will be empty and will look like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<joblist>
    <title>Submitted Jobs</title>
    <jobs/>
</joblist>

TIP: Throughout the API, XML elements named selfUri link to the full version of the containing object. All Uri elements, including selfUri, contain a url which gives the actual url, a rel which describes the type of data that the url returns and a title. It's good practice to navigate through the API by using the Uris the API returns instead of constructing urls to specific objects yourself.

Submit Jobs

Now that we know how to list jobs; let's consider job submission. You can submit a job by issuing a POST request to $URL/job/username with multipart/form-data. Remember to replace username with your username, or the username of the person running your application. Most tools can be run minimally using only two fields: a tool identifier and a file to be processed.

Below is an example of a minimal job submission:

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom \
    -F tool=NEURON_EXPANSE \
    -F input.infile_=@./sampleinput

In this example, the fields used are:

tool=NEURON_EXPANSE
The tool field identifies the tool to be used, in this case, NEURON_EXPANSE. Job submissions must always include a tool. You can find a list of computational neuroscience application IDs by using the Tool API.
 
input.infile_=@./sampleinput
The input.infile_ field is also mandatory; it identifies the main data file to be operated on. input.infile_ is usually a set of sequences to align or a character matrix. In this example, we're sending the contents of the file named sampleinput. The '@' tells curl to send sampleinput as an attached file.For all NSG jobs, the input file should be of zip file format.

A submission like this will succeed for most tools, and will cause the application to run a job with whatever defaults NSG has for that particular tool. Of course, many job submissions will require configuration of command line options to non-default values, and (often) submission of auxiliary files that specify starting trees, constraints, etc. The appendix of this guide has a section that explains how to configure tool specific parameters.

Before you submit a job, you may want to check whether the request is composed correctly. How to Make a Test Run explains how to verify the fields.

Optional Metadata

A job submission may include the following optional metadata fields:

metadata.clientJobId
Your application's unique ID for the job.We highly recommended that you use this field. You may encounter situations where it isn't clear whether or not a submission reached the NSG-R, in which case, the best thing to do is request a list of your jobs and see whether or not it includes one with the clientJobId you just tried to submit.
 
metadata.clientJobName
A name that the user of your application will recognize the job by.
 
metadata.clientToolName
A name that the user will recognize the tool by.
 
metadata.statusEmail
If "true", email will be sent on job completion. (Delivery, of course, depends upon an valid email address, and functioning delivery infrastructure).
 
metadata.emailAddress
Use this along with statusEmail to override the default email destination. By default, job completion emails are sent to the user's registered email address. (Or in the case of UMBRELLA applications, to the address in the cipres-eu-email header of the job submission request). Use this property to direct the email somewhere else.
 
metadata.statusUrlPut
Use this field to specify a URL in your web application where NSG will PUT a notification when the job is finished. NSG can't guarantee that your application will receive the PUT request so you may still need to poll occasionally. Not implemented yet.
 
metadata.statusUrlGet
Use this field to specify a URL in your web application that NSG will GET, with a jh=jobhandle query parameter, when the job is finished. NSG can't guarantee that your application will receive the request so you may still need to poll occasionally. Not implemented yet.

All metadata fields are limited to 100 characters, and all are optional. Metadata will be returned with the rest of the information about the job when you later ask for the job's status.

In the following example, Tom uses some of the metadata fields described above to supply a job ID, generated by his application, and to request email notification of job completion.

$ curl -u tom:$PASSWORD \
	-H cipres-appkey:$KEY \
	$URL/job/tom \
	-F tool=NEURON_EXPANSE \
	-F input.infile_=@./sampleinput \
	-F metadata.clientJobId=101 \
	-F metadata.statusEmail=true
	

As noted above, many runs will be more complicated than this because of the need to configure the precise command line. We suggest that you continue through this guide to learn how to check job status, download results, and handle errors, and then read about configuring Tool Specific Parameters in the Appendix to learn how to create customized runs.

Understand Job Status

Successful job submission returns a jobstatus object that looks like this:


<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<jobstatus>
    <selfUri>
        <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</url>
        <rel>jobstatus</rel>
        <title>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</title>
    </selfUri>
    <jobHandle>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</jobHandle>
    <jobStage>QUEUE</jobStage>
    <terminalStage>false</terminalStage>
    <failed>false</failed>
    <metadata>
        <entry>
            <key>clientJobId</key>
            <value>101</value>
        </entry>
    </metadata>
    <dateSubmitted>2014-09-10T15:54:58-07:00</dateSubmitted>
    <resultsUri>
        <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output</url>
        <rel>results</rel>
        <title>Job Results</title>
    </resultsUri>
    <workingDirUri>
        <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/workingdir</url>
        <rel>workingdir</rel>
        <title>Job Working Directory</title>
    </workingDirUri>
    <messages>
        <message>
            <timestamp>2014-09-10T15:54:59-07:00</timestamp>
            <stage>QUEUE</stage>
            <text>Added to cipres run queue.</text>
        </message>
    </messages>
    <minPollIntervalSeconds>60</minPollIntervalSeconds>
</jobstatus>

Elements of particular interest are:

jobHandle
Is a unique, NSG assigned, job identifier. It has the format: NGBW-JOB-toolID-unique_identifier
 
jobStage
Unfortunately, the current version of NSG sets jobstatus.jobStage in a way that's somewhat inconsistent and difficult to explain. You're better off using jobstatus.messages to monitor the progress of a job.
 
messages
NSG adds a message at each major processing point, as well as when problems are encountered. Each message has a timestamp, processing stage, and textual description. A job progresses through the following stages:
  • QUEUE - The job has been validated and placed in NSG's queue.
  • COMMANDRENDERING - The job has reached the head of the queue and NSG has created the command line that will be run.
  • INPUTSTAGING - NSG has created a temporary working directory for the job on the execution host and copied the input files over.
  • SUBMITTED - The job has been submited to the scheduler on the execution host.
  • LOAD_RESULTS - The job has finished running on the execution host and NSG has begun to transfer the results.
  • COMPLETED - Results successfully transferred and available.
terminalStage
If true, NSG has finished processing the job. If false, there is more to do.
 
failed
This will only be set to true only when the job is finished (i.e. terminalStage=true) and the job has failed. NSG has a narrow definition of failure that does not take the tool's output or exit code into consideration. A job will only have failed=true if a network or system error prevents the tool from being run or prevents NSG from being able to obtain the job's results.
 
minPollIntervalSeconds
If you poll for the status of this job, this is the minimum polling in interval (in seconds) that you may use.
 

The jobstatus also includes several urls:

selfUri
Use this to poll for updated job status.
 
workingDirUri
Use this to monitor the files in the job's working directory, while the job is running.
 
resultsUri
Use this to get the list of result files, once the job has finshed.
 

Is the Job Finished

The job is finished when jobstatus.terminalStage=true. Use jobstatus.selfUri.url to check the status of the job, like this:

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90

Alternatively, you can check the status of multiple jobs in a single GET of endpoint $URL/job by using multiple instances of the jh=jobhandle query parameter. In this case the URL does not include the username (so that UMBRELLA applications can check on jobs for all their end users with a single query).

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY\
    $URL/job/?jh=NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90\&jh=NGBW-JOB-NEURON_EXPANSE-CC460782E5FF464CB96791B1E6053AA4

We ask users to keep polling frequency as low as possible to avoid overloading NSG: As a rule, jobstatus.minPollInterval specifies the shortest polling interval that you may use. However we encourage you to poll much less frequently when possible. For example, if you aren't returning intermediate results to your users and you submit a job with a maximum run time that's more than hour, please consider increasing the polling interval to 15 minutes. As an alternative to frequent polling, consider using metdata.statusEmail=true in your job submission so that NSG will email you when the job is finished. Showing courtesy here will allow us to avoid having to enforce hard limits.

If you poll for the status of multiple jobs in a single call, please use jobstatus.minPollInterval of the most recently submitted job as your minimum polling interval.

List Results

Once jobstatus.terminalStage=true, you can list and then retrieve the final results. Issue a GET request to the URL specified by jobstatus.resultsUri.url, like this:

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output


<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<results>
    <jobfiles>
        <jobfile>
            <downloadUri>
                <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output/1544</url>
                <rel>fileDownload</rel>
                <title>STDOUT</title>
            </downloadUri>
            <jobHandle>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</jobHandle>
            <filename>STDOUT</filename>
            <length>1243</length>
            <parameterName>PROCESS_OUTPUT</parameterName>
            <outputDocumentId>1544</outputDocumentId>
        </jobfile>
        <jobfile>
            <downloadUri>
                <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output/1545</url>
                <rel>fileDownload</rel>
                <title>STDERR</title>
            </downloadUri>
            <jobHandle>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</jobHandle>
            <filename>STDERR</filename>
            <length>0</length>
            <parameterName>PROCESS_OUTPUT</parameterName>
            <outputDocumentId>1545</outputDocumentId>
        </jobfile>
        <jobfile>
            <downloadUri>
                <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output/1550</url>
                <rel>fileDownload</rel>
                <title>infile.aln</title>
            </downloadUri>
            <jobHandle>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</jobHandle>
            <filename>infile.aln</filename>
            <length>1449</length>
            <parameterName>aligfile</parameterName>
            <outputDocumentId>1550</outputDocumentId>
        </jobfile>
        <jobfile>
            <downloadUri>
                <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output/1551</url>
                <rel>fileDownload</rel>
                <title>term.txt</title>
            </downloadUri>
            <jobHandle>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</jobHandle>
            <filename>term.txt</filename>
            <length>338</length>
            <parameterName>all_outputfiles</parameterName>
            <outputDocumentId>1551</outputDocumentId>
        </jobfile>
        <jobfile>
            <downloadUri>
                <url>$URL/v1/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output/1552</url>
                <rel>fileDownload</rel>
                <title>batch_command.cmdline</title>
            </downloadUri>
            <jobHandle>NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90</jobHandle>
            <filename>batch_command.cmdline</filename>
            <length>48</length>
            <parameterName>all_outputfiles</parameterName>
            <outputDocumentId>1552</outputDocumentId>
        </jobfile>
        <jobfile>

...
</jobfiles>
</results>

Download Results

Use the jobfile.downloadUri.url links to download individual result files, like this:

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    -O -J \
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/output/1544

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1243    0  1243    0     0    178      0 --:--:--  0:00:06 --:--:--   313
curl: Saved to filename 'STDOUT'

List Working Directory

If you are interested in monitoring the progress of a job while it is running, you can use jobstatus.workingDirUri.url to retrieve the list of files in the job's working directory. The job only has a working directory after it has been staged to the execution host and is waiting to run, is running, or is waiting to be cleaned up. If you use this URL at other times, it will return an empty list. Furthermore, if you happen to use this URL while NSG is in the process of removing the working directory, you may receive a transient error. Because of this possibility, be prepared to retry the operation.

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-3957CC6EBF5E448095A5666B41EDDF90/workingdir

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workingdir>
    <jobfiles/>
</workingdir>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workingdir>
    <jobfiles>
        <jobfile>
            <downloadUri>
                <url>$URL/job/tom/NGBW-JOB-NEURON_EXPANSE-0171A3F1BFA0477CAF35B79CE075DF9C/workingdir/scheduler.conf</url>
                <rel>fileDownload</rel>
                <title>scheduler.conf</title>
            </downloadUri>
            <filename>scheduler.conf</filename>
            <length>11</length>
            <dateModified>2014-09-20T16:18:05-07:00</dateModified>
            <parameterName></parameterName>
            <outputDocumentId>0</outputDocumentId>
        </jobfile>
        <jobfile>
            <downloadUri>
                <url>$URL/job/tom/NGBW-JOB-NEURON_EXPANSE-0171A3F1BFA0477CAF35B79CE075DF9C/workingdir/infile.dnd</url>
                <rel>fileDownload</rel>
                <title>infile.dnd</title>
            </downloadUri>
            <filename>infile.dnd</filename>
            <length>137</length>
            <dateModified>2014-09-20T16:18:13-07:00</dateModified>
            <parameterName></parameterName>
            <outputDocumentId>0</outputDocumentId>
        </jobfile>
		. . .
    </jobfiles>
</workingdir>

Download Working Directory Files

To retrieve a file from the working directory list, use its jobfile.downloadUri.url. Be prepared to handle transient errors, as well as a permanent 404 NOT FOUND error, once the working directory has been removed.

$ curl -k -u tom:tom \
    -H cipres-appkey:$KEY \
    -O -J \ 
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-0171A3F1BFA0477CAF35B79CE075DF9C/workingdir/infile.dnd

curl: Saved to filename 'infile.dnd'

Delete and Cancel

Once a job has finished and you've downloaded the results, it's a good idea to delete the job. You may also want to delete a job that hasn't finished yet if you, or the user of your application, realize you made a mistake and don't want to waste the compute time.

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    -X DELETE \
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-CC460782E5FF464CB96791B1E6053AA4

There is no data returned from a successful DELETE.

If the job is scheduled to run or is running at the time you delete it, it will be cancelled. Either way, all info associated with the job will be removed. You can verify that the job has been deleted by doing a GET of its jobstatus url. Http status 404 (NOT FOUND) will be returned along with an error object. We demonstrate this below by using curl's -i option, which tells curl to include the http header in its output.

$ curl -i -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom/NGBW-JOB-NEURON_EXPANSE-CC460782E5FF464CB96791B1E6053AA4

HTTP/1.1 404 Not Found
Server: Apache-Coyote/1.1
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Thu, 11 Sep 2014 21:43:54 GM

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<error>
    <displayMessage>Job not found.</displayMessage>
    <message>Job Not Found Error: org.ngbw.sdk.jobs.JobNotFoundException: NGBW-JOB-NEURON_EXPANSE-CC460782E5FF464CB96791B1E6053AA4</message>
    <code>4</code>
</error>

Handle Errors

Http status codes are used to indicate whether an API request succeeded or failed. When the http status indicates failure (with a status other than 200) an error object is returned. A basic error object looks like this:

<error>
    <displayMessage>Job Not Found</displayMessage>
    <message>Job Not Found Error: org.ngbw.sdk.jobs.JobNotFoundException: NGBW-JOB-NEURON_EXPANSE-261679BE83E245AD8EEECB4592A52B81
    </message>
    <code>4</code>
</error>

The displayMessage is a user friendly description of the error. The contents of the message are not meant for end users, but may be helpful in debugging. The code indicates the type of error, for example code = 4 is "not found", as shown in the source code for ErrorData.java

A job validation error may contain a list of field errors. For example:

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom \
    -F tool=NEURON_EXPANSE \
    -F metadata.clientJobId=110 \
    -F input.infile_=@./sampleinput \
    -F vparam.runtime_="one hour" \
    -F vparam.foo_=bar

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<error>
    <displayMessage>Form validation error.</displayMessage>
    <message>Validation Error: </message>
    <code>5</code>
    <paramError>
        <param>runtime_</param>
        <error>Must be a Double.</error>
    </paramError>
    <paramError>
        <param>foo_</param>
        <error>Does not exist.</error>
    </paramError>
</error>

Data Types

The XML documents or data structures returned by the API, such as jobstatus, results, jobfile, error, etc., are not fully documented yet, however the basic schema is available. You can also view the java source code for these data structures. NSG maps the java classes to XML using JAXB. If you happen to be implementing in java you may want to use the java source code, linked to above, with JAXB, to unmarshall the XML documents that the NSG-R returns.

We may find it necessary to add elements to the schema as time goes by but your application should continue to work provided it ignores any elements it doesn't recognize.

Tool API

The tool API provides information about the computational neuroscience tools that can be run on NSG. It's public: no credentials and no special headers are required, so it's easy to use a browser or curl to explore it. You can use the Tool API to learn the IDs of the tools you're interested in running and to download their PISE XML descriptions.

Definition: Strictly speaking, a NSG tool is an interface for configuring command line job submissions. It is defined by a PISE XML document found in the Tool API. Each tool deploys jobs for one installed application (e.g. NEURON_EXPANSE)

Go to $URL/tool in the browser, or use curl, as shown below, to see a list of the available tools:

$ curl $URL/tool
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><tools>
<tool>
    <toolId>PGENESIS_TG</toolId>
    <toolName>PGENESIS 2.3 on Comet</toolName>
    <selfUri>
        <url>https://nsgr.sdsc.edu:8444/cipresrest/v1/tool/PGENESIS_TG</url>
        <rel>tool</rel>
        <title>PGENESIS_TG</title>
    </selfUri>
    <piseUri>
        <url>https://nsgr.sdsc.edu:8444/cipresrest/v1/tool/PGENESIS_TG/doc/pise</url>
        <rel>Pise XML</rel>
        <title>PGENESIS_TG pise</title>
    </piseUri>
    <portal2Uri>
        <url>https://nsgr.sdsc.edu:8444/cipresrest/v1/tool/PGENESIS_TG/doc/portal2</url>
        <rel>Html Web Page</rel>
        <title>PGENESIS_TG type</title>
    </portal2Uri>
    <exampleUri>
        <url>https://nsgr.sdsc.edu:8444/cipresrest/v1/tool/PGENESIS_TG/doc/example</url>
        <rel>Html Web Page</rel>
        <title>PGENESIS_TG type</title>
    </exampleUri>
    <parameterUri>
        <url>https://nsgr.sdsc.edu:8444/cipresrest/v1/tool/PGENESIS_TG/doc/param</url>
        <rel>Html Web Page</rel>
        <title>PGENESIS_TG type</title>
    </parameterUri>
</tool>

	. . .
</tools>

Each tool description includes the toolId, toolName, and a number of "Uri" elements, which are links to various documents for the specific tool.

As we mentioned earlier, it's good practice to navigate through the API using these returned links rather than hardcoding the urls. For example, all the urls in the table below can all be extracted from the data returned by the top level resource at $URL/tool.

Summary

GET $URL/tool Use this to get a list of the available tools.
GET $URL/tool/toolId Use this to get the URLs that link to the tool's documents (ie. the documents listed below).
GET $URL/tool/toolId/doc/pise Uset this URL to download the tool's PISE XML file.
GET $URL/tool/toolId/doc/portal2 Use this URL, in a browser, to read a detailed description of the tool. This URL returns http status 303 and a Location header that redirects to the html page on the NSG that gives a detailed, human readable, description of the tool.
GET $URL/tool/toolId/doc/example Not implemented yet. Will give examples showing how to submit jobs to use this tool.
GET $URL/tool/toolId/doc/param Not implemented yet. Will give a human readable description of each of the tool's parameters.

Sample Code

To get all the examples described below, run

$ svn export https://svn.sdsc.edu/repo/scigap/tags/rest-R8.28/rest_client_examples/examples

To build the java examples, you will first need to build and install the restdatatypes jar by running

$ svn export https://svn.sdsc.edu/repo/scigap/tags/rest-R8.28/rest/datatypes
$ cd datatypes
$ mvn clean install

Python Example

This example shows how to communicate with the NSG-R from python, using the Requests library and DIRECT authentication.

Java Umbrella Example

This is a very bare bones, Struts based web application that communicates with the NSG-R by using the Jersey REST Framework. It employs the NSG-R UMBRELLA authentication model and lets users login, submit jobs, monitor their progress, download results, etc.

Unlike a real web application, this simple example doesn't have a registration form and doesn't validate anything you enter on the login screen or the "Create Task" form. Whatever you enter on the login form will be sent to REST API in the cipres-eu headers when you choose to "List Tasks" or "Create Task". In a real application that uses UMBRELLA authentication, you would look up the user's email address, institution and country in your user management module or database in order to populate the headers.

When you choose "List Task" or "Create Task", and the example application contacts the NSG-R, it is possible that you will see an Authentication Error if, for example, you've entered an invalid country code or you enter the same email address as another user. The NSG-R looks up the application name/username pair in its database, and if it doesn't find an entry, it creates an account for the user on the fly. Thus there is no need for users of UMBRELLA applications to register with the NSG-R.

You can see the java_umbrella demo in action or export the code from the svn link and follow the instructions in Readme.txt to build and run it yourself. The maximum job runtime is capped at 1 min in the demo.

Perl Example

This is a perl script that makes use of libwww-perl to access the NSG-R. It repeatedly prompts the user to retrieve a list of supported tools, submit a job, show the user's jobs, show a job's results or download a job's results.

Examples of a javascript client, php client and java client (using the DIRECT authentication model) will be posted here soon.

iPython Client Example

The example shows how to access NSG-R from iPython notebook. Users need to load it into ipython to actually run it. Please email [email protected] with any questions.

Usage Limits

NSG has the following per user limits:

CONCURRENT_LIMIT
The number of concurrent REST API requests.
XSEDE_SU_LIMIT
The number of XSEDE SUs, where 1 SU = 1 hour of computing time on one CPU.
OTHER_SU_LIMIT
The number of SUs on non-xsede resources, not applicable for NSG-R users currently.
SUBMITTED_TODAY_LIMIT
The number of jobs a user may submit in a single day.
ACTIVE_LIMIT
The number of active jobs allowed, where an active job is any job that isn't fully completed (i.e. jobstatus.terminalStage is still false). This includes jobs that are queued, running, or awaiting cleanup.

When a request is rejected due to a usage limit, the http status will be 429 (Too Many Requests). The error.code will be 103, which is the NSG generic "UsageLimit" error code. The error will contain a nested limitStatus element which has type and ceiling fields.

<error>
    <displayMessage>Too many active jobs.  Limit is 1</displayMessage>
    <message>org.ngbw.sdk.UsageLimitException: Too many active jobs.  Limit is 1</message>
    <code>103</code>
    <limitStatus>
        <type>active_limit</type>
        <ceiling>1</ceiling>
    </limitStatus>
</error>

Currently, the limits are

  • concurrent_limit=10
  • active_limit=50
  • other_su_limit=30,000
  • xsede_su_limit=30,000

These limits can be modified for specific applications and users. If you have a problem with the default limits, please contact us to discuss your needs.

A future release of the REST API will

  • Provide a way to programmatically determine the limits that are in effect.
  • Provide a way to programmatically determine the number of SUs a user has consumed

Appendix


Configure Params

It is impossible to explain job configuration in NSG without first explaining the basic method for command line generation. The code for creating command lines and configuring jobs evolved from the Pasteur Institute Software Environment (PISE). PISE is an XML-based standard designed to permit scalable generation of web forms that can be used to create Unix command lines. Each tool offered by NSG has a PISE XML document that describes the options supported by that tool. (Please see the Tool API section, for the definition of a NSG "tool").

In the NSG website, the PISE XML documents are used to create the browser-based forms that let a user configure a job. In the NSG-R, they define the fields that may be POSTed in a job submission. In both systems the PISE files are also used to validate job submissions and to create the command line, based on the user supplied fields.

We have already seen that a NSG-R job submission must include a tool and a primary input.infile_, and may include optional metadata fields. In this section of the guide we explain how to modify the default values in job submissions. In doing so, we explore the relationship between the command line options offered by a given program, the parameters in a tool's PISE XML files and the input. and vparam. fields you can use to configure a job submission.

The two types of fields that are derived from the PISE XML files and are:

  • Input Files: these field names have the form input.parameter_name_. Each such field corresponds to a <parameter> in the tool's PISE XML file, where the name of the parameter is parameter_name and the parameter's type is InFile. Every PISE file defines one special InFile parameter that is, by convention, named infile. This parameter has the attribute isinput=1, which means that it is the primary input, and must always be included in any run of this tool. Other InFile parameters allow you to submit optional files containing constraints, guide trees, etc. (as appropriate for the tool, and the particular analysis).
  • Visible Parameters: these field names have the form vparam.parameter_name_. Each such field corresponds to a <parameter> in the tool's PISE XML file where the name of the parameter is parameter_name and the parameter's type is Switch, String, Integer, etc. These parameters are used to configure the command line and certain other aspects of a run, such as how long the job is allowed to run. They are called visible parameters, because in the NSG website, they correspond to textareas, radio buttons and other visible form controls.

Syntax Recap: Except for the tool, all field names are of the form prefix.name where allowed values for prefix are metadata. input. or vparam. All input and vparam field names have a trailing underscore.

Continuing with the job submission example used earlier in this guide, here's how Tom could submit a NEURON_EXPANSE job with a limited maximum run time:

$ curl -u tom:$PASSWORD \
    -H cipres-appkey:$KEY \
    $URL/job/tom \
    -F tool=NEURON_EXPANSE \
    -F metadata.clientJobId=102   \
    -F metadata.statusEmail=true \
    -F input.infile_=@./sampleinput \
    -F vparam.runtime_=1 \
-F vparam.runtime_=1
Configures a maximum run time of 1 hour, using the runtime parameter, found in neuron73_tg.xml. Typically, vparams are specific to a particular tool, but, by convention, runtime is found in every tool's PISE XML file. If left unspecified, maximum run time would have been set to the default value specified in the PISE XML file, typically 0.5 h.
 

Note: In general, only parameters that differ from the defaults specified in the tool's PISE file, need to be specified in the job submission.

Strategies for Using PISE XML Files

  1. Find the tool in Tools: How to Configure Specific Tools. The "REST Tool Info" links on that page explain how to use the tool specific parameters to configure different types of analysis. We are still writing these pages so you may find that there is documentation for some tools but not for others. Please let us know which tools you need documented.
  2. Download and examine the PISE XML file of interest, and identify the elements that control the command line flags you are interested in using. The utility of this strategy will depend on the complexity of the interface (in terms of preconds and ctrls) and how many non-default values you need to use. This strategy should work fine for simpler PISE XML documents.
  3. Display the functioning parameter web form in the NSG website, and note names of the fields, the default values, the interdependency of the fields, and the logical organization of the form. This information can help identify the parameters of interest within the PISE XML document. To find the PISE parameter names, you'll have to create, configure and save a task in the NSG, then use the links on the Task Details or Tasks (list) pages to view the inputs and parameters.

It may be comforting to know that many commonly used job configurations require only one or two parameter fields to be set manually.

Please contact us if you have questions about how to configure specific tools.

How to Make a Test Run

Once you have a job submission ready, you can validate it by POSTing it to $URL/job/username/validate instead of POSTing it to $URL/job/username. NSG will validate the parameters but won't actually submit the job. If your submission is fine NSG will return a jobstatus object with a commandline element that shows the Linux command line that would be run if the job were submitted. On the other hand if there are errors in the submission, NSG will return an error object that explains the problems.

Umbrella Application Examples

Umbrella applications can use the commands in this guide, with additional request headers that identify the end user. Behind the scenes, NSG creates an account for the user with a username of the form application_name.cipres_eu_header and it is this qualified username that goes in the URLs.

Request Headers

All requests to the job API use request headers. The required headers depend on the type of authentication the application uses, as noted below.

Basic authentication credentials ALL DIRECT applications send the user's NSG REST username and password. UMBRELLA applications send the username and password of the person who registered the application. See Authentication.
cipres-appkey ALL Application ID generated by NSG when you registered the application. It can be changed later if necessary.
cipres-eu UMBRELLA Uniquely identifies the user within your application. Up to 200 characters. Single quotes are not allowed within the name.
cipres-eu-email UMBRELLA End user's email address. Up to 200 characters. You can't have 2 users with the same email address.
cipres-eu-institution UMBRELLA End user's home institution, if any.
cipres-eu-country UMBRELLA Two letter, upper case, ISO 3166 country code for the end user's institution.

List Jobs

For example, suppose your username is mary and you're integrating an existing web application with the NSG-R API. You've registered the application with the name neuromorph and set the authentication method to UMBRELLA.

Now suppose a user named harry logs into your application and your application needs to get a list of jobs that harry has submitted to NSG. First, you go to your database or user management component and retrieve harry's email address, institutional affiliation, and optional ISO 3166 2 letter country code. Now you're ready to issue this curl command (or the equivalent statement in the language you're using):

$ curl -i -u mary:password \
    -H cipres-appkey:$KEY \
    -H cipres-eu:harry \
    -H cipres-eu-email:harry@ucsddd.edu \
    -H cipres-eu-institution:UCSD \
    -H cipres-eu-country:US \
    $URL/job/neuromorph.harry

Notice that although the value of the cipres-eu header is harry, in the URL, you must use neuromorph.harry.

Submit a Job

Here you submit a basic NEURON job for harry and get back a jobstatus object.

$ curl -u mary:password \
    -H cipres-appkey:$KEY \
    -H cipres-eu:harry \
    -H cipres-eu-email:harry@ucsddd.edu \
    -H cipres-eu-institution:UCSD \
    -H cipres-eu-country:US \
    $URL/job/neuromorph.harry\
    -F tool=NEURON_EXPANSE \
    -F input.infile_=@./sampleinput \

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<jobstatus>
    <selfUri>
        <url>$URL/cipresrest/v1/job/neuromorph.harry/NGBW-JOB-NEURON_EXPANSE-CB8D053F9033487E9B4F9BAF8A3AA47A</url>
        <rel>jobstatus</rel>
        <title>NGBW-JOB-NEURON_EXPANSE-CB8D053F9033487E9B4F9BAF8A3AA47A</title>
    </selfUri>
    <jobHandle>NGBW-JOB-NEURON_EXPANSE-CB8D053F9033487E9B4F9BAF8A3AA47A</jobHandle>
    <jobStage>QUEUE</jobStage>
    <terminalStage>false</terminalStage>
    <failed>false</failed>
    <metadata>
        <entry>
            <key>clientJobId</key>
            <value>010007AQ</value>
        </entry>
    </metadata>
    <dateSubmitted>2014-09-12T12:36:31-07:00</dateSubmitted>
    <resultsUri>
        <url>$URL/cipresrest/v1/job/neuromorph.harry/NGBW-JOB-NEURON_EXPANSE-CB8D053F9033487E9B4F9BAF8A3AA47A/output</url>
        <rel>results</rel>
        <title>Job Results</title>
    </resultsUri>
    <workingDirUri>
        <url>$URL/cipresrest/v1/job/neuromorph.harry/NGBW-JOB-NEURON_EXPANSE-CB8D053F9033487E9B4F9BAF8A3AA47A/workingdir</url>
        <rel>workingdir</rel>
        <title>Job Working Directory</title>
    </workingDirUri>
    <messages>
        <message>
            <timestamp>2014-09-12T12:36:31-07:00</timestamp>
            <stage>QUEUE</stage>
            <text>Added to NSG run queue.</text>
        </message>
    </messages>
    <minPollIntervalSeconds>60</minPollIntervalSeconds>
</jobstatus>

Check Job Status

You can check the status of a single job, using the jobstatus.selfUri.url that was returned when the job was submitted, like this:

$ curl -u mary:mary \
    -H cipres-appkey:$KEY \
    -H cipres-eu:harry \
    -H cipres-eu-email:harry@ucsddd.edu \
    -H cipres-eu-institution:UCSD \
    -H cipres-eu-country:US \
    $URL/job/neuromorph.harry/NGBW-JOB-NEURON_EXPANSE-CB8D053F9033487E9B4F9BAF8A3AA47A
 

or you can get the status of multiple jobs, submitted on behalf of multiple users with a single GET of $URL/job. Indicate which jobs you're interested in with a query parameters named jh (for "job handle"). Use separate jh parameters for each job. With this request, the cipres-appkey header is required, but end user headers are not. For example:

$ curl -u mary:mary \
    -H cipres-appkey:$KEY \
    $URL/job/?jh=NGBW-JOB-NEURON_EXPANSE-CB8D053F9033487E9B4F9BAF8A3AA47A\&jh=NGBW-JOB-NEURON_EXPANSE-553D534D355C4631BBDCF217BB792A01

If you're using curl in a typical unix shell, you must place a backslash before the & that separates the query parameters to escape it from interpretation by the shell.

Other Operations

The other things you may need to do are 1) retrieve files from a job's working directory while it's running, 2) retrieve final results once a job has finished, 3) cancel and/or delete a job. The DIRECT application examples in this guide are applicable to UMBRELLA applications too. Just remember to add the appropriate NSG end user headers and prefix the username in the URL with the the application name and a period.