Services on LINDAT

  1. Proxies setup

I took a liberty to reorganize the way we define the proxies. Each proxy setup has a separate file. All definitions related to that proxy should be in that file and not anywhere else. The proxies definitions are in the directory `proxies-available` and enabled proxies are symlinked in `proxies-enabled`.

You don't have to do symlinks by hand, but you can use `` [script](

├── proxies-available
│   ├── cesilko
│   ├── ....
│   ├── pmltq
│   └── treex-web
├── proxies-enabled

  1. Add new proxy

The file skeleton:

  1. who is responsible
  2. email

location /services/name {
rewrite /services/name(.*) /proxied/path$1 break;

include service_proxy;
proxy_pass http://quest;
  1. `location`

The crucial directive here is `location` see [Nginx documentation for details](

The most important part is the way how are the locations matched to urls and which location gets used.

1. Exact match (note `=`)
location = /services/exact { ... }
If the exact match is found the matching process is terminated and the matched location is used.

2. The longest matching prefix
location /services/name { config A }
location /services/name/images { config B }
The longest matching prefix is selected and than the matching continues ...

3. Regular expressions are checked after looking for the longest prefix
```nginx # case-sensitive match
location ~ \.(gif)$ { } # case-insensitive match
location ~* \.(php)$ { }
Regular expressions are checked in the order they appear in the config file. If the regular expression match the first one is used. If not the longest matching substring is used instead.

4. To get things even more complicated you can use `^~` operator to match the longest prefix and don't do regular expression search. This works in similar way the `=` works.

  1. `rewrite`

Use `rewrite` to change base url if required. [See documentation for details](

  1. `proxy_pass`

Every location defined for service should end with `proxy_pass`. It's recommended to use `include service_proxy;` to inject useful headers to service endpoint.

The endpoint definition can be full url or so called `upstream`. The upstreams defined:

- quest
- apache
- tomcats

You can't define `upstream` in service file because it's not included in the context where `upstream` definition is allowed.

[See documentation]( for more details.

  1. Shibboleth authentication

Shibboleth authentication only works for `https`. To make things as simple as possible I have made a file you can include to make Shibboleth work out of the box. Otherwise those directives would have to be repeated everytime.

more_clear_input_headers 'Variable-*' 'Shib-*' 'Remote-User' 'REMOTE_USER' 'Auth-Type' 'AUTH_TYPE';

  1. Add your attributes here. They get introduced as headers
  2. by the FastCGI authorizer so we must prevent spoofing.
    more_clear_input_headers 'displayName' 'mail' 'persistent-id';
  1. Require https and will redirect
    if ($https != "on") {
    return 301 https://$http_host$request_uri;

shib_request /shibauthorizer;

Instead use `include shibboleth_auth;`.


location = /services/name/shibboleth-login-url {
include shibboleth_auth;

rewrite /services/name(.*) /proxied/path$1 break;
include service_proxy;
proxy_pass http://quest;

Note the `=` in `location` directive. This will ensure that Shibboleth authentication will trigger only for this location.

Python REST

Python REST is an implementation of REST (Representational State Transfer) like server written in Python. The python REST server is being implemented by UFAL. The main objectives of this implementation are,

  • to provide a consistent web based interaction to the already available and future NLP tools developed at UFAL.
  • to aid the web developers by providing them GET/POST like API methods that can be used to access the NLP tools.
  • to aid the NLP tools developers at UFAL to easily port their stand alone applications to web.

How to use Python REST

Clone the REST repository

The REST server can be run from any machine that supports Python. The REST can made available to the local machine by cloning the source code repository. To clone the repository, you need the Git software. Use the following Git command to populate the repository locally,

git clone REST

Once the repository is cloned to the REST directory, you will be able to see the following files under the cloned repository,

$ cd REST
$ ls -F
applications/  nohup.out                    pyrest@    settings/
cherrypy/          README
logs/          project_settings.pyc         _scripts/  utils.pyc*  server/

Run the REST server

To start the REST service, you need to do two major things,

  • Server Settings
  • Expose Plugins

Server settings

  • at least set the server name and port number in which the REST server should run.

The server settings should be added to the file at the top-level directory.
The minimum settings for the REST service is given below, copy the following contents into

# coding=utf-8
# See main file for licence

  Project specific settings overriding the default ones in `settings` directory.

settings = {

    "server": {
        "host": "",
        "port": 8280,

Expose an application/service to web via REST plugins

A plugin is a piece of code written in Python (in our case) that interacts with the actual application. The plugin acts as a middle layer that accepts requests from the web and passes the requests to the actual application for processing. Also, the plugin is responsible for obtaining the results returned by the application to the web clients. A plugin essentially processes external HTTP requests for a particular exposed application/service (for example: the NLP application Cesilko) and communicates the results of the application back to the HTTP clients. The plugin offers list of APIs which the HTTP clients can request through HTTP methods (such as GET, POST, DELETE and PUT).

  • Each plugin defines list of APIs the REST should expose to the outside world for a particular application/service.
  • Each plugin defines methods for processing the API requests and return the output in JSON format.

There's simple plugin in the applications/ The applications/ defines the following things,

  • The name of the service : myservice
  • Provides List of APIs:
# API name API parameters
1 myapi "list" (required)
2 version

Start the REST server

python pyrest 

Accessing the REST service APIs

Once the REST server is started, the applications (via plugins in the directory applications/) will be exposed via the port as defined in the Then, the applications/services can be accessed via HTTP requests. The examples below show how to access the API methods as defined in applications/ via HTTP requests.


The expected output for the above API request should be,

  "input": "hello", 
  "result": "this is an example" 


The expected output for the above API request should be,

  "version": "myservice version 1" 

API output format

At present, the Python REST supports only JSON format. All HTTP GET/POST requests to REST services will be returned via JSON objects. JSON is supported by all web tools, so that the outputs can be used by API developers or others who want to get access to the REST services through web programming tools.

Detailed Documentation on Python REST

Here you can read more about the Python REST and how to define/expose your own applications via plugins: Python REST detailed documentation

Step-by-Step guide to adding LINDAT/CLARIN service


REST.svg View (4.5 KB) Redmine Admin, 01/04/2017 05:04 PM

REST.png View (9.02 KB) Redmine Admin, 01/04/2017 05:04 PM

REST (1).png View (66.5 KB) Redmine Admin, 01/04/2017 05:04 PM

REST (2).png View (56.8 KB) Redmine Admin, 01/04/2017 05:04 PM