Join the Community
and take part in the story

Introduction to Grid for Apps


Understanding Event-Driven Storage

This is the first in of a series of articles about the event-driven framework that is part of our SDS object storage solution. This framework allows users to process data at scale; we call it Grid for Apps.

There are many use cases for Grid for Apps. Today, this technology is used for video transcoding, metadata enrichment, image recognition and manipulation, pattern recognition in images and data files, real-time video transcoding and watermarking, and more. But, if you think of the future and the quantity of data we expect to produce, the number of use cases is even bigger, with applications in fields like industrial IoT, artificial intelligence, big data; the only limit is your imagination.

#Let’s give it a try

Let’s start with a very simple use case: adding a new metadata field to an object right after its upload. We will tackle more complex use cases in the coming weeks.

To deploy an OpenIO SDS cluster, we will use the Docker container that we provide as a quick and easy way to use the software. But you can use the same steps to implement OpenIO Grid for Apps and use it on a very large platform with hundreds of nodes and billions of objects.

Retrieve the OpenIO SDS Docker container:

# docker pull openio/sds

Start your new OpenIO SDS environment:

# docker run -ti --tty openio/sds

You should now be at the prompt with an OpenIO SDS instance up and running.

Next, we will configure the trigger, so that every time you add a new object, the data is processed and a new metadata field is added.

Add the following content to the file /etc/oio/sds/OPENIO/oio-event-agent-0/oio-event-handlers.conf:

pipeline = process

pipeline = content_cleaner

pipeline = account_update

pipeline = account_update

pipeline = account_update

pipeline = volume_index

pipeline = volume_index

use = egg:oio#content_cleaner

use = egg:oio#account_update

use = egg:oio#volume_index

use = egg:oio#notify
tube = oio-process
queue_url = beanstalk://

As you can see in the configuration file, there are many events that can be triggered (such as, storage.content.deleted, etc.), but for this tutorial we will just focus on the event.

According to the configuration file, each time we put new content in the object store ([”]), we will use the pipeline “process” (pipeline = process).

The pipeline “process” will then take the event and put it in the tube oio-process in the local beanstalk instance, as described at the end of the configuration file:

use = egg:oio#notify
tube = oio-process
queue_url = beanstalk://

Then, restart the openio event agent to enable the modification:

# gridinit_cmd restart @oio-event-agent

Your event-driven system is now up and running. The next step is to write a small script that will take the events stored in the beanstalk to process the object.

Let’s create a script called with the following content:

#!/usr/bin/env python
import json
from oio.api import object_storage
from oio.event.beanstalk import Beanstalk, ResponseError

# Initiate a connection to beanstalk to fetch the events from the tube oio-process
b = Beanstalk.from_url("beanstalk://")"oio-process")

# Waiting for events
while True:
            # Reserve the event when it appears
            event_id, data = b.reserve()
        except ResponseError:
            # Or continue waiting for the next one
        # Retrieve the information from the event (namespace, bucket, object name ...)
        meta = json.loads(data)
        url = meta["url"]
        # Initiate a connection with the OpenIO cluster
        s = object_storage.ObjectStorageAPI(url["ns"], "")
        # Add the metadata to the object
        s.object_update(url["account"], url["user"], url["path"], {"uploaded" : "true"})
        # Delete the event

Finally, launch it in background:

# python &

Please note that the script is written in Python, but you can write it any other language.

#How does it work?

It’s time to add a new object to see if it works. Using the OpenIO CLI, let’s upload the new object /etc/fstab to the container mycontainer in the account myaccount:

# openio --oio-ns OPENIO --oio-account myaccount object create mycontainer /etc/fstab

And check that the new metadata was properly set:

# openio --oio-ns OPENIO --oio-account myaccount object show mycontainer fstab

With the following result:

| Field         | Value                            |
| account       | myaccount                        |
| container     | mycontainer                      |
| ctime         | 1493721260                       |
| hash          | FB2B5EC6E6BC56CF7D02BE2B3D4AA5BA |
| id            | 64A81915884E0500529252884202F1CA |
| meta.uploaded | true                             |
| mime-type     | application/octet-stream         |
| object        | fstab                            |
| policy        | SINGLE                           |
| size          | 313                              |
| version       | 1493721260075114                 |

You can see that the metadata was added to the object meta.uploaded | true

Let’s start the discussion!