Skip to content

Latest commit

 

History

History
314 lines (266 loc) · 16.4 KB

README.md

File metadata and controls

314 lines (266 loc) · 16.4 KB

Cygnus NGSI

Content:

Welcome to Cygnus NGSI

Cygnus NGSI is a connector in charge of persisting Orion context data in certain configured third-party storages, creating a historical view of such data. In other words, Orion only stores the last value regarding an entity's attribute, and if an older value is required then you will have to persist it in other storage, value by value, using Cygnus NGSI.

Cygnus NGSI uses the subscription/notification feature of Orion. A subscription is made in Orion on behalf of Cygnus NGSI, detailing which entities we want to be notified when an update occurs on any of those entities attributes.

Internally, Cygnus NGSI is based on Apache Flume, which is used through cygnus-common and which Cygnus NGSI depends on. In fact, Cygnus NGSI is a Flume agent, which is basically composed of a source in charge of receiving the data, a channel where the source puts the data once it has been transformed into a Flume event, and a sink, which takes Flume events from the channel in order to persist the data within its body into a third-party storage.

Current stable release is able to persist Orion context data in:

  • HDFS, the Hadoop distributed file system.
  • MySQL, the well-know relational database manager.
  • CKAN, an Open Data platform.
  • MongoDB, the NoSQL document-oriented database.
  • STH Comet, a Short-Term Historic database built on top of MongoDB.
  • Kafka, the publish-subscribe messaging broker.
  • DynamoDB, a cloud-based NoSQL database by Amazon Web Services.
  • PostgreSQL, the well-know relational database manager.
  • PostGIS, a spatial database extender for PostgreSQL object-relational database.
  • Carto, the database specialized in geolocated data.
  • Orion, the FIWARE Context Broker.
  • Elasticsearch, the distributed full-text search engine with JSON documents.
  • Arcgis, the Arcgis is a geographic information system (GIS).

You may consider to visit Cygnus NGSI Quick Start Guide before going deep into the details.

Top

Basic operation

Hardware requirements

  • RAM: 1 GB, specially if abusing of the batching mechanism.
  • HDD: A few GB may be enough unless the channel types are configured as FileChannel type.

Top

Installation (CentOS/RedHat)

Simply configure the FIWARE release repository if not yet configured:

sudo wget -P /etc/yum.repos.d/ https://nexus.lab.fiware.org/repository/raw/public/repositories/el/7/x86_64/fiware-release.repo

And use your applications manager in order to install the latest version of Cygnus NGSI:

sudo yum install cygnus-ngsi

The above will install cygus-ngsi in /usr/cygnus/.

Please observe, as part of the installation process, cygnus-common is installed too.

Top

Configuration

Cygnus NGSI is a tool with a high degree of configuration required for properly running it. The reason is the configuration describes the Flume-based agent chosen to be run.

So, the starting point is choosing the internal architecture of the Cygnus NGSI agent. Let's assume the simplest one:

      +-------+
      |   NGSI|
      |   Rest|
      |Handler|
+-------------+    +----------------+    +---------------+
| http source |----| memory channel |----| NGSITestSink |
+-------------+    +----------------+    +---------------+

Attending to the above architecture, the content of /usr/cygnus/conf/agent_1.conf will be:

cygnusagent.sources = http-source
cygnusagent.sinks = test-sink
cygnusagent.channels = test-channel

cygnusagent.sources.http-source.channels = test-channel
cygnusagent.sources.http-source.type = http
cygnusagent.sources.http-source.port = 5050
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
cygnusagent.sources.http-source.handler.notification_target = /notify
cygnusagent.sources.http-source.handler.default_service = def_serv
cygnusagent.sources.http-source.handler.default_service_path = /def_servpath
cygnusagent.sources.http-source.handler.events_ttl = 10
cygnusagent.sources.http-source.interceptors = ts gi
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
cygnusagent.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf

cygnusagent.channels.test-channel.type = memory
cygnusagent.channels.test-channel.capacity = 1000
cygnusagent.channels.test-channel.transactionCapacity = 100

Check the Installation and Administration Guide for configurations involving real data storages such as HDFS, MySQL, etc.

In addition, a /usr/cygnus/conf/cygnus_instance_1.conf file must be created if we want to run Cygnus NGSI as a service (see next section):

CYGNUS_USER=cygnus
CONFIG_FOLDER=/usr/cygnus/conf
CONFIG_FILE=/usr/cygnus/conf/agent_1.conf
AGENT_NAME=cygnusagent
LOGFILE_NAME=cygnus.log
ADMIN_PORT=8081
POLLING_INTERVAL=30

Top

Running

Cygnus NGSI can be run as a service by simply typing:

$ (sudo) service cygnus start

Logs are written in /var/log/cygnus/cygnus.log, and the PID of the process will be at /var/run/cygnus/cygnus_1.pid.

Top

Unit testing

Running the tests require Apache Maven installed and Cygnus NGSI sources downloaded.

$ git clone https://github.com/telefonicaid/fiware-cygnus.git
$ cd fiware-cygnus/cygnus-ngsi
$ mvn test

Top

e2e testing

Cygnus NGSI works by receiving NGSI-like notifications, which are finally persisted. In order to test this, you can run any of the notification scripts located in the resources folder of this repo, which emulate certain notification types.

$ ./notification-json-simple.sh http://localhost:5050/notify myservice myservicepath
*   Trying ::1...
* Connected to localhost (::1) port 5050 (#0)
> POST /notify HTTP/1.1
> Host: localhost:5050
> Content-Type: application/json
> Accept: application/json
> User-Agent: orion/0.10.0
> Fiware-Service: myservice
> Fiware-ServicePath: myservicepath
> ngsiv2-attrsformat: normalized
> Content-Length: 460
>
* upload completely sent off: 460 out of 460 bytes
< HTTP/1.1 200 OK
< Transfer-Encoding: chunked
< Server: Jetty(6.1.26)
<
* Connection #0 to host localhost left intact

Or you can connect a real NGSI source such as Orion Context Broker. Please, check the User and Programmer Guide for further details.

Top

Management API overview

Run the following curl in order to get the version (assuming Cygnus NGSI runs on localhost):

$ curl -X GET "http://localhost:8081/v1/version"
{
    "success": "true",
    "version": "0.12.0_SNAPSHOT.52399574ea8503aa8038ad14850380d77529b550"
}

Run the following curl in order to get certain Flume components statistics (assuming cygus-ngsi runs on localhost):

$ curl -X GET "http://localhost:8081/v1/stats" | python -m json.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   489  100   489    0     0  81500      0 --:--:-- --:--:-- --:--:-- 97800
{
    "stats": {
        "channels": [
            {
                "name": "mysql-channel",
                "num_events": 0,
                "num_puts_failed": 0,
                "num_puts_ok": 11858,
                "num_takes_failed": 1,
                "num_takes_ok": 11858,
                "setup_time": "2016-02-05T10:34:25.80Z",
                "status": "START"
            }
        ],
        "sinks": [
            {
                "name": "mysql-sink",
                "num_persisted_events": 11800,
                "num_processed_events": 11858,
                "setup_time": "2016-02-05T10:34:24.978Z",
                "status": "START"
            }
        ],
        "sources": [
            {
                "name": "http-source",
                "num_processed_events": 11858,
                "num_received_events": 11858,
                "setup_time": "2016-02-05T10:34:24.921Z",
                "status": "START"
            }
        ]
    },
    "success": "true"
}

Many other operations, like getting/putting/updating/deleting the grouping rules can be found in Management Interface documentation.

Top

Advanced topics and further reading

Detailed information regarding cygus-ngsi can be found in the Installation and Administration Guide, the User and Programmer Guide and the Flume extensions catalogue. The following is just a list of shortcuts regarding the most popular topics:

  • Installation with docker. An alternative to RPM installation, docker is one of the main options when installing FIWARE components.
  • Installation from sources. Sometimes you will need to install from sources, particularly when some of the dependencies must be modified, e.g. the hadoop-core libraries.
  • Running as a process. Running cygus-ngsi as a process is very useful for testing and debugging purposes.
  • Management Interface. REST-based management interface for administration purposes.
  • Name Mappings. Designed as a Flume interceptor, this feature alows overwriting any notified service, service path, entity ID, entity type, attribute name or attribute type, when used for naming.
  • Multi-instance. Several instances of cygus-ngsi can be run as a service.
  • Reliability. Learn about the mechanisms making Cygnus a very reliable tool.
  • Performance tips. If you are experiencing performance issues or want to improve your statistics, take a look on how to obtain the best from cygus-ngsi.
  • New sink development. Addressed to those developers aiming to contribute to cygus-ngsi with new sinks.
  • Integration examples. Step-by-step how-to's regarding the integraton of Cygnus NGSI with Spark and Kafka.

Top

Features summary

ComponentFeatureFrom version
NGSIHDFSSinkFirst implementation0.1.0
Multiple HDFS endpoint setup0.4.1
Kerberos support0.7.0
OAuth2 support0.8.2
CSV support0.9.0
HiveServer2 support0.9.0
Table type select0.9.0
enable/disable Hive0.10.0
HDFSBackendImplBinary0.10.0
Batching mechanism0.10.0
Per-user Hive databases0.12.0
NGSICKANSinkFirst implementation0.2.0
Enable SSL0.4.2
Batching mechanism0.11.0
Capping and expiration1.7.0
Possibility to select datamodel2.2.0
NGSIDynamoDBSinkFirst implementation0.11.0
NGSIKafkaSinkFirst implementation0.9.0
Batching mechanims0.11.0
NGSIMongoSinkFirst implementation0.8.0
Hash based collections0.8.1
Batching support0.12.0
Time and size-based data management policies0.13.0
Ignore white space-based attribute values1.0.0
NGSIMySQLSinkFirst implementation0.2.0
Batching mechanism0.10.0
Capping and expiration1.7.0
NGSISTHSinkFirst implementation0.8.0
Hash based collections0.8.1
TimeInstant metadata as reception time0.12.0
Batching mechanism0.13.0
Time and size-based data management policies0.13.0
String-based aggregation (occurrences)1.0.0
Ignore white space-based attribute values1.0.0
NGSIPostgreSQLSinkFirst implementation0.12.0
NGSIPostgisLSinkFirst implementation1.12.0
NGSICartoDBSinkFirst implementation (raw-historic analysis)1.0.0
Distance-historic analysis1.1.0
Multi tenancy support1.1.0
Orion's geo:json support1.6.0
Raw-snapsot analysis1.6.0
NGSIOrionSinkFirst implementation1.10.0
NGSIElasticsearchSinkFirst implementation1.15.0
NGSIArcgisFeatureTableSinkFirst implementation (as NGSIArcGisSink)1.16.0
NGSITestSinkFirst implementation0.7.0
Batching mechanism0.12.0
All sinksEvents TTL0.4.1
Pattern-based grouping0.5.0
Infinite events TTL0.7.0
enable/disable Grouping Rules0.9.0
Data model configuration0.12.0
enable/disable forced lower case0.13.0
Per batch TTL0.13.0
New encoding1.3.0
Name mappings1.4.0
APIGrouping Rules0.13.0
Subscriptions1.0.0
Agents and instances1.2.0
Logs1.4.0
Metrics1.7.0

Top

Reporting issues and contact information

Any doubt you may have, please refer to the Cygnus Core Team.

Top