【问题标题】:How to store data in MySql using cygnus?如何使用 cygnus 在 MySql 中存储数据?
【发布时间】:2015-04-21 15:39:34
【问题描述】:

我已阅读有关 cygnus 工作原理的所有文档,我专门测试了 this one 成功。我也读完了this 教程,但我确定我没有正确配置一些东西。

在我创建的 cygnus_instance_1.conf 中:

CYGNUS_USER=root
CONFIG_FOLDER=/usr/cygnus/conf
CONFIG_FILE=/usr/cygnus/conf/agent_1.conf
AGENT_NAME=cygnusagent
LOGFILE_NAME=cygnus.log
ADMIN_PORT=8081

在我创建的 agent_1.conf 中:

#=============================================
# To be put in APACHE_FLUME_HOME/conf/cygnus.conf
#
# General configuration template explaining how to setup a sink of each of the available types (HDFS, CKAN, MySQL).

#=============================================
# The next tree fields set the sources, sinks and channels used by Cygnus. You could use different names than the
# ones suggested below, but in that case make sure you keep coherence in properties names along the configuration file.
# Regarding sinks, you can use multiple types at the same time; the only requirement is to provide a channel for each
# one of them (this example shows how to configure 3 sink types at the same time). Even, you can define more than one
# sink of the same type and sharing the channel in order to improve the performance (this is like having
# multi-threading).
cygnusagent.sources = http-source
cygnusagent.sinks = hdfs-sink mysql-sink ckan-sink
cygnusagent.channels = hdfs-channel mysql-channel ckan-channel

#=============================================
# source configuration
# channel name where to write the notification events
cygnusagent.sources.http-source.channels = hdfs-channel mysql-channel ckan-channel
# source class, must not be changed
cygnusagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# listening port the Flume source will use for receiving incoming notifications
cygnusagent.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
cygnusagent.sources.http-source.handler = es.tid.fiware.fiwareconnectors.cygnus.handlers.OrionRestHandler
# URL target
cygnusagent.sources.http-source.handler.notification_target = /notify
# Default service (service semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service = def_serv
# Default service path (service path semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service_path = def_servpath
# Number of channel re-injection retries before a Flume event is definitely discarded (-1 means infinite retries)
cygnusagent.sources.http-source.handler.events_ttl = 10
# Source interceptors, do not change
cygnusagent.sources.http-source.interceptors = ts de
# Interceptor type, do not change
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
# Destination extractor interceptor, do not change
cygnusagent.sources.http-source.interceptors.de.type = es.tid.fiware.fiwareconnectors.cygnus.interceptors.DestinationExtractor$Builder
# Matching table for the destination extractor interceptor, put the right absolute path to the file if necessary
# See the doc/design/interceptors document for more details
cygnusagent.sources.http-source.interceptors.de.matching_table = /usr/cygnus/conf/matching_table.conf

# ============================================
# OrionHDFSSink configuration
# channel name from where to read notification events
cygnusagent.sinks.hdfs-sink.channel = hdfs-channel
# sink class, must not be changed
cygnusagent.sinks.hdfs-sink.type = es.tid.fiware.fiwareconnectors.cygnus.sinks.OrionHDFSSink
# Comma-separated list of FQDN/IP address regarding the Cosmos Namenode endpoints
# If you are using Kerberos authentication, then the usage of FQDNs instead of IP addresses is mandatory
cygnusagent.sinks.hdfs-sink.cosmos_host = x1.y1.z1.w1,x2.y2.z2.w2
# port of the Cosmos service listening for persistence operations; 14000 for httpfs, 50070 for webhdfs and free choice for inifinty
cygnusagent.sinks.hdfs-sink.cosmos_port = 14000
# default username allowed to write in HDFS
cygnusagent.sinks.hdfs-sink.cosmos_default_username = cosmos_username
# default password for the default username
cygnusagent.sinks.hdfs-sink.cosmos_default_password = xxxxxxxxxxxxx
# HDFS backend type (webhdfs, httpfs or infinity)
cygnusagent.sinks.hdfs-sink.hdfs_api = httpfs
# how the attributes are stored, either per row either per column (row, column)
cygnusagent.sinks.hdfs-sink.attr_persistence = column
# Hive FQDN/IP address of the Hive server
cygnusagent.sinks.hdfs-sink.hive_host = x.y.z.w
# Hive port for Hive external table provisioning
cygnusagent.sinks.hdfs-sink.hive_port = 10000
# Kerberos-based authentication enabling
cygnusagent.sinks.hdfs-sink.krb5_auth = false
# Kerberos username
cygnusagent.sinks.hdfs-sink.krb5_auth.krb5_user = krb5_username
# Kerberos password
cygnusagent.sinks.hdfs-sink.krb5_auth.krb5_password = xxxxxxxxxxxxx
# Kerberos login file
cygnusagent.sinks.hdfs-sink.krb5_auth.krb5_login_conf_file = /usr/cygnus/conf/krb5_login.conf
# Kerberos configuration file
cygnusagent.sinks.hdfs-sink.krb5_auth.krb5_conf_file = /usr/cygnus/conf/krb5.conf

# ============================================
# OrionCKANSink configuration
# channel name from where to read notification events
cygnusagent.sinks.ckan-sink.channel = ckan-channel
# sink class, must not be changed
cygnusagent.sinks.ckan-sink.type = es.tid.fiware.fiwareconnectors.cygnus.sinks.OrionCKANSink
# the CKAN API key to use
cygnusagent.sinks.ckan-sink.api_key = ckanapikey
# the FQDN/IP address for the CKAN API endpoint
cygnusagent.sinks.ckan-sink.ckan_host = x.y.z.w
# the port for the CKAN API endpoint
cygnusagent.sinks.ckan-sink.ckan_port = 80
# Orion URL used to compose the resource URL with the convenience operation URL to query it
cygnusagent.sinks.ckan-sink.orion_url = http://localhost:1026
# how the attributes are stored, either per row either per column (row, column)
cygnusagent.sinks.ckan-sink.attr_persistence = row
# enable SSL for secure Http transportation; 'true' or 'false'
cygnusagent.sinks.ckan-sink.ssl = false

# ============================================
# OrionMySQLSink configuration
# channel name from where to read notification events
cygnusagent.sinks.mysql-sink.channel = mysql-channel
# sink class, must not be changed
cygnusagent.sinks.mysql-sink.type = es.tid.fiware.fiwareconnectors.cygnus.sinks.OrionMySQLSink
# the FQDN/IP address where the MySQL server runs 
cygnusagent.sinks.mysql-sink.mysql_host = localhost
# the port where the MySQL server listes for incomming connections
cygnusagent.sinks.mysql-sink.mysql_port = 3306
# a valid user in the MySQL server
cygnusagent.sinks.mysql-sink.mysql_username = root
# password for the user above
cygnusagent.sinks.mysql-sink.mysql_password = klasika
# how the attributes are stored, either per row either per column (row, column)
cygnusagent.sinks.mysql-sink.attr_persistence = column

#=============================================
# hdfs-channel configuration
# channel type (must not be changed)
cygnusagent.channels.hdfs-channel.type = memory
# capacity of the channel
cygnusagent.channels.hdfs-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.hdfs-channel.transactionCapacity = 100

#=============================================
# ckan-channel configuration
# channel type (must not be changed)
cygnusagent.channels.ckan-channel.type = memory
# capacity of the channel
cygnusagent.channels.ckan-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.ckan-channel.transactionCapacity = 100

#=============================================
# mysql-channel configuration
# channel type (must not be changed)
cygnusagent.channels.mysql-channel.type = memory
# capacity of the channel
cygnusagent.channels.mysql-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.mysql-channel.transactionCapacity = 100

虽然我没有使用 OrionHDFSSink 和 OrionCKANSink,但我没有碰这些配置,因为我真的不确定我应该使用什么。

当我最终 subscribeContext 并以 cygnus @ 默认端口 5050 为目标时,我得到了正常响应,但我的数据库中没有创建任何内容

我在这里做错了什么?

【问题讨论】:

    标签: mysql fiware fiware-orion fiware-cygnus


    【解决方案1】:

    首先,请随意删除 HDFS 和 CKAN 配置部分。在运行 Cygnus 时,您将避免与这些组件相关的不必要的日志。当然,记得删除所有对 sinks 和 channels 的引用;具体来说:

    cygnusagent.sources = http-source
    cygnusagent.sinks = mysql-sink
    cygnusagent.channels = mysql-channel
    ...
    cygnusagent.sources.http-source.channels = mysql-channel
    

    其次,您的问题的答案可以在文档中找到:

    在表格中,我们可以找到两个选项:

    • 像往常一样修复了 8 字段行:recvTimeTs、recvTime、entityId、entityType、attrName、attrType、attrValue 和 attrMd。如果表在插入行之前不存在,则在执行时创建这些表(和数据库)。关于 attrValue,最简单的形式是,这个值只是一个字符串,但从 Orion 0.11.0 开始,它可以是 Json 对象或 Json 数组。关于attrMd,它包含Json中属性的元数据数组的字符串序列化(如果属性没有元数据,则插入一个空数组[]),
    • 每个实体的属性有两列(一列用于值,另一列用于元数据),以及关于数据接收时间 (recv_time) 的附加列。 这种表(和数据库)必须在执行 Cygnus 之前提供,因为每个实体可能有不同数量的属性,并且通知必须确保通知每个属性的值。

    连接器关于数据内部表示的行为由配置参数 attr_persistence 控制,其值可以是行或列。

    可能是写法有问题,我认为最后一段必须以"结尾...整个值可以是行或列,其行为分别对应于上述选项".

    即如果您使用的是列模式,则必须预先配置数据库和表。

    有一个similar question 我更详细地解释了这种行为。

    HTH!

    【讨论】:

    • 那么你的意思是如果我更改为行模式它会自动创建一个新的数据库和适当的表?如果我保持列模式,那么我在哪里指定要定位的数据库?
    • 是的,转移到行模式 Cygnus 将自动创建数据库和表,因为这种模式存储每个属性的数据属性,并且 Cygnus 采用预定义的结构。保持在列模式下,您必须自己创建数据库,将它们命名为 Orion 通知中的 fiware-service 标头。如果没有通知fiware-service 标头,则默认的cygnusagent.sources.http-source.handler.default_service = <DEF_SERV> 将用作数据库的名称。
    • 列模式下的表格也是如此:它们将被命名为通知中fiware-servicePath标题(如果未通知则为cygnusagent.sources.http-source.handler.default_service_path = <DEF_SERVPATH>)与entityId和entityType的连接。您可以在documentation 中找到有关此的详细信息。
    • 好的,所以我使用了行模式,并重新启动了 cygnus 服务器并执行了 subscribeContext 并且我得到了消息,甚至尝试了更新上下文。我的数据库中没有任何内容。也许我在这里混淆了顺序?
    • 在这种情况下,您能打开一个新问题吗?因为这是一个不同的主题,即以行模式将数据持久化到 MySQL 中并且不工作。请在这个新问题中打印 Cygnus 生成的日志跟踪。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-05-09
    相关资源
    最近更新 更多