目录:

1、需求

2、参考

3、数据和配置

4、展现

5、注意事项

————————————————————————————–

1、需求

参考官网,使用Linux向Druid发送数据和查询数据

2、参考

数据来源–Formatting the Data
http://druid.io/docs/0.9.2/ingestion/data-formats.html

配置来源 : 使用druid的默认配置文件
/home/druid/druid-0.9.2/quickstart/wikiticker-index.json

3、数据和配置

1、将官网的数据中的日期修改为当前日期,我只修改了YYYY-MM-DD
2、将basicdata.json放入到HDFS,路径为:/user/druid/basicdata.json
3、将默认配置的dimensions列名,修改为basicdata.json的列名

测试的数据 :basicdata.json

{"timestamp": "2017-03-17T01:02:33Z", "page": "Gypsy Danger", "language" : "en", "user" : "nuclear", "unpatrolled" : "true", "newPage" : "true", "robot": "false", "anonymous": "false", "namespace":"article", "continent":"North America", "country":"United States", "region":"Bay Area", "city":"San Francisco", "added": 57, "deleted": 200, "delta": -143}
{"timestamp": "2017-03-17T03:32:45Z", "page": "Striker Eureka", "language" : "en", "user" : "speed", "unpatrolled" : "false", "newPage" : "true", "robot": "true", "anonymous": "false", "namespace":"wikipedia", "continent":"Australia", "country":"Australia", "region":"Cantebury", "city":"Syndey", "added": 459, "deleted": 129, "delta": 330}
{"timestamp": "2017-03-17T07:11:21Z", "page": "Cherno Alpha", "language" : "ru", "user" : "masterYi", "unpatrolled" : "false", "newPage" : "true", "robot": "true", "anonymous": "false", "namespace":"article", "continent":"Asia", "country":"Russia", "region":"Oblast", "city":"Moscow", "added": 123, "deleted": 12, "delta": 111}
{"timestamp": "2017-03-17T11:58:39Z", "page": "Crimson Typhoon", "language" : "zh", "user" : "triplets", "unpatrolled" : "true", "newPage" : "false", "robot": "true", "anonymous": "false", "namespace":"wikipedia", "continent":"Asia", "country":"China", "region":"Shanxi", "city":"Taiyuan", "added": 905, "deleted": 5, "delta": 900}
{"timestamp": "2017-03-17T12:41:27Z", "page": "Coyote Tango", "language" : "ja", "user" : "cancer", "unpatrolled" : "true", "newPage" : "false", "robot": "true", "anonymous": "false", "namespace":"wikipedia", "continent":"Asia", "country":"Japan", "region":"Kanto", "city":"Tokyo", "added": 1, "deleted": 10, "delta": -9}

测试的配置: data_schema.json

{
  "type" : "index_hadoop",
  "spec" : {
    "ioConfig" : {
      "type" : "hadoop",
      "inputSpec" : {
        "type" : "static",
        "paths" : "/user/druid/basicdata.json"
      }
    },
    "dataSchema" : {
      "dataSource" : "silentwolf",
      "granularitySpec" : {
        "type" : "arbitrary",
        "segmentGranularity" : "day",
        "queryGranularity" : "none",
        "intervals" : ["2017-03-17/2017-03-18"]
      },
      "parser" : {
        "type" : "hadoopyString",
        "parseSpec" : {
          "format" : "json",
          "dimensionsSpec" : { "dimensions" : [ "page", "language", "user", "unpatrolled", "newPage", "robot", "anonymous", "namespace", "continent", "country", "region", "city" ] },
          "timestampSpec" : { "format" : "auto", "column" : "timestamp" } }
      },
      "metricsSpec" : [
        {
          "name" : "count",
          "type" : "count"
        },
        {
          "name" : "added",
          "type" : "longSum",
          "fieldName" : "added"
        },
        {
          "name" : "deleted",
          "type" : "longSum",
          "fieldName" : "deleted"
        },
        {
          "name" : "delta",
          "type" : "longSum",
          "fieldName" : "delta"
        }
      ]
    },
    "tuningConfig" : {
      "type" : "hadoop",
      "jobProperties" : {}
    }
  }
}

测试的查询配置:queryall.json

    {
        "queryType": "timeseries",
        "dataSource": "silentwolf",
        "intervals": [ "2017-03-17/2017-03-18" ],
        "granularity": "day",
        "aggregations": [
            {"type": "count", "name": "count"},
            { "name" : "deleted","type" : "longSum", "fieldName" : "deleted"},
            { "name" : "delta","type" : "longSum","fieldName" : "delta"}
     ]

    }

4、展现

发送命令

[root@tagtic-master boke]# curl -X 'POST' -H 'Content-Type: application/json' -d @data_schema.json tagtic-master:18090/druid/indexer/v1/task

查询命令

[root@tagtic-master boke]# curl -X POST 'tagtic-slave01:18082/druid/v2/?pretty' -H 'Content-Type:application/json' -d @queryall.json 

发送、查询、数据展现
这里写图片描述

数据发送状态
这里写图片描述

这里写图片描述

5、注意事项

1、找到Druid集群中broker的server和端口,我的broker的端口为18082

[root@tagtic-slave01 yuhui]# ps -ef | grep broker
druid     52680  52675  1 220 ?       06:31:04 java -server -Xms16g -Xmx16g -XX:MaxDirectMemorySize=4096m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -cp conf/druid/_common:conf/druid/broker:lib/* io.druid.cli.Main server broker
root      89216  67823  0 17:03 pts/0    00:00:00 grep --color=auto broker

2、测试数据要放到HDFS上面

3、dimensions中的列名不要和metricsSpec中的name一样

       如果您喜欢我写的博文,读后觉得收获很大,不妨小额赞助我一下,让我有动力继续写出高质量的博文,感谢您的赞赏!!!

GitHub 加速计划 / druid / druid
27.83 K
8.56 K
下载
阿里云计算平台DataWorks(https://help.aliyun.com/document_detail/137663.html) 团队出品,为监控而生的数据库连接池
最近提交(Master分支:2 个月前 )
f060c270 - 4 天前
1613a765 * Improve gaussdb ddl parser * fix temp table 5 天前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐