hive中转化 json 格式的函数

  • string 类型 → json
    • named_struct:生成key和value的struct结构,传递参数为named_struct(key1,value1,key2,value2,…,keyN,valueN))
    • to_json:将各种复杂结构转换成json格式
      • 若无,可自定义 udf 函数
to_json(collect_list(named_struct("eventType",predicate_label,"eventTime",create_time))) AS json
  • map类型 → json
    • str_to_map 可以先将string切割成map,再转json
    • to_json:将各种复杂结构转换成json格式
      • 若无,可自定义 udf 函数
  to_json(map类型字段) as field_json


select to_json(str_to_map("aaaa_-100#bbbb_领券29减8#cccc_29分钟#cccc_50分钟",'#','_')) as json_test
-- {"aaaa":"-100","bbbb":"领券29减8","cccc":"50分钟"}
           

Hive中处理json数据的两种方式

第一种:将json数据作为字符串进行处理

  • get_json_object

    • 功能:取单个json字段
    • tips:如果字段名称错误,不会报错,但结果会是 NULL
  • json_tuple:

    • 功能:同时取多个字段
    • UDTF函数
    • tips如果加$.的话会出现null哦~~~
  • 案例一:

--创建数据:vim /export/datas/hivedata.json
{"id": 1701439105,"ids": [2154137571,3889177061],"total_number": 493}
{"id": 1701439106,"ids": [2154137571,3889177061],"total_number": 494}
--创建表:
create table tb_json_test1 (
json string
);
--加载数据
load data local inpath '/export/datas/hivedata.json' into table tb_json_test1;
--处理读取
select 
  get_json_object(t.json,'$.id'), 
  get_json_object(t.json,'$.total_number') 
from 
  tb_json_test1 t ;
  
select 
  t2.* 
from 
  tb_json_test1 t1 
lateral view 
  json_tuple(t1.json, 'id', 'total_number') t2 as c1,c2;
  
select json_tuple(json, 'id', 'total_number') as (c1,c2) from tb_json_test1;
[{"key":"upload-screenshot","value":false},{"key":"choice","value":"已解决"}]


select reply
    , key
    , vaule
from (select reply
			, reply_2
      from (select reply
                   ,regexp_replace(reply_1, '\\}\\,\\{', '\\}\\;\\{') as reply_1
              from (select reply
                           ,regexp_replace(reply, '\\[|\\]', '') as reply_1
                      from (select '[{"key":"upload-screenshot","value":false},{"key":"choice","value":"已解决"}]' as reply)
                   ) c
            ) a 
    LATERAL VIEW explode(split(reply_1, '\\;')) b as reply_2
   ) a 
lateral view json_tuple(reply_2, 'key', 'value') b as key,vaule 
;


 
-- reply	                                                                                key	                    vaule
-- [{"key":"upload-screenshot","value":false},{"key":"choice","value":"已解决"}]	        upload-screenshot	    false	
-- [{"key":"upload-screenshot","value":false},{"key":"choice","value":"已解决"}]	        choice	                已解决	

第二种:通过专门的解析类直接加载一个json格式的数据到Hive中

  • 案例:
--创建文件:vim /export/datas/hivedata.json
{"id": 1701439105,"ids": [2154137571,3889177061],"total_number": 493}
{"id": 1701439106,"ids": [2154137571,3889177061],"total_number": 494}
--添加jar包
add jar /export/datas/json-serde-1.3.7-jar-with-dependencies.jar;
--创建表:
create table tb_json_test2 (
id string,
ids array<string>,
total_number int)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE;
--加载数据
load data local inpath '/export/datas/hivedata.json' into table tb_json_test2;
GitHub 加速计划 / js / json
18
5
下载
适用于现代 C++ 的 JSON。
最近提交(Master分支:3 个月前 )
f06604fc * :page_facing_up: bump the copyright years Signed-off-by: Niels Lohmann <mail@nlohmann.me> * :page_facing_up: bump the copyright years Signed-off-by: Niels Lohmann <mail@nlohmann.me> * :page_facing_up: bump the copyright years Signed-off-by: Niels Lohmann <niels.lohmann@gmail.com> --------- Signed-off-by: Niels Lohmann <mail@nlohmann.me> Signed-off-by: Niels Lohmann <niels.lohmann@gmail.com> 8 天前
d23291ba * add a ci step for Json_Diagnostic_Positions Signed-off-by: Harinath Nampally <harinath922@gmail.com> * Update ci.cmake to address review comments Signed-off-by: Harinath Nampally <harinath922@gmail.com> * address review comment Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix typo in the comment Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix typos in ci.cmake Signed-off-by: Harinath Nampally <harinath922@gmail.com> * invoke the new ci step from ubuntu.yml Signed-off-by: Harinath Nampally <harinath922@gmail.com> * issue4561 - use diagnostic positions for exceptions Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci_test_documentation check Signed-off-by: Harinath Nampally <harinath922@gmail.com> * address review comments Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci check failures for unit-diagnostic-postions.cpp Signed-off-by: Harinath Nampally <harinath922@gmail.com> * improvements based on review comments Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix const correctness string Signed-off-by: Harinath Nampally <harinath922@gmail.com> * further refinements based on reviews Signed-off-by: Harinath Nampally <harinath922@gmail.com> * add one more test case for full coverage Signed-off-by: Harinath Nampally <harinath922@gmail.com> * ci check fix - add const Signed-off-by: Harinath Nampally <harinath922@gmail.com> * add unit tests for json_diagnostic_postions only Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci_test_diagnostics Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci_test_build_documentation check Signed-off-by: Harinath Nampally <harinath922@gmail.com> --------- Signed-off-by: Harinath Nampally <harinath922@gmail.com> 9 天前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐