Pandas对于Json文件操作的方法:

  1. 将 JSON 字符串转换为 pandas 对象。
read_json([path_or_buf, orient, typ, dtype, ...])
  1. Normalize semi-structured JSON data into a flat table.
json_normalize(data[, record_path, meta, ...])
  1. 将对象转换为 JSON 字符串。
DataFrame.to_json([path_or_buf, orient, ...])
  1. Create a Table schema from data.
build_table_schema(data[, index, ...])

pandas.read_json

pandas.read_json(path_or_buf=None, 
				orient=None, 
				typ='frame', 
				dtype=None, 
				convert_axes=None, 
				convert_dates=True, 
				keep_default_dates=True, 
				numpy=False, 
				precise_float=False, 
				date_unit=None, 
				encoding=None, 
				encoding_errors='strict', 
				lines=False, 
				chunksize=None, 
				compression='infer', 
				nrows=None, 
				storage_options=None)

参数:

  • path_or_buf:a valid JSON str, path object or file-like object
  • orient:str
  • typ:{‘frame’, ‘series’}, default ‘frame’
  • dtype:bool or dict, default None
  • convert_axes:bool, default None
  • convert_dates:bool or list of str, default True
  • keep_default_dates:bool, default True
  • numpy:bool, default False
  • precise_float:bool, default False
  • date_unit:str, default None
  • encoding:str, default is ‘utf-8’
  • encoding_errors:str, optional, default “strict”
  • lines:bool, default False。按行读取
  • chunksize:int, optional
  • compression:str or dict, default ‘infer’
  • nrows:int, optional
  • storage_options:dict, optional

返回值: Series or DataFrame
示例:
json文件内容:

[{"ttery":"[123]","issue":"20130801-3391"},{"ttery":"[123]","issue":"20130801-3390"},{"ttery":"[123]","issue":"20130801-3389"}]
# -*- coding: utf-8 -*-

import pandas as pd

file = open('ceshi.json', 'r', encoding='utf-8')

df = pd.read_json(file, orient='records')
df.to_excel('pandas处理ceshi-json.xlsx', index=False, columns=["ttery", "issue"])

pandas.json_normalize

pandas.json_normalize(data, 
				record_path=None, 
				meta=None, 
				meta_prefix=None, 
				record_prefix=None, 
				errors='raise', 
				sep='.', 
				max_level=None)

参数:

  • data:dict or list of dicts
  • record_path:str or list of str, default None
  • meta:list of paths (str or list of str), default None
  • meta_prefix:str, default None
  • record_prefix:str, default None
  • errors:{‘raise’, ‘ignore’}, default ‘raise’
  • sep:str, default ‘.’
  • max_level:int, default None

返回值: frame:DataFrame
示例:

data = [
    {"id": 1, "name": {"first": "Coleen", "last": "Volk"}},
    {"name": {"given": "Mark", "family": "Regner"}},
    {"id": 2, "name": "Faye Raker"},
]
pd.json_normalize(data)

id name.first name.last name.given name.family        name
0  1.0     Coleen      Volk        NaN         NaN         NaN
1  NaN        NaN       NaN       Mark      Regner         NaN
2  2.0        NaN       NaN        NaN         NaN  Faye Raker
data = [
    {
        "id": 1,
        "name": "Cole Volk",
        "fitness": {"height": 130, "weight": 60},
    },
    {"name": "Mark Reg", "fitness": {"height": 130, "weight": 60}},
    {
        "id": 2,
        "name": "Faye Raker",
        "fitness": {"height": 130, "weight": 60},
    },
]
pd.json_normalize(data, max_level=0)

id        name                        fitness
0  1.0   Cole Volk  {'height': 130, 'weight': 60}
1  NaN    Mark Reg  {'height': 130, 'weight': 60}
2  2.0  Faye Raker  {'height': 130, 'weight': 60}
data = [
    {
        "id": 1,
        "name": "Cole Volk",
        "fitness": {"height": 130, "weight": 60},
    },
    {"name": "Mark Reg", "fitness": {"height": 130, "weight": 60}},
    {
        "id": 2,
        "name": "Faye Raker",
        "fitness": {"height": 130, "weight": 60},
    },
]
pd.json_normalize(data, max_level=1)

id        name  fitness.height  fitness.weight
0  1.0   Cole Volk             130              60
1  NaN    Mark Reg             130              60
2  2.0  Faye Raker             130              60
data = [
    {
        "state": "Florida",
        "shortname": "FL",
        "info": {"governor": "Rick Scott"},
        "counties": [
            {"name": "Dade", "population": 12345},
            {"name": "Broward", "population": 40000},
            {"name": "Palm Beach", "population": 60000},
        ],
    },
    {
        "state": "Ohio",
        "shortname": "OH",
        "info": {"governor": "John Kasich"},
        "counties": [
            {"name": "Summit", "population": 1234},
            {"name": "Cuyahoga", "population": 1337},
        ],
    },
]
result = pd.json_normalize(
    data, "counties", ["state", "shortname", ["info", "governor"]]
)

name  population    state shortname info.governor
0        Dade       12345   Florida    FL    Rick Scott
1     Broward       40000   Florida    FL    Rick Scott
2  Palm Beach       60000   Florida    FL    Rick Scott
3      Summit        1234   Ohio       OH    John Kasich
4    Cuyahoga        1337   Ohio       OH    John Kasich

DataFrame.to_json

DataFrame.to_json(path_or_buf=None, 
				orient=None, 
				date_format=None, 
				double_precision=10, 
				force_ascii=True, 
				date_unit='ms', 
				default_handler=None, 
				lines=False, 
				compression='infer', 
				index=True, 
				indent=None, 
				storage_options=None)

参数:

  • path_or_buf:str, path object, file-like object, or None, default None
  • orient:str
  • date_format:{None, ‘epoch’, ‘iso’}
  • double_precision:int, default 10
  • force_ascii:bool, default True
  • date_unit:str, default ‘ms’ (milliseconds)
  • default_handler:callable, default None
  • lines:bool, default False
  • compression:str or dict, default ‘infer’
  • index:bool, default True
  • indent:int, optional
  • storage_options:dict, optional

返回值: None or str
示例:

import json
df = pd.DataFrame(
    [["a", "b"], ["c", "d"]],
    index=["row 1", "row 2"],
    columns=["col 1", "col 2"],
)
result = df.to_json(orient="split")
parsed = json.loads(result)
json.dumps(parsed, indent=4)  

{
    "columns": [
        "col 1",
        "col 2"
    ],
    "index": [
        "row 1",
        "row 2"
    ],
    "data": [
        [
            "a",
            "b"
        ],
        [
            "c",
            "d"
        ]
    ]
}

pandas.io.json.build_table_schema

pandas.io.json.build_table_schema(data, index=True, primary_key=None, version=True)

参数:

  • data:Series,DataFrame
  • index:bool, default True
  • primary_key:bool or None, default True
  • version:bool, default True

返回值: schema:dict
示例:

df = pd.DataFrame(
    {'A': [1, 2, 3],
     'B': ['a', 'b', 'c'],
     'C': pd.date_range('2016-01-01', freq='d', periods=3),
    }, index=pd.Index(range(3), name='idx'))
build_table_schema(df)

{'fields': [{'name': 'idx', 'type': 'integer'}, {'name': 'A', 'type': 'integer'}, {'name': 'B', 'type': 'string'}, {'name': 'C', 'type': 'datetime'}], 'primaryKey': ['idx'], 'pandas_version': '1.4.0'}
GitHub 加速计划 / js / json
18
5
下载
适用于现代 C++ 的 JSON。
最近提交(Master分支:3 个月前 )
f06604fc * :page_facing_up: bump the copyright years Signed-off-by: Niels Lohmann <mail@nlohmann.me> * :page_facing_up: bump the copyright years Signed-off-by: Niels Lohmann <mail@nlohmann.me> * :page_facing_up: bump the copyright years Signed-off-by: Niels Lohmann <niels.lohmann@gmail.com> --------- Signed-off-by: Niels Lohmann <mail@nlohmann.me> Signed-off-by: Niels Lohmann <niels.lohmann@gmail.com> 3 天前
d23291ba * add a ci step for Json_Diagnostic_Positions Signed-off-by: Harinath Nampally <harinath922@gmail.com> * Update ci.cmake to address review comments Signed-off-by: Harinath Nampally <harinath922@gmail.com> * address review comment Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix typo in the comment Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix typos in ci.cmake Signed-off-by: Harinath Nampally <harinath922@gmail.com> * invoke the new ci step from ubuntu.yml Signed-off-by: Harinath Nampally <harinath922@gmail.com> * issue4561 - use diagnostic positions for exceptions Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci_test_documentation check Signed-off-by: Harinath Nampally <harinath922@gmail.com> * address review comments Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci check failures for unit-diagnostic-postions.cpp Signed-off-by: Harinath Nampally <harinath922@gmail.com> * improvements based on review comments Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix const correctness string Signed-off-by: Harinath Nampally <harinath922@gmail.com> * further refinements based on reviews Signed-off-by: Harinath Nampally <harinath922@gmail.com> * add one more test case for full coverage Signed-off-by: Harinath Nampally <harinath922@gmail.com> * ci check fix - add const Signed-off-by: Harinath Nampally <harinath922@gmail.com> * add unit tests for json_diagnostic_postions only Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci_test_diagnostics Signed-off-by: Harinath Nampally <harinath922@gmail.com> * fix ci_test_build_documentation check Signed-off-by: Harinath Nampally <harinath922@gmail.com> --------- Signed-off-by: Harinath Nampally <harinath922@gmail.com> 4 天前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐