将utf-8文本保存在json.dumps中为UTF8，而不是\\ u转义序列

json

适用于现代 C++ 的 JSON。

项目地址：https://gitcode.com/gh_mirrors/js/json

免费下载资源

p15097962069

9525人浏览 · 2020-05-19 17:07:58

p15097962069 · 2020-05-19 17:07:58 发布

本文翻译自：Saving utf-8 texts in json.dumps as UTF8, not as \u escape sequence

sample code: 样例代码：

>>> import json
>>> json_string = json.dumps("ברי צקלה")
>>> print json_string
"\u05d1\u05e8\u05d9 \u05e6\u05e7\u05dc\u05d4"

The problem: it's not human readable. 问题：这不是人类可读的。 My (smart) users want to verify or even edit text files with JSON dumps (and I'd rather not use XML). 我的（智能）用户想要使用JSON转储来验证甚至编辑文本文件（我宁愿不使用XML）。

Is there a way to serialize objects into UTF-8 JSON strings (instead of \\uXXXX )? 有没有一种方法可以将对象序列化为UTF-8 JSON字符串（而不是\\uXXXX ）？

#1楼

参考：https://stackoom.com/question/1EwOd/将utf-文本保存在json-dumps中为UTF-而不是-u转义序列

#2楼

Use the ensure_ascii=False switch to json.dumps() , then encode the value to UTF-8 manually: 使用ensure_ascii=False切换到json.dumps() ，然后将值手动编码为UTF-8：

>>> json_string = json.dumps("ברי צקלה", ensure_ascii=False).encode('utf8')
>>> json_string
b'"\xd7\x91\xd7\xa8\xd7\x99 \xd7\xa6\xd7\xa7\xd7\x9c\xd7\x94"'
>>> print(json_string.decode())
"ברי צקלה"

If you are writing to a file, just use json.dump() and leave it to the file object to encode: 如果要写入文件，只需使用json.dump()并将其留给文件对象进行编码：

with open('filename', 'w', encoding='utf8') as json_file:
    json.dump("ברי צקלה", json_file, ensure_ascii=False)

Caveats for Python 2 Python 2警告

For Python 2, there are some more caveats to take into account. 对于Python 2，还有更多注意事项需要考虑。 If you are writing this to a file, you can use io.open() instead of open() to produce a file object that encodes Unicode values for you as you write, then use json.dump() instead to write to that file: 如果要将其写入文件，则可以使用io.open()而不是open()来生成一个文件对象，该对象在编写时为您编码Unicode值，然后使用json.dump()代替来写入该文件：

with io.open('filename', 'w', encoding='utf8') as json_file:
    json.dump(u"ברי צקלה", json_file, ensure_ascii=False)

Do note that there is a bug in the json module where the ensure_ascii=False flag can produce a mix of unicode and str objects. 请注意， json模块中存在一个bug，其中ensure_ascii=False标志可以生成unicode和str对象的混合体。 The workaround for Python 2 then is: 那么，Python 2的解决方法是：

with io.open('filename', 'w', encoding='utf8') as json_file:
    data = json.dumps(u"ברי צקלה", ensure_ascii=False)
    # unicode(data) auto-decodes data to unicode if str
    json_file.write(unicode(data))

In Python 2, when using byte strings (type str ), encoded to UTF-8, make sure to also set the encoding keyword: 在Python 2中，当使用编码为UTF-8的字节字符串（类型str ）时，请确保还设置了encoding关键字：

>>> d={ 1: "ברי צקלה", 2: u"ברי צקלה" }
>>> d
{1: '\xd7\x91\xd7\xa8\xd7\x99 \xd7\xa6\xd7\xa7\xd7\x9c\xd7\x94', 2: u'\u05d1\u05e8\u05d9 \u05e6\u05e7\u05dc\u05d4'}

>>> s=json.dumps(d, ensure_ascii=False, encoding='utf8')
>>> s
u'{"1": "\u05d1\u05e8\u05d9 \u05e6\u05e7\u05dc\u05d4", "2": "\u05d1\u05e8\u05d9 \u05e6\u05e7\u05dc\u05d4"}'
>>> json.loads(s)['1']
u'\u05d1\u05e8\u05d9 \u05e6\u05e7\u05dc\u05d4'
>>> json.loads(s)['2']
u'\u05d1\u05e8\u05d9 \u05e6\u05e7\u05dc\u05d4'
>>> print json.loads(s)['1']
ברי צקלה
>>> print json.loads(s)['2']
ברי צקלה

#3楼

Using ensure_ascii=False in json.dumps is the right direction to solve this problem, as pointed out by Martijn. Martijn指出，在json.dumps中使用suresure_ascii = False是解决此问题的正确方向。 However, this may raise an exception: 但是，这可能会引发异常：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 1: ordinal not in range(128)

You need extra settings in either site.py or sitecustomize.py to set your sys.getdefaultencoding() correct. 您需要在site.py或sitecustomize.py中进行其他设置，才能正确设置sys.getdefaultencoding（）。 site.py is under lib/python2.7/ and sitecustomize.py is under lib/python2.7/site-packages. site.py在lib / python2.7 /下，sitecustomize.py在lib / python2.7 / site-packages下。

If you want to use site.py, under def setencoding(): change the first if 0: to if 1: so that python will use your operation system's locale. 如果要使用site.py，请在def setencoding（）下：将第一个if 0：更改为if 1 :，以便python使用操作系统的语言环境。

If you prefer to use sitecustomize.py, which may not exist if you haven't created it. 如果您喜欢使用sitecustomize.py，那么如果尚未创建，则可能不存在。 simply put these lines: 只需将这些行：

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

Then you can do some Chinese json output in utf-8 format, such as: 然后，您可以以utf-8格式进行中文json输出，例如：

name = {"last_name": u"王"}
json.dumps(name, ensure_ascii=False)

You will get an utf-8 encoded string, rather than \\u escaped json string. 您将获得一个utf-8编码的字符串，而不是\\ u转义的json字符串。

To verify your default encoding: 验证默认编码：

print sys.getdefaultencoding()

You should get "utf-8" or "UTF-8" to verify your site.py or sitecustomize.py settings. 您应该获得“ utf-8”或“ UTF-8”来验证site.py或sitecustomize.py设置。

Please note that you could not do sys.setdefaultencoding("utf-8") at interactive python console. 请注意，您无法在交互式python控制台上执行sys.setdefaultencoding（“ utf-8”）。

#4楼

UPDATE: This is wrong answer, but it's still useful to understand why it's wrong. 更新：这是错误的答案，但是了解为什么它仍然是有用的。 See comments. 看评论。

How about unicode-escape ? unicode-escape怎么样？

>>> d = {1: "ברי צקלה", 2: u"ברי צקלה"}
>>> json_str = json.dumps(d).decode('unicode-escape').encode('utf8')
>>> print json_str
{"1": "ברי צקלה", "2": "ברי צקלה"}

#5楼

Peters' python 2 workaround fails on an edge case: Peters的python 2解决方法在边缘情况下失败：

d = {u'keyword': u'bad credit  \xe7redit cards'}
with io.open('filename', 'w', encoding='utf8') as json_file:
    data = json.dumps(d, ensure_ascii=False).decode('utf8')
    try:
        json_file.write(data)
    except TypeError:
        # Decode data to Unicode first
        json_file.write(data.decode('utf8'))

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 25: ordinal not in range(128)

It was crashing on the .decode('utf8') part of line 3. I fixed the problem by making the program much simpler by avoiding that step as well as the special casing of ascii: 它在第3行的.decode（'utf8'）部分崩溃。我通过避免该步骤以及ascii的特殊大小写，使程序更加简单，从而解决了该问题：

with io.open('filename', 'w', encoding='utf8') as json_file:
  data = json.dumps(d, ensure_ascii=False, encoding='utf8')
  json_file.write(unicode(data))

cat filename
{"keyword": "bad credit  çredit cards"}

#6楼

Here's my solution using json.dump(): 这是我使用json.dump（）的解决方案：

def jsonWrite(p, pyobj, ensure_ascii=False, encoding=SYSTEM_ENCODING, **kwargs):
    with codecs.open(p, 'wb', 'utf_8') as fileobj:
        json.dump(pyobj, fileobj, ensure_ascii=ensure_ascii,encoding=encoding, **kwargs)

where SYSTEM_ENCODING is set to: 其中SYSTEM_ENCODING设置为：

locale.setlocale(locale.LC_ALL, '')
SYSTEM_ENCODING = locale.getlocale()[1]

GitHub 加速计划 / js / json

41.72 K

6.61 K

下载

适用于现代 C++ 的 JSON。

最近提交(Master分支：1 个月前 )

960b763e 2 个月前

8c391e04 5 个月前

GitCode 开源社区

旨在为数千万中国开发者提供一个无缝且高效的云端环境，以支持学习、使用和贡献开源项目。

更多推荐

[转载]在Windows环境下安装GNU Radio

转自：在Windows环境下安装GNURadio_恐弱智_新浪博客GNU Radio是用Python开发的，大部分开源的工程能够在Linux环境下运行良好，而Windows下却运行的很勉强，而且安装配置都很复杂。GNU Radio算是个例外了，不光提供了Windows的二进制安装，还有比较详细的说明。我是Python小白，所以折腾了好久才弄好，特意记录下来，免得以后再装还折腾。GNU Radio的

GitCode 开源社区

centOS 8 使用dnf安装Docker

DNF是什么？CentOS 8使用YUM软件包管理器版本v4.0.4。现在，该版本使用DNF(已删除YUM)。DNF是软件包管理器。它会在Linux发行版上安装，执行更新并删除软件包。使用DNF安装Docker跳过具有损坏依赖性的程序包一个有效的解决方案是使您的CentOS 8系统使用以下--nobest命令安装最符合条件的版本：sudo dnf install docker...

GitCode 开源社区

定时同步数据库表(mysql+linux+crontab)

sync.sh里面的参数需要改变，ip/username/password/database/tablesync.sh#!/bin/sh# Please change the IP and password of the data source db.# Then change the table name.filename=/home/nington/db/$(date +%Y-%m