PaddleOCR将ICDAR官网 label 转换为PaddleOCR支持的数据格式
gen_label.py转换出错问题:
Generate rec label
Traceback (most recent call last):
File "gen_label.py", line 79, in <module>
gen_rec_label(args.input_path, args.output_label)
File "gen_label.py", line 24, in gen_rec_label
img_path, label = tmp[0], tmp[1]
IndexError: list index out of range
标签是按照github文档下载:
train/word_1.png Genaxis Theatre
train/word_2.png [06]
train/word_3.png 62-03
train/word_4.png Carpark
train/word_5.png EXIT
train/word_6.png I2R
train/word_7.png fusionopolis
train/word_8.png fusionopolis
train/word_9.png Reserve
train/word_10.png CAUTION
train/word_11.png citi
train/word_12.png smrt
train/word_13.png WHY
train/word_14.png PAY
train/word_15.png FOR
train/word_16.png NOTHING?
train/word_17.png EXIT
train/word_18.png STAGE
train/word_19.png HarbourFront
train/word_20.png CC22
train/word_21.png bua
train/word_22.png Shops
train/word_23.png Gen is
train/word_24.png Theatre
train/word_25.png Place
train/word_26.png Nursing
train/word_27.png Room
train/word_28.png Fraser
train/word_29.png LIFESTYLE
train/word_30.png CHOICES
train/word_31.png THAT
train/word_32.png FOR
train/word_33.png COME
train/word_34.png SHOP
解决办法
def gen_rec_label(input_path, out_label):
'''根据自己的数据标签更改'''
with open(out_label, 'w') as out_file:
with open(input_path, 'r') as f:
for line in f.readlines():
# tmp = line.strip('\n').replace(" ", "").split(',')
tmp = line.strip('\n').replace(" ", "").split('\t')
# 自己更改
#tmp1 = line.strip('\n')
#tmp2 = tmp1.replace(" ", "")
#tmp = tmp2.split('\t')
img_path, label = tmp[0], tmp[1]
label = label.replace("\"", "")
out_file.write(img_path + '\t' + label + '\n')
# tmp = line.strip('\n').replace(" ", "").split(',')
tmp = line.strip('\n').replace(" ", "").split('\t')
更多推荐




所有评论(0)