gen_label.py转换出错问题:

Generate rec label
Traceback (most recent call last):
  File "gen_label.py", line 79, in <module>
    gen_rec_label(args.input_path, args.output_label)
  File "gen_label.py", line 24, in gen_rec_label
    img_path, label = tmp[0], tmp[1]
IndexError: list index out of range

标签是按照github文档下载:

train/word_1.png    Genaxis Theatre
train/word_2.png    [06]
train/word_3.png    62-03
train/word_4.png    Carpark
train/word_5.png    EXIT
train/word_6.png    I2R
train/word_7.png    fusionopolis
train/word_8.png    fusionopolis
train/word_9.png    Reserve
train/word_10.png    CAUTION
train/word_11.png    citi
train/word_12.png    smrt
train/word_13.png    WHY
train/word_14.png    PAY
train/word_15.png    FOR
train/word_16.png    NOTHING?
train/word_17.png    EXIT
train/word_18.png    STAGE
train/word_19.png    HarbourFront
train/word_20.png    CC22
train/word_21.png    bua
train/word_22.png    Shops
train/word_23.png    Gen is
train/word_24.png    Theatre
train/word_25.png    Place
train/word_26.png    Nursing
train/word_27.png    Room
train/word_28.png    Fraser
train/word_29.png    LIFESTYLE
train/word_30.png    CHOICES
train/word_31.png    THAT
train/word_32.png    FOR
train/word_33.png    COME
train/word_34.png    SHOP

解决办法

def gen_rec_label(input_path, out_label):
    '''根据自己的数据标签更改'''
    with open(out_label, 'w') as out_file:
        with open(input_path, 'r') as f:
            for line in f.readlines():
                # tmp = line.strip('\n').replace(" ", "").split(',')
                tmp = line.strip('\n').replace(" ", "").split('\t')
                # 自己更改
                #tmp1 = line.strip('\n')
                #tmp2 = tmp1.replace(" ", "")
                #tmp = tmp2.split('\t')
                img_path, label = tmp[0], tmp[1]
                label = label.replace("\"", "")
                out_file.write(img_path + '\t' + label + '\n')

# tmp = line.strip('\n').replace(" ", "").split(',')
tmp = line.strip('\n').replace(" ", "").split('\t')

GitHub 加速计划 / pa / PaddleOCR
69
11
下载
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
最近提交(Master分支:8 个月前 )
332d9d51 * docs: Fix formatting * Fix typo * Fix translation * Fix formatting * Fix formatting 7 天前
715b1d9a 12 天前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐