Change file encoding to utf8

Original link:current page

Description:

Sometimes you may need a way to convert the encoding to utf-8 from other format, specially when a large of files need to be converted.


Module Needed

1 chardet
2 os

Code

The module chardet is needed here, though there are several other ways can do the same thing.
the function as follows:

def convert(filename, in_enc = "GBK", out_enc="UTF8"):
    print("convert " + filename)
    content = open(filename, 'rb').read()
    print(type(content))
    result = chardet.detect(content) # get current encoding by chardet
    if coding != 'utf-8':
        if coding:
            print(coding + "to utf-8!")
        new_content = content.decode(in_enc).encode(out_enc)
        open(filename, 'wb').write(new_content)
        print("done")
    else:
        print(coding)

  • Last modified: 2019/10/21 11:51