问题重现

1
open(file,encoding='utf-8')

读取到的第一个文字是\ufeff

错误原因

因为file是UTF-8 with BOM格式(即utf-8-sig),与UTF-8不同

As UTF-8 is an 8-bit encoding no BOM is required and anyU+FEFF character in the decoded Unicode string (even if it’s the firstcharacter) is treated as a ZERO WIDTH NO-BREAK SPACE.

解决方法

  • 将文件转为UTF-8格式,但需要转换文件编码

  • utf-8-sig格式打开

    1
    open(file, encoding='utf-8-sig')