文件读取第一个字符为\ufeff
问题重现
1 | open(file,encoding='utf-8') |
读取到的第一个文字是\ufeff
错误原因
因为file是UTF-8 with BOM格式(即utf-8-sig),与UTF-8不同
As UTF-8 is an 8-bit encoding no BOM is required and anyU+FEFF character in the decoded Unicode string (even if it’s the firstcharacter) is treated as a ZERO WIDTH NO-BREAK SPACE.
解决方法
将文件转为UTF-8格式,但需要转换文件编码
按
utf-8-sig格式打开1
open(file, encoding='utf-8-sig')
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 AlMirai!