使用 Pandas 和 glob 导入 Excel 文件时如何解决“Excel 文件格式无法确定”错误？

使用 pandas 和 glob 导入 excel 文件时如何解决“excel 文件格式无法确定”错误？

在使用 pandas 和 glob 导入 excel 文件时的不常见的引擎指定难题

通过 pandas 库和 glob 模块读取 excel 文件时，早期阶段可能出现“excel 文件格式无法确定，您必须手动指定引擎”的错误信息。这个特殊错误的根源在于隐藏的临时文件 ~$filename.xlsx，当 ms excel 打开 xlsx 文件时会在同一目录中创建。

为了解决这个问题，有两个可行的解决方案：

关闭打开的所有 excel 文件，并删除所有隐藏的临时文件 (~$filename.xlsx)。随后，代码就可以正常运行，没有任何引擎指定错误。
明确指定引擎参数。对于大多数 xlsx 文件，openpyxl 引擎是一个很好的选择。但是，如果遇到错误“badzipfile: file is not a zip file”，则需要使用 xlrd 引擎来读取文件。

为了避免此问题，在从文件夹中读取 excel 文件之前，确保关闭所有打开的 excel 文件并删除任何隐藏的临时文件 ~$filename.xlsx。通过遵循这些步骤，可以确保 pandas 和 glob 的无错误导入。

更新的代码：

# 以字符串形式指定引擎
df = pd.read_excel(f, engine="openpyxl").reindex(columns=customer_id).dropna(how='all', axis=1)

# 如果出现 BadZipFile 错误，则使用 xlrd
try:
    df = pd.read_excel(f, engine="openpyxl").reindex(columns=customer_id).dropna(how='all', axis=1)
except BadZipFile:
    df = pd.read_excel(f, engine="xlrd").reindex(columns=customer_id).dropna(how='all', axis=1)

以上就是使用 Pandas 和 glob 导入 Excel 文件时如何解决“Excel 文件格式无法确定”错误？的详细内容，更多请关注其它相关文章！