首页 python

使用fastapi读取docx文件内容

发布于: 2024-04-21

需要安装的包

1
2
3
4
python-docx==1.1.0
fastapi==0.110.0
uvicorn==0.22.0
python-multipart==0.0.9

读取docx文件

1
2
3
4
5
from docx import Document
//打开文档
document = Document("dark-and-stormy.docx")
//读取第0段的内容
document.paragraphs[0].text

启动fastapi

1
2
3
4
5
6
7
8
9
10
11
from fastapi import FastAPI
import uvicorn

app = FastAPI()

@app.get("/")
async def root():
return {"message": "Hello World"}

if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)

代码实战

目标:文件上传读取文件内容
需求:在服务端不存储上传的文件,将docx内容直接进行返回

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from fastapi import FastAPI, File, UploadFile
import uvicorn
from tempfile import NamedTemporaryFile

app = FastAPI()

@app.get("/")
async def root():
return {"message": "Hello World"}

@app.post("/docx")
async def read_docx(docx: UploadFile = File()):
# 建立临时文件
temp_file = NamedTemporaryFile(delete=True)
temp_file.write(await docx.read())
temp_file.flush()
tmp_name = temp_file.name
from docx import Document
# 打开文档
document = Document(tmp_name)
# 将内容进行字符串连接(具体业务具体操作)
ret = ""
for idx, para in enumerate(document.paragraphs):
ret += document.paragraphs[idx].text
return ret

if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)