AI开发-python-langchain框架（1-12 返回json-格式解析器）

窟聿湎 · 2026-2-6 13:55:01

关键点来了，现在json格式是开发中是最为普遍的数据格式，尤其在前后端交互中应用十分广泛，如何让大模型返回的数据是标准的json格式？
看如下代码：

from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field
import os
# 定义您想要的数据结构。
class Book(BaseModel):
title: str = Field(description="书名")
author: str = Field(description="作者")
description: str = Field(description="书的简介")
# Set up a parser + inject instructions into the prompt template.
output_parser = JsonOutputParser(pydantic_object=Book)
format_instructions = output_parser.get_format_instructions()
print('原版提示词')
print(format_instructions)
print('#############')
#改成中文提示词
format_instructions = '''输出应格式化为符合以下 JSON 结构的 JSON 实例。
JSON结构
```
{
'title': '书的标题',
'author': '作者',
'description': '书的简介'
}
```
'''
prompt = PromptTemplate(
template="{format_instructions}\n{query}\n",
input_variables=["query"],
partial_variables={"format_instructions": format_instructions},
)
# 初始化聊天模型（使用DeepSeek API）
llm = ChatOpenAI(
api_key=os.getenv("DEEPSEEK_API_KEY"), # 从环境变量读取API密钥
base_url=os.getenv("BASE_URL"), # 从环境变量读取API基础URL（如 https://api.deepseek.com）
model="deepseek-v3:671b", # 指定使用的模型版本
temperature=0.7, # 生成随机性控制：0.7 适中创造性
max_tokens=1024 # 单次响应最大token数
)
chain = prompt | llm | output_parser
print('--------------')
# 以及旨在提示语言模型填充数据结构的查询。
query = "请给我介绍2本学习中国历史的经典书籍"
result = chain.invoke({"query": query})
print(result)
#流式输出
# for s in chain.stream({"query": query}):
# print(s)

复制代码

输出：

原版提示词
The output should be formatted as a JSON instance that conforms to the JSON schema below.
As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
Here is the output schema:
```
{"properties": {"title": {"title": "Title", "description": "\u4e66\u540d", "type": "string"}, "author": {"title": "Author", "description": "\u4f5c\u8005", "type": "string"}, "description": {"title": "Description", "description": "\u4e66\u7684\u7b80\u4ecb", "type": "string"}}, "required": ["title", "author", "description"]}
```
#############
--------------
[{'title': '中国通史', 'author': '吕思勉', 'description': '《中国通史》是吕思勉先生的代表作之一，系统全面地介绍了中国从远古时代到近代的历史发展脉络。该书内容详实，分析深入，是学习中国历史的经典入门书籍。'}, {'title': '万历十五年', 'author': '黄仁宇', 'description': '《万历十五年》是黄仁宇先生的经典著作，以明朝万历十五年为切入点，通过细致入微的历史分析，展现了当时社会的政治、经济和文化状况。该书视角独特，文笔流畅，深受读者喜爱。'}]

复制代码

看这个返回数据是不是就是需要的标准json格式
上面这段代码的核心是通过定义数据结构、构建提示词、调用大模型、解析输出的完整流程，精准控制大模型返回指定格式的 JSON 数据。
首先通过 Pydantic 的 BaseModel 定义 Book 类，明确要求输出包含 title、author、description 三个字段及对应含义，为 JSON 输出提供规则蓝本；
接着利用 JsonOutputParser 绑定该数据结构（关键点），既自动生成格式提示词，又能后续校验并解析模型输出，同时自定义中文格式提示词强化大模型对 JSON 结构的理解，确保字段与定义完全匹配；
再通过 PromptTemplate 将格式要求与用户查询拼接为标准化提示词，明确告知大模型需返回符合结构的 JSON 实例；
初始化兼容第三方模型的 ChatOpenAI 时，将 temperature 设为 0.7 平衡创造性与格式合规性，降低输出偏离 JSON 结构的概率；
最后通过 LangChain 的链式调用（prompt | llm | output_parser）自动化完成 “提示词拼接→模型生成→JSON 解析” 全流程，
最终输出可直接操作的 Python 字典，全程通过结构约束、提示词引导、解析器校验三重保障，实现大模型稳定返回合规 JSON 数据的核心目标。

来源：程序园用户自行投稿发布，如果侵权，请联系站长删除
免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！

凤患更 · 2026-2-8 16:27:50

前排留名，哈哈哈

铝缉惹 · 2026-2-9 23:11:24

不错，里面软件多更新就更好了

丝甲坞 · 2026-2-10 05:37:02

喜欢鼓捣这些软件，现在用得少，谢谢分享！

贼瘁 · 2026-2-10 13:48:23

收藏一下不知道什么时候能用到

栓州 · 2026-2-10 17:32:54

喜欢鼓捣这些软件，现在用得少，谢谢分享！

别萧玉 · 2026-2-11 13:13:00

不错，里面软件多更新就更好了

俞秋荣 · 2026-2-12 05:09:01

yyds。多谢分享

睿哝 · 25 分钟前

感谢分享，学习下。

账号		自动登录	找回密码
密码			立即注册

AI开发-python-langchain框架（1-12 返回json-格式解析器）

相关帖子

回复

浏览过的版块

签约作者

AI开发-python-langchain框架（1-12 返回json-格式解析器）

相关帖子

相关推荐

回复

浏览过的版块

签约作者