在 Agent 开发中,Tool Schema 是你写给模型的“操作手册”。一个平庸的 Schema 仅仅定义了 API 接口;而一个优秀的 Schema 能显著降低模型的幻觉率,提高参数填充的准确性,并隐式地包含“思维链(CoT)”引导。
本附录汇集了 5 大类、20+ 个 经过生产环境验证的 JSON Schema 模板。它们专为 GPT-4o、Claude 3.5 Sonnet 等强模型设计,特别强化了多模态感知与工程落地的细节。
设计黄金法则 (Golden Rules of Schema Design):
enum,这能物理阻断模型生成不存在的选项。不仅仅是简单的搜索,而是构建“搜索-浏览-提取”的完整链路。
domain_filter 和 search_intent,引导模型在搜索前先思考搜索的目的。{
"name": "web_search_advanced",
"description": "Performs a web search using a search engine. Optimized for retrieving factual information, technical documentation, or news.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query. Include specific keywords to narrow down results."
},
"search_intent": {
"type": "string",
"enum": ["informational", "navigational", "troubleshooting", "academic"],
"description": "The intent behind the search to optimize ranking algorithms."
},
"time_range": {
"type": "string",
"enum": ["any", "past_24h", "past_week", "past_month", "past_year"],
"description": "Filter results by publication date."
},
"domain_filter": {
"type": "array",
"items": { "type": "string" },
"description": "Optional list of domains to include (e.g., ['github.com', 'stackoverflow.com']) or exclude (prefix with -)."
}
},
"required": ["query"]
}
}
{
"name": "read_webpage_content",
"description": "Scrapes and parses the content of a specific URL. Can return text structure or visual snapshots.",
"parameters": {
"type": "object",
"properties": {
"url": { "type": "string" },
"mode": {
"type": "string",
"enum": ["markdown_text", "screenshot_full", "screenshot_viewport"],
"description": "Use 'markdown_text' for articles. Use 'screenshot' for dashboards, charts, or complex layouts."
},
"wait_for_selector": {
"type": "string",
"description": "Optional CSS selector to wait for (dynamic loading) before scraping."
}
},
"required": ["url", "mode"]
}
}
这是多模态 Agent 的“眼睛”。重点在于节省 Token 和 聚焦关注点。
{
"name": "inspect_pdf_page",
"description": "Renders a specific page of a PDF document into an image for visual analysis (VLM).",
"parameters": {
"type": "object",
"properties": {
"file_path": { "type": "string" },
"page_number": {
"type": "integer",
"description": "1-based page index."
},
"zoom_level": {
"type": "number",
"default": 1.0,
"description": "Zoom factor (1.0 to 3.0). Use >1.5 to read small footnotes or dense tables."
},
"region_of_interest": {
"type": "array",
"items": { "type": "integer" },
"minItems": 4,
"maxItems": 4,
"description": "Optional [ymin, xmin, ymax, xmax] (0-1000) to crop specific area instead of full page."
}
},
"required": ["file_path", "page_number"]
}
}
{
"name": "sample_video_frames",
"description": "Extracts frames from a video file based on a sampling strategy.",
"parameters": {
"type": "object",
"properties": {
"video_path": { "type": "string" },
"strategy": {
"type": "string",
"enum": ["uniform_interval", "scene_change_detection", "timestamp_list"],
"description": "'uniform_interval': every N seconds. 'scene_change': smart detection. 'timestamp_list': specific moments."
},
"interval_seconds": {
"type": "number",
"description": "Required if strategy is 'uniform_interval'."
},
"timestamps": {
"type": "array",
"items": { "type": "number" },
"description": "Required if strategy is 'timestamp_list'."
}
},
"required": ["video_path", "strategy"]
}
}
此类工具风险最高,Schema 必须包含安全约束和精确的文件定位。
search_block + replace_block 的模式,这比 Unified Diff 对 LLM 更友好。{
"name": "edit_source_code",
"description": "Edits a file by replacing a specific block of text. Locate the unique block first.",
"parameters": {
"type": "object",
"properties": {
"file_path": { "type": "string" },
"search_block": {
"type": "string",
"description": "The exact unique text block to be replaced. Must match the file content character-by-character."
},
"replace_block": {
"type": "string",
"description": "The new code block to insert."
}
},
"required": ["file_path", "search_block", "replace_block"]
}
}
{
"name": "generate_repo_map",
"description": "Generates a tree structure or dependency graph of the codebase.",
"parameters": {
"type": "object",
"properties": {
"root_dir": {
"type": "string",
"default": "."
},
"depth": {
"type": "integer",
"default": 2,
"description": "Depth of the directory tree traversal."
},
"include_definitions": {
"type": "boolean",
"description": "If true, lists class and function names (signatures) alongside filenames."
}
},
"required": ["root_dir"]
}
}
这里的关键是动作原语 (Primitives) 的定义。
{
"name": "move_robot_arm",
"description": "Moves the robot's end-effector to a target pose in 3D space.",
"parameters": {
"type": "object",
"properties": {
"target_position": {
"type": "array",
"items": { "type": "number" },
"minItems": 3,
"maxItems": 3,
"description": "[x, y, z] coordinates in meters relative to robot base."
},
"target_orientation_rpy": {
"type": "array",
"items": { "type": "number" },
"minItems": 3,
"maxItems": 3,
"description": "[roll, pitch, yaw] Euler angles in degrees."
},
"motion_type": {
"type": "string",
"enum": ["linear", "joint"],
"description": "'linear' for straight line (cartesian), 'joint' for fastest path."
}
},
"required": ["target_position"]
}
}
{
"name": "navigate_to_landmark",
"description": "Navigates the mobile base to a semantic landmark visible in the map or memory.",
"parameters": {
"type": "object",
"properties": {
"landmark_name": {
"type": "string",
"description": "e.g., 'kitchen_counter', 'charging_station', 'red_sofa'."
},
"proximity_threshold": {
"type": "number",
"default": 1.0,
"description": "Distance in meters to consider 'arrived'."
}
},
"required": ["landmark_name"]
}
}
赋予 Agent 长期记忆和自我反思的能力。
{
"name": "save_to_long_term_memory",
"description": "Saves critical information/facts to long-term vector storage.",
"parameters": {
"type": "object",
"properties": {
"content": {
"type": "string",
"description": "The fact, code snippet, or user preference to save."
},
"tags": {
"type": "array",
"items": { "type": "string" },
"description": "Keywords for retrieval (e.g., 'project_A', 'user_preference')."
},
"importance": {
"type": "integer",
"minimum": 1,
"maximum": 5,
"description": "1=Trivial, 5=Critical/Permanent."
}
},
"required": ["content", "tags"]
}
}
在部署任何新工具前,请按照以下清单进行 Self-Check:
1. 设计“天气查询”工具 要求:支持城市名查询,必须包含日期(今天/明天/未来一周),且能够选择单位(摄氏度/华氏度)。
2. 设计“日程管理”工具 要求:创建一个会议日程。字段包含:主题、开始时间、持续时长、参与者列表(邮箱)。
3. 多模态 OCR 后处理工具 背景:OCR 识别出的表格往往格式混乱。 任务:设计一个工具,输入是一段混乱的 OCR 文本和原始图片的 Bounding Box,功能是让 Agent 重构出 Markdown 表格。
4. 数据库安全查询 (SQL Generator vs Executor)
任务:为了防止 Agent 直接执行危险 SQL,设计一套“两步走”的工具组。
工具 A:generate_sql_plan (只生成,不执行)
工具 B:execute_approved_sql (需要传入 plan_id)
items: { "type": "string" } (这是一个字符串数组) vs items: { "type": "object", ... } (这是对象数组)。["name": "A"] 这种非法 JSON。default: 10,但后端代码里没有处理 None 或 null 的况。null,有时完全不传参数。value = args.get('key') or DEFAULT_VALUE 的防御性编程,不要完全依赖 Schema 的 default 描述(因为有些 LLM 可能会忽略它)。crop 工具。如果 Agent 抱怨“看不清”,提示它使用 crop 工具截取局部放大。oneOf、anyOf、条件依赖逻辑。