第11章：Goal Setting and Monitoring

要使 AI Agent 真正有效且有明确目的，它们需要的不仅仅是处理信息或使用工具的能力；它们需要清晰的方向感，以及知道它们是否真正取得成功的方法。这就是 Goal Setting and Monitoring（目标设定与监控）模式发挥作用的地方。它是关于给 Agent 指定要努力实现的具体目标，并为它们配备跟踪进展和确定这些目标是否已经实现的手段。

Goal Setting and Monitoring 模式概述

想象一下计划一次旅行。您不会只是 spontaneously 出现在您的目的地。您决定您想去哪里（目标状态），弄清楚您从哪里开始（初始状态），考虑可用的选项（交通、路线、预算），然后规划出一系列步骤：预订票务、收拾行李、前往机场/车站、登上交通工具、到达、找到住宿，等等。这个逐步的过程，通常考虑依赖关系和约束，从根本上说是我们在 Agentic 系统中所说的规划。

在 AI Agent 的上下文中，规划通常涉及 Agent 接收一个高级目标，并自主或半自主地生成一系列中间步骤或子目标。然后这些步骤可以按顺序执行，或者以更复杂的流程执行，可能涉及其他模式，如 Tool Use、Routing 或 Multi-Agent 协作。规划机制可能涉及复杂的搜索算法、逻辑推理，或者越来越多地利用大语言模型 (LLM) 的能力，根据它们的训练数据和对任务的理解来生成合理且有效的计划。

良好的规划能力使 Agent 能够处理不是简单的单步查询的问题。它使它们能够处理多方面的请求，通过调整计划来适应不断变化的环境，并编排复杂的工作流。它是一个基础模式，支撑着许多高级的 Agentic 行为，将简单的反应式系统转变为可以主动朝着定义的目标工作的系统。

实际应用与用例

Goal Setting and Monitoring 模式对于构建能够在复杂的真实场景中自主且可靠地运行的 Agent 至关重要。以下是一些实际应用：

客户支持自动化：一个 Agent 的目标可能是"解决客户的账单查询"。它监控对话、检查数据库条目，并使用工具来调整账单。通过确认账单变更和收到客户的积极反馈来监控成功。如果问题没有解决，它会升级处理。
个性化学习系统：一个学习 Agent 可能有一个目标是"提高学生对代数的理解"。它监控学生在练习上的进展，调整教学材料，并跟踪准确性和完成时间等性能指标，如果学生遇到困难，则调整其方法。
项目管理助手：一个 Agent 可能被指派"确保项目里程碑 X 在 Y 日期之前完成"。它监控任务状态、团队沟通和资源可用性，标记延迟，并在目标有风险时建议纠正措施。
自动化交易机器人：一个交易 Agent 的目标可能是"在风险承受能力范围内最大化投资组合收益"。它持续监控市场数据、其当前投资组合价值和风险指标，当条件与的目标一致时执行交易，如果风险阈值被突破则调整策略。
机器人与自动驾驶车辆：一辆自动驾驶车辆的主要目标是"安全地将乘客从 A 地运送到 B 地"。它不断监控其环境（其他车辆、行人、交通信号）、其自身状态（速度、燃料）以及沿规划路线的进展，调整其驾驶行为以安全高效地实现目标。
内容审核：一个 Agent 的目标可能是"识别并删除平台 X 上的有害内容"。它监控传入的内容，应用分类模型，跟踪假阳性/假阴性等指标，调整其过滤标准或将模糊案例升级给人工审核员。

此模式对于需要可靠运行、实现特定结果并适应动态条件的 Agent 来说至关重要，为智能自我管理提供了必要的框架。

实践代码示例

为了说明 Goal Setting and Monitoring 模式，我们有一个使用 LangChain 和 OpenAI APIs 的示例。这个 Python 脚本概述了一个自主 AI Agent，旨在生成和改进 Python 代码。其核心功能是针对指定的问题生成解决方案，确保遵守用户定义的质量基准。

它采用了一种"目标设定与监控"模式，即它不只生成一次代码，而是进入一个创建、自我评估和改进的迭代循环。

Agent 的成功是通过其自身的 AI 驱动的判断来衡量的，即生成的代码是否成功满足了初始目标。最终输出是一个经过润色的、有注释的、随时可用的 Python 文件，代表此改进过程的顶点。

依赖项：

bash

pip install langchain_openai openai python-dotenv

.env 文件包含 OPENAI_API_KEY 密钥。

您可以通过将此事想象成一个被分配到项目中的自主 AI 程序员来最好地理解此脚本（见图 1）。当您将详细的项目需求书交给 AI 时，流程就开始了，这是它需要解决的具体编码问题。

python

# MIT License
# Copyright (c) 2025 Mahtab Syed
# https://www.linkedin.com/in/mahtabsyed/

"""
Hands-On Code Example - Iteration 2
- To illustrate the Goal Setting and Monitoring pattern, we have an
example using LangChain and OpenAI APIs:
Objective: Build an AI Agent which can write code for a specified
use case based on specified goals:
- Accepts a coding problem (use case) in code or can be as input.
- Accepts a list of goals (e.g., "simple", "tested", "handles edge
cases") in code or can be input.
- Uses an LLM (like GPT-4o) to generate and refine Python code
until the goals are met. (I am using max 5 iterations, this could
be based on a set goal as well)
- To check if we have met our goals I am asking the LLM to judge
this and answer just True or False which makes it easier to stop
the iterations.
- Saves the final code in a .py file with a clean filename and a
header comment.
"""

import os
import random
import re
from pathlib import Path
from langchain_openai import ChatOpenAI

from dotenv import load_dotenv, find_dotenv
# 🔐 Load environment variables
_ = load_dotenv(find_dotenv())
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
if not OPENAI_API_KEY:
    # ❌
    raise EnvironmentError(" Please set the OPENAI_API_KEY environment variable.")
# ✅
# Initialize OpenAI model
# 📡
print(" Initializing OpenAI LLM (gpt-4o)...")
llm = ChatOpenAI(
    model="gpt-4o",  # If you dont have access to got-4o use other OpenAI LLMs
    temperature=0.3,
    openai_api_key=OPENAI_API_KEY,
)

# --- Utility Functions ---

def generate_prompt(
    use_case: str, goals: list[str], previous_code: str = "", feedback: str = ""
) -> str:
    # 📝
    print(" Constructing prompt for code generation...")
    base_prompt = f"""
You are an AI coding agent. Your job is to write Python code based on the following use case:
Use Case: {use_case}

Your goals are:
{chr(10).join(f"- {g.strip()}" for g in goals)}
"""
    if previous_code:
        # 🔄
        print(" Adding previous code to the prompt for refinement.")
        base_prompt += f"\nPreviously generated code:\n{previous_code}"
    if feedback:
        # 📋
        print(" Including feedback for revision.")
        base_prompt += f"\nFeedback on previous version:\n{feedback}\n"
    base_prompt += "\nPlease return only the revised Python code. Do not include comments or explanations outside the code."
    return base_prompt

def get_code_feedback(code: str, goals: list[str]) -> str:
    # 🔍
    print(" Evaluating code against the goals...")
    feedback_prompt = f"""
You are a Python code reviewer. A code snippet is shown below.
Based on the following goals:
{chr(10).join(f"- {g.strip()}" for g in goals)}

Please critique this code and identify if the goals are met.
Mention if improvements are needed for clarity, simplicity, correctness, edge case handling, or test coverage.

Code:
{code}
"""
    return llm.invoke(feedback_prompt)

def goals_met(feedback_text: str, goals: list[str]) -> bool:
    """
    Uses the LLM to evaluate whether the goals have been met based
    on the feedback text.
    Returns True or False (parsed from LLM output).
    """
    review_prompt = f"""
You are an AI reviewer.
Here are the goals:
{chr(10).join(f"- {g.strip()}" for g in goals)}

Here is the feedback on the code:
\"\"\"
{feedback_text}
\"\"\"

Based on the feedback above, have the goals been met?
Respond with only one word: True or False.
"""
    response = llm.invoke(review_prompt).content.strip().lower()
    return response == "true"

def clean_code_block(code: str) -> str:
    lines = code.strip().splitlines()
    if lines and lines[0].strip().startswith("```"):
        lines = lines[1:]
    if lines and lines[-1].strip() == "```":
        lines = lines[:-1]
    return "\n".join(lines).strip()

def add_comment_header(code: str, use_case: str) -> str:
    comment = f"# This Python program implements the following use case:\n# {use_case.strip()}\n"
    return comment + "\n" + code

def to_snake_case(text: str) -> str:
    text = re.sub(r"[^a-zA-Z0-9 ]", "", text)
    return re.sub(r"\s+", "_", text.strip().lower())

def save_code_to_file(code: str, use_case: str) -> str:
    # 💾
    print(" Saving final code to file...")
    summary_prompt = (
        f"Summarize the following use case into a single lowercase word or phrase, "
        f"no more than 10 characters, suitable for a Python filename:\n\n{use_case}"
    )
    raw_summary = llm.invoke(summary_prompt).content.strip()
    short_name = re.sub(r"[^a-zA-Z0-9_]", "", raw_summary.replace(" ", "_").lower())[:10]
    random_suffix = str(random.randint(1000, 9999))
    filename = f"{short_name}_{random_suffix}.py"
    filepath = Path.cwd() / filename
    with open(filepath, "w") as f:
        f.write(code)
    # ✅
    print(f" Code saved to: {filepath}")
    return str(filepath)

# --- Main Agent Function ---

def run_code_agent(use_case: str, goals_input: str, max_iterations: int = 5) -> str:
    goals = [g.strip() for g in goals_input.split(",")]
    # 🎯
    print(f"\n Use Case: {use_case}")
    # 🎯
    print(" Goals:")
    for g in goals:
        print(f" - {g}")
    previous_code = ""
    feedback = ""
    for i in range(max_iterations):
        # 🔁
        print(f"\n=== Iteration {i + 1} of {max_iterations} ===")
        prompt = generate_prompt(use_case, goals, previous_code, feedback if isinstance(feedback, str) else feedback.content)
        # 🚧
        print(" Generating code...")
        code_response = llm.invoke(prompt)
        raw_code = code_response.content.strip()
        code = clean_code_block(raw_code)
        # 🧾
        print("\n Generated Code:\n" + "-" * 50 + f"\n{code}\n" + "-" * 50)
        # 📤
        print("\n Submitting code for feedback review...")
        feedback = get_code_feedback(code, goals)
        feedback_text = feedback.content.strip()
        # 📥
        print("\n Feedback Received:\n" + "-" * 50 + f"\n{feedback_text}\n" + "-" * 50)
        if goals_met(feedback_text, goals):
            # ✅
            print(" LLM confirms goals are met. Stopping iteration.")
            break
        # 🛠
        print(" Goals not fully met. Preparing for next iteration...")
        previous_code = code

    final_code = add_comment_header(code, use_case)
    return save_code_to_file(final_code, use_case)

# --- CLI Test Run ---

if __name__ == "__main__":
    # 🧠
    print("\n Welcome to the AI Code Generation Agent")
    # Example 1
    use_case_input = "Write code to find BinaryGap of a given positive integer"
    goals_input = "Code simple to understand, Functionally correct, Handles comprehensive edge cases, Takes positive integer input only, prints the results with few examples"
    run_code_agent(use_case_input, goals_input)

    # Example 2
    # use_case_input = "Write code to count the number of files in current directory and all its nested sub directories, and print the total count"
    # goals_input = (
    # "Code simple to understand, Functionally correct, Handles comprehensive edge cases, Ignore recommendations for performance, Ignore recommendations for test suite use like unittest or pytest"
    # )
    # run_code_agent(use_case_input, goals_input)

    # Example 3
    # use_case_input = "Write code which takes a command line input of a word doc or docx file and opens it and counts the number of words, and characters in it and prints all"
    # goals_input = "Code simple to understand, Functionally correct, Handles edge cases"
    # run_code_agent(use_case_input, goals_input)

除了这份需求书，您还提供了一个严格的质量检查清单，它代表了最终代码必须满足的目标——诸如"解决方案必须简单"、"必须在功能上正确"或"需要处理意外的边界情况"等标准。

图 1：Goal Setting and Monitor 示例

有了这个任务，AI 程序员开始工作，并生成其第一版代码草案。然而，它不是立即提交这个初始版本，而是暂停执行一个关键步骤：严格的自我审查。它根据您提供的质量检查清单中的每一项，细致地将自己的创作与每一项进行比较，充当自己的质量保证检查员。在这次检查之后，它对自己的进展给出了一个简单、公正的裁决："True"（如果工作符合所有标准），或 "False"（如果不达标）。

如果裁决是"False"，AI 不会放弃。它进入一个深思熟虑的修订阶段，利用自我批评的见解来查明弱点并智能地重写代码。这种起草、自我审查和改进的循环持续进行，每次迭代都旨在更接近目标。这个过程重复进行，直到 AI 最终通过满足每个要求而实现"True"状态，或者直到它达到预定义的尝试限制，就像开发人员在截止日期前工作一样。一旦代码通过了最终检查，脚本就会打包润色后的解决方案，添加有用的注释并将其保存到一个干净的、新的 Python 文件中，随时可以使用。

注意事项的考虑：值得注意的是，这是一个示例性的说明，而不是生产就绪的代码。对于真实世界的应用程序，必须考虑几个因素。LLM 可能无法完全理解目标的预期含义，并且可能错误地将其性能评估为成功。即使目标被很好地理解，模型也可能产生幻觉。当同一个 LLM 负责编写代码和判断其质量时，它可能更难发现自己正朝着错误的方向前进。

最终，LLM 不会通过魔法产生完美的代码；您仍然需要运行和测试生成的代码。此外，简单示例中的"监控"是基础性的，并且可能会产生进程永远运行的风险。

text

Act as an expert code reviewer with a deep commitment to producing
clean, correct, and simple code. Your core mission is to eliminate
code "hallucinations" by ensuring every suggestion is grounded in
reality and best practices.

When I provide you with a code snippet, I want you to:
-- Identify and Correct Errors: Point out any logical flaws, bugs, or potential runtime errors.
-- Simplify and Refactor: Suggest changes that make the code more readable, efficient, and maintainable without sacrificing correctness.
-- Provide Clear Explanations: For every suggested change, explain why it is an improvement, referencing principles of clean code, performance, or security.
-- Offer Corrected Code: Show the "before" and "after" of your suggested changes so the improvement is clear.

Your feedback should be direct, constructive, and always aimed at
improving the quality of the code.

一种更稳健的方法是通过将特定角色分配给一组 Agent 来分离这些关注点。例如，我使用 Gemini 构建了一个个人 AI Agent 团队，其中每个 Agent 都有特定的角色：

结对程序员 (The Peer Programmer)：帮助编写和头脑风暴代码。
代码审查员 (The Code Reviewer)：捕捉错误并建议改进。
文档编写员 (The Documenter)：生成清晰简洁的文档。
测试编写员 (The Test Writer)：创建全面的单元测试。
Prompt 优化员 (The Prompt Refiner)：优化与 AI 的交互。

在这个 Multi-Agent 系统中，代码审查员作为与程序员 Agent 不同的实体，拥有一个类似于示例中裁判的 Prompt，这显著改善了客观评估。这种结构自然会带来更好的实践，因为测试编写员 Agent 可以满足为结对程序员生成的代码编写单元测试的需求。

我将添加这些更复杂的控制和使代码更接近生产就绪的任务留给感兴趣的读者。

概览

是什么：AI Agent 往往缺乏明确的方向，阻碍了它们在简单的反应式任务之外采取行动。如果没有定义的目标，它们无法独立处理复杂的、多步骤的问题或编排复杂的工作流。此外，它们没有内在的机制来确定其行动是否正在导致成功的结果。这限制了它们的自主性，并阻碍了它们在动态的真实场景中真正有效，在这些场景中，仅仅执行任务是不够的。

为什么：Goal Setting and Monitoring 模式通过向 Agentic 系统嵌入目的感和自我评估能力，提供了一个标准化的解决方案。它涉及为 Agent 要实现的目标明确定义清晰、可衡量的目标。同时，它建立了一个监控机制，持续跟踪 Agent 的进展及其环境相对于这些目标的状态。这创造了一个关键的反馈循环，使 Agent 能够评估其性能、纠正其路线，并在偏离成功路径时调整其计划。通过实现此模式，开发者可以将简单的反应式 Agent 转变为能够自主和可靠运行的、主动的、以目标为导向的系统。

经验法则：当 AI Agent 必须自主执行多步骤任务、适应动态条件，并在没有持续人工干预的情况下可靠地实现特定的、高级的目标时，使用此模式。

可视化摘要

图 2：Goal design patterns

关键要点

关键要点包括：

Goal Setting and Monitoring 为 Agent 配备了目的和跟踪进展的机制。
目标应该是具体的、可衡量的、可实现的、相关的和有时限的 (SMART)。
清晰定义指标和成功标准对于有效监控至关重要。
监控涉及观察 Agent 行动、环境状态和工具输出。
来自监控的反馈循环允许 Agent 适应、修订计划或升级问题。
在 Google 的 ADK 中，目标通常通过 Agent 指令传达，监控通过状态管理和工具交互完成。

总结

本章重点讨论了 Goal Setting and Monitoring 的关键范式。我强调了这一概念如何将 AI Agent 从仅仅反应式的系统转变为主动的、目标驱动的实体。文本强调了定义清晰、可衡量目标的重要性，并建立严格的监控程序来跟踪进展。实际应用演示了这一范式如何支持跨各种领域的可靠自主操作，包括客户服务和机器人技术。一个概念性的编码示例说明了在一个结构化框架内这些原则的实现，使用 Agent 指令和状态管理来指导和评估 Agent 实现其指定目标的进展。最终，使 Agent 具备制定和监督目标的能力是构建真正智能和负责任的 AI 系统的基本步骤。

参考资料

SMART Goals Framework. https://en.wikipedia.org/wiki/SMART_criteria