Spring AI 教程

Spring AI 详细使用教程

bear1682

820人浏览 · 2025-05-26 17:21:24

bear1682 · 2025-05-26 17:21:24 发布

Spring AI 优化教程

Spring AI 简介
核心概念
快速入门
聊天客户端 API (ChatClient)
AI 模型 (AI Models)
多模态 (Multimodality)
向量数据库 (Vector Databases)
检索增强生成 (RAG)
结构化输出 (Structured Output)
聊天内存 (Chat Memory)
工具调用 (Tool Calling)
提示工程 (Prompt Engineering)
ETL 框架 (ETL Framework)
顾问 API (Advisors API)
可观测性 (Observability)
AI 模型评估 (AI Model Evaluation)
服务连接 (Service Connections)
贡献指南
升级说明
常见问题与解决方案 (FAQ)
参考资料与链接

1. Spring AI 简介

什么是 Spring AI？

Spring AI 项目旨在以简化的方式开发集成人工智能（AI）功能的应用程序，避免不必要的复杂性。该项目深受 Python 生态中 LangChain 和 LlamaIndex 等著名项目的影响，但并非它们的直接移植。Spring AI 的创立基于一个信念：下一波生成式 AI 应用将不仅仅局限于 Python 开发者，而是会在多种编程语言中普及开来。

Spring AI 致力于解决 AI 集成中的一个根本性挑战：将您的企业数据和 API 与 AI 模型连接起来。它并非用于从头创建 AI 模型，而是提供了一套抽象和工具，使开发者能够轻松地与各种 AI 提供商（如 OpenAI、Azure OpenAI、Anthropic、Amazon Bedrock、Google Vertex AI、Ollama、HuggingFace 等）进行交互，并利用它们的能力来构建智能应用。

Spring AI 的核心理念是提供一个统一的、可移植的编程模型，屏蔽底层不同 AI 服务提供商 API 的复杂性和差异性。这意味着开发者可以使用一套标准的、Spring 风格的接口和注解来调用不同的 AI 模型，执行聊天对话、文本生成、图像生成、音频转录、文本转语音、文本嵌入、向量搜索等常见 AI 任务，而无需深入了解每个特定提供商的 SDK 或 API 细节。这种抽象大大降低了学习成本和集成复杂度，使得 Java 开发者可以更快地将 AI 功能融入到现有的 Spring Boot 应用中。

该项目致力于将 Spring 生态系统的设计原则（如模块化、自动配置、依赖注入、面向接口编程等）应用于快速发展的 AI 工程领域。通过提供熟悉的 Spring 抽象，例如 ChatClient（其流式 API 设计类似于 WebClient 和 RestClient）、EmbeddingClient 以及对模板化提示（Prompt Templating）的支持，Spring AI 使得构建复杂的 AI 驱动应用变得更加高效和可维护。

Spring AI 的核心目标和优势

Spring AI 的核心目标可以概括为以下几点：

简化 AI 应用开发：为 Java 和 Spring 开发者提供一套熟悉且一致的工具集，降低集成 AI 功能的门槛。开发者无需成为 AI 领域的专家，也能快速上手并构建出强大的 AI 应用。
提供统一的抽象层：针对不同 AI 提供商的 API 提供统一的、可移植的接口。这意味着开发者可以编写一次代码，然后通过简单的配置切换不同的底层 AI 模型或服务，例如从 OpenAI GPT-4 切换到本地运行的 Ollama 模型，或者从一个向量数据库切换到另一个。
与 Spring 生态无缝集成：充分利用 Spring Boot 的自动配置、依赖管理和模块化特性，使得 AI 功能可以像其他 Spring 组件一样轻松集成到应用程序中。这包括对 Spring Data、Spring WebFlux 等项目的良好兼容性。
支持便携式 AI 功能：通过抽象化核心 AI 概念（如聊天、嵌入、向量存储等），Spring AI 使得应用程序在不同 AI 实现之间具有更好的可移植性。这有助于避免厂商锁定，并允许开发者根据需求灵活选择最合适的 AI 技术。
促进 AI 工程最佳实践：引入诸如提示工程、检索增强生成（RAG）、结构化输出、工具调用、模型评估等 AI 工程领域的最佳实践，并提供相应的支持，帮助开发者构建更健壮、更可靠、更强大的 AI 应用。

Spring AI 的主要优势包括：

降低学习曲线：对于熟悉 Spring 框架的 Java 开发者而言，学习 Spring AI 的成本相对较低，因为其 API 设计和编程模型与 Spring 的整体风格保持一致。
提高开发效率：通过自动配置、标准化的接口（如 ChatClient 流式 API）、丰富的工具支持和顾问 API (Advisors API)，Spring AI 可以显著提高 AI 应用的开发效率。
增强可移植性：统一的抽象使得应用在不同 AI 模型和向量数据库之间切换变得更加容易。
强大的社区支持：作为 Spring 生态系统的一部分，Spring AI 可以受益于庞大的 Spring 社区和丰富的生态资源。
企业级特性：Spring 框架本身具备的成熟的企业级特性，如安全性、可观测性、事务管理等，可以间接或直接地应用于 Spring AI 构建的应用中。

Spring AI 的主要功能模块概览

Spring AI 提供了一系列功能模块来支持不同类型的 AI 任务。以下是一些核心的功能模块：

可移植的 API 支持：跨 AI 提供商支持聊天（Chat）、文本到图像（Text-to-Image）和嵌入（Embedding）模型。支持同步和流式 API 选项，并可访问特定模型的功能。
广泛的 AI 模型支持：支持所有主要的 AI 模型提供商，如 Anthropic、OpenAI、Microsoft、Amazon、Google 和 Ollama。支持的模型类型包括：
- 聊天补全 (Chat Completion)
- 嵌入 (Embedding)
- 文本到图像 (Text to Image)
- 音频转录 (Audio Transcription)
- 文本到语音 (Text to Speech)
结构化输出 (Structured Outputs)：将 AI 模型的输出映射到 POJO（Plain Old Java Objects）。
广泛的向量数据库支持：支持所有主要的向量数据库提供商，如 Apache Cassandra、Azure Cosmos DB、Azure Vector Search、Chroma、Elasticsearch、GemFire、MariaDB、Milvus、MongoDB Atlas、Neo4j、OpenSearch、Oracle、PostgreSQL/PGVector、PineCone、Qdrant、Redis、SAP Hana、Typesense 和 Weaviate。
可移植的向量存储 API：跨向量存储提供商提供可移植的 API，包括新颖的类 SQL 元数据过滤 API。
工具/函数调用 (Tools/Function Calling)：允许模型请求执行客户端工具和函数，从而根据需要访问必要的实时信息并采取行动。
可观测性 (Observability)：提供对 AI 相关操作的洞察力，集成了 Micrometer。
文档摄取 ETL 框架：用于数据工程的 ETL（Extract, Transform, Load）框架。
AI 模型评估 (AI Model Evaluation)：提供工具帮助评估生成内容的质量，并防范幻觉响应。
Spring Boot 自动配置与 Starters：为 AI 模型和向量存储提供自动配置和启动器。
ChatClient API：用于与 AI 聊天模型通信的流式 API，在习惯用法上类似于 WebClient 和 RestClient API。
顾问 API (Advisors API)：封装了重复性的生成式 AI 模式，转换进出语言模型（LLM）的数据，并提供跨各种模型和用例的可移植性。
聊天对话内存 (Chat Conversation Memory) 和 检索增强生成 (RAG) 支持。

这些功能集使您能够实现常见的用例，例如“基于您的文档进行问答”或“与您的文档聊天”。

2. 核心概念

在深入学习 Spring AI 的具体功能之前，理解一些与人工智能（AI）、机器学习（ML）和大型语言模型（LLM）相关的核心概念至关重要。这些概念是构建和使用 AI 应用的基础，Spring AI 正是围绕这些概念提供了相应的抽象和实现。

模型 (Models)

AI 模型是经过训练的算法，旨在处理和生成信息，通常模仿人类的认知功能。通过从大型数据集中学习模式和见解，这些模型可以进行预测、生成文本、图像或其他输出，从而增强各行各业的应用。

存在多种类型的 AI 模型，每种都适用于特定的用例。虽然像 ChatGPT 这样的生成式 AI 通过文本输入和输出吸引了用户，但许多模型和公司提供了多样化的输入和输出能力。例如，文本到图像生成模型（如 Midjourney、Stable Diffusion）在 ChatGPT 之前就已广受欢迎。

Spring AI 目前支持处理语言、图像和音频作为输入和输出的模型。此外，它还支持将文本转换为数字表示（嵌入），这是许多高级 AI 用例的基础。

像 GPT 这样的模型的特点在于其“预训练”性质（GPT 中的“P”代表 Pre-trained）。这种预训练特性将 AI 转变为一种通用的开发者工具，无需开发者具备广泛的机器学习或模型训练背景。

提示 (Prompts)

提示是提供给基于语言的 AI 模型的输入，用于引导模型产生特定的输出。对于熟悉 ChatGPT 的用户来说，提示可能看起来只是输入到对话框中的文本。然而，在许多 AI 模型（包括 ChatGPT API）中，提示不仅仅是一个简单的字符串。它通常包含多个文本输入，每个输入被赋予一个角色。

例如，有系统角色 (System Role)，它告诉模型如何行为并设定交互的上下文；还有用户角色 (User Role)，通常是来自用户的输入；以及助手角色 (Assistant Role)，代表模型之前的响应。

构建有效的提示既是一门艺术也是一门科学，被称为“提示工程 (Prompt Engineering)”。与使用 SQL 等结构化查询语言不同，与 AI 模型沟通更像是与人交谈。投入时间精心设计提示可以显著改善输出结果。Spring AI 提供了 Prompt 对象来封装这些带有角色的消息。

提示模板 (Prompt Templates)

创建有效的提示通常涉及建立请求的上下文，并将请求的某些部分替换为特定于用户输入的值。这个过程可以使用传统的基于文本的模板引擎来创建和管理提示。

Spring AI 使用开源库 StringTemplate 来实现提示模板功能。

例如，考虑一个简单的提示模板：

Tell me a {adjective} joke about {content}.

在 Spring AI 中，提示模板类似于 Spring MVC 架构中的“视图 (View)”。提供一个模型对象（通常是 java.util.Map）来填充模板中的占位符。渲染后的字符串将成为提供给 AI 模型的提示内容的一部分。

Spring AI 的 PromptTemplate 类简化了使用模板创建 Prompt 对象的过程。

嵌入 (Embeddings)

嵌入是将文本、图像或视频等输入转换为数值表示（通常是浮点数数组，即向量）的技术，旨在捕捉输入之间的关系和语义含义。

嵌入通过将输入转换为向量来工作。这些向量被设计用来捕捉文本、图像和视频的意义。嵌入数组的长度称为向量的维度 (Dimensionality)。

通过计算两个文本片段的向量表示之间的数值距离（例如，余弦相似度），应用程序可以确定用于生成嵌入向量的对象之间的相似性。

对于 Java 开发者来说，无需深入理解嵌入背后的复杂数学理论或具体实现。理解它们在 AI 系统中的作用和功能就足够了，特别是在将 AI 功能集成到应用程序中时。

嵌入在诸如检索增强生成 (RAG) 等实际应用中尤为重要。它们使得数据可以在一个语义空间中表示为点。在这个多维空间中，点的邻近度反映了意义的相似性。关于相似主题的句子在这个空间中位置更近，这有助于文本分类、语义搜索甚至产品推荐等任务。

Spring AI 提供了 EmbeddingClient 接口，用于将文本（或未来可能的其他模态）转换为嵌入向量。

Tokens

Token 是 AI 模型处理文本的基本单位。在输入时，模型将单词（或子词）转换为 Token；在输出时，它们将 Token 转换回单词。

在英语中，一个 Token 大致对应一个单词的 75%。了解 Token 很重要，因为：

成本 (Tokens = Money)：对于托管的 AI 模型，费用通常根据使用的 Token 数量确定。输入和输出 Token 都会计入总数。
限制 (Token Limits / Context Window)：模型有 Token 限制，这限制了单次 API 调用中可以处理的文本量。这个阈值通常被称为“上下文窗口 (Context Window)”。模型不会处理超出此限制的文本。不同的模型具有不同的上下文窗口大小（例如，GPT-3.5 有 4K，GPT-4 有 8K、32K 等，Anthropic 的 Claude 有 100K 甚至更高）。

当处理大量文本（如莎士比亚全集）时，需要制定策略将数据分块并在模型的上下文窗口限制内呈现给模型。Spring AI 项目提供了工具来帮助完成这项任务。

结构化输出 (Structured Output)

传统上，AI 模型的输出是 java.lang.String，即使你要求以 JSON 格式回复。它可能是一个格式正确的 JSON 字符串，但它不是一个可以直接使用的 JSON 数据结构。此外，仅仅在提示中要求“返回 JSON”并不能保证 100% 的准确性和格式一致性。

为了解决这个问题，Spring AI 提供了结构化输出功能。它允许开发者定义期望的输出格式（通常是一个 Java Bean 或 POJO），并让 Spring AI 负责将模型的自然语言响应转换为该结构化对象。

这通过使用输出转换器 (OutputConverter) 实现，例如 @BeanOutputConverter，它可以将模型的文本输出自动映射到指定的 Java Bean 实例。这极大地简化了将 AI 模型响应集成到应用程序逻辑中的过程。

检索增强生成 (RAG - Retrieval Augmented Generation)

检索增强生成 (RAG) 是一种结合了信息检索和大型语言模型生成能力的技术范式。其核心思想是在 LLM 生成响应之前，先从一个或多个外部知识源（如向量数据库、文档集合、API 等）中检索相关信息，然后将这些检索到的信息作为上下文提供给 LLM，以生成更准确、更具体、更基于事实的回答。

RAG 的主要优势包括：

减少幻觉：通过提供相关的外部知识，帮助 LLM 基于真实数据回答问题。
访问最新信息：允许 LLM 利用训练数据之外的最新信息。
利用领域特定知识：使 LLM 能够回答关于私有或特定领域数据的问题，而无需重新训练模型。
提高透明度和可信度：可以追溯信息来源。

Spring AI 提供了构建 RAG 应用的组件和模式，包括文档加载器 (DocumentReader)、文档转换器/分割器 (DocumentTransformer/TextSplitter)、与向量数据库 (VectorStore) 的集成以及将检索结果整合到提示中的机制。

工具调用 (Tool Calling)

工具调用（在 OpenAI 中也称为 Function Calling）是一种允许 LLM 与外部工具、API 或函数进行交互的机制。当 LLM 需要执行其自身无法完成的操作时（例如，获取实时信息、查询数据库、执行计算），它可以请求调用一个预定义的工具。

基本流程：

定义工具：向 LLM 描述可用的工具及其功能和参数。
模型决策：LLM 判断是否需要调用工具，并生成调用请求（工具名和参数）。
执行工具：应用程序接收请求，执行相应的外部工具/函数。
返回结果：将工具执行结果返回给 LLM。
最终响应：LLM 结合工具结果生成最终回复。

工具调用极大地扩展了 LLM 的能力。Spring AI 提供了对工具调用的支持，允许开发者将 Java 方法注册为可供 LLM 调用的工具，并通过 ToolCallback 接口处理调用请求。

多模态 (Multimodality)

多模态 指 AI 系统能够处理和理解包含多种信息类型（模态）的输入，和/或生成包含多种模态的输出。常见模态包括文本、图像、音频、视频等。

多模态 LLM（如 OpenAI 的 GPT-4V、Google 的 Gemini）可以同时理解文本和图像等信息。

Spring AI 正在逐步增强对多模态的支持，允许开发者向支持多模态的模型发送包含图像等非文本内容的提示（例如，通过 Media 对象封装图像数据），并处理其多模态响应。

AI 响应评估 (Evaluating AI responses)

评估 AI 模型生成的响应质量对于构建可靠的应用程序至关重要。由于生成式 AI 的输出可能存在不准确、不相关甚至“幻觉”的情况，因此需要机制来评估其响应。

Spring AI 提供了一些基础工具和概念来帮助进行 AI 模型评估，例如 EvaluationRequest 和 EvaluationResponse，以及与评估相关的接口和实现。这有助于开发者衡量模型输出是否满足特定标准或预期。

理解这些核心概念将帮助你更好地掌握 Spring AI 的各项功能，并有效地利用它们来构建强大的 AI 应用程序。

3. 快速入门

本章节将指导你快速搭建 Spring AI 开发环境，并创建你的第一个简单的 Spring AI 应用。我们将通过一个基础的问答机器人示例，演示如何配置和使用 Spring AI 与 AI 模型进行交互。

环境准备

在开始之前，请确保你的开发环境中已安装并配置好以下软件：

Java Development Kit (JDK)：Spring AI 需要 Java 17 或更高版本。建议使用最新的 LTS (Long-Term Support) 版本的 JDK，例如 JDK 17 或 JDK 21。你可以从 Oracle, AdoptOpenJDK (Temurin), Amazon Corretto, Azul Zulu 等官方渠道下载并安装适合你操作系统的 JDK。
- 验证 JDK 安装：打开终端或命令提示符，输入 java -version，应能看到已安装的 JDK 版本信息。
构建工具 (Maven 或 Gradle)：Spring Boot 项目通常使用 Maven 或 Gradle 进行构建和依赖管理。Spring AI 也与这两种构建工具兼容。
- Maven：如果选择 Maven，请确保已安装 Maven 3.6.3 或更高版本。你可以从 Apache Maven 官网下载并安装。验证安装：mvn -version。
- Gradle：如果选择 Gradle，请确保已安装 Gradle 7.x 或更高版本。你可以从 Gradle 官网下载并安装。验证安装：gradle -version。
Spring Boot：Spring AI 通常作为 Spring Boot 应用的一部分来使用。Spring AI 支持 Spring Boot 3.4.x 版本，未来将支持 Spring Boot 3.5.x。如果你对 Spring Boot 不熟悉，建议先查阅 Spring Boot 的官方文档了解其基本概念和用法。
- 你可以通过 Spring Initializr (start.spring.io) 快速创建一个 Spring Boot 项目骨架。
IDE (集成开发环境)：推荐使用支持 Java 和 Spring Boot 开发的 IDE，如 IntelliJ IDEA (Ultimate 或 Community Edition), Eclipse (with Spring Tools Suite), 或 VS Code (with Java Extension Pack and Spring Boot Extension Pack)。这些 IDE 提供了代码补全、调试、项目管理等强大功能，能显著提高开发效率。
AI 模型访问权限：虽然 Spring AI 支持一些本地运行的 AI 模型 (如通过 Ollama)，但许多强大的模型需要 API 密钥才能访问。为了完整体验本教程中的示例，建议你：
- 注册一个 OpenAI 账户并获取 API 密钥 (platform.openai.com)。
- 或者，如果你希望使用本地模型，可以安装 Ollama (ollama.ai) 并下载一个模型，例如 Llama2 或 Mistral。
- 其他 AI 服务提供商（如 Azure OpenAI, Google Vertex AI, Amazon Bedrock 等）也需要相应的账户和凭证。

确保以上环境和工具都已正确安装和配置，以便顺利进行后续的 Spring AI 应用开发。

使用 Spring Initializr 创建项目

Spring AI 提供了与 Spring Initializr 的集成，使你可以轻松创建包含 Spring AI 依赖的项目。

访问 start.spring.io
选择你偏好的构建工具（Maven 或 Gradle）
选择 Java 17 或更高版本
选择最新的 Spring Boot 版本（3.4.x 或更高）
在"Dependencies"部分，搜索并选择你需要的 AI 模型和向量存储组件，例如：
- Spring AI OpenAI
- Spring AI Chroma
- Spring Web（如果你要构建 Web 应用）
点击"Generate"下载项目骨架
解压下载的文件，并在你的 IDE 中导入项目

添加 Spring AI 依赖

如果你没有使用 Spring Initializr 或需要手动添加依赖，可以按照以下步骤在你的 Spring Boot 项目中添加 Spring AI 依赖。

Spring AI 的依赖通常以 spring-ai-*-spring-boot-starter 命名，后面跟着你希望集成的特定 AI 模型或向量数据库的模块。

对于 Maven 项目 (pom.xml)：

首先，确保你的项目继承自 spring-boot-starter-parent，并指定了合适的 Spring Boot 版本 (例如 3.4.x 或更高)。

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>3.4.0</version> <!-- 请使用最新的稳定版 Spring Boot -->
    <relativePath/> <!-- lookup parent from repository -->
</parent>

<properties>
    <java.version>17</java.version>
</properties>

<dependencies>
    <!-- Spring Boot Web Starter (如果需要构建 Web 应用) -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- Spring AI OpenAI Starter (示例) -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    </dependency>

    <!-- 如果使用 Ollama (示例) -->
    <!--
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
    </dependency>
    -->

    <!-- 其他 Spring AI 模块，例如向量数据库 -->
    <!--
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-chroma-spring-boot-starter</artifactId>
    </dependency>
    -->

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>

注意：

从 1.0.0-M6 版本开始，Spring AI 的发布版本已经可以在 Maven Central 中获取，无需添加额外的仓库配置。
如果你使用的是快照版本 (SNAPSHOT)，则需要在 <repositories> 部分添加 Spring 的快照仓库地址：

<!-- Spring AI 快照版本需要添加仓库 -->
<repositories>
    <repository>
        <id>spring-snapshots</id>
        <name>Spring Snapshots</name>
        <url>https://repo.spring.io/snapshot</url>
        <releases>
            <enabled>false</enabled>
        </releases>
    </repository>
    <repository>
        <name>Central Portal Snapshots</name>
        <id>central-portal-snapshots</id>
        <url>https://central.sonatype.com/repository/maven-snapshots/</url>
        <releases>
            <enabled>false</enabled>
        </releases>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
    </repository>
</repositories>

对于 Gradle 项目 (build.gradle)：

plugins {
    id 'java'
    id 'org.springframework.boot' version '3.4.0' // 请使用最新的稳定版 Spring Boot
    id 'io.spring.dependency-management' version '1.1.4'
}

group = 'com.example'
version = '0.0.1-SNAPSHOT'

java {
    sourceCompatibility = '17'
}

repositories {
    mavenCentral()
    // Spring AI 快照版本需要添加仓库
    // maven {
    //     url 'https://repo.spring.io/snapshot'
    //     mavenContent {
    //         snapshotsOnly()
    //     }
    // }
    // maven {
    //     url 'https://central.sonatype.com/repository/maven-snapshots/'
    //     mavenContent {
    //         snapshotsOnly()
    //     }
    // }
}

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-web'

    // Spring AI OpenAI Starter (示例)
    implementation 'org.springframework.ai:spring-ai-openai-spring-boot-starter'

    // 如果使用 Ollama (示例)
    // implementation 'org.springframework.ai:spring-ai-ollama-spring-boot-starter'

    // 其他 Spring AI 模块，例如向量数据库
    // implementation 'org.springframework.ai:spring-ai-chroma-spring-boot-starter'

    testImplementation 'org.springframework.boot:spring-boot-starter-test'
}

tasks.named('test') {
    useJUnitPlatform()
}

添加完依赖后，请刷新你的项目构建配置（例如，在 IntelliJ IDEA 中，右键点击 pom.xml 或 build.gradle 选择 “Reload project” 或 “Load Gradle Changes”），以便构建工具下载并引入这些依赖。

配置 AI 模型

在使用 Spring AI 之前，你需要配置你选择的 AI 模型。以下是一些常见模型的配置示例：

OpenAI 配置 (application.properties 或 application.yml)：

# application.properties
spring.ai.openai.api-key=YOUR_OPENAI_API_KEY
# 可选：指定模型，默认为 gpt-3.5-turbo
# spring.ai.openai.chat.options.model=gpt-4

或者使用 YAML 格式：

# application.yml
spring:
  ai:
    openai:
      api-key: YOUR_OPENAI_API_KEY
      chat:
        options:
          model: gpt-3.5-turbo  # 可选，默认值

Ollama 配置：

# application.properties
spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.chat.options.model=llama2

Azure OpenAI 配置：

# application.properties
spring.ai.azure.openai.api-key=YOUR_AZURE_OPENAI_API_KEY
spring.ai.azure.openai.endpoint=YOUR_AZURE_OPENAI_ENDPOINT
spring.ai.azure.openai.chat.options.model=YOUR_DEPLOYED_MODEL_NAME

请将占位符（如 YOUR_OPENAI_API_KEY）替换为你的实际 API 密钥或配置值。切勿将真实的 API 密钥硬编码到版本控制系统中。 在生产环境中，应使用更安全的方式管理密钥，例如环境变量、Vault 或 Spring Cloud Config。

第一个 Spring AI 应用：简单问答机器人

现在，让我们来创建一个非常简单的 Spring Boot 应用，它使用 Spring AI 与 AI 模型进行交互，实现一个基础的问答功能。

首先，创建一个简单的 REST Controller 来接收用户的问题并返回 AI 的回答：

src/main/java/com/example/springaitutorial/controller/ChatController.java:

package com.example.springaitutorial.controller;

import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.SystemPromptTemplate;
import org.springframework.ai.chat.prompt.UserMessage;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import java.util.Map;

@RestController
public class ChatController {

    private final ChatClient chatClient;

    @Autowired
    public ChatController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping("/ai/simple-chat")
    public Map<String, String> simpleChat(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        // 创建一个用户消息
        UserMessage userMessage = new UserMessage(message);

        // 创建一个提示对象，包含用户消息
        Prompt prompt = new Prompt(userMessage);

        // 发送提示给 AI 模型并获取响应
        ChatResponse response = chatClient.call(prompt);

        // 从响应中提取文本内容
        String aiResponse = response.getResult().getOutput().getContent();

        // 返回问题和回答
        return Map.of(
            "question", message,
            "answer", aiResponse
        );
    }

    @GetMapping("/ai/guided-chat")
    public Map<String, String> guidedChat(
            @RequestParam(value = "message") String message,
            @RequestParam(value = "role", defaultValue = "assistant") String role) {

        // 创建一个系统提示模板，定义 AI 的行为
        String systemPromptTemplate = "You are a {role}. " +
                "Provide helpful and informative responses in a friendly tone. " +
                "Keep your answers concise and to the point.";

        // 使用模板创建系统提示，并传入角色参数
        SystemPromptTemplate systemPrompt = new SystemPromptTemplate(systemPromptTemplate);

        // 创建一个提示对象，包含系统提示和用户消息
        Prompt prompt = new Prompt(
                systemPrompt.create(Map.of("role", role)),
                new UserMessage(message)
        );

        // 发送提示给 AI 模型并获取响应
        ChatResponse response = chatClient.call(prompt);

        // 从响应中提取文本内容
        String aiResponse = response.getResult().getOutput().getContent();

        // 返回问题、角色和回答
        return Map.of(
            "question", message,
            "role", role,
            "answer", aiResponse
        );
    }
}

这个控制器提供了两个端点：

/ai/simple-chat：一个基本的聊天端点，接收用户消息并返回 AI 的回答。
/ai/guided-chat：一个更高级的端点，允许指定 AI 的角色，并使用系统提示来引导 AI 的行为。

接下来，创建 Spring Boot 应用的主类：

src/main/java/com/example/springaitutorial/SpringAiTutorialApplication.java:

package com.example.springaitutorial;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class SpringAiTutorialApplication {

    public static void main(String[] args) {
        SpringApplication.run(SpringAiTutorialApplication.class, args);
    }
}

现在，你可以运行这个应用程序：

./mvnw spring-boot:run  # 对于 Maven 项目

或者

./gradlew bootRun  # 对于 Gradle 项目

应用启动后，你可以通过浏览器或 API 客户端（如 Postman、curl）访问以下 URL 来测试你的 AI 聊天机器人：

http://localhost:8080/ai/simple-chat?message=What is Spring AI?

或者

http://localhost:8080/ai/guided-chat?message=Tell me about Java&role=experienced programmer

恭喜！你已经成功创建了你的第一个 Spring AI 应用程序。这个简单的例子展示了如何使用 Spring AI 与 AI 模型进行基本交互。在接下来的章节中，我们将探索更多高级功能，如流式响应、聊天内存、结构化输出、RAG 等。

Spring AI 示例项目

Spring AI 提供了一系列示例项目，展示了各种功能和用例。这些示例是学习和理解 Spring AI 的宝贵资源。

你可以在 GitHub 上的 Spring AI 示例仓库中找到这些示例。每个示例都包含了详细的说明和代码，涵盖了从基本用法到高级功能的各个方面。

一些值得关注的示例包括：

基本的聊天和嵌入示例
各种 AI 模型提供商的集成示例
RAG 应用示例
结构化输出示例
工具调用示例
向量数据库集成示例

浏览这些示例是深入了解 Spring AI 功能和最佳实践的好方法。

4. 聊天客户端 API (ChatClient)

ChatClient API 是 Spring AI 的核心组件之一，它提供了一个流畅的接口，用于与各种 AI 聊天模型进行交互。这个 API 的设计风格类似于 Spring 的 WebClient 和 RestClient，为 Java 开发者提供了一种熟悉且一致的方式来构建 AI 驱动的对话应用。

ChatClient 接口概述

ChatClient 是一个接口，定义了与聊天模型交互的核心方法。Spring AI 为各种 AI 提供商（如 OpenAI、Azure OpenAI、Anthropic、Amazon Bedrock、Google Vertex AI、Ollama 等）提供了这个接口的实现。

基本用法示例：

@Autowired
private ChatClient chatClient;

public String chat(String userMessage) {
    return chatClient.call(userMessage);
}

这个简单的例子展示了如何使用 ChatClient 发送一个用户消息并获取模型的响应。但 ChatClient 的功能远不止于此，它支持更复杂的交互模式。

构建提示 (Building Prompts)

在与 AI 模型交互时，提示（Prompt）的构建至关重要。Spring AI 提供了多种方式来构建提示：

使用字符串

最简单的方式是直接使用字符串作为用户消息：

String response = chatClient.call("Tell me a joke about programming");

使用 Message 对象

对于更复杂的场景，你可以使用 Message 对象来构建提示。Spring AI 提供了几种类型的消息：

UserMessage：用户输入的消息
SystemMessage：设置 AI 行为的系统指令
AssistantMessage：AI 之前的响应，用于构建对话历史

// 创建一个系统消息来设置 AI 的行为
SystemMessage systemMessage = new SystemMessage("You are a helpful programming assistant. Keep your answers concise and include code examples when appropriate.");

// 创建一个用户消息
UserMessage userMessage = new UserMessage("How do I read a file in Java?");

// 创建提示对象，包含系统消息和用户消息
Prompt prompt = new Prompt(List.of(systemMessage, userMessage));

// 发送提示并获取响应
ChatResponse response = chatClient.call(prompt);

使用 Prompt Builder

Spring AI 还提供了一个流式的 prompt() builder API，使提示构建更加直观：

ChatResponse response = chatClient.prompt()
    .system("You are a helpful programming assistant.")
    .user("How do I read a file in Java?")
    .call();

这种方式特别适合构建多轮对话：

ChatResponse response = chatClient.prompt()
    .system("You are a helpful assistant.")
    .user("Hello, who are you?")
    .assistant("I'm an AI assistant created to help answer your questions.")
    .user("Can you help me with a math problem?")
    .call();

处理响应 (Handling Responses)

ChatClient 的 call() 方法返回一个 ChatResponse 对象，它包含了模型的响应以及其他元数据。

ChatResponse response = chatClient.call("What is Spring AI?");

// 获取响应内容
String content = response.getResult().getOutput().getContent();

// 获取响应元数据（如果模型提供）
Map<String, Object> metadata = response.getMetadata();

对于结构化输出，Spring AI 提供了转换器来将文本响应映射到 Java 对象：

public record WeatherInfo(String location, double temperature, String condition) {}

WeatherInfo weatherInfo = chatClient.prompt()
    .system("You are a weather information service.")
    .user("What's the weather like in New York?")
    .callAndConvertTo(WeatherInfo.class);

流式响应 (Streaming Responses)

对于需要实时显示 AI 响应的应用（如聊天界面），Spring AI 支持流式响应：

// 同步流式处理
chatClient.stream("Generate a story about a space adventure")
    .forEach(chunk -> {
        String content = chunk.getOutput().getContent();
        System.out.print(content); // 逐步打印响应内容
    });

// 响应式流式处理（使用 Project Reactor）
Flux<ChatResponse> responseFlux = chatClient.streamResponse("Generate a story about a space adventure");
responseFlux.subscribe(chunk -> {
    String content = chunk.getOutput().getContent();
    System.out.print(content);
});

配置请求选项 (Request Options)

Spring AI 允许你为每个请求配置特定的选项，如温度、最大 token 数等：

ChatResponse response = chatClient.prompt()
    .system("You are a creative storyteller.")
    .user("Tell me a short story about a dragon")
    .withOptions(options -> options
        .withTemperature(0.8f)
        .withMaxTokens(500)
        .withTopP(0.95f))
    .call();

这些选项可以帮助你控制模型的创造性、响应长度和其他生成参数。

多模型支持 (Multi-Model Support)

如果你的应用需要使用多个不同的 AI 模型，Spring AI 提供了 ChatClient.Builder 来创建针对特定模型的客户端实例：

@Configuration
public class AiConfig {

    @Bean
    public ChatClient gpt35ChatClient(OpenAiChatOptions defaultOptions) {
        OpenAiChatOptions gpt35Options = OpenAiChatOptions.builder()
            .withModel("gpt-3.5-turbo")
            .withTemperature(0.7f)
            .build();

        return new OpenAiChatClient(gpt35Options);
    }

    @Bean
    public ChatClient gpt4ChatClient(OpenAiChatOptions defaultOptions) {
        OpenAiChatOptions gpt4Options = OpenAiChatOptions.builder()
            .withModel("gpt-4")
            .withTemperature(0.5f)
            .build();

        return new OpenAiChatClient(gpt4Options);
    }
}

然后，你可以根据需要注入特定的客户端：

@Autowired
@Qualifier("gpt4ChatClient")
private ChatClient premiumChatClient;

@Autowired
@Qualifier("gpt35ChatClient")
private ChatClient standardChatClient;

错误处理 (Error Handling)

与 AI 模型交互时可能会遇到各种错误，如 API 限制、网络问题或模型错误。Spring AI 提供了异常处理机制来管理这些情况：

try {
    ChatResponse response = chatClient.call("Generate a very long story");
} catch (AiException e) {
    if (e.getStatusCode() == HttpStatus.TOO_MANY_REQUESTS) {
        // 处理速率限制错误
        System.out.println("Rate limit exceeded. Please try again later.");
    } else {
        // 处理其他 AI 相关错误
        System.out.println("AI error: " + e.getMessage());
    }
} catch (Exception e) {
    // 处理其他异常
    System.out.println("Unexpected error: " + e.getMessage());
}

实际应用示例

以下是一个更完整的示例，展示了如何在 Spring Boot 应用中使用 ChatClient 创建一个聊天机器人 API：

@RestController
@RequestMapping("/api/chat")
public class ChatBotController {

    private final ChatClient chatClient;

    @Autowired
    public ChatBotController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @PostMapping
    public ResponseEntity<Map<String, String>> chat(@RequestBody ChatRequest request) {
        try {
            ChatResponse response = chatClient.prompt()
                .system("You are a helpful assistant named ChatBot. " +
                        "You provide concise and accurate information.")
                .user(request.getMessage())
                .withOptions(options -> options
                    .withTemperature(0.7f)
                    .withMaxTokens(300))
                .call();

            String content = response.getResult().getOutput().getContent();

            return ResponseEntity.ok(Map.of(
                "message", content,
                "timestamp", LocalDateTime.now().toString()
            ));
        } catch (Exception e) {
            return ResponseEntity
                .status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }

    // 请求对象
    public static class ChatRequest {
        private String message;

        // Getters and setters
        public String getMessage() {
            return message;
        }

        public void setMessage(String message) {
            this.message = message;
        }
    }
}

这个控制器提供了一个 POST 端点，接收用户消息并返回 AI 的响应。它使用系统消息来设置 AI 的行为，并配置了温度和最大 token 数等选项。

通过 ChatClient API，Spring AI 为开发者提供了一种强大而灵活的方式来与 AI 聊天模型交互，使得构建智能对话应用变得更加简单和直观。

5. AI 模型 (AI Models)

Spring AI 支持多种 AI 模型提供商，使开发者能够根据需求选择最合适的模型。本章节将介绍 Spring AI 支持的主要 AI 模型提供商，以及如何配置和使用它们。

支持的模型提供商

Spring AI 目前支持以下主要的 AI 模型提供商：

OpenAI：提供 GPT-3.5-turbo、GPT-4 等模型
Azure OpenAI：微软 Azure 平台上的 OpenAI 服务
Anthropic：提供 Claude 系列模型
Amazon Bedrock：亚马逊的生成式 AI 服务，支持多种基础模型
Google Vertex AI：谷歌的 AI 平台，提供 PaLM、Gemini 等模型
Ollama：本地运行的开源 LLM 解决方案
HuggingFace：开源 AI 社区，提供数千种模型
Mistral AI：提供 Mistral 系列模型
Cohere：提供 Command 系列模型
AI21 Labs：提供 Jurassic 系列模型

每个提供商都有其独特的特点、优势和适用场景。Spring AI 通过统一的抽象接口，使得在不同提供商之间切换变得简单。

模型类型

Spring AI 支持多种类型的 AI 模型：

聊天模型 (Chat Models)：用于生成对话式响应，如 OpenAI 的 GPT 系列、Anthropic 的 Claude 系列等。
嵌入模型 (Embedding Models)：用于将文本转换为向量表示，如 OpenAI 的 text-embedding-ada-002、Cohere 的 embed-multilingual-v3.0 等。
图像生成模型 (Image Generation Models)：用于从文本描述生成图像，如 OpenAI 的 DALL-E 系列、Stability AI 的 Stable Diffusion 等。
音频转录模型 (Audio Transcription Models)：用于将语音转换为文本，如 OpenAI 的 Whisper 模型。
文本到语音模型 (Text-to-Speech Models)：用于将文本转换为语音，如 OpenAI 的 TTS 模型。

配置模型

每个模型提供商都需要特定的配置。以下是一些常见提供商的配置示例：

OpenAI

# application.properties
spring.ai.openai.api-key=your-api-key
spring.ai.openai.chat.options.model=gpt-4
spring.ai.openai.chat.options.temperature=0.7
spring.ai.openai.embedding.options.model=text-embedding-ada-002

Azure OpenAI

# application.properties
spring.ai.azure.openai.api-key=your-api-key
spring.ai.azure.openai.endpoint=https://your-resource-name.openai.azure.com
spring.ai.azure.openai.chat.options.deployment-name=your-gpt-deployment
spring.ai.azure.openai.embedding.options.deployment-name=your-embedding-deployment

Anthropic

# application.properties
spring.ai.anthropic.api-key=your-api-key
spring.ai.anthropic.chat.options.model=claude-3-opus-20240229
spring.ai.anthropic.chat.options.temperature=0.7

Amazon Bedrock

# application.properties
spring.ai.bedrock.region=us-east-1
# 使用 AWS 凭证提供者链，或者显式配置
# spring.ai.bedrock.access-key=your-access-key
# spring.ai.bedrock.secret-key=your-secret-key
spring.ai.bedrock.chat.options.model=anthropic.claude-3-sonnet-20240229-v1:0
spring.ai.bedrock.embedding.options.model=amazon.titan-embed-text-v1

Ollama

# application.properties
spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.chat.options.model=llama2
spring.ai.ollama.embedding.options.model=nomic-embed-text

使用不同的模型

Spring AI 的抽象使得使用不同的模型变得简单。只需更改配置并添加相应的依赖，就可以切换到不同的模型提供商。

例如，从 OpenAI 切换到 Anthropic：

更新依赖：

<!-- 移除 OpenAI 依赖 -->
<!--
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
-->

<!-- 添加 Anthropic 依赖 -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-anthropic-spring-boot-starter</artifactId>
</dependency>

更新配置：

# 注释掉 OpenAI 配置
# spring.ai.openai.api-key=your-openai-api-key
# spring.ai.openai.chat.options.model=gpt-4

# 添加 Anthropic 配置
spring.ai.anthropic.api-key=your-anthropic-api-key
spring.ai.anthropic.chat.options.model=claude-3-opus-20240229

由于 Spring AI 提供了统一的 ChatClient 和 EmbeddingClient 接口，你的应用代码无需更改，就可以无缝切换到新的模型提供商。

模型选择指南

选择合适的 AI 模型取决于多种因素，包括：

功能需求：不同模型在能力上有差异，如语言理解、代码生成、多语言支持等。
性能要求：模型的响应速度、吞吐量和延迟。
成本考虑：不同提供商和模型的价格差异很大。
数据隐私：某些应用可能需要本地部署模型或使用特定区域的服务。
上下文窗口大小：处理长文本时，模型的上下文窗口（可处理的最大 token 数）很重要。

以下是一些常见场景的模型推荐：

一般对话和内容生成：OpenAI GPT-3.5-turbo、Anthropic Claude Instant、Mistral 7B
复杂推理和高质量输出：OpenAI GPT-4、Anthropic Claude 3 Opus、Google Gemini Pro
代码生成和理解：OpenAI GPT-4、Anthropic Claude 3 Opus、Google Gemini Pro
本地部署（无需互联网）：Ollama 上的 Llama 2、Mistral 或其他开源模型
多语言支持：OpenAI GPT-4、Google Gemini Pro、Cohere Command
成本敏感应用：OpenAI GPT-3.5-turbo、Mistral 7B、Ollama 上的开源模型

模型参数调优

AI 模型的行为可以通过各种参数进行调整。以下是一些常见参数及其影响：

温度 (Temperature)：控制输出的随机性。较高的值（如 0.8）产生更多样化、创造性的响应，较低的值（如 0.2）产生更确定性、一致性的响应。
Top-P (Nucleus Sampling)：另一种控制随机性的方法。模型只考虑累积概率达到 top_p 的 token（例如，0.9 表示只考虑概率最高的占总概率 90% 的 token）。
最大 Token 数 (Max Tokens)：限制模型生成的 token 数量，有助于控制响应长度和成本。
存在惩罚 (Presence Penalty)：降低模型重复已出现内容的倾向。
频率惩罚 (Frequency Penalty)：降低模型重复使用高频词的倾向。

在 Spring AI 中，你可以通过 ChatOptions 设置这些参数：

ChatResponse response = chatClient.prompt()
    .user("Write a creative story about a magical forest")
    .withOptions(options -> options
        .withTemperature(0.8f)  // 较高温度，增加创造性
        .withMaxTokens(1000)    // 限制响应长度
        .withTopP(0.95f)        // 稍微限制词汇选择范围
        .withPresencePenalty(0.1f)  // 轻微减少重复内容
        .withFrequencyPenalty(0.1f) // 轻微减少重复词汇
    )
    .call();

多模型策略

在某些应用中，使用多个模型可能是有益的。例如：

分层策略：使用成本较低的模型处理简单查询，只在必要时升级到更强大但成本更高的模型。
专业化策略：为不同任务使用专门的模型（如代码生成、创意写作、事实查询等）。
回退策略：当首选模型不可用或失败时，自动切换到备用模型。

Spring AI 支持在同一应用中使用多个模型：

@Configuration
public class MultiModelConfig {

    @Bean
    @Qualifier("economyModel")
    public ChatClient gpt35ChatClient(OpenAiApi openAiApi) {
        OpenAiChatOptions options = OpenAiChatOptions.builder()
            .withModel("gpt-3.5-turbo")
            .withTemperature(0.7f)
            .build();

        return new OpenAiChatClient(openAiApi, options);
    }

    @Bean
    @Qualifier("premiumModel")
    public ChatClient gpt4ChatClient(OpenAiApi openAiApi) {
        OpenAiChatOptions options = OpenAiChatOptions.builder()
            .withModel("gpt-4")
            .withTemperature(0.5f)
            .build();

        return new OpenAiChatClient(openAiApi, options);
    }

    @Bean
    @Qualifier("codeModel")
    public ChatClient claudeCodeClient(AnthropicApi anthropicApi) {
        AnthropicChatOptions options = AnthropicChatOptions.builder()
            .withModel("claude-3-opus-20240229")
            .withTemperature(0.2f)  // 低温度，更确定性的代码生成
            .build();

        return new AnthropicChatClient(anthropicApi, options);
    }
}

然后在服务中使用这些模型：

@Service
public class IntelligentAssistantService {

    private final ChatClient economyModel;
    private final ChatClient premiumModel;
    private final ChatClient codeModel;

    @Autowired
    public IntelligentAssistantService(
            @Qualifier("economyModel") ChatClient economyModel,
            @Qualifier("premiumModel") ChatClient premiumModel,
            @Qualifier("codeModel") ChatClient codeModel) {
        this.economyModel = economyModel;
        this.premiumModel = premiumModel;
        this.codeModel = codeModel;
    }

    public String processQuery(String query, boolean isPremiumUser) {
        // 根据查询内容和用户类型选择合适的模型
        if (query.contains("code") || query.contains("programming")) {
            return codeModel.call(query);
        } else if (isPremiumUser || isComplexQuery(query)) {
            return premiumModel.call(query);
        } else {
            return economyModel.call(query);
        }
    }

    private boolean isComplexQuery(String query) {
        // 实现逻辑来判断查询的复杂性
        return query.length() > 100 || query.contains("explain") || query.contains("analyze");
    }
}

通过这种方式，你可以根据不同的需求和约束条件灵活地使用多个 AI 模型，优化性能、成本和用户体验。

Spring AI 的模型抽象使得在应用中集成和切换不同的 AI 模型变得简单而灵活，让开发者能够充分利用各种 AI 模型的优势，同时避免厂商锁定。

6. 多模态 (Multimodality)

多模态是指 AI 系统能够处理和理解包含多种类型信息（模态）的输入，并/或生成包含多种模态的输出。Spring AI 提供了对多模态交互的支持，使开发者能够构建更丰富、更强大的 AI 应用。

多模态概述

传统的 LLM 主要处理文本数据，而多模态 LLM 则可以同时处理文本和其他类型的数据，如图像、音频等。这极大地扩展了 AI 应用的可能性，使其能够：

分析和描述图像内容
回答关于图像的问题
基于图像和文本提示生成内容
处理包含图表、图形、截图等的复杂查询

目前，Spring AI 主要支持文本-图像多模态，即允许开发者向支持多模态的模型（如 OpenAI 的 GPT-4V、Google 的 Gemini、Anthropic 的 Claude 3 等）发送包含图像的提示，并获取基于这些图像的文本响应。

支持多模态的模型

Spring AI 支持多种提供多模态能力的模型，包括：

OpenAI GPT-4 Vision (GPT-4V)：通过 gpt-4-vision-preview 模型提供
Google Gemini：通过 Vertex AI 集成提供
Anthropic Claude 3：所有 Claude 3 变体（Haiku、Sonnet、Opus）都支持视觉输入
Amazon Bedrock 上的多模态模型：如 Claude 3 和 Titan Multimodal

使用多模态功能

在 Spring AI 中使用多模态功能的基本步骤如下：

1. 添加依赖

首先，确保添加了支持多模态的模型提供商的依赖：

<!-- 例如，使用 OpenAI 的多模态功能 -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

2. 配置模型

确保配置了支持多模态的模型：

# 对于 OpenAI GPT-4V
spring.ai.openai.api-key=your-api-key
spring.ai.openai.chat.options.model=gpt-4-vision-preview

3. 创建包含图像的消息

Spring AI 提供了 Media 类来表示非文本内容，如图像。你可以使用 UserMessage 构建包含文本和图像的消息：

import org.springframework.ai.chat.messages.Media;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.Resource;

import java.io.IOException;
import java.util.List;

// 从文件加载图像
Resource imageResource = new ClassPathResource("images/chart.png");
byte[] imageData = imageResource.getInputStream().readAllBytes();

// 创建媒体对象
Media imageMedia = Media.of(Media.Type.IMAGE_PNG, imageData);

// 创建包含文本和图像的用户消息
UserMessage userMessage = new UserMessage(
    "Describe what you see in this image and analyze the trend shown in the chart.",
    List.of(imageMedia)
);

// 创建提示并发送给模型
Prompt prompt = new Prompt(userMessage);
ChatResponse response = chatClient.call(prompt);

也可以使用 prompt() builder API：

ChatResponse response = chatClient.prompt()
    .user("Describe what you see in this image and analyze the trend shown in the chart.")
    .media(Media.of(Media.Type.IMAGE_PNG, imageData))
    .call();

4. 处理响应

多模态模型的响应通常是文本形式，你可以像处理普通聊天响应一样处理它：

String description = response.getResult().getOutput().getContent();
System.out.println("Image description: " + description);

多模态应用场景

多模态 AI 有许多实用的应用场景，包括：

图像分析与描述：自动生成图像的详细描述，识别图像中的对象、场景、人物等。

ChatResponse response = chatClient.prompt()
    .user("What's in this image? Provide a detailed description.")
    .media(imageMedia)
    .call();

图表与数据可视化分析：分析图表、图形中的数据趋势和模式。

ChatResponse response = chatClient.prompt()
    .user("Analyze this chart. What trends do you observe? What conclusions can we draw?")
    .media(chartImageMedia)
    .call();

文档理解：分析包含文本和图形的文档，如报告、论文、幻灯片等。

ChatResponse response = chatClient.prompt()
    .user("Summarize the key points from this slide and explain the diagram.")
    .media(slideDeckImageMedia)
    .call();

代码与截图分析：分析代码截图，提供解释、改进建议或错误诊断。

ChatResponse response = chatClient.prompt()
    .user("What does this code do? Are there any bugs or improvements you can suggest?")
    .media(codeScreenshotMedia)
    .call();

视觉问答：回答关于图像内容的具体问题。

ChatResponse response = chatClient.prompt()
    .user("How many people are in this image? What are they doing?")
    .media(sceneImageMedia)
    .call();

多模态最佳实践

使用多模态功能时，请考虑以下最佳实践：

提供清晰的指令：明确告诉模型你希望它关注图像的哪些方面。
控制图像质量和大小：较大的图像可能会消耗更多的 token 和处理时间。根据需要调整图像分辨率和文件大小。
组合多个图像：某些用例可能需要发送多个图像。Spring AI 支持在单个消息中包含多个媒体对象。

UserMessage userMessage = new UserMessage(
    "Compare these two charts and explain the differences in trends.",
    List.of(chart1Media, chart2Media)
);

考虑模型限制：不同的多模态模型有不同的能力和限制。例如，某些模型可能更擅长识别文本，而其他模型可能更擅长分析图表。
处理敏感内容：确保你发送的图像符合模型提供商的使用政策和隐私要求。

实际应用示例：图像分析服务

以下是一个完整的 Spring Boot 控制器示例，展示了如何创建一个图像分析 API 端点：

@RestController
@RequestMapping("/api/image-analysis")
public class ImageAnalysisController {

    private final ChatClient chatClient;

    @Autowired
    public ImageAnalysisController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @PostMapping
    public ResponseEntity<Map<String, String>> analyzeImage(
            @RequestParam("image") MultipartFile imageFile,
            @RequestParam("query") String query) {

        try {
            // 读取上传的图像文件
            byte[] imageData = imageFile.getBytes();

            // 确定媒体类型
            Media.Type mediaType;
            String contentType = imageFile.getContentType();
            if (contentType != null) {
                if (contentType.equals("image/jpeg") || contentType.equals("image/jpg")) {
                    mediaType = Media.Type.IMAGE_JPEG;
                } else if (contentType.equals("image/png")) {
                    mediaType = Media.Type.IMAGE_PNG;
                } else {
                    return ResponseEntity.badRequest().body(Map.of(
                        "error", "Unsupported image format. Please upload JPEG or PNG."
                    ));
                }
            } else {
                // 默认为 PNG
                mediaType = Media.Type.IMAGE_PNG;
            }

            // 创建媒体对象
            Media imageMedia = Media.of(mediaType, imageData);

            // 发送带有图像的提示
            ChatResponse response = chatClient.prompt()
                .user(query)
                .media(imageMedia)
                .withOptions(options -> options
                    .withTemperature(0.5f)
                    .withMaxTokens(500))
                .call();

            String analysis = response.getResult().getOutput().getContent();

            return ResponseEntity.ok(Map.of(
                "query", query,
                "analysis", analysis,
                "timestamp", LocalDateTime.now().toString()
            ));

        } catch (IOException e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", "Failed to process image: " + e.getMessage()));
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }
}

这个控制器允许用户上传图像文件并提供查询文本，然后使用多模态 AI 模型分析图像并返回结果。

随着多模态 AI 技术的不断发展，Spring AI 将继续增强对多模态功能的支持，使开发者能够构建更加智能和直观的应用程序。

7. 向量数据库 (Vector Databases)

向量数据库是专门设计用于存储、管理和高效检索嵌入向量的数据库系统。在 AI 应用中，特别是在实现检索增强生成（RAG）时，向量数据库扮演着至关重要的角色。Spring AI 提供了对多种向量数据库的支持，使开发者能够轻松地将向量存储集成到他们的应用中。

向量数据库概述

传统的关系型数据库主要处理结构化数据，而向量数据库则专注于高维向量的存储和相似性搜索。向量数据库的核心功能包括：

向量存储：高效存储大量高维向量（通常是浮点数数组）。
相似性搜索：快速查找与给定查询向量最相似的向量（通常使用余弦相似度、欧几里得距离等度量）。
元数据过滤：基于向量相关的元数据进行过滤和查询。
索引优化：使用特殊的索引结构（如 HNSW、IVF 等）加速相似性搜索。

在 AI 应用中，向量数据库通常用于存储文本、图像或其他数据的嵌入表示，以便进行语义搜索、推荐、分类等任务。

Spring AI 支持的向量数据库

Spring AI 支持多种向量数据库，包括：

Chroma：轻量级、开源的向量数据库
Milvus：高性能、分布式的向量数据库
Neo4j：图数据库，支持向量搜索
PostgreSQL/PGVector：在 PostgreSQL 中使用 pgvector 扩展进行向量操作
Redis：使用 Redis Stack 进行向量搜索
Weaviate：开源的向量搜索引擎
Pinecone：专为向量搜索设计的托管服务
Qdrant：开源的向量相似性搜索引擎
Azure Cosmos DB：微软的多模型数据库，支持向量搜索
Azure Vector Search：微软的向量搜索服务
Elasticsearch：全文搜索引擎，支持向量搜索
OpenSearch：Elasticsearch 的开源分支，支持向量搜索
MongoDB Atlas：MongoDB 的托管服务，支持向量搜索
Apache Cassandra：分布式 NoSQL 数据库，支持向量操作
GemFire：内存数据网格，支持向量搜索
MariaDB：关系型数据库，支持向量操作
Oracle：企业级关系型数据库，支持向量操作
SAP Hana：内存数据库，支持向量操作
Typesense：开源搜索引擎，支持向量搜索

向量存储抽象

Spring AI 提供了 VectorStore 接口作为向量数据库操作的抽象。这个接口定义了以下核心方法：

add(List<Document> documents)：将文档添加到向量存储中
search(String query, int k)：搜索与查询最相似的 k 个文档
search(String query, int k, double threshold)：搜索相似度超过阈值的文档
search(String query, int k, String filter)：使用元数据过滤条件进行搜索
delete(List<String> ids)：删除指定 ID 的文档

这种抽象使得应用代码可以独立于具体的向量数据库实现，提高了代码的可移植性和可维护性。

配置向量数据库

以下是一些常见向量数据库的配置示例：

Chroma

# application.properties
spring.ai.vectorstore.chroma.host=localhost
spring.ai.vectorstore.chroma.port=8000
spring.ai.vectorstore.chroma.collection-name=my_collection

PostgreSQL/PGVector

# application.properties
spring.ai.vectorstore.pgvector.host=localhost
spring.ai.vectorstore.pgvector.port=5432
spring.ai.vectorstore.pgvector.database=postgres
spring.ai.vectorstore.pgvector.username=postgres
spring.ai.vectorstore.pgvector.password=password
spring.ai.vectorstore.pgvector.table=embeddings

Redis

# application.properties
spring.ai.vectorstore.redis.host=localhost
spring.ai.vectorstore.redis.port=6379
spring.ai.vectorstore.redis.index-name=my_index

使用向量存储

以下是使用 Spring AI 的 VectorStore 接口的基本示例：

@Service
public class DocumentSearchService {

    private final VectorStore vectorStore;
    private final EmbeddingClient embeddingClient;

    @Autowired
    public DocumentSearchService(VectorStore vectorStore, EmbeddingClient embeddingClient) {
        this.vectorStore = vectorStore;
        this.embeddingClient = embeddingClient;
    }

    // 添加文档到向量存储
    public void addDocuments(List<String> contents, List<Map<String, Object>> metadata) {
        List<Document> documents = new ArrayList<>();

        for (int i = 0; i < contents.size(); i++) {
            Document document = new Document(
                contents.get(i),
                metadata.get(i)
            );
            documents.add(document);
        }

        vectorStore.add(documents);
    }

    // 搜索相似文档
    public List<Document> searchSimilarDocuments(String query, int k) {
        return vectorStore.search(query, k);
    }

    // 使用元数据过滤进行搜索
    public List<Document> searchWithFilter(String query, int k, String category) {
        String filter = "metadata.category == '" + category + "'";
        return vectorStore.search(query, k, filter);
    }

    // 删除文档
    public void deleteDocuments(List<String> documentIds) {
        vectorStore.delete(documentIds);
    }
}

元数据过滤

Spring AI 提供了一种类 SQL 的元数据过滤 API，使你能够基于文档的元数据属性进行过滤。这在需要结合语义相似性和结构化条件进行搜索时非常有用。

过滤表达式的基本语法如下：

metadata.{field} {operator} {value}

支持的操作符包括：

比较操作符：==, !=, >, >=, <, <=
逻辑操作符：AND, OR, NOT
包含操作符：IN

例如：

// 搜索类别为"技术"且发布日期在2023年之后的文档
String filter = "metadata.category == 'technology' AND metadata.publishDate > '2023-01-01'";
List<Document> results = vectorStore.search("artificial intelligence", 5, filter);

// 搜索作者在指定列表中的文档
String filter = "metadata.author IN ('John Doe', 'Jane Smith')";
List<Document> results = vectorStore.search("climate change", 10, filter);

自定义向量存储实现

如果 Spring AI 尚未支持你需要的向量数据库，你可以通过实现 VectorStore 接口创建自定义实现：

@Component
public class CustomVectorStore implements VectorStore {

    private final EmbeddingClient embeddingClient;
    private final CustomVectorDatabase database;

    public CustomVectorStore(EmbeddingClient embeddingClient, CustomVectorDatabase database) {
        this.embeddingClient = embeddingClient;
        this.database = database;
    }

    @Override
    public void add(List<Document> documents) {
        // 为每个文档生成嵌入
        List<List<Double>> embeddings = embeddingClient.embed(
            documents.stream().map(Document::getContent).collect(Collectors.toList())
        ).getValues();

        // 将文档和嵌入存储到自定义数据库
        for (int i = 0; i < documents.size(); i++) {
            Document doc = documents.get(i);
            List<Double> embedding = embeddings.get(i);
            database.storeDocument(doc.getId(), doc.getContent(), embedding, doc.getMetadata());
        }
    }

    @Override
    public List<Document> search(String query, int k) {
        // 为查询生成嵌入
        List<Double> queryEmbedding = embeddingClient.embed(query).getValues().get(0);

        // 在自定义数据库中搜索最相似的文档
        List<CustomVectorDatabase.SearchResult> results = database.searchSimilar(queryEmbedding, k);

        // 将结果转换为 Document 对象
        return results.stream()
            .map(result -> new Document(
                result.getId(),
                result.getContent(),
                result.getMetadata(),
                Map.of("score", result.getScore())
            ))
            .collect(Collectors.toList());
    }

    // 实现其他必要的方法...
}

向量存储最佳实践

使用向量数据库时，请考虑以下最佳实践：

选择合适的向量数据库：根据你的需求（如数据量、查询性能、部署环境等）选择合适的向量数据库。
优化嵌入维度：较高的维度可能提供更好的语义表示，但也会增加存储和计算成本。根据你的用例选择合适的嵌入模型和维度。
批量操作：尽可能批量添加文档，而不是一次添加一个，以提高性能。
索引调优：了解并调整向量数据库的索引参数（如 HNSW 的 M 和 ef_construction 参数），以平衡搜索速度和准确性。
元数据设计：精心设计文档的元数据结构，以支持有效的过滤和分类。
监控和扩展：监控向量数据库的性能和资源使用情况，并根据需要进行扩展。

实际应用示例：知识库搜索

以下是一个使用 Spring AI 和向量数据库实现知识库语义搜索的完整示例：

@RestController
@RequestMapping("/api/knowledge-base")
public class KnowledgeBaseController {

    private final VectorStore vectorStore;
    private final EmbeddingClient embeddingClient;

    @Autowired
    public KnowledgeBaseController(VectorStore vectorStore, EmbeddingClient embeddingClient) {
        this.vectorStore = vectorStore;
        this.embeddingClient = embeddingClient;
    }

    @PostMapping("/documents")
    public ResponseEntity<Map<String, Object>> addDocument(@RequestBody DocumentRequest request) {
        try {
            // 创建文档对象
            Document document = new Document(
                request.getContent(),
                Map.of(
                    "title", request.getTitle(),
                    "category", request.getCategory(),
                    "author", request.getAuthor(),
                    "date", request.getDate()
                )
            );

            // 添加到向量存储
            vectorStore.add(List.of(document));

            return ResponseEntity.ok(Map.of(
                "message", "Document added successfully",
                "documentId", document.getId()
            ));
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }

    @GetMapping("/search")
    public ResponseEntity<List<Map<String, Object>>> searchDocuments(
            @RequestParam String query,
            @RequestParam(defaultValue = "5") int limit,
            @RequestParam(required = false) String category) {

        try {
            List<Document> results;

            // 如果提供了类别，使用元数据过滤
            if (category != null && !category.isEmpty()) {
                String filter = "metadata.category == '" + category + "'";
                results = vectorStore.search(query, limit, filter);
            } else {
                results = vectorStore.search(query, limit);
            }

            // 转换结果为响应格式
            List<Map<String, Object>> response = results.stream()
                .map(doc -> {
                    Map<String, Object> item = new HashMap<>();
                    item.put("id", doc.getId());
                    item.put("content", doc.getContent());
                    item.put("metadata", doc.getMetadata());
                    item.put("score", doc.getScores().get("score"));
                    return item;
                })
                .collect(Collectors.toList());

            return ResponseEntity.ok(response);
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(List.of(Map.of("error", e.getMessage())));
        }
    }

    // 请求对象
    public static class DocumentRequest {
        private String title;
        private String content;
        private String category;
        private String author;
        private String date;

        // Getters and setters
        // ...
    }
}

这个控制器提供了两个端点：一个用于添加文档到知识库，另一个用于基于语义相似性搜索文档，并支持按类别过滤。

向量数据库是实现高效语义搜索和 RAG 应用的关键组件。Spring AI 的向量存储抽象使得在 Java 应用中集成和使用向量数据库变得简单而灵活，让开发者能够专注于构建智能应用，而不必担心底层实现的细节。

8. 检索增强生成 (RAG)

检索增强生成（Retrieval Augmented Generation，简称 RAG）是一种结合信息检索与生成式 AI 的技术范式，旨在提高大型语言模型（LLM）生成内容的准确性、相关性和可靠性。Spring AI 提供了全面的 RAG 支持，使开发者能够轻松构建强大的 RAG 应用。

RAG 概述

RAG 的核心思想是在 LLM 生成响应之前，先从外部知识源（如文档、数据库、API 等）检索相关信息，然后将这些检索到的信息作为上下文提供给 LLM，引导其生成更准确、更具体、更基于事实的回答。

RAG 的主要优势包括：

减少幻觉：通过提供相关的外部知识，帮助 LLM 基于真实数据回答问题，减少编造或不准确信息的可能性。
访问最新信息：允许 LLM 利用训练数据截止日期之后的最新信息。
利用领域特定知识：使 LLM 能够回答关于特定领域或组织的私有数据的问题，而无需对模型进行微调。
提高透明度和可解释性：可以追溯信息来源，增强答案的可信度。
成本效益：比完全微调模型更经济高效。

RAG 工作流程

典型的 RAG 工作流程包括以下步骤：

数据摄取：收集和准备知识源（如文档、网页、数据库等）。
文档处理：将知识源转换为可处理的格式，并进行清洗和规范化。
文档分块：将长文档分割成较小的、语义连贯的块。
嵌入生成：为每个文档块生成向量嵌入表示。
向量存储：将文档块及其嵌入存储在向量数据库中。
查询处理：当用户提出问题时，为查询生成嵌入。
相似性搜索：在向量数据库中查找与查询最相似的文档块。
上下文增强：将检索到的相关文档块作为上下文添加到提示中。
生成回答：LLM 基于增强的上下文生成最终回答。

Spring AI 中的 RAG 组件

Spring AI 提供了一套完整的组件来支持 RAG 工作流的每个步骤：

1. 文档读取器 (DocumentReader)

DocumentReader 接口用于从各种来源加载文档。Spring AI 提供了多种实现，包括：

TextDocumentReader：从纯文本文件读取
PdfDocumentReader：从 PDF 文件读取
WordDocumentReader：从 Word 文档读取
UrlDocumentReader：从网页读取
JsonDocumentReader：从 JSON 文件读取
CsvDocumentReader：从 CSV 文件读取

示例：

// 从 PDF 文件读取文档
DocumentReader pdfReader = new PdfDocumentReader();
List<Document> documents = pdfReader.read("path/to/document.pdf");

// 从网页读取文档
DocumentReader urlReader = new UrlDocumentReader();
List<Document> webDocuments = urlReader.read("https://example.com/article");

2. 文档转换器 (DocumentTransformer)

DocumentTransformer 接口用于处理和转换文档。主要实现包括：

TextSplitter：将文档分割成较小的块
- TokenTextSplitter：基于 token 数量分割文本
- RecursiveCharacterTextSplitter：递归地按字符分割文本
- SentenceTextSplitter：按句子分割文本
- ParagraphTextSplitter：按段落分割文本

示例：

// 创建一个文本分割器，设置块大小和重叠
TextSplitter splitter = new RecursiveCharacterTextSplitter()
    .withChunkSize(1000)
    .withChunkOverlap(200);

// 分割文档
List<Document> chunks = splitter.apply(documents);

3. 嵌入客户端 (EmbeddingClient)

EmbeddingClient 接口用于生成文本的向量嵌入。Spring AI 支持多种嵌入模型，如 OpenAI、Azure OpenAI、Ollama 等。

示例：

@Autowired
private EmbeddingClient embeddingClient;

// 为文本生成嵌入
EmbeddingResponse response = embeddingClient.embed("这是一段示例文本");
List<Double> embedding = response.getValues().get(0);

4. 向量存储 (VectorStore)

VectorStore 接口用于存储和检索文档嵌入。Spring AI 支持多种向量数据库，如 Chroma、Milvus、Neo4j、PGVector 等。

示例：

@Autowired
private VectorStore vectorStore;
@Autowired
private EmbeddingClient embeddingClient;

// 将文档添加到向量存储
vectorStore.add(chunks);

// 搜索相似文档
List<Document> relevantDocs = vectorStore.search("如何使用 Spring AI?", 5);

5. 检索增强提示 (Retrieval Augmented Prompts)

Spring AI 提供了 PromptTemplate 和相关工具，用于将检索到的文档整合到提示中。

示例：

// 定义系统提示模板
String systemPromptTemplate = """
    你是一个有用的助手。使用以下上下文来回答用户的问题。
    如果你不知道答案，就说你不知道，不要试图编造答案。

    上下文:
    {context}
    """;

// 搜索相关文档
List<Document> relevantDocs = vectorStore.search(userQuestion, 3);

// 提取文档内容并合并
String context = relevantDocs.stream()
    .map(Document::getContent)
    .collect(Collectors.joining("\n\n"));

// 创建系统提示
SystemPromptTemplate systemPrompt = new SystemPromptTemplate(systemPromptTemplate);
Message systemMessage = systemPrompt.create(Map.of("context", context));

// 创建用户消息
UserMessage userMessage = new UserMessage(userQuestion);

// 创建提示并发送给模型
Prompt prompt = new Prompt(List.of(systemMessage, userMessage));
ChatResponse response = chatClient.call(prompt);

RAG 助手 (RAG Assistant)

为了简化 RAG 应用的构建，Spring AI 提供了 RagAssistant 类，它封装了整个 RAG 工作流程：

@Service
public class DocumentAssistantService {

    private final RagAssistant ragAssistant;

    @Autowired
    public DocumentAssistantService(
            ChatClient chatClient,
            VectorStore vectorStore) {

        // 创建 RAG 助手
        this.ragAssistant = new RagAssistant(chatClient, vectorStore)
            .withSystemPrompt("""
                你是一个专业的文档助手。使用提供的上下文信息回答用户的问题。
                如果上下文中没有足够的信息，请明确说明你无法回答，不要编造信息。
                """)
            .withSimilarityThreshold(0.7)  // 设置相似度阈值
            .withTopK(3);  // 检索前 3 个最相关的文档
    }

    // 回答关于文档的问题
    public String answerQuestion(String question) {
        return ragAssistant.generate(question);
    }

    // 添加文档到知识库
    public void addDocumentsToKnowledgeBase(List<Document> documents) {
        ragAssistant.addDocuments(documents);
    }
}

使用 RagAssistant，你可以轻松地：

添加文档到知识库
基于知识库回答问题
自定义系统提示和检索参数
处理流式响应

高级 RAG 技术

Spring AI 支持多种高级 RAG 技术，以进一步提高检索和生成的质量：

1. 查询转换 (Query Transformation)

有时用户的原始查询可能不是检索相关文档的最佳形式。查询转换使用 LLM 将原始查询转换为更适合检索的形式：

@Service
public class QueryTransformationService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    @Autowired
    public QueryTransformationService(ChatClient chatClient, VectorStore vectorStore) {
        this.chatClient = chatClient;
        this.vectorStore = vectorStore;
    }

    public String answerWithQueryTransformation(String originalQuery) {
        // 转换查询
        String transformedQuery = transformQuery(originalQuery);

        // 使用转换后的查询检索文档
        List<Document> relevantDocs = vectorStore.search(transformedQuery, 3);

        // 使用检索到的文档和原始查询生成回答
        return generateAnswer(originalQuery, relevantDocs);
    }

    private String transformQuery(String originalQuery) {
        String promptTemplate = """
            你是一个查询优化专家。你的任务是将用户的原始查询转换为更适合文档检索的形式。
            保留所有重要的关键词，但可以添加同义词或相关术语以提高检索效果。

            原始查询: {query}

            优化后的查询:
            """;

        ChatResponse response = chatClient.prompt()
            .system(promptTemplate.replace("{query}", originalQuery))
            .call();

        return response.getResult().getOutput().getContent();
    }

    private String generateAnswer(String originalQuery, List<Document> documents) {
        // 实现使用原始查询和检索文档生成回答的逻辑
        // ...
    }
}

2. 多查询检索 (Multi-Query Retrieval)

为了提高检索覆盖面，可以从原始查询生成多个不同的查询变体，然后合并检索结果：

@Service
public class MultiQueryRetrievalService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    @Autowired
    public MultiQueryRetrievalService(ChatClient chatClient, VectorStore vectorStore) {
        this.chatClient = chatClient;
        this.vectorStore = vectorStore;
    }

    public String answerWithMultiQueryRetrieval(String originalQuery) {
        // 生成多个查询变体
        List<String> queryVariants = generateQueryVariants(originalQuery, 3);

        // 使用所有查询变体检索文档
        Set<Document> allRelevantDocs = new HashSet<>();
        for (String query : queryVariants) {
            List<Document> docs = vectorStore.search(query, 2);
            allRelevantDocs.addAll(docs);
        }

        // 使用检索到的文档和原始查询生成回答
        return generateAnswer(originalQuery, new ArrayList<>(allRelevantDocs));
    }

    private List<String> generateQueryVariants(String originalQuery, int numVariants) {
        String promptTemplate = """
            你是一个查询变体生成专家。你的任务是从用户的原始查询生成 {numVariants} 个不同的查询变体。
            这些变体应该表达相同的信息需求，但使用不同的词汇、句式或角度。

            原始查询: {query}

            生成 {numVariants} 个查询变体，每行一个:
            """;

        String prompt = promptTemplate
            .replace("{query}", originalQuery)
            .replace("{numVariants}", String.valueOf(numVariants));

        ChatResponse response = chatClient.prompt()
            .system(prompt)
            .call();

        String content = response.getResult().getOutput().getContent();
        return Arrays.stream(content.split("\n"))
            .map(String::trim)
            .filter(s -> !s.isEmpty())
            .collect(Collectors.toList());
    }

    private String generateAnswer(String originalQuery, List<Document> documents) {
        // 实现使用原始查询和检索文档生成回答的逻辑
        // ...
    }
}

3. 上下文压缩 (Context Compression)

当检索到的文档过多或过长时，可能超出 LLM 的上下文窗口限制。上下文压缩使用 LLM 提取和压缩最相关的信息：

@Service
public class ContextCompressionService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    @Autowired
    public ContextCompressionService(ChatClient chatClient, VectorStore vectorStore) {
        this.chatClient = chatClient;
        this.vectorStore = vectorStore;
    }

    public String answerWithContextCompression(String query) {
        // 检索相关文档
        List<Document> relevantDocs = vectorStore.search(query, 5);

        // 压缩上下文
        String compressedContext = compressContext(query, relevantDocs);

        // 使用压缩后的上下文生成回答
        return generateAnswer(query, compressedContext);
    }

    private String compressContext(String query, List<Document> documents) {
        String allContent = documents.stream()
            .map(Document::getContent)
            .collect(Collectors.joining("\n\n"));

        String promptTemplate = """
            你是一个信息提取专家。你的任务是从以下文档中提取与用户查询最相关的信息。
            保留所有重要的事实和细节，但删除不相关或冗余的内容。最终压缩后的内容应该不超过 1000 个单词。

            用户查询: {query}

            文档内容:
            {content}

            提取的相关信息:
            """;

        String prompt = promptTemplate
            .replace("{query}", query)
            .replace("{content}", allContent);

        ChatResponse response = chatClient.prompt()
            .system(prompt)
            .call();

        return response.getResult().getOutput().getContent();
    }

    private String generateAnswer(String query, String compressedContext) {
        String promptTemplate = """
            你是一个有用的助手。使用以下压缩后的上下文信息回答用户的问题。
            如果上下文中没有足够的信息，请明确说明你无法回答。

            上下文:
            {context}

            用户问题: {query}
            """;

        String prompt = promptTemplate
            .replace("{context}", compressedContext)
            .replace("{query}", query);

        ChatResponse response = chatClient.prompt()
            .system(prompt)
            .call();

        return response.getResult().getOutput().getContent();
    }
}

RAG 最佳实践

构建高效的 RAG 应用时，请考虑以下最佳实践：

文档分块策略：
- 选择合适的分块大小（通常在 500-1500 个 token 之间）
- 使用适当的重叠（通常为块大小的 10-20%）以保持上下文连贯性
- 考虑语义边界（如段落、章节）而不仅仅是固定大小
嵌入选择：
- 选择高质量的嵌入模型（如 OpenAI 的 text-embedding-ada-002 或更新的模型）
- 对于多语言应用，使用支持多语言的嵌入模型
检索策略：
- 调整检索的文档数量（通常 3-5 个文档块是一个好的起点）
- 考虑设置相似度阈值，只检索真正相关的文档
- 实验不同的相似度度量（如余弦相似度、欧几里得距离）
提示工程：
- 设计清晰的系统提示，指导 LLM 如何使用检索到的上下文
- 明确指示 LLM 在上下文中没有足够信息时应该如何回应
- 考虑在提示中包含文档的元数据（如标题、来源、日期）
评估与改进：
- 定期评估 RAG 系统的性能（准确性、相关性、完整性）
- 收集用户反馈并持续改进
- 考虑实现 A/B 测试来比较不同的 RAG 策略

实际应用示例：文档问答系统

以下是一个完整的 Spring Boot 应用示例，展示了如何使用 Spring AI 构建文档问答系统：

@SpringBootApplication
public class DocumentQaApplication {

    public static void main(String[] args) {
        SpringApplication.run(DocumentQaApplication.class, args);
    }

    @Bean
    public CommandLineRunner loadDocuments(DocumentService documentService) {
        return args -> {
            // 在应用启动时加载文档
            documentService.loadDocumentsFromDirectory("data/documents");
            System.out.println("Documents loaded successfully!");
        };
    }
}

@Service
public class DocumentService {

    private final VectorStore vectorStore;
    private final EmbeddingClient embeddingClient;

    @Autowired
    public DocumentService(VectorStore vectorStore, EmbeddingClient embeddingClient) {
        this.vectorStore = vectorStore;
        this.embeddingClient = embeddingClient;
    }

    public void loadDocumentsFromDirectory(String directoryPath) throws IOException {
        Path directory = Paths.get(directoryPath);
        List<Document> allDocuments = new ArrayList<>();

        // 创建文档读取器
        DocumentReader pdfReader = new PdfDocumentReader();
        DocumentReader textReader = new TextDocumentReader();
        DocumentReader wordReader = new WordDocumentReader();

        // 创建文本分割器
        TextSplitter splitter = new RecursiveCharacterTextSplitter()
            .withChunkSize(1000)
            .withChunkOverlap(200);

        // 遍历目录中的所有文件
        Files.walk(directory)
            .filter(Files::isRegularFile)
            .forEach(file -> {
                try {
                    String fileName = file.toString();
                    List<Document> documents = null;

                    // 根据文件类型选择合适的读取器
                    if (fileName.endsWith(".pdf")) {
                        documents = pdfReader.read(fileName);
                    } else if (fileName.endsWith(".txt")) {
                        documents = textReader.read(fileName);
                    } else if (fileName.endsWith(".docx") || fileName.endsWith(".doc")) {
                        documents = wordReader.read(fileName);
                    }

                    if (documents != null) {
                        // 为每个文档添加元数据
                        documents.forEach(doc -> {
                            doc.getMetadata().put("source", fileName);
                            doc.getMetadata().put("filename", file.getFileName().toString());
                        });

                        allDocuments.addAll(documents);
                    }
                } catch (Exception e) {
                    System.err.println("Error processing file " + file + ": " + e.getMessage());
                }
            });

        // 分割文档
        List<Document> chunks = splitter.apply(allDocuments);

        // 将分割后的文档添加到向量存储
        vectorStore.add(chunks);
    }
}

@Service
public class QaService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    @Autowired
    public QaService(ChatClient chatClient, VectorStore vectorStore) {
        this.chatClient = chatClient;
        this.vectorStore = vectorStore;
    }

    public String answerQuestion(String question) {
        // 检索相关文档
        List<Document> relevantDocs = vectorStore.search(question, 3);

        // 如果没有找到相关文档
        if (relevantDocs.isEmpty()) {
            return "抱歉，我没有找到与您问题相关的信息。";
        }

        // 提取文档内容和来源
        StringBuilder contextBuilder = new StringBuilder();
        for (Document doc : relevantDocs) {
            contextBuilder.append("内容: ").append(doc.getContent()).append("\n");
            contextBuilder.append("来源: ").append(doc.getMetadata().get("filename")).append("\n\n");
        }

        String context = contextBuilder.toString();

        // 创建系统提示
        String systemPromptTemplate = """
            你是一个专业的文档助手。使用以下上下文信息回答用户的问题。
            如果上下文中没有足够的信息，请明确说明你无法回答，不要编造信息。
            在回答的最后，如果使用了上下文中的信息，请注明信息来源。

            上下文信息:
            {context}
            """;

        // 生成回答
        ChatResponse response = chatClient.prompt()
            .system(systemPromptTemplate.replace("{context}", context))
            .user(question)
            .call();

        return response.getResult().getOutput().getContent();
    }
}

@RestController
@RequestMapping("/api/qa")
public class QaController {

    private final QaService qaService;

    @Autowired
    public QaController(QaService qaService) {
        this.qaService = qaService;
    }

    @PostMapping
    public ResponseEntity<Map<String, String>> answerQuestion(@RequestBody QuestionRequest request) {
        try {
            String answer = qaService.answerQuestion(request.getQuestion());

            return ResponseEntity.ok(Map.of(
                "question", request.getQuestion(),
                "answer", answer
            ));
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }

    // 请求对象
    public static class QuestionRequest {
        private String question;

        // Getters and setters
        public String getQuestion() {
            return question;
        }

        public void setQuestion(String question) {
            this.question = question;
        }
    }
}

这个示例展示了一个完整的文档问答系统，它可以：

从目录中加载不同类型的文档（PDF、文本、Word）
将文档分割成较小的块
将文档块存储在向量数据库中
检索与用户问题最相关的文档
使用检索到的文档作为上下文，生成准确的回答
通过 REST API 提供问答服务

RAG 是 Spring AI 中一个强大的功能，它使开发者能够构建智能应用，将 LLM 的生成能力与外部知识源结合起来，提供更准确、更可靠的 AI 体验。通过 Spring AI 提供的抽象和工具，实现 RAG 应用变得简单而灵活。

9. 结构化输出 (Structured Output)

结构化输出是 Spring AI 的一个强大功能，它允许开发者从 AI 模型获取结构化的数据（如 Java 对象），而不仅仅是自由格式的文本。这对于需要将 AI 响应集成到应用程序逻辑中的场景尤为重要，因为结构化数据更易于程序处理和操作。

结构化输出的挑战

传统上，AI 模型的输出是纯文本字符串。即使你要求模型以 JSON 格式返回数据，它返回的仍然只是一个格式化的 JSON 字符串，而不是实际的数据结构。这带来了几个挑战：

解析复杂性：需要手动解析文本输出并转换为应用程序可用的数据结构。
格式不一致：模型可能不总是严格遵循请求的输出格式。
错误处理：解析错误需要额外的错误处理逻辑。
类型安全：缺乏类型安全，可能导致运行时错误。

Spring AI 的结构化输出功能解决了这些挑战，提供了一种优雅的方式来将 AI 模型的文本输出直接映射到 Java 对象。

使用 OutputParser

Spring AI 提供了 OutputParser 接口，用于将模型的文本输出转换为特定类型的对象。框架内置了几种实现，包括：

BeanOutputParser：将输出解析为 Java Bean
JsonOutputParser：将输出解析为 JSON 对象
ListOutputParser：将输出解析为列表

BeanOutputParser

BeanOutputParser 是最常用的解析器，它可以将模型输出解析为 Java Bean（POJO）。使用方法如下：

// 定义一个 POJO 类
public record Person(String name, int age, List<String> hobbies) {}

// 创建解析器
BeanOutputParser<Person> parser = new BeanOutputParser<>(Person.class);

// 获取格式说明
String format = parser.getFormat();

// 创建提示
Prompt prompt = new Prompt("""
    提取以下文本中的人物信息，并以指定格式返回。

    文本：小明今年25岁，喜欢游泳、阅读和旅行。

    %s
    """.formatted(format));

// 调用模型并解析输出
Person person = chatClient.call(prompt, parser);

System.out.println(person.name());  // 输出：小明
System.out.println(person.age());   // 输出：25
System.out.println(person.hobbies());  // 输出：[游泳, 阅读, 旅行]

BeanOutputParser 会自动生成一个格式说明，告诉 AI 模型应该如何构造输出。例如，上面的 format 变量可能包含类似以下的内容：

以 JSON 格式返回结果，包含以下字段：
name (String): 人物的名字
age (int): 人物的年龄
hobbies (List<String>): 人物的爱好列表

使用 @BeanOutputConverter 注解

Spring AI 还提供了更简洁的方式来使用结构化输出，通过 @BeanOutputConverter 注解：

// 使用注解定义输出格式
@BeanOutputConverter
public record WeatherInfo(
    @Description("城市名称") String city,
    @Description("当前温度（摄氏度）") double temperature,
    @Description("天气状况，如晴、多云、雨等") String condition,
    @Description("湿度百分比") int humidity
) {}

// 在服务中使用
@Service
public class WeatherService {

    private final ChatClient chatClient;

    @Autowired
    public WeatherService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public WeatherInfo getWeatherInfo(String location) {
        return chatClient.prompt()
            .system("你是一个天气信息助手。请提供指定城市的当前天气信息。")
            .user("请提供" + location + "的天气信息")
            .callAndConvertTo(WeatherInfo.class);
    }
}

@Description 注解用于为每个字段提供描述，帮助 AI 模型理解应该提取什么信息。

处理复杂结构

Spring AI 的结构化输出功能可以处理各种复杂的数据结构，包括嵌套对象、集合和枚举：

// 嵌套对象
@BeanOutputConverter
public record Product(
    String name,
    String description,
    double price,
    Category category,
    List<Review> reviews
) {}

public record Category(String name, String description) {}

public record Review(String author, int rating, String comment) {}

// 使用枚举
@BeanOutputConverter
public record TaskStatus(
    String title,
    Priority priority,
    Status status,
    LocalDate dueDate
) {}

public enum Priority { LOW, MEDIUM, HIGH, URGENT }

public enum Status { TODO, IN_PROGRESS, REVIEW, DONE }

自定义输出解析器

如果内置的解析器不满足需求，你可以实现自己的 OutputParser：

public class CustomOutputParser implements OutputParser<MyCustomType> {

    @Override
    public MyCustomType parse(String output) {
        // 实现自定义解析逻辑
        return parseOutput(output);
    }

    @Override
    public String getFormat() {
        // 返回格式说明
        return "请以以下格式返回结果：...";
    }

    private MyCustomType parseOutput(String output) {
        // 解析逻辑
        // ...
    }
}

结构化输出最佳实践

使用结构化输出时，请考虑以下最佳实践：

保持简单：尽量使用简单、明确的数据结构。复杂的嵌套结构可能增加解析错误的风险。
提供清晰的描述：使用 @Description 注解为每个字段提供详细的描述，帮助 AI 模型理解应该提取什么信息。
设置合理的默认值：对于可能缺失的字段，考虑在 POJO 中设置默认值。
处理解析错误：实现适当的错误处理逻辑，以应对解析失败的情况。
考虑字段类型：使用适当的数据类型。例如，对于日期，可以使用 LocalDate 或 LocalDateTime。
测试边缘情况：测试各种输入场景，包括边缘情况和异常情况。

实际应用示例：产品信息提取

以下是一个完整的示例，展示了如何使用结构化输出从文本中提取产品信息：

@RestController
@RequestMapping("/api/products")
public class ProductExtractionController {

    private final ChatClient chatClient;

    @Autowired
    public ProductExtractionController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @PostMapping("/extract")
    public ResponseEntity<ProductInfo> extractProductInfo(@RequestBody TextRequest request) {
        try {
            ProductInfo productInfo = chatClient.prompt()
                .system("""
                    你是一个产品信息提取专家。你的任务是从提供的文本中提取产品信息，
                    包括产品名称、描述、价格、特点和规格。如果某些信息在文本中不存在，
                    请将相应字段设置为null或空列表。
                    """)
                .user(request.getText())
                .callAndConvertTo(ProductInfo.class);

            return ResponseEntity.ok(productInfo);
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(null);
        }
    }

    // 请求对象
    public static class TextRequest {
        private String text;

        // Getters and setters
        public String getText() {
            return text;
        }

        public void setText(String text) {
            this.text = text;
        }
    }

    // 产品信息结构
    @BeanOutputConverter
    public record ProductInfo(
        @Description("产品的完整名称") String name,
        @Description("产品的详细描述") String description,
        @Description("产品价格（数字）") Double price,
        @Description("产品的主要特点列表") List<String> features,
        @Description("产品的技术规格，键值对形式") Map<String, String> specifications
    ) {}
}

使用这个 API，你可以发送包含产品信息的文本，并获取结构化的产品数据：

// 请求
POST /api/products/extract
{
  "text": "新款 MacBook Pro 13 英寸，搭载 Apple M2 芯片，提供卓越性能和长达 20 小时的电池续航。8GB 统一内存，256GB SSD 存储，Retina 显示屏，售价 9,999 元。特点包括：Touch Bar、背光键盘、Force Touch 触控板。"
}

// 响应
{
  "name": "MacBook Pro 13 英寸",
  "description": "新款 MacBook Pro 13 英寸，搭载 Apple M2 芯片，提供卓越性能和长达 20 小时的电池续航。",
  "price": 9999.0,
  "features": [
    "Touch Bar",
    "背光键盘",
    "Force Touch 触控板",
    "20 小时电池续航"
  ],
  "specifications": {
    "处理器": "Apple M2 芯片",
    "内存": "8GB 统一内存",
    "存储": "256GB SSD",
    "显示屏": "Retina 显示屏"
  }
}

结构化输出与其他功能的结合

结构化输出可以与 Spring AI 的其他功能结合使用，创建更强大的应用：

与 RAG 结合

@Service
public class StructuredRagService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    @Autowired
    public StructuredRagService(ChatClient chatClient, VectorStore vectorStore) {
        this.chatClient = chatClient;
        this.vectorStore = vectorStore;
    }

    public ProductComparison compareProducts(String productA, String productB) {
        // 检索产品信息
        List<Document> docsA = vectorStore.search("product " + productA, 3);
        List<Document> docsB = vectorStore.search("product " + productB, 3);

        String contextA = docsA.stream()
            .map(Document::getContent)
            .collect(Collectors.joining("\n\n"));

        String contextB = docsB.stream()
            .map(Document::getContent)
            .collect(Collectors.joining("\n\n"));

        // 使用检索到的信息生成结构化比较
        return chatClient.prompt()
            .system("""
                你是一个产品比较专家。使用提供的产品信息，创建一个详细的产品比较分析。

                产品A信息:
                {contextA}

                产品B信息:
                {contextB}
                """.replace("{contextA}", contextA).replace("{contextB}", contextB))
            .user("比较" + productA + "和" + productB)
            .callAndConvertTo(ProductComparison.class);
    }

    @BeanOutputConverter
    public record ProductComparison(
        @Description("产品A的名称") String productA,
        @Description("产品B的名称") String productB,
        @Description("价格比较") PriceComparison priceComparison,
        @Description("性能比较") PerformanceComparison performanceComparison,
        @Description("特点比较") List<FeatureComparison> featureComparisons,
        @Description("总体推荐，哪个产品更好或适合哪种用户") String recommendation
    ) {}

    public record PriceComparison(double priceA, double priceB, String analysis) {}

    public record PerformanceComparison(String performanceA, String performanceB, String analysis) {}

    public record FeatureComparison(String feature, String productAValue, String productBValue, String comparison) {}
}

与工具调用结合

@Service
public class StructuredToolCallingService {

    private final ChatClient chatClient;

    @Autowired
    public StructuredToolCallingService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public TravelPlan planTravel(String destination, int days, List<String> interests) {
        return chatClient.prompt()
            .system("你是一个旅行规划专家。为用户创建详细的旅行计划。")
            .user("我想去" + destination + "旅行" + days + "天。我对" + String.join("、", interests) + "感兴趣。")
            .withTools(new WeatherTool(), new AttractionsTool(), new HotelsTool())
            .callAndConvertTo(TravelPlan.class);
    }

    @BeanOutputConverter
    public record TravelPlan(
        String destination,
        int days,
        WeatherInfo weatherInfo,
        List<DayPlan> dayPlans,
        List<HotelOption> recommendedHotels,
        List<String> packingTips
    ) {}

    public record WeatherInfo(String forecast, double averageTemperature, String clothingRecommendation) {}

    public record DayPlan(int day, List<Activity> activities, String morningPlan, String afternoonPlan, String eveningPlan) {}

    public record Activity(String name, String description, double cost, int durationHours, String location) {}

    public record HotelOption(String name, String location, double pricePerNight, List<String> amenities, double rating) {}

    // 工具定义
    // ...
}

结构化输出是 Spring AI 的一个强大功能，它弥合了 AI 模型的自然语言输出与应用程序所需的结构化数据之间的鸿沟。通过这个功能，开发者可以更轻松地将 AI 集成到现有的应用程序逻辑中，创建更智能、更实用的应用。

10. 聊天内存 (Chat Memory)

聊天内存是 Spring AI 的一个重要功能，它使 AI 应用能够在多轮对话中保持上下文连贯性。通过记住之前的交互，AI 可以提供更相关、更个性化的响应，创造更自然的对话体验。

聊天内存的重要性

在没有聊天内存的情况下，每次与 AI 模型的交互都是独立的，模型无法访问之前的对话历史。这会导致以下问题：

上下文丢失：模型无法理解引用之前提到的信息的问题。
重复信息：用户可能需要在每个问题中重复提供相同的背景信息。
不连贯的对话：对话感觉断断续续，不自然。
个性化缺失：模型无法根据用户之前的偏好调整响应。

聊天内存解决了这些问题，使 AI 能够参考之前的对话，提供连贯、上下文相关的响应。

Spring AI 中的聊天内存

Spring AI 提供了灵活的聊天内存抽象，支持多种内存实现，从简单的内存中存储到持久化数据库存储。

核心接口和类

ChatMemory：聊天内存的核心接口，定义了存储和检索消息的方法。
MessageWindowChatMemory：基于滑动窗口的内存实现，保留最近的 N 条消息。
ConversationBufferChatMemory：存储完整对话历史的内存实现。
VolatileMemory：内存中的临时存储实现。
PersistentMemory：持久化存储实现，可以将对话历史保存到数据库。

基本用法

以下是使用聊天内存的基本示例：

@Service
public class ChatService {

    private final ChatClient chatClient;

    @Autowired
    public ChatService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public String chat(String userId, String message) {
        // 创建或获取用户的聊天内存
        ChatMemory chatMemory = getChatMemoryForUser(userId);

        // 添加用户消息到内存
        chatMemory.add(new UserMessage(message));

        // 创建包含聊天历史的提示
        Prompt prompt = new Prompt(chatMemory.getMessages());

        // 调用模型
        ChatResponse response = chatClient.call(prompt);

        // 获取助手响应
        String responseContent = response.getResult().getOutput().getContent();

        // 将助手响应添加到内存
        chatMemory.add(new AssistantMessage(responseContent));

        return responseContent;
    }

    private ChatMemory getChatMemoryForUser(String userId) {
        // 实现获取或创建用户聊天内存的逻辑
        // 这可能涉及从缓存或数据库中检索现有内存，或创建新的内存
        // ...
    }
}

内存实现

Spring AI 提供了多种聊天内存实现，适用于不同的场景：

1. MessageWindowChatMemory

MessageWindowChatMemory 使用滑动窗口方法，只保留最近的 N 条消息。这有助于控制上下文窗口大小，避免超出模型的 token 限制：

// 创建一个窗口大小为 10 的聊天内存
ChatMemory windowMemory = new MessageWindowChatMemory(10);

// 添加系统消息（可选）
windowMemory.add(new SystemMessage("你是一个有用的助手。"));

// 添加用户和助手消息
windowMemory.add(new UserMessage("你好！"));
windowMemory.add(new AssistantMessage("你好！有什么我可以帮助你的吗？"));
windowMemory.add(new UserMessage("我想了解 Spring AI。"));

// 获取所有消息
List<Message> messages = windowMemory.getMessages();

当添加的消息超过窗口大小时，最早的消息会被移除，确保内存中只保留最近的 N 条消息。

2. ConversationBufferChatMemory

ConversationBufferChatMemory 存储完整的对话历史，适用于需要访问整个对话的场景：

// 创建一个对话缓冲内存
ChatMemory bufferMemory = new ConversationBufferChatMemory();

// 添加消息
bufferMemory.add(new SystemMessage("你是一个有用的助手。"));
bufferMemory.add(new UserMessage("你好！"));
bufferMemory.add(new AssistantMessage("你好！有什么我可以帮助你的吗？"));

// 获取所有消息
List<Message> messages = bufferMemory.getMessages();

对于长时间的对话，需要注意可能超出模型的上下文窗口限制。

3. 持久化内存

对于需要在会话之间保持对话历史的应用，Spring AI 支持持久化聊天内存：

@Configuration
public class ChatMemoryConfig {

    @Bean
    public JdbcChatMemoryStore jdbcChatMemoryStore(JdbcTemplate jdbcTemplate) {
        return new JdbcChatMemoryStore(jdbcTemplate);
    }

    @Bean
    public ChatMemoryManager chatMemoryManager(JdbcChatMemoryStore chatMemoryStore) {
        return new ChatMemoryManager(chatMemoryStore);
    }
}

@Service
public class PersistentChatService {

    private final ChatClient chatClient;
    private final ChatMemoryManager chatMemoryManager;

    @Autowired
    public PersistentChatService(ChatClient chatClient, ChatMemoryManager chatMemoryManager) {
        this.chatClient = chatClient;
        this.chatMemoryManager = chatMemoryManager;
    }

    public String chat(String userId, String message) {
        // 获取或创建用户的聊天内存
        ChatMemory chatMemory = chatMemoryManager.getChatMemory(userId);

        // 添加用户消息
        chatMemory.add(new UserMessage(message));

        // 创建提示并调用模型
        Prompt prompt = new Prompt(chatMemory.getMessages());
        ChatResponse response = chatClient.call(prompt);

        // 获取助手响应
        String responseContent = response.getResult().getOutput().getContent();

        // 将助手响应添加到内存
        chatMemory.add(new AssistantMessage(responseContent));

        // 保存更新后的聊天内存
        chatMemoryManager.updateChatMemory(userId, chatMemory);

        return responseContent;
    }

    public void clearChatMemory(String userId) {
        chatMemoryManager.deleteChatMemory(userId);
    }
}

这个示例使用 JDBC 存储来持久化聊天内存，但 Spring AI 也支持其他存储实现，如 Redis、MongoDB 等。

高级用法

1. 自定义内存实现

你可以创建自定义的聊天内存实现，以满足特定需求：

public class CustomChatMemory implements ChatMemory {

    private final List<Message> messages = new ArrayList<>();
    private final int maxSystemMessages;
    private final int maxUserMessages;
    private final int maxAssistantMessages;

    public CustomChatMemory(int maxSystemMessages, int maxUserMessages, int maxAssistantMessages) {
        this.maxSystemMessages = maxSystemMessages;
        this.maxUserMessages = maxUserMessages;
        this.maxAssistantMessages = maxAssistantMessages;
    }

    @Override
    public void add(Message message) {
        messages.add(message);

        // 根据消息类型应用不同的限制
        int systemCount = 0;
        int userCount = 0;
        int assistantCount = 0;

        for (Message m : messages) {
            if (m instanceof SystemMessage) systemCount++;
            else if (m instanceof UserMessage) userCount++;
            else if (m instanceof AssistantMessage) assistantCount++;
        }

        // 移除超出限制的最早消息
        while (systemCount > maxSystemMessages) {
            removeEarliestMessageOfType(SystemMessage.class);
            systemCount--;
        }

        while (userCount > maxUserMessages) {
            removeEarliestMessageOfType(UserMessage.class);
            userCount--;
        }

        while (assistantCount > maxAssistantMessages) {
            removeEarliestMessageOfType(AssistantMessage.class);
            assistantCount--;
        }
    }

    @Override
    public List<Message> getMessages() {
        return new ArrayList<>(messages);
    }

    private void removeEarliestMessageOfType(Class<? extends Message> type) {
        for (int i = 0; i < messages.size(); i++) {
            if (type.isInstance(messages.get(i))) {
                messages.remove(i);
                break;
            }
        }
    }
}

这个自定义实现为不同类型的消息应用不同的限制，可以根据特定需求进行调整。

2. 内存修改和摘要

对于长时间的对话，可能需要对聊天内存进行修改或摘要，以避免超出模型的上下文窗口限制：

@Service
public class SummarizingChatService {

    private final ChatClient chatClient;

    @Autowired
    public SummarizingChatService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public String chat(String userId, String message, ChatMemory chatMemory) {
        // 检查内存大小
        if (estimateTokenCount(chatMemory) > 3000) {
            // 对对话历史进行摘要
            summarizeConversation(chatMemory);
        }

        // 添加用户消息
        chatMemory.add(new UserMessage(message));

        // 创建提示并调用模型
        Prompt prompt = new Prompt(chatMemory.getMessages());
        ChatResponse response = chatClient.call(prompt);

        // 获取助手响应
        String responseContent = response.getResult().getOutput().getContent();

        // 将助手响应添加到内存
        chatMemory.add(new AssistantMessage(responseContent));

        return responseContent;
    }

    private void summarizeConversation(ChatMemory chatMemory) {
        List<Message> messages = chatMemory.getMessages();

        // 提取用户和助手消息
        List<Message> conversationMessages = messages.stream()
            .filter(m -> m instanceof UserMessage || m instanceof AssistantMessage)
            .collect(Collectors.toList());

        // 保留系统消息
        List<Message> systemMessages = messages.stream()
            .filter(m -> m instanceof SystemMessage)
            .collect(Collectors.toList());

        // 创建摘要提示
        String conversationText = conversationMessages.stream()
            .map(m -> (m instanceof UserMessage ? "用户: " : "助手: ") + m.getContent())
            .collect(Collectors.joining("\n"));

        Prompt summaryPrompt = new Prompt("""
            请总结以下对话的要点，保留重要的信息和上下文。摘要应该简洁但全面，以便后续对话可以继续。

            对话历史:
            %s

            摘要:
            """.formatted(conversationText));

        // 生成摘要
        ChatResponse summaryResponse = chatClient.call(summaryPrompt);
        String summary = summaryResponse.getResult().getOutput().getContent();

        // 清除现有内存并添加系统消息和摘要
        chatMemory.clear();

        // 添加原始系统消息
        systemMessages.forEach(chatMemory::add);

        // 添加摘要作为系统消息
        chatMemory.add(new SystemMessage("以下是之前对话的摘要: " + summary));
    }

    private int estimateTokenCount(ChatMemory chatMemory) {
        // 实现估算 token 数量的逻辑
        // 一个简单的启发式方法是计算字符数并除以一个因子（如 4）
        return chatMemory.getMessages().stream()
            .mapToInt(m -> m.getContent().length() / 4)
            .sum();
    }
}

这个示例展示了如何在对话历史变得过长时对其进行摘要，以保持上下文窗口在合理范围内。

聊天内存最佳实践

使用聊天内存时，请考虑以下最佳实践：

控制内存大小：监控聊天内存的大小，避免超出模型的上下文窗口限制。对于长对话，考虑使用窗口内存或定期摘要。
保留关键信息：确保重要的上下文信息（如用户偏好、关键事实）在内存修剪或摘要过程中得到保留。
使用系统消息：利用系统消息来设置对话的基调和提供持久的指导，这些消息通常应该保留在内存中。
考虑隐私和安全：在持久化聊天内存时，确保适当的数据保护措施，如加密敏感信息。
提供清除选项：允许用户清除其对话历史，这既是一个隐私考虑，也是解决对话可能陷入不良状态的方法。
优化存储：对于大规模应用，考虑聊天内存的存储和检索性能，可能需要缓存常用对话或实现更高效的存储策略。

实际应用示例：客户支持聊天机器人

以下是一个使用聊天内存的客户支持聊天机器人示例：

@RestController
@RequestMapping("/api/chat")
public class CustomerSupportChatController {

    private final ChatClient chatClient;
    private final ChatMemoryManager chatMemoryManager;

    @Autowired
    public CustomerSupportChatController(ChatClient chatClient, ChatMemoryManager chatMemoryManager) {
        this.chatClient = chatClient;
        this.chatMemoryManager = chatMemoryManager;
    }

    @PostMapping
    public ResponseEntity<ChatResponse> chat(@RequestBody ChatRequest request) {
        String userId = request.getUserId();
        String message = request.getMessage();

        try {
            // 获取或创建用户的聊天内存
            ChatMemory chatMemory = chatMemoryManager.getChatMemory(userId);

            // 如果是新对话，添加系统消息
            if (chatMemory.getMessages().isEmpty()) {
                chatMemory.add(new SystemMessage("""
                    你是一个专业的客户支持助手。你的任务是帮助用户解决产品相关的问题。
                    遵循以下准则:
                    1. 保持礼貌和专业
                    2. 提供简洁明了的解答
                    3. 如果不确定答案，建议用户联系人工客服
                    4. 收集必要的信息以便更好地理解问题
                    5. 在适当的情况下提供相关产品文档的链接
                    """));
            }

            // 添加用户消息
            chatMemory.add(new UserMessage(message));

            // 创建提示并调用模型
            Prompt prompt = new Prompt(chatMemory.getMessages());
            org.springframework.ai.chat.ChatResponse aiResponse = chatClient.call(prompt);

            // 获取助手响应
            String responseContent = aiResponse.getResult().getOutput().getContent();

            // 将助手响应添加到内存
            chatMemory.add(new AssistantMessage(responseContent));

            // 保存更新后的聊天内存
            chatMemoryManager.updateChatMemory(userId, chatMemory);

            // 创建响应对象
            ChatResponse response = new ChatResponse(
                responseContent,
                LocalDateTime.now().toString(),
                chatMemory.getMessages().size() / 2  // 对话轮次（不计算系统消息）
            );

            return ResponseEntity.ok(response);
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(new ChatResponse(
                    "抱歉，处理您的请求时出现了问题。请稍后再试。",
                    LocalDateTime.now().toString(),
                    0
                ));
        }
    }

    @DeleteMapping("/{userId}")
    public ResponseEntity<Map<String, String>> clearChat(@PathVariable String userId) {
        try {
            chatMemoryManager.deleteChatMemory(userId);
            return ResponseEntity.ok(Map.of("message", "对话历史已清除"));
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }

    // 请求和响应对象
    public static class ChatRequest {
        private String userId;
        private String message;

        // Getters and setters
        // ...
    }

    public static class ChatResponse {
        private String message;
        private String timestamp;
        private int conversationTurn;

        public ChatResponse(String message, String timestamp, int conversationTurn) {
            this.message = message;
            this.timestamp = timestamp;
            this.conversationTurn = conversationTurn;
        }

        // Getters
        // ...
    }
}

这个控制器提供了两个端点：

/api/chat：处理聊天消息，使用持久化的聊天内存保持对话上下文
/api/chat/{userId}：允许清除用户的对话历史

聊天内存是构建自然、连贯对话体验的关键组件。Spring AI 提供了灵活的聊天内存抽象和多种实现，使开发者能够轻松地在其 AI 应用中实现上下文感知的对话。无论是简单的问答机器人还是复杂的对话系统，聊天内存都能显著提升用户体验。

11. 工具调用 (Tool Calling)

工具调用是 Spring AI 的一个强大功能，它允许 AI 模型调用预定义的工具或函数来执行特定任务，如获取实时数据、执行计算或与外部系统交互。这极大地扩展了 AI 应用的能力范围，使其能够执行超出纯文本生成的操作。

工具调用概述

传统上，LLM 的输出仅限于生成文本。而通过工具调用，模型可以：

识别何时需要使用工具：模型可以判断何时需要外部工具来完成任务。
选择合适的工具：从可用工具列表中选择最合适的工具。
构造正确的参数：为所选工具生成有效的参数。
解释工具的输出：将工具执行的结果整合到最终响应中。

这使得 AI 应用能够执行各种实际任务，如查询数据库、调用 API、执行计算、检索最新信息等。

Spring AI 中的工具调用

Spring AI 提供了一个灵活的工具调用框架，使开发者能够轻松定义和使用工具。核心组件包括：

Tool：表示可以被 AI 模型调用的工具或函数。
ToolExecutor：负责执行工具调用。
ToolParameterDefinition：定义工具参数的名称、类型和描述。

定义工具

在 Spring AI 中，工具可以通过多种方式定义：

1. 使用 @ToolAction 注解

最简单的方式是使用 @ToolAction 注解标记方法：

@Component
public class WeatherTools {

    private final WeatherService weatherService;

    @Autowired
    public WeatherTools(WeatherService weatherService) {
        this.weatherService = weatherService;
    }

    @ToolAction(name = "get_current_weather",
                description = "获取指定城市的当前天气信息")
    public WeatherInfo getCurrentWeather(
            @ToolParameter(name = "city",
                          description = "城市名称，如北京、上海、广州") String city) {
        return weatherService.getWeatherForCity(city);
    }

    @ToolAction(name = "get_weather_forecast",
                description = "获取指定城市的天气预报")
    public List<WeatherInfo> getWeatherForecast(
            @ToolParameter(name = "city",
                          description = "城市名称，如北京、上海、广州") String city,
            @ToolParameter(name = "days",
                          description = "预报天数，1-7") int days) {
        return weatherService.getWeatherForecast(city, days);
    }

    public record WeatherInfo(String city, double temperature, String condition, int humidity) {}
}

2. 使用 Tool 接口

对于更复杂的场景，可以实现 Tool 接口：

@Component
public class DatabaseSearchTool implements Tool {

    private final DatabaseService databaseService;

    @Autowired
    public DatabaseSearchTool(DatabaseService databaseService) {
        this.databaseService = databaseService;
    }

    @Override
    public String getName() {
        return "database_search";
    }

    @Override
    public String getDescription() {
        return "搜索数据库中的记录";
    }

    @Override
    public List<ToolParameterDefinition> getParameterDefinitions() {
        return List.of(
            new ToolParameterDefinition("query", String.class, "搜索查询", true),
            new ToolParameterDefinition("limit", Integer.class, "结果数量限制", false)
        );
    }

    @Override
    public Object execute(Map<String, Object> parameters) {
        String query = (String) parameters.get("query");
        Integer limit = parameters.containsKey("limit") ? (Integer) parameters.get("limit") : 10;

        return databaseService.search(query, limit);
    }
}

3. 使用 ToolDefinition 和 ToolExecutor

对于需要更细粒度控制的场景，可以使用 ToolDefinition 和 ToolExecutor：

@Configuration
public class ToolConfig {

    @Bean
    public ToolDefinition calculatorTool() {
        return ToolDefinition.builder()
            .name("calculator")
            .description("执行数学计算")
            .parameter(ToolParameterDefinition.builder()
                .name("expression")
                .description("要计算的数学表达式，如 '2 + 2 * 3'")
                .type(String.class)
                .required(true)
                .build())
            .build();
    }

    @Bean
    public ToolExecutor calculatorToolExecutor() {
        return new ToolExecutor() {
            @Override
            public boolean canExecute(ToolDefinition toolDefinition) {
                return "calculator".equals(toolDefinition.getName());
            }

            @Override
            public Object execute(ToolDefinition toolDefinition, Map<String, Object> parameters) {
                String expression = (String) parameters.get("expression");
                // 使用表达式解析库计算结果
                return evaluateExpression(expression);
            }

            private double evaluateExpression(String expression) {
                // 实现表达式计算逻辑
                // 这里可以使用现有的表达式解析库，如 exp4j
                // ...
                return 0.0; // 占位符
            }
        };
    }
}

使用工具

定义好工具后，可以在 ChatClient 中使用它们：

@Service
public class AssistantService {

    private final ChatClient chatClient;
    private final List<Tool> tools;

    @Autowired
    public AssistantService(ChatClient chatClient, List<Tool> tools) {
        this.chatClient = chatClient;
        this.tools = tools;
    }

    public String processQuery(String query) {
        ChatResponse response = chatClient.prompt()
            .system("你是一个有用的助手，可以使用提供的工具来回答用户的问题。")
            .user(query)
            .withTools(tools)
            .call();

        return response.getResult().getOutput().getContent();
    }
}

也可以使用 prompt() builder API 的 withTools() 方法：

ChatResponse response = chatClient.prompt()
    .system("你是一个有用的助手，可以使用提供的工具来回答用户的问题。")
    .user("北京今天的天气怎么样？")
    .withTools(weatherTools.getCurrentWeather())
    .call();

工具调用流程

当使用工具调用时，Spring AI 会执行以下流程：

将用户查询和可用工具的描述发送给 AI 模型。
模型决定是否需要调用工具，如果需要，它会指定工具名称和参数。
Spring AI 接收模型的工具调用请求，验证工具名称和参数。
Spring AI 执行指定的工具，获取结果。
将工具执行结果发送回模型，让模型生成最终响应。
模型整合工具执行结果，生成对用户查询的完整回答。

这个流程可能涉及多次工具调用，直到模型认为已经收集了足够的信息来回答用户的查询。

高级工具调用功能

Spring AI 提供了多种高级工具调用功能，以满足复杂应用的需求：

1. 多工具调用

模型可以在单个响应中调用多个工具：

@Service
public class TravelPlanningService {

    private final ChatClient chatClient;
    private final WeatherTools weatherTools;
    private final FlightTools flightTools;
    private final HotelTools hotelTools;

    @Autowired
    public TravelPlanningService(
            ChatClient chatClient,
            WeatherTools weatherTools,
            FlightTools flightTools,
            HotelTools hotelTools) {
        this.chatClient = chatClient;
        this.weatherTools = weatherTools;
        this.flightTools = flightTools;
        this.hotelTools = hotelTools;
    }

    public String planTravel(String destination, String departureDate, int stayDuration) {
        ChatResponse response = chatClient.prompt()
            .system("你是一个旅行规划助手，可以使用提供的工具来帮助用户规划旅行。")
            .user("我想在" + departureDate + "去" + destination + "旅行" + stayDuration + "天。")
            .withTools(
                weatherTools.getWeatherForecast(),
                flightTools.searchFlights(),
                hotelTools.searchHotels()
            )
            .call();

        return response.getResult().getOutput().getContent();
    }
}

2. 递归工具调用

模型可以基于之前工具调用的结果进行后续工具调用：

@Service
public class DataAnalysisService {

    private final ChatClient chatClient;
    private final DatabaseTools databaseTools;
    private final AnalysisTools analysisTools;
    private final VisualizationTools visualizationTools;

    @Autowired
    public DataAnalysisService(
            ChatClient chatClient,
            DatabaseTools databaseTools,
            AnalysisTools analysisTools,
            VisualizationTools visualizationTools) {
        this.chatClient = chatClient;
        this.databaseTools = databaseTools;
        this.analysisTools = analysisTools;
        this.visualizationTools = visualizationTools;
    }

    public String analyzeData(String datasetName, String analysisType) {
        ChatResponse response = chatClient.prompt()
            .system("""
                你是一个数据分析助手，可以使用提供的工具来帮助用户分析数据。
                遵循以下步骤：
                1. 从数据库获取数据集
                2. 根据用户需求执行适当的分析
                3. 生成可视化
                4. 提供分析结果和见解
                """)
            .user("请对" + datasetName + "数据集进行" + analysisType + "分析。")
            .withTools(
                databaseTools.queryDataset(),
                analysisTools.performAnalysis(),
                visualizationTools.createVisualization()
            )
            .call();

        return response.getResult().getOutput().getContent();
    }
}

3. 条件工具调用

模型可以根据上下文和用户需求有条件地调用工具：

@Service
public class CustomerSupportService {

    private final ChatClient chatClient;
    private final OrderTools orderTools;
    private final ProductTools productTools;
    private final KnowledgeBaseTools knowledgeBaseTools;

    @Autowired
    public CustomerSupportService(
            ChatClient chatClient,
            OrderTools orderTools,
            ProductTools productTools,
            KnowledgeBaseTools knowledgeBaseTools) {
        this.chatClient = chatClient;
        this.orderTools = orderTools;
        this.productTools = productTools;
        this.knowledgeBaseTools = knowledgeBaseTools;
    }

    public String handleCustomerQuery(String query) {
        ChatResponse response = chatClient.prompt()
            .system("""
                你是一个客户支持助手，可以使用提供的工具来帮助用户解决问题。
                根据用户的查询，选择合适的工具：
                - 如果是关于订单的查询，使用订单工具
                - 如果是关于产品的查询，使用产品工具
                - 如果是一般问题，使用知识库工具
                """)
            .user(query)
            .withTools(
                orderTools.getOrderStatus(),
                orderTools.trackShipment(),
                productTools.getProductInfo(),
                productTools.checkAvailability(),
                knowledgeBaseTools.searchKnowledgeBase()
            )
            .call();

        return response.getResult().getOutput().getContent();
    }
}

工具调用最佳实践

使用工具调用时，请考虑以下最佳实践：

明确工具描述：提供清晰、详细的工具和参数描述，帮助模型理解何时以及如何使用工具。
参数验证：在工具执行前验证参数，确保它们符合预期的格式和范围。
错误处理：实现健壮的错误处理，优雅地处理工具执行过程中可能出现的异常。
限制工具范围：每个工具应该专注于一个特定的功能，避免过于通用的工具。
安全考虑：确保工具不会执行潜在危险的操作，如未经授权的数据访问或系统修改。
性能优化：对于可能耗时的工具，考虑实现异步执行或缓存机制。
版本控制：随着应用的发展，维护工具的版本控制，确保向后兼容性。

实际应用示例：智能助手

以下是一个完整的智能助手示例，它使用多种工具来回答用户的查询：

@SpringBootApplication
public class IntelligentAssistantApplication {

    public static void main(String[] args) {
        SpringApplication.run(IntelligentAssistantApplication.class, args);
    }
}

@Component
public class WeatherTools {

    private final WeatherService weatherService;

    @Autowired
    public WeatherTools(WeatherService weatherService) {
        this.weatherService = weatherService;
    }

    @ToolAction(name = "get_weather",
                description = "获取指定城市的天气信息")
    public WeatherInfo getWeather(
            @ToolParameter(name = "city",
                          description = "城市名称，如北京、上海、广州") String city) {
        return weatherService.getWeatherForCity(city);
    }

    public record WeatherInfo(String city, double temperature, String condition, int humidity) {}
}

@Component
public class CalendarTools {

    private final CalendarService calendarService;

    @Autowired
    public CalendarTools(CalendarService calendarService) {
        this.calendarService = calendarService;
    }

    @ToolAction(name = "get_events",
                description = "获取指定日期的日程安排")
    public List<CalendarEvent> getEvents(
            @ToolParameter(name = "date",
                          description = "日期，格式为 YYYY-MM-DD") String date) {
        return calendarService.getEventsForDate(date);
    }

    @ToolAction(name = "add_event",
                description = "添加新的日程安排")
    public CalendarEvent addEvent(
            @ToolParameter(name = "title",
                          description = "事件标题") String title,
            @ToolParameter(name = "date",
                          description = "日期，格式为 YYYY-MM-DD") String date,
            @ToolParameter(name = "time",
                          description = "时间，格式为 HH:MM") String time,
            @ToolParameter(name = "duration",
                          description = "持续时间（分钟）") int duration) {
        return calendarService.addEvent(title, date, time, duration);
    }

    public record CalendarEvent(String title, String date, String time, int duration) {}
}

@Component
public class SearchTools {

    private final SearchService searchService;

    @Autowired
    public SearchTools(SearchService searchService) {
        this.searchService = searchService;
    }

    @ToolAction(name = "search_web",
                description = "在网络上搜索信息")
    public List<SearchResult> searchWeb(
            @ToolParameter(name = "query",
                          description = "搜索查询") String query,
            @ToolParameter(name = "limit",
                          description = "结果数量限制") int limit) {
        return searchService.search(query, limit);
    }

    public record SearchResult(String title, String url, String snippet) {}
}

@Service
public class AssistantService {

    private final ChatClient chatClient;
    private final WeatherTools weatherTools;
    private final CalendarTools calendarTools;
    private final SearchTools searchTools;

    @Autowired
    public AssistantService(
            ChatClient chatClient,
            WeatherTools weatherTools,
            CalendarTools calendarTools,
            SearchTools searchTools) {
        this.chatClient = chatClient;
        this.weatherTools = weatherTools;
        this.calendarTools = calendarTools;
        this.searchTools = searchTools;
    }

    public String processQuery(String query) {
        ChatResponse response = chatClient.prompt()
            .system("""
                你是一个智能助手，可以使用提供的工具来帮助用户。
                根据用户的查询，选择合适的工具来获取信息或执行操作。
                如果用户的查询不需要使用工具，直接回答。
                如果需要使用工具但没有合适的工具，告知用户你无法执行该操作。
                """)
            .user(query)
            .withTools(
                weatherTools.getWeather(),
                calendarTools.getEvents(),
                calendarTools.addEvent(),
                searchTools.searchWeb()
            )
            .call();

        return response.getResult().getOutput().getContent();
    }
}

@RestController
@RequestMapping("/api/assistant")
public class AssistantController {

    private final AssistantService assistantService;

    @Autowired
    public AssistantController(AssistantService assistantService) {
        this.assistantService = assistantService;
    }

    @PostMapping
    public ResponseEntity<Map<String, String>> processQuery(@RequestBody QueryRequest request) {
        try {
            String response = assistantService.processQuery(request.getQuery());

            return ResponseEntity.ok(Map.of(
                "query", request.getQuery(),
                "response", response
            ));
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }

    // 请求对象
    public static class QueryRequest {
        private String query;

        // Getters and setters
        public String getQuery() {
            return query;
        }

        public void setQuery(String query) {
            this.query = query;
        }
    }
}

这个示例展示了一个智能助手，它可以：

获取天气信息
管理日程安排
在网络上搜索信息
根据用户查询自动选择合适的工具

通过工具调用，Spring AI 使开发者能够构建功能强大的 AI 应用，这些应用不仅能够生成文本，还能执行实际操作，与外部系统交互，提供更加实用和动态的用户体验。

12. 可观测性 (Observability)

可观测性是构建可靠 AI 应用的关键方面，它使开发者能够监控、理解和调试 AI 系统的行为。Spring AI 提供了全面的可观测性支持，帮助开发者跟踪 AI 交互、性能指标和潜在问题。

可观测性的重要性

在 AI 应用中，可观测性尤为重要，原因包括：

复杂性：AI 系统涉及多个组件和外部服务，增加了调试难度。
不确定性：AI 模型的输出具有一定的不确定性，需要监控以确保质量。
成本管理：AI API 调用通常基于使用量计费，需要跟踪使用情况以控制成本。
性能优化：识别瓶颈和优化机会，提高系统响应时间。
合规性：满足审计和合规要求，记录 AI 决策过程。

Spring AI 中的可观测性

Spring AI 提供了多层次的可观测性支持，包括：

1. 日志记录 (Logging)

Spring AI 使用 SLF4J 进行日志记录，可以配置不同级别的日志详细程度：

# application.properties

# 设置 Spring AI 的日志级别
logging.level.org.springframework.ai=DEBUG

# 设置特定组件的日志级别
logging.level.org.springframework.ai.openai=TRACE
logging.level.org.springframework.ai.vectorstore=INFO

日志记录了 AI 交互的关键信息，如请求、响应、错误和性能数据。

2. 指标收集 (Metrics)

Spring AI 与 Micrometer 集成，提供了丰富的指标收集功能：

@Configuration
public class MetricsConfig {

    @Bean
    public MeterRegistry meterRegistry() {
        return new SimpleMeterRegistry();
    }

    @Bean
    public AiMetricsRecorder aiMetricsRecorder(MeterRegistry meterRegistry) {
        return new MicrometerAiMetricsRecorder(meterRegistry);
    }
}

主要指标包括：

调用计数：AI 模型调用次数
令牌使用量：输入和输出令牌数量
延迟：请求响应时间
错误率：失败请求的比例
成本：API 调用的估计成本

这些指标可以通过 Prometheus、Grafana 等工具进行可视化和监控。

3. 追踪 (Tracing)

Spring AI 支持分布式追踪，使用 Spring Cloud Sleuth 或 Micrometer Tracing：

@Configuration
public class TracingConfig {

    @Bean
    public AiTracingRecorder aiTracingRecorder(Tracer tracer) {
        return new OpenTelemetryAiTracingRecorder(tracer);
    }
}

追踪提供了 AI 交互的详细时间线，包括：

请求开始和结束时间
子操作（如嵌入生成、向量搜索）的持续时间
组件之间的调用关系
错误和异常信息

4. 审计 (Auditing)

Spring AI 提供了审计功能，记录所有 AI 交互的详细信息：

@Configuration
public class AuditConfig {

    @Bean
    public AiAuditRecorder aiAuditRecorder() {
        return new FileSystemAiAuditRecorder("/path/to/audit/logs");
    }
}

审计记录包括：

完整的请求和响应内容
用户和会话标识
时间戳和持续时间
使用的模型和参数
元数据和上下文信息

5. 自定义观察者 (Custom Observers)

Spring AI 允许开发者实现自定义观察者，以满足特定的可观测性需求：

@Component
public class CustomAiObserver implements AiObserver {

    @Override
    public void beforeRequest(AiRequest request) {
        // 请求前的处理
        System.out.println("Preparing to send request: " + request.getId());
    }

    @Override
    public void afterResponse(AiRequest request, AiResponse response) {
        // 响应后的处理
        System.out.println("Received response for request: " + request.getId());
        System.out.println("Response time: " + response.getDuration() + "ms");
    }

    @Override
    public void onError(AiRequest request, Throwable error) {
        // 错误处理
        System.err.println("Error processing request: " + request.getId());
        error.printStackTrace();
    }
}

实现可观测性的最佳实践

在 Spring AI 应用中实现可观测性时，请考虑以下最佳实践：

1. 结构化日志

使用结构化日志格式（如 JSON），使日志更易于解析和分析：

@Service
public class LoggingService {

    private static final Logger logger = LoggerFactory.getLogger(LoggingService.class);

    public void logAiInteraction(String userId, String query, String response, long duration) {
        Map<String, Object> logData = new HashMap<>();
        logData.put("userId", userId);
        logData.put("query", query);
        logData.put("responseLength", response.length());
        logData.put("duration", duration);
        logData.put("timestamp", System.currentTimeMillis());

        logger.info("AI interaction: {}", JsonUtils.toJson(logData));
    }
}

2. 关联标识符

使用关联标识符（如请求 ID、会话 ID）关联不同组件的日志和指标：

@Service
public class AiService {

    private final ChatClient chatClient;
    private final MDCService mdcService;

    @Autowired
    public AiService(ChatClient chatClient, MDCService mdcService) {
        this.chatClient = chatClient;
        this.mdcService = mdcService;
    }

    public String processQuery(String userId, String query) {
        String requestId = UUID.randomUUID().toString();

        try {
            // 设置 MDC 上下文
            mdcService.setContext(userId, requestId);

            // 处理查询
            ChatResponse response = chatClient.prompt()
                .system("你是一个有用的助手。")
                .user(query)
                .call();

            return response.getResult().getOutput().getContent();
        } finally {
            // 清除 MDC 上下文
            mdcService.clearContext();
        }
    }
}

@Service
public class MDCService {

    public void setContext(String userId, String requestId) {
        MDC.put("userId", userId);
        MDC.put("requestId", requestId);
    }

    public void clearContext() {
        MDC.clear();
    }
}

3. 采样和过滤

对于高流量系统，实现日志和追踪的采样和过滤，以减少存储需求和性能影响：

@Configuration
public class ObservabilityConfig {

    @Bean
    public AiTracingRecorder sampledAiTracingRecorder(Tracer tracer) {
        return new SampledAiTracingRecorder(tracer, 0.1); // 10% 采样率
    }

    @Bean
    public AiAuditRecorder filteredAiAuditRecorder() {
        return new FilteredAiAuditRecorder(
            new FileSystemAiAuditRecorder("/path/to/audit/logs"),
            request -> request.getModelName().contains("gpt-4") // 只审计 GPT-4 请求
        );
    }
}

4. 敏感信息处理

确保日志和审计记录中不包含敏感信息，或者实现适当的加密和访问控制：

@Component
public class PrivacyAwareAiObserver implements AiObserver {

    @Override
    public void beforeRequest(AiRequest request) {
        // 在记录前清除敏感信息
        AiRequest sanitizedRequest = sanitizeRequest(request);
        logRequest(sanitizedRequest);
    }

    @Override
    public void afterResponse(AiRequest request, AiResponse response) {
        // 在记录前清除敏感信息
        AiRequest sanitizedRequest = sanitizeRequest(request);
        AiResponse sanitizedResponse = sanitizeResponse(response);
        logResponse(sanitizedRequest, sanitizedResponse);
    }

    private AiRequest sanitizeRequest(AiRequest request) {
        // 实现敏感信息清除逻辑
        // ...
        return sanitizedRequest;
    }

    private AiResponse sanitizeResponse(AiResponse response) {
        // 实现敏感信息清除逻辑
        // ...
        return sanitizedResponse;
    }
}

5. 警报和通知

基于指标和日志设置警报和通知，及时发现和解决问题：

@Configuration
public class AlertConfig {

    @Bean
    public AlertingService aiAlertingService(MeterRegistry meterRegistry) {
        return new AlertingService(meterRegistry);
    }
}

@Service
public class AlertingService {

    private final MeterRegistry meterRegistry;
    private final NotificationService notificationService;

    @Autowired
    public AlertingService(MeterRegistry meterRegistry, NotificationService notificationService) {
        this.meterRegistry = meterRegistry;
        this.notificationService = notificationService;

        // 设置警报
        setupAlerts();
    }

    private void setupAlerts() {
        // 错误率警报
        meterRegistry.gauge("ai.error.rate", Tags.empty(), 0.0, value -> {
            double errorRate = calculateErrorRate();
            if (errorRate > 0.05) { // 错误率超过 5%
                notificationService.sendAlert("AI Error Rate Alert", 
                    "Error rate has exceeded threshold: " + errorRate);
            }
            return errorRate;
        });

        // 延迟警报
        meterRegistry.gauge("ai.latency.p95", Tags.empty(), 0.0, value -> {
            double p95Latency = calculateP95Latency();
            if (p95Latency > 2000) { // P95 延迟超过 2 秒
                notificationService.sendAlert("AI Latency Alert", 
                    "P95 latency has exceeded threshold: " + p95Latency + "ms");
            }
            return p95Latency;
        });
    }

    private double calculateErrorRate() {
        // 实现错误率计算逻辑
        // ...
        return 0.0; // 占位符
    }

    private double calculateP95Latency() {
        // 实现 P95 延迟计算逻辑
        // ...
        return 0.0; // 占位符
    }
}

实际应用示例：全面可观测的 AI 服务

以下是一个实现全面可观测性的 AI 服务示例：

@SpringBootApplication
@EnableAiObservability
public class ObservableAiApplication {

    public static void main(String[] args) {
        SpringApplication.run(ObservableAiApplication.class, args);
    }

    @Bean
    public MeterRegistry meterRegistry() {
        return new SimpleMeterRegistry();
    }

    @Bean
    public AiMetricsRecorder aiMetricsRecorder(MeterRegistry meterRegistry) {
        return new MicrometerAiMetricsRecorder(meterRegistry);
    }

    @Bean
    public AiTracingRecorder aiTracingRecorder(Tracer tracer) {
        return new OpenTelemetryAiTracingRecorder(tracer);
    }

    @Bean
    public AiAuditRecorder aiAuditRecorder() {
        return new FileSystemAiAuditRecorder("./audit-logs");
    }
}

@Service
public class ObservableAiService {

    private static final Logger logger = LoggerFactory.getLogger(ObservableAiService.class);

    private final ChatClient chatClient;
    private final MeterRegistry meterRegistry;

    @Autowired
    public ObservableAiService(ChatClient chatClient, MeterRegistry meterRegistry) {
        this.chatClient = chatClient;
        this.meterRegistry = meterRegistry;
    }

    public String processQuery(String userId, String query) {
        String requestId = UUID.randomUUID().toString();
        long startTime = System.currentTimeMillis();

        // 设置 MDC 上下文
        MDC.put("userId", userId);
        MDC.put("requestId", requestId);

        logger.info("Processing query: {}", query);

        try {
            // 创建 span
            Span span = Span.current();
            span.setAttribute("userId", userId);
            span.setAttribute("queryLength", query.length());

            // 记录请求计数
            meterRegistry.counter("ai.requests", 
                "userId", userId,
                "modelType", "chat"
            ).increment();

            // 处理查询
            ChatResponse response = chatClient.prompt()
                .system("你是一个有用的助手。")
                .user(query)
                .call();

            String content = response.getResult().getOutput().getContent();

            // 记录令牌使用量
            Map<String, Object> metadata = response.getMetadata();
            if (metadata.containsKey("usage")) {
                Map<String, Object> usage = (Map<String, Object>) metadata.get("usage");
                meterRegistry.counter("ai.tokens.input").increment((int) usage.get("promptTokens"));
                meterRegistry.counter("ai.tokens.output").increment((int) usage.get("completionTokens"));
            }

            // 记录延迟
            long duration = System.currentTimeMillis() - startTime;
            meterRegistry.timer("ai.latency", 
                "userId", userId,
                "modelType", "chat"
            ).record(duration, TimeUnit.MILLISECONDS);

            logger.info("Query processed successfully in {}ms", duration);

            return content;
        } catch (Exception e) {
            // 记录错误
            meterRegistry.counter("ai.errors", 
                "userId", userId,
                "errorType", e.getClass().getSimpleName()
            ).increment();

            logger.error("Error processing query", e);
            throw e;
        } finally {
            // 清除 MDC 上下文
            MDC.clear();
        }
    }
}

@RestController
@RequestMapping("/api/ai")
public class AiController {

    private final ObservableAiService aiService;

    @Autowired
    public AiController(ObservableAiService aiService) {
        this.aiService = aiService;
    }

    @PostMapping("/query")
    public ResponseEntity<Map<String, String>> processQuery(@RequestBody QueryRequest request) {
        try {
            String response = aiService.processQuery(request.getUserId(), request.getQuery());

            return ResponseEntity.ok(Map.of(
                "query", request.getQuery(),
                "response", response
            ));
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }

    // 请求对象
    public static class QueryRequest {
        private String userId;
        private String query;

        // Getters and setters
        // ...
    }
}

这个示例展示了一个全面可观测的 AI 服务，它实现了：

结构化日志记录
指标收集（请求计数、令牌使用量、延迟）
分布式追踪
审计记录
关联标识符（用户 ID、请求 ID）

通过这些可观测性功能，开发者可以全面了解 AI 系统的行为和性能，及时发现和解决问题，确保系统的可靠性和效率。

可观测性是构建企业级 AI 应用的关键要素。Spring AI 提供的全面可观测性支持，使开发者能够监控、理解和优化其 AI 系统，确保它们在生产环境中可靠、高效地运行。

13. 模型评估 (Model Evaluation)

模型评估是 AI 应用开发中的关键环节，它帮助开发者了解 AI 模型的性能、质量和适用性。Spring AI 提供了全面的模型评估框架，使开发者能够系统地评估和比较不同的 AI 模型和配置。

模型评估的重要性

在 AI 应用开发中，模型评估至关重要，原因包括：

模型选择：帮助选择最适合特定任务的模型。
参数优化：确定最佳的模型参数和配置。
质量保证：确保模型输出满足质量和准确性要求。
成本效益分析：评估模型性能与成本之间的平衡。
持续监控：检测模型性能随时间的变化。

Spring AI 中的模型评估

Spring AI 提供了灵活的模型评估框架，支持多种评估方法和指标。

核心组件

EvaluationDataset：表示用于评估的数据集。
EvaluationTask：定义要评估的任务类型。
EvaluationMetric：定义用于评估的指标。
ModelEvaluator：执行评估并生成结果。

基本用法

以下是使用 Spring AI 进行模型评估的基本示例：

@Service
public class ModelEvaluationService {

    private final ChatClient chatClientA;
    private final ChatClient chatClientB;

    @Autowired
    public ModelEvaluationService(
            @Qualifier("gpt35ChatClient") ChatClient chatClientA,
            @Qualifier("gpt4ChatClient") ChatClient chatClientB) {
        this.chatClientA = chatClientA;
        this.chatClientB = chatClientB;
    }

    public EvaluationResult evaluateModels() {
        // 创建评估数据集
        EvaluationDataset dataset = new SimpleEvaluationDataset(
            List.of(
                new EvaluationExample("什么是机器学习？", "机器学习是人工智能的一个子领域，它使计算机系统能够通过经验自动改进。"),
                new EvaluationExample("解释神经网络的工作原理。", "神经网络是受人脑启发的计算模型，由多层互连的节点（神经元）组成..."),
                // 更多示例...
            )
        );

        // 创建评估任务
        EvaluationTask task = new QnAEvaluationTask();

        // 创建评估指标
        List<EvaluationMetric> metrics = List.of(
            new AccuracyMetric(),
            new RelevanceMetric(),
            new CompletenessMetric()
        );

        // 创建模型评估器
        ModelEvaluator evaluatorA = new ModelEvaluator(chatClientA, task, metrics);
        ModelEvaluator evaluatorB = new ModelEvaluator(chatClientB, task, metrics);

        // 执行评估
        EvaluationResult resultA = evaluatorA.evaluate(dataset);
        EvaluationResult resultB = evaluatorB.evaluate(dataset);

        // 比较结果
        ComparisonResult comparison = new ModelComparer().compare(
            Map.of("GPT-3.5", resultA, "GPT-4", resultB)
        );

        return comparison;
    }
}

评估数据集

Spring AI 支持多种方式创建评估数据集：

1. 从文件加载

// 从 CSV 文件加载
EvaluationDataset csvDataset = EvaluationDataset.fromCsv("path/to/dataset.csv");

// 从 JSON 文件加载
EvaluationDataset jsonDataset = EvaluationDataset.fromJson("path/to/dataset.json");

// 从 YAML 文件加载
EvaluationDataset yamlDataset = EvaluationDataset.fromYaml("path/to/dataset.yaml");

2. 编程方式创建

// 创建问答数据集
EvaluationDataset qnaDataset = new SimpleEvaluationDataset(
    List.of(
        new EvaluationExample("问题1", "参考答案1"),
        new EvaluationExample("问题2", "参考答案2"),
        // 更多示例...
    )
);

// 创建分类数据集
EvaluationDataset classificationDataset = new SimpleEvaluationDataset(
    List.of(
        new EvaluationExample("这个产品非常好用！", "正面"),
        new EvaluationExample("服务态度很差，不会再购买。", "负面"),
        // 更多示例...
    )
);

3. 自动生成

// 使用 AI 生成评估数据集
EvaluationDataset generatedDataset = new DatasetGenerator(chatClient)
    .generateQnADataset(
        "生成 10 个关于人工智能的问答对",
        10
    );

评估任务

Spring AI 支持多种评估任务：

1. 问答任务

EvaluationTask qnaTask = new QnAEvaluationTask();

2. 分类任务

EvaluationTask classificationTask = new ClassificationEvaluationTask();

3. 摘要任务

EvaluationTask summarizationTask = new SummarizationEvaluationTask();

4. 自定义任务

public class CustomEvaluationTask implements EvaluationTask {

    @Override
    public String getName() {
        return "custom-task";
    }

    @Override
    public String getDescription() {
        return "自定义评估任务";
    }

    @Override
    public String generatePrompt(EvaluationExample example) {
        // 生成适合任务的提示
        return "基于以下信息：" + example.getInput() + "\n\n请执行特定任务...";
    }

    @Override
    public boolean isValidExample(EvaluationExample example) {
        // 验证示例是否适合此任务
        return example.getInput() != null && example.getExpectedOutput() != null;
    }
}

评估指标

Spring AI 提供了多种评估指标：

1. 准确性指标

// 精确匹配准确性
EvaluationMetric exactMatchMetric = new ExactMatchAccuracyMetric();

// 语义相似度准确性
EvaluationMetric semanticSimilarityMetric = new SemanticSimilarityMetric(embeddingClient);

// F1 分数
EvaluationMetric f1Metric = new F1ScoreMetric();

2. 质量指标

// 相关性
EvaluationMetric relevanceMetric = new RelevanceMetric();

// 完整性
EvaluationMetric completenessMetric = new CompletenessMetric();

// 一致性
EvaluationMetric consistencyMetric = new ConsistencyMetric();

3. 性能指标

// 延迟
EvaluationMetric latencyMetric = new LatencyMetric();

// 令牌使用量
EvaluationMetric tokenUsageMetric = new TokenUsageMetric();

// 成本
EvaluationMetric costMetric = new CostMetric();

4. AI 评估指标

// 使用 AI 评估输出质量
EvaluationMetric aiJudgeMetric = new AiJudgeMetric(
    evaluationChatClient,
    "评估模型输出的质量，考虑准确性、相关性和完整性。给出 1-10 的分数。"
);

5. 自定义指标

public class CustomMetric implements EvaluationMetric {

    @Override
    public String getName() {
        return "custom-metric";
    }

    @Override
    public String getDescription() {
        return "自定义评估指标";
    }

    @Override
    public MetricResult evaluate(String expectedOutput, String actualOutput) {
        // 实现自定义评估逻辑
        double score = calculateCustomScore(expectedOutput, actualOutput);

        return new MetricResult(
            getName(),
            score,
            Map.of("details", "自定义评估详情...")
        );
    }

    private double calculateCustomScore(String expectedOutput, String actualOutput) {
        // 实现自定义评分逻辑
        // ...
        return 0.0; // 占位符
    }
}

高级评估功能

Spring AI 提供了多种高级评估功能，以满足复杂评估需求：

1. 交叉验证

@Service
public class CrossValidationService {

    private final ChatClient chatClient;

    @Autowired
    public CrossValidationService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public List<EvaluationResult> performCrossValidation(EvaluationDataset dataset, int folds) {
        // 将数据集分成 k 个折
        List<EvaluationDataset> foldDatasets = splitDataset(dataset, folds);

        List<EvaluationResult> results = new ArrayList<>();

        // 创建评估任务和指标
        EvaluationTask task = new QnAEvaluationTask();
        List<EvaluationMetric> metrics = List.of(
            new AccuracyMetric(),
            new RelevanceMetric()
        );

        // 对每个折执行评估
        for (int i = 0; i < folds; i++) {
            // 创建训练集和测试集
            EvaluationDataset testDataset = foldDatasets.get(i);
            EvaluationDataset trainDataset = combineDatasets(foldDatasets, i);

            // 使用训练集微调模型（如果支持）
            // ...

            // 评估模型
            ModelEvaluator evaluator = new ModelEvaluator(chatClient, task, metrics);
            EvaluationResult result = evaluator.evaluate(testDataset);

            results.add(result);
        }

        return results;
    }

    private List<EvaluationDataset> splitDataset(EvaluationDataset dataset, int folds) {
        // 实现数据集分割逻辑
        // ...
        return new ArrayList<>(); // 占位符
    }

    private EvaluationDataset combineDatasets(List<EvaluationDataset> datasets, int excludeIndex) {
        // 实现数据集合并逻辑
        // ...
        return new SimpleEvaluationDataset(List.of()); // 占位符
    }
}

2. 参数优化

@Service
public class ParameterOptimizationService {

    private final OpenAiChatClient chatClient;

    @Autowired
    public ParameterOptimizationService(OpenAiChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public Map<OpenAiChatOptions, EvaluationResult> optimizeParameters(EvaluationDataset dataset) {
        // 定义要测试的参数组合
        List<OpenAiChatOptions> optionsList = List.of(
            OpenAiChatOptions.builder().withTemperature(0.0f).build(),
            OpenAiChatOptions.builder().withTemperature(0.3f).build(),
            OpenAiChatOptions.builder().withTemperature(0.7f).build(),
            OpenAiChatOptions.builder().withTemperature(1.0f).build(),

            OpenAiChatOptions.builder().withTopP(0.5f).build(),
            OpenAiChatOptions.builder().withTopP(0.9f).build(),

            OpenAiChatOptions.builder().withPresencePenalty(0.5f).build(),
            OpenAiChatOptions.builder().withFrequencyPenalty(0.5f).build()
        );

        Map<OpenAiChatOptions, EvaluationResult> results = new HashMap<>();

        // 创建评估任务和指标
        EvaluationTask task = new QnAEvaluationTask();
        List<EvaluationMetric> metrics = List.of(
            new AccuracyMetric(),
            new RelevanceMetric()
        );

        // 对每组参数执行评估
        for (OpenAiChatOptions options : optionsList) {
            // 创建使用特定参数的客户端
            OpenAiChatClient parameterizedClient = new OpenAiChatClient(
                chatClient.getApi(),
                options
            );

            // 评估模型
            ModelEvaluator evaluator = new ModelEvaluator(parameterizedClient, task, metrics);
            EvaluationResult result = evaluator.evaluate(dataset);

            results.put(options, result);
        }

        return results;
    }
}

3. 多模型比较

@Service
public class ModelComparisonService {

    private final Map<String, ChatClient> chatClients;

    @Autowired
    public ModelComparisonService(
            @Qualifier("gpt35ChatClient") ChatClient gpt35Client,
            @Qualifier("gpt4ChatClient") ChatClient gpt4Client,
            @Qualifier("claudeClient") ChatClient claudeClient,
            @Qualifier("mistralClient") ChatClient mistralClient) {

        this.chatClients = Map.of(
            "GPT-3.5", gpt35Client,
            "GPT-4", gpt4Client,
            "Claude", claudeClient,
            "Mistral", mistralClient
        );
    }

    public ComparisonResult compareModels(EvaluationDataset dataset) {
        // 创建评估任务和指标
        EvaluationTask task = new QnAEvaluationTask();
        List<EvaluationMetric> metrics = List.of(
            new AccuracyMetric(),
            new RelevanceMetric(),
            new CompletenessMetric(),
            new LatencyMetric(),
            new TokenUsageMetric(),
            new CostMetric()
        );

        Map<String, EvaluationResult> results = new HashMap<>();

        // 对每个模型执行评估
        for (Map.Entry<String, ChatClient> entry : chatClients.entrySet()) {
            String modelName = entry.getKey();
            ChatClient chatClient = entry.getValue();

            // 评估模型
            ModelEvaluator evaluator = new ModelEvaluator(chatClient, task, metrics);
            EvaluationResult result = evaluator.evaluate(dataset);

            results.put(modelName, result);
        }

        // 比较结果
        return new ModelComparer().compare(results);
    }
}

评估结果可视化

Spring AI 支持多种方式可视化评估结果：

1. 表格报告

@Service
public class EvaluationReportService {

    public String generateTableReport(EvaluationResult result) {
        StringBuilder report = new StringBuilder();

        // 添加标题
        report.append("# 模型评估报告\n\n");

        // 添加总体指标
        report.append("## 总体指标\n\n");
        report.append("| 指标 | 分数 |\n");
        report.append("|------|------|\n");

        for (MetricResult metricResult : result.getMetricResults()) {
            report.append("| ")
                .append(metricResult.getMetricName())
                .append(" | ")
                .append(String.format("%.2f", metricResult.getScore()))
                .append(" |\n");
        }

        // 添加示例级别结果
        report.append("\n## 示例级别结果\n\n");
        report.append("| 输入 | 预期输出 | 实际输出 | 分数 |\n");
        report.append("|------|----------|----------|------|\n");

        for (ExampleResult exampleResult : result.getExampleResults()) {
            report.append("| ")
                .append(truncate(exampleResult.getExample().getInput(), 30))
                .append(" | ")
                .append(truncate(exampleResult.getExample().getExpectedOutput(), 30))
                .append(" | ")
                .append(truncate(exampleResult.getActualOutput(), 30))
                .append(" | ")
                .append(String.format("%.2f", exampleResult.getAverageScore()))
                .append(" |\n");
        }

        return report.toString();
    }

    private String truncate(String text, int maxLength) {
        if (text.length() <= maxLength) {
            return text;
        }
        return text.substring(0, maxLength - 3) + "...";
    }
}

2. 图表可视化

@Service
public class EvaluationChartService {

    public void generateBarChart(Map<String, EvaluationResult> results, String outputPath) {
        // 提取数据
        List<String> modelNames = new ArrayList<>(results.keySet());
        Map<String, List<Double>> metricScores = new HashMap<>();

        // 获取所有指标名称
        Set<String> metricNames = new HashSet<>();
        for (EvaluationResult result : results.values()) {
            for (MetricResult metricResult : result.getMetricResults()) {
                metricNames.add(metricResult.getMetricName());
            }
        }

        // 组织数据
        for (String metricName : metricNames) {
            List<Double> scores = new ArrayList<>();

            for (String modelName : modelNames) {
                EvaluationResult result = results.get(modelName);
                double score = result.getMetricResults().stream()
                    .filter(mr -> mr.getMetricName().equals(metricName))
                    .findFirst()
                    .map(MetricResult::getScore)
                    .orElse(0.0);

                scores.add(score);
            }

            metricScores.put(metricName, scores);
        }

        // 使用 JFreeChart 或其他库生成图表
        // ...
    }

    public void generateRadarChart(Map<String, EvaluationResult> results, String outputPath) {
        // 实现雷达图生成逻辑
        // ...
    }

    public void generateHeatmap(EvaluationResult result, String outputPath) {
        // 实现热图生成逻辑
        // ...
    }
}

模型评估最佳实践

在使用 Spring AI 进行模型评估时，请考虑以下最佳实践：

多样化数据集：确保评估数据集涵盖各种场景、难度级别和边缘情况。
多指标评估：使用多种指标评估模型，不要仅依赖单一指标。
人机结合评估：结合自动指标和人工评估，获得更全面的评估结果。
定期重新评估：随着模型和数据的变化，定期重新评估模型性能。
比较基准：与基准模型或之前的版本进行比较，了解性能变化。
考虑成本效益：评估模型性能与成本之间的平衡，选择最具成本效益的模型。
特定任务评估：针对特定应用场景设计评估任务和指标，而不是使用通用评估。

实际应用示例：模型评估框架

以下是一个完整的模型评估框架示例：

@SpringBootApplication
public class ModelEvaluationApplication {

    public static void main(String[] args) {
        SpringApplication.run(ModelEvaluationApplication.class, args);
    }
}

@Configuration
public class ModelConfig {

    @Bean
    @Qualifier("gpt35ChatClient")
    public ChatClient gpt35ChatClient(OpenAiApi openAiApi) {
        OpenAiChatOptions options = OpenAiChatOptions.builder()
            .withModel("gpt-3.5-turbo")
            .withTemperature(0.7f)
            .build();

        return new OpenAiChatClient(openAiApi, options);
    }

    @Bean
    @Qualifier("gpt4ChatClient")
    public ChatClient gpt4ChatClient(OpenAiApi openAiApi) {
        OpenAiChatOptions options = OpenAiChatOptions.builder()
            .withModel("gpt-4")
            .withTemperature(0.7f)
            .build();

        return new OpenAiChatClient(openAiApi, options);
    }

    @Bean
    @Qualifier("claudeClient")
    public ChatClient claudeClient(AnthropicApi anthropicApi) {
        AnthropicChatOptions options = AnthropicChatOptions.builder()
            .withModel("claude-3-opus-20240229")
            .withTemperature(0.7f)
            .build();

        return new AnthropicChatClient(anthropicApi, options);
    }
}

@Service
public class EvaluationService {

    private final Map<String, ChatClient> chatClients;
    private final EmbeddingClient embeddingClient;

    @Autowired
    public EvaluationService(
            @Qualifier("gpt35ChatClient") ChatClient gpt35Client,
            @Qualifier("gpt4ChatClient") ChatClient gpt4Client,
            @Qualifier("claudeClient") ChatClient claudeClient,
            EmbeddingClient embeddingClient) {

        this.chatClients = Map.of(
            "GPT-3.5", gpt35Client,
            "GPT-4", gpt4Client,
            "Claude", claudeClient
        );
        this.embeddingClient = embeddingClient;
    }

    public ComparisonResult evaluateModels(String datasetPath, String taskType) {
        // 加载数据集
        EvaluationDataset dataset = loadDataset(datasetPath);

        // 创建评估任务
        EvaluationTask task = createTask(taskType);

        // 创建评估指标
        List<EvaluationMetric> metrics = createMetrics();

        Map<String, EvaluationResult> results = new HashMap<>();

        // 对每个模型执行评估
        for (Map.Entry<String, ChatClient> entry : chatClients.entrySet()) {
            String modelName = entry.getKey();
            ChatClient chatClient = entry.getValue();

            // 评估模型
            ModelEvaluator evaluator = new ModelEvaluator(chatClient, task, metrics);
            EvaluationResult result = evaluator.evaluate(dataset);

            results.put(modelName, result);
        }

        // 比较结果
        return new ModelComparer().compare(results);
    }

    private EvaluationDataset loadDataset(String datasetPath) {
        if (datasetPath.endsWith(".csv")) {
            return EvaluationDataset.fromCsv(datasetPath);
        } else if (datasetPath.endsWith(".json")) {
            return EvaluationDataset.fromJson(datasetPath);
        } else if (datasetPath.endsWith(".yaml") || datasetPath.endsWith(".yml")) {
            return EvaluationDataset.fromYaml(datasetPath);
        } else {
            throw new IllegalArgumentException("Unsupported dataset format: " + datasetPath);
        }
    }

    private EvaluationTask createTask(String taskType) {
        switch (taskType) {
            case "qna":
                return new QnAEvaluationTask();
            case "classification":
                return new ClassificationEvaluationTask();
            case "summarization":
                return new SummarizationEvaluationTask();
            default:
                throw new IllegalArgumentException("Unsupported task type: " + taskType);
        }
    }

    private List<EvaluationMetric> createMetrics() {
        return List.of(
            new ExactMatchAccuracyMetric(),
            new SemanticSimilarityMetric(embeddingClient),
            new RelevanceMetric(),
            new CompletenessMetric(),
            new LatencyMetric(),
            new TokenUsageMetric(),
            new CostMetric()
        );
    }
}

@RestController
@RequestMapping("/api/evaluation")
public class EvaluationController {

    private final EvaluationService evaluationService;
    private final EvaluationReportService reportService;
    private final EvaluationChartService chartService;

    @Autowired
    public EvaluationController(
            EvaluationService evaluationService,
            EvaluationReportService reportService,
            EvaluationChartService chartService) {
        this.evaluationService = evaluationService;
        this.reportService = reportService;
        this.chartService = chartService;
    }

    @PostMapping
    public ResponseEntity<Map<String, Object>> evaluateModels(@RequestBody EvaluationRequest request) {
        try {
            // 执行评估
            ComparisonResult result = evaluationService.evaluateModels(
                request.getDatasetPath(),
                request.getTaskType()
            );

            // 生成报告
            String report = reportService.generateTableReport(result);

            // 生成图表
            String chartPath = "/tmp/evaluation_chart.png";
            chartService.generateBarChart(result.getResults(), chartPath);

            return ResponseEntity.ok(Map.of(
                "result", result,
                "report", report,
                "chartPath", chartPath
            ));
        } catch (Exception e) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                .body(Map.of("error", e.getMessage()));
        }
    }

    // 请求对象
    public static class EvaluationRequest {
        private String datasetPath;
        private String taskType;

        // Getters and setters
        // ...
    }
}

这个示例展示了一个完整的模型评估框架，它可以：

加载不同格式的评估数据集
支持多种评估任务
使用多种评估指标
评估和比较多个模型
生成表格报告和图表可视化
通过 REST API 提供评估服务

模型评估是 AI 应用开发中的关键环节，它帮助开发者了解和优化 AI 模型的性能。Spring AI 提供的全面评估框架，使开发者能够系统地评估和比较不同的 AI 模型和配置，为选择最适合特定应用的模型提供科学依据。

14. 总结与最佳实践

Spring AI 是一个强大而灵活的框架，为 Java 开发者提供了构建 AI 驱动应用的全套工具。本教程涵盖了 Spring AI 的核心功能和高级特性，从基本概念到实际应用。在本章中，我们将总结关键知识点，并提供一系列最佳实践，帮助你成功构建高质量的 AI 应用。

关键知识点总结

核心概念：Spring AI 提供了统一的抽象层，使开发者能够与各种 AI 模型和服务交互，而不必担心底层实现细节。核心接口如 ChatClient 和 EmbeddingClient 提供了一致的 API，简化了开发过程。
模型支持：Spring AI 支持多种 AI 模型提供商，包括 OpenAI、Azure OpenAI、Anthropic、Amazon Bedrock、Google Vertex AI 等。这种多样性使开发者能够选择最适合其需求的模型，并避免厂商锁定。
聊天客户端：ChatClient API 提供了与聊天模型交互的简洁接口，支持提示构建、流式响应、结构化输出等功能。
向量数据库：Spring AI 支持多种向量数据库，如 Chroma、Milvus、Neo4j、PGVector 等，为实现高效的语义搜索和 RAG 应用提供基础。
RAG 功能：检索增强生成（RAG）是 Spring AI 的核心功能之一，它结合了信息检索和生成式 AI，提高了模型输出的准确性和相关性。
多模态支持：Spring AI 支持文本-图像多模态交互，使应用能够处理和理解包含图像的输入。
工具调用：Spring AI 的工具调用功能使 AI 模型能够执行特定任务，如获取实时数据、执行计算或与外部系统交互。
聊天内存：Spring AI 提供了聊天内存抽象，使应用能够在多轮对话中保持上下文连贯性。
结构化输出：Spring AI 支持将模型的文本输出直接映射到 Java 对象，简化了数据处理流程。
可观测性：Spring AI 提供了全面的可观测性支持，包括日志记录、指标收集、追踪和审计，帮助开发者监控和优化 AI 系统。
模型评估：Spring AI 的评估框架使开发者能够系统地评估和比较不同的 AI 模型和配置。

最佳实践

架构设计

分层架构：采用分层架构设计，将 AI 功能与业务逻辑分离。

- 表示层（控制器、视图）
- 业务逻辑层（服务、领域模型）
- AI 交互层（AI 客户端、提示模板）
- 数据访问层（向量存储、数据库）

接口抽象：使用接口抽象 AI 功能，便于切换不同的实现和模拟测试。

public interface AiService {
    String generateResponse(String prompt);
    List<String> generateSuggestions(String context);
}

@Service
public class OpenAiService implements AiService {
    // 实现...
}

配置外部化：将 AI 模型配置外部化，便于在不同环境中切换。

# application.properties
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.chat.options.model=gpt-4
spring.ai.openai.chat.options.temperature=0.7

提示工程

提示模板化：使用模板化的提示，便于维护和一致性。

@Component
public class PromptTemplates {
    public static final String CUSTOMER_SUPPORT = """
        你是一个专业的客户支持助手。你的任务是帮助用户解决产品相关的问题。
        遵循以下准则:
        1. 保持礼貌和专业
        2. 提供简洁明了的解答
        3. 如果不确定答案，建议用户联系人工客服

        用户问题: {question}
        """;

    // 更多模板...
}

系统消息优化：精心设计系统消息，明确定义 AI 的角色和行为。

SystemMessage systemMessage = new SystemMessage("""
    你是一个专业的技术文档编写者。你的任务是将复杂的技术概念转化为清晰、简洁的文档。
    遵循以下风格指南:
    - 使用简单、直接的语言
    - 提供具体的代码示例
    - 解释技术术语
    - 使用列表和标题组织信息
    - 保持客观、准确的技术描述
    """);

少样本学习：在提示中包含示例，帮助模型理解预期输出格式。

String fewShotPrompt = """
    将以下文本翻译成中文:

    示例1:
    英文: The weather is nice today.
    中文: 今天天气很好。

    示例2:
    英文: I love programming with Java.
    中文: 我喜欢用Java编程。

    现在翻译:
    英文: %s
    中文:
    """.formatted(englishText);

错误处理

优雅降级：实现优雅降级策略，在 AI 服务不可用时提供备选方案。

@Service
public class ResilientAiService {

    private final ChatClient primaryClient;
    private final ChatClient fallbackClient;

    public String generateResponse(String prompt) {
        try {
            return primaryClient.call(prompt);
        } catch (Exception e) {
            log.warn("Primary AI service failed, falling back to secondary service", e);
            try {
                return fallbackClient.call(prompt);
            } catch (Exception fallbackError) {
                log.error("Fallback AI service also failed", fallbackError);
                return "抱歉，AI 服务暂时不可用，请稍后再试。";
            }
        }
    }
}

重试机制：对于临时性错误，实现智能重试机制。

@Service
public class RetryingAiService {

    private final ChatClient chatClient;

    public String generateResponseWithRetry(String prompt) {
        int maxRetries = 3;
        int retryCount = 0;

        while (retryCount < maxRetries) {
            try {
                return chatClient.call(prompt);
            } catch (Exception e) {
                if (isRetryableError(e) && retryCount < maxRetries - 1) {
                    retryCount++;
                    long backoffMs = (long) Math.pow(2, retryCount) * 1000; // 指数退避
                    log.warn("AI request failed, retrying in {}ms (attempt {}/{})", 
                             backoffMs, retryCount, maxRetries);
                    try {
                        Thread.sleep(backoffMs);
                    } catch (InterruptedException ie) {
                        Thread.currentThread().interrupt();
                        throw new RuntimeException("Retry interrupted", ie);
                    }
                } else {
                    throw new RuntimeException("AI service failed after retries", e);
                }
            }
        }

        throw new RuntimeException("Maximum retries exceeded");
    }

    private boolean isRetryableError(Exception e) {
        // 判断错误是否可重试
        return e instanceof RateLimitExceededException || 
               e instanceof ServiceUnavailableException;
    }
}

输入验证：在发送到 AI 模型前验证用户输入。

@Service
public class ValidatingAiService {

    private final ChatClient chatClient;

    public String generateResponse(String userInput) {
        // 验证输入不为空
        if (userInput == null || userInput.trim().isEmpty()) {
            throw new IllegalArgumentException("User input cannot be empty");
        }

        // 验证输入长度
        if (userInput.length() > 4000) {
            throw new IllegalArgumentException("User input exceeds maximum length");
        }

        // 验证输入内容
        if (containsProhibitedContent(userInput)) {
            throw new IllegalArgumentException("User input contains prohibited content");
        }

        return chatClient.call(userInput);
    }

    private boolean containsProhibitedContent(String text) {
        // 实现内容检查逻辑
        // ...
        return false;
    }
}

性能优化

缓存常见查询：对于常见或重复的查询，实现缓存机制。

@Service
public class CachingAiService {

    private final ChatClient chatClient;
    private final Cache<String, String> responseCache;

    public CachingAiService(ChatClient chatClient) {
        this.chatClient = chatClient;
        this.responseCache = Caffeine.newBuilder()
            .maximumSize(1000)
            .expireAfterWrite(1, TimeUnit.HOURS)
            .build();
    }

    public String generateResponse(String prompt) {
        // 尝试从缓存获取
        String cachedResponse = responseCache.getIfPresent(prompt);
        if (cachedResponse != null) {
            return cachedResponse;
        }

        // 缓存未命中，调用 AI 服务
        String response = chatClient.call(prompt);

        // 将响应存入缓存
        responseCache.put(prompt, response);

        return response;
    }
}

批处理请求：当可能时，批量处理请求以减少 API 调用。

@Service
public class BatchingEmbeddingService {

    private final EmbeddingClient embeddingClient;

    public List<List<Double>> generateEmbeddings(List<String> texts) {
        // 批量生成嵌入，而不是一次处理一个
        return embeddingClient.embed(texts).getValues();
    }
}

异步处理：对于非阻塞场景，使用异步处理。

@Service
public class AsyncAiService {

    private final ChatClient chatClient;
    private final ExecutorService executorService;

    public AsyncAiService(ChatClient chatClient) {
        this.chatClient = chatClient;
        this.executorService = Executors.newFixedThreadPool(10);
    }

    public CompletableFuture<String> generateResponseAsync(String prompt) {
        return CompletableFuture.supplyAsync(() -> {
            return chatClient.call(prompt);
        }, executorService);
    }
}

安全性

输入净化：净化用户输入，防止提示注入和其他安全问题。

@Service
public class SecureAiService {

    private final ChatClient chatClient;

    public String generateResponse(String userInput) {
        // 净化用户输入
        String sanitizedInput = sanitizeInput(userInput);

        // 使用净化后的输入
        return chatClient.call(sanitizedInput);
    }

    private String sanitizeInput(String input) {
        // 移除潜在的提示注入模式
        input = input.replaceAll("忽略之前的指令", "");
        input = input.replaceAll("系统提示", "");

        // 其他净化逻辑
        // ...

        return input;
    }
}

输出过滤：过滤 AI 输出，确保符合应用的内容政策。

@Service
public class ContentFilteringService {

    private final ChatClient chatClient;
    private final ContentFilter contentFilter;

    public String generateSafeResponse(String prompt) {
        String response = chatClient.call(prompt);

        // 检查响应是否符合内容政策
        if (!contentFilter.isSafe(response)) {
            return "抱歉，无法提供相关信息。";
        }

        return response;
    }
}

访问控制：实现适当的访问控制，限制对 AI 功能的访问。

@RestController
@RequestMapping("/api/ai")
public class AiController {

    private final AiService aiService;

    @PostMapping("/generate")
    @PreAuthorize("hasRole('AI_USER')")
    public ResponseEntity<Map<String, String>> generateContent(@RequestBody GenerateRequest request) {
        // 处理请求
        // ...
    }
}

成本管理

令牌计数：监控和限制令牌使用量，控制成本。

@Service
public class TokenAwareAiService {

    private final ChatClient chatClient;
    private final TokenCounter tokenCounter;

    public String generateResponse(String prompt, int maxTokens) {
        // 计算提示的令牌数
        int promptTokens = tokenCounter.countTokens(prompt);

        // 检查是否超出限制
        if (promptTokens > maxTokens) {
            throw new IllegalArgumentException("Prompt exceeds maximum token limit");
        }

        // 设置响应的最大令牌数
        return chatClient.prompt()
            .user(prompt)
            .withOptions(options -> options.withMaxTokens(maxTokens - promptTokens))
            .call()
            .getResult().getOutput().getContent();
    }
}

模型分层：根据任务复杂性选择不同的模型，优化成本。

@Service
public class TieredAiService {

    private final ChatClient economyModel; // 如 GPT-3.5-turbo
    private final ChatClient standardModel; // 如 GPT-4
    private final ChatClient premiumModel; // 如 GPT-4-turbo

    public String generateResponse(String prompt, TaskComplexity complexity) {
        switch (complexity) {
            case SIMPLE:
                return economyModel.call(prompt);
            case STANDARD:
                return standardModel.call(prompt);
            case COMPLEX:
                return premiumModel.call(prompt);
            default:
                return economyModel.call(prompt);
        }
    }

    public enum TaskComplexity {
        SIMPLE, STANDARD, COMPLEX
    }
}

使用率监控：监控 AI 服务的使用率，识别优化机会。

@Service
public class UsageMonitoringService {

    private final MeterRegistry meterRegistry;

    public void recordUsage(String modelName, int inputTokens, int outputTokens, double cost) {
        // 记录使用量指标
        meterRegistry.counter("ai.tokens.input", "model", modelName).increment(inputTokens);
        meterRegistry.counter("ai.tokens.output", "model", modelName).increment(outputTokens);
        meterRegistry.counter("ai.cost", "model", modelName).increment(cost);
    }

    public Map<String, Double> getMonthlyUsage() {
        // 实现获取月度使用量的逻辑
        // ...
        return new HashMap<>();
    }
}

测试策略

单元测试：使用模拟对象测试 AI 服务的业务逻辑。

@ExtendWith(MockitoExtension.class)
public class AiServiceTest {

    @Mock
    private ChatClient chatClient;

    @InjectMocks
    private AiService aiService;

    @Test
    public void testGenerateResponse() {
        // 设置模拟行为
        when(chatClient.call(anyString())).thenReturn(
            new ChatResponse(new Generation("模拟的 AI 响应"), Map.of())
        );

        // 调用被测方法
        String response = aiService.generateResponse("测试提示");

        // 验证结果
        assertEquals("模拟的 AI 响应", response);
        verify(chatClient).call(anyString());
    }
}

集成测试：测试与实际 AI 服务的集成。

@SpringBootTest
public class AiIntegrationTest {

    @Autowired
    private AiService aiService;

    @Test
    public void testIntegrationWithAiService() {
        // 使用简单的提示，减少成本
        String response = aiService.generateResponse("Hello, world!");

        // 验证响应不为空
        assertNotNull(response);
        assertFalse(response.isEmpty());
    }
}

回归测试：使用固定的提示集和预期响应进行回归测试。

@SpringBootTest
public class AiRegressionTest {

    @Autowired
    private AiService aiService;

    @ParameterizedTest
    @CsvFileSource(resources = "/ai-regression-tests.csv", numLinesToSkip = 1)
    public void testRegressionCases(String prompt, String expectedResponsePattern) {
        String response = aiService.generateResponse(prompt);

        // 使用正则表达式匹配预期响应模式
        assertTrue(response.matches(expectedResponsePattern),
                  "Response does not match expected pattern");
    }
}