Java接入Gemini31Pro完整教程

Google DeepMind推出的Gemini 3.1 Pro是一款采用MoE架构的旗舰AI模型，具有100万token上下文窗口和多模态处理能力。Java开发者可通过三种方式接入：官方SDK、OkHttp调用OpenAI兼容接口或SpringAI框架集成。该模型提供三层思维模式调节，支持流式输出和函数调用，定价为输入2美元/百万token，输出12美元/百万token。企业级应用中推荐使用Sp

大树_zhao

139人浏览 · 2026-05-07 15:59:14

大树_zhao · 2026-05-07 15:59:14 发布

概要

Gemini 3.1 Pro 是 Google DeepMind 于 2026 年 2 月发布的旗舰模型，ARC-AGI-2 得分 77.1%，采用 MoE（混合专家）架构，上下文窗口扩展至 100 万 token。定价保持与前代相同——每百万输入 token 2 美元，输出 12 美元。支持文本、图像、PDF、视频全格式输入，三层思维模式（Low/Medium/High）可按任务复杂度动态调节推理深度。

Java 是企业级项目中最常用的语言之一。Google 官方提供了 Java SDK 用于调用 Gemini API，同时社区也有 Spring AI、LangChain4j 等框架的集成支持。本文将从 SDK 配置、基础调用、多模态输入、流式输出、函数调用到 Spring Boot 集成，完整记录 Java 接入 Gemini 3.1 Pro 的实战过程。

KULAAI（c.877ai.cn）作为 AI 模型聚合平台，支持国内直连、统一接口调用 Gemini 3.1 Pro、GPT-5.5、Claude、DeepSeek 等多个主流大模型，一个 Key 即可完成多模型切换，适合快速验证和多模型对比选型。

整体架构流程

Java 接入 Gemini 3.1 Pro 有两条路线：

路线一：Google 官方 Java SDK

text

text

Java 代码  ↓ google-genai Java SDK  ↓ HTTPS 请求（带 API Key 认证）  ↓ Google API Gateway  ↓ Gemini 3.1 Pro 推理集群  ↓ 响应解析 → 业务处理

路线二：OpenAI 兼容接口（通过聚合平台）

text

text

Java 代码  ↓ OpenAI Java SDK / OkHttp / HttpClient  ↓ HTTPS 请求  ↓ 聚合平台网关（协议转换）  ↓ Gemini 3.1 Pro 推理集群  ↓ 响应解析 → 业务处理

两条路线的核心差异在于 SDK 和认证方式不同。路线一需要处理 Google 的地区限制和风控，路线二通过聚合平台中转，国内直连无门槛。业务逻辑代码基本一致。

技术名词解释

google-genai Java SDK Google 官方提供的 Java SDK，用于调用 Gemini 系列模型。支持文本生成、多模态输入、函数调用、流式输出等全部功能。通过 Maven 或 Gradle 引入依赖。

API Key 调用 Gemini API 的认证凭证，从 Google AI Studio 获取。创建后只显示一次，需要妥善保存。安全方面三条红线：不能硬编码在代码中、不能提交代码仓库、必须用配置文件或环境变量管理。

generateContent Gemini API 的核心调用方法，接收文本或多模态输入，返回模型生成的内容。Java SDK 中对应 generateContent() 方法，支持同步和流式两种调用模式。

MoE（混合专家架构） Gemini 3.1 Pro 的底层架构。模型内部包含多个专家子网络，推理时门控网络根据输入语义将 token 路由到最合适的专家。只激活部分专家，用更少的计算量达到同等效果，这是定价能保持在 2 美元/百万输入 token 的技术基础。

Thinking Levels（思维层级） Gemini 3.1 Pro 的推理分层机制。Low 模式约 1 秒响应适合简单任务，Medium 约 3 秒适合常规任务，High 约 5 秒适合复杂推理。同一个任务 Low 和 High 的准确率差距可达 21 个百分点。

SSE（Server-Sent Events） 服务端推送事件协议，用于实现流式输出。Gemini API 的流式响应基于此协议，Java 端通过 Flowable 或 Flux 逐 chunk 接收数据。

技术细节

一、Maven 依赖配置

方式一：Google 官方 SDK

xml

xml

<dependency>  <groupId>com.google.genai</groupId>  <artifactId>google-genai</artifactId>  <version>1.0.0</version> </dependency>

方式二：OkHttp 直接调用（适合 OpenAI 兼容接口）

xml

xml

<dependency>  <groupId>com.squareup.okhttp3</groupId>  <artifactId>okhttp</artifactId>  <version>4.12.0</version> </dependency> <dependency>  <groupId>com.google.code.gson</groupId>  <artifactId>gson</artifactId>  <version>2.11.0</version> </dependency>

方式三：Spring AI 集成

xml

xml

<dependency>  <groupId>org.springframework.ai</groupId>  <artifactId>spring-ai-openai-spring-boot-starter</artifactId> </dependency>

Spring AI 通过 OpenAI 兼容接口对接 Gemini，只需修改 base-url 配置即可指向聚合平台。

二、基础文本调用

Google 官方 SDK 方式：

java

java

import com.google.genai.Client; import com.google.genai.types.GenerateContentResponse;  public class GeminiBasicDemo {  public static void main(String[] args) {  Client client = Client.builder()  .apiKey("YOUR_API_KEY")  .build();   GenerateContentResponse response = client.models.generateContent(  "gemini-3.1-pro-preview",  "解释一下微服务架构和单体架构的区别",  null  );   System.out.println(response.text());  } }

OkHttp 方式（OpenAI 兼容接口）：

java

java

import okhttp3.*; import com.google.gson.*;  public class GeminiOkHttpDemo {  private static final String BASE_URL = "https://c.877ai.cn/v1";  private static final String API_KEY = "your_platform_key";  private static final OkHttpClient client = new OkHttpClient();  private static final Gson gson = new GsonBuilder().create();   public static String chat(String userMessage) throws Exception {  JsonObject request = new JsonObject();  request.addProperty("model", "gemini-3.1-pro-preview");  request.addProperty("temperature", 0.3);   JsonArray messages = new JsonArray();   JsonObject systemMsg = new JsonObject();  systemMsg.addProperty("role", "system");  systemMsg.addProperty("content", "你是一个专业的技术助手");  messages.add(systemMsg);   JsonObject userMsg = new JsonObject();  userMsg.addProperty("role", "user");  userMsg.addProperty("content", userMessage);  messages.add(userMsg);   request.add("messages", messages);   RequestBody body = RequestBody.create(  gson.toJson(request),  MediaType.parse("application/json")  );   Request httpRequest = new Request.Builder()  .url(BASE_URL + "/chat/completions")  .header("Authorization", "Bearer " + API_KEY)  .post(body)  .build();   try (Response response = client.newCall(httpRequest).execute()) {  JsonObject resp = gson.fromJson(  response.body().string(), JsonObject.class  );  return resp.getAsJsonArray("choices")  .get(0).getAsJsonObject()  .getAsJsonObject("message")  .get("content").getAsString();  }  }   public static void main(String[] args) throws Exception {  System.out.println(chat("什么是 MoE 架构"));  } }

三、关键参数配置

temperature 响应区间 0.0–2.0，默认值 0.75。代码生成建议 0.3，事实问答建议 0.3–0.5，创意写作建议 0.7–0.85。避免设为 1.5 以上，易触发非收敛采样路径。

system_instruction 系统级指令，长度不超过 2048 个 Unicode 字符，超长内容会被静默截断且不报错。这个坑很隐蔽——不会收到任何错误提示，但输出质量会莫名其妙地下降。

max_output_tokens 采用软上限与硬上限双重控制。当输入含图像数据时，每 100KB 图像数据使硬上限自动下调 128 tokens。

response_mime_type 设为 application/json 时模型自动补全 JSON 结构。application/octet-stream 会直接触发错误。

safety_settings 支持 per-category 阈值覆盖。未声明的类别继承全局默认策略 BLOCK_ONLY_HIGH。

四、流式输出

企业级场景中流式输出几乎是必选项。用户提问后等 3-5 秒才看到第一个字的体验很差。

OkHttp 实现流式输出：

java

java

public static void streamChat(String userMessage) throws Exception {  JsonObject request = new JsonObject();  request.addProperty("model", "gemini-3.1-pro-preview");  request.addProperty("stream", true);  request.addProperty("temperature", 0.3);   JsonArray messages = new JsonArray();  JsonObject userMsg = new JsonObject();  userMsg.addProperty("role", "user");  userMsg.addProperty("content", userMessage);  messages.add(userMsg);  request.add("messages", messages);   RequestBody body = RequestBody.create(  gson.toJson(request), MediaType.parse("application/json")  );   Request httpRequest = new Request.Builder()  .url(BASE_URL + "/chat/completions")  .header("Authorization", "Bearer " + API_KEY)  .post(body)  .build();   try (Response response = client.newCall(httpRequest).execute()) {  BufferedSource source = response.body().source();  while (!source.exhausted()) {  String line = source.readUtf8Line();  if (line != null && line.startsWith("data: ")) {  String data = line.substring(6).trim();  if ("[DONE]".equals(data)) break;   JsonObject chunk = gson.fromJson(data, JsonObject.class);  String content = chunk.getAsJsonArray("choices")  .get(0).getAsJsonObject()  .getAsJsonObject("delta")  .has("content")  ? chunk.getAsJsonArray("choices")  .get(0).getAsJsonObject()  .getAsJsonObject("delta")  .get("content").getAsString()  : "";  System.out.print(content);  }  }  } }

五、Spring Boot 集成

通过 Spring AI 可以用最简洁的方式集成 Gemini 3.1 Pro：

yaml

yaml

# application.yml spring:  ai:  openai:  api-key: ${GEMINI_API_KEY}  base-url: https://c.877ai.cn/v1  chat:  options:  model: gemini-3.1-pro-preview  temperature: 0.3

java

java

@RestController @RequestMapping("/api/chat") public class ChatController {   private final ChatClient chatClient;   public ChatController(ChatClient.Builder builder) {  this.chatClient = builder  .defaultSystem("你是一个企业内部知识库问答助手，回答要简洁准确")  .build();  }   @GetMapping  public String chat(@RequestParam String question) {  return chatClient.prompt()  .user(question)  .call()  .content();  }   @GetMapping("/stream")  public Flux<String> streamChat(@RequestParam String question) {  return chatClient.prompt()  .user(question)  .stream()  .content();  } }

切换到其他模型只需要改 application.yml 中的 base-url 和 model 参数，代码完全不变。这就是聚合平台统一接口的价值。

六、函数调用

Gemini 3.1 Pro 支持函数调用，Java 端通过声明工具 schema 让模型决定调用什么：

java

java

// 声明工具 JsonObject functionDef = new JsonObject(); functionDef.addProperty("name", "query_database"); functionDef.addProperty("description", "查询企业内部数据库获取产品信息");  JsonObject params = new JsonObject(); params.addProperty("type", "object");  JsonObject properties = new JsonObject(); JsonObject queryProp = new JsonObject(); queryProp.addProperty("type", "string"); queryProp.addProperty("description", "SQL 查询条件"); properties.add("query", queryProp);  params.add("properties", properties);  JsonArray required = new JsonArray(); required.add("query"); params.add("required", required);  functionDef.add("parameters", params);  // 构建请求 JsonObject tool = new JsonObject(); tool.addProperty("type", "function"); tool.add("function", functionDef);  JsonArray tools = new JsonArray(); tools.add(tool); request.add("tools", tools);

GPT-5.5 的 function calling 在 Java 生态中同样适用，切换到 GPT-5.5 只需要改 model 参数。

七、错误处理

java

java

public String chatWithRetry(String message, int maxRetries) {  for (int attempt = 0; attempt < maxRetries; attempt++) {  try {  return chat(message);  } catch (HttpException e) {  if (e.code() == 429 || e.code() >= 500) {  long wait = (long) Math.pow(2, attempt) * 1000  + (long) (Math.random() * 1000);  System.out.printf("错误 %d，等待 %dms 后重试%n", e.code(), wait);  Thread.sleep(wait);  } else {  throw e; // 4xx 错误不重试  }  } catch (InterruptedException e) {  Thread.currentThread().interrupt();  throw new RuntimeException(e);  }  }  throw new RuntimeException("重试 " + maxRetries + " 次后仍然失败"); }

429 和 5xx 值得重试，4xx 重试无用。等待时间按指数退避加随机偏移。

八、API 定价

模型	输入 (每 1M tokens)	输出 (每 1M tokens)
Gemini 3.1 Pro	$2.00	$12.00
GPT-5.5	$5.00	$30.00
Claude Opus 4.6	$15.00	$75.00

超过 200K 上下文时输入上调至 4.00/百万，输出4.00/百万，输出18.00/百万。上下文缓存命中时输入仅 $0.50/百万。

小结

Java 接入 Gemini 3.1 Pro 有三种方式：Google 官方 SDK 直连、OkHttp 调用 OpenAI 兼容接口、Spring AI 框架集成。企业级项目推荐 Spring AI 方式——配置简洁、生态完善、切换模型只需改配置文件。

接入过程中需要注意几个隐性规则：system_instruction 超过 2048 字符会被静默截断、max_output_tokens 会因图像输入自动下调、temperature 超过 1.5 容易触发非收敛采样。这些坑不会报错但会直接影响输出质量。

对于国内开发者，建议先通过聚合平台验证核心调用流程和参数配置效果，再根据业务需求选择直连或中转方案。统一接口管理多个模型，切换只需改一个配置项，适合快速验证和灵活选型。

DeepSeek技术社区

欢迎加入DeepSeek 技术社区。在这里，你可以找到志同道合的朋友，共同探索AI技术的奥秘。

更多推荐

案例研究：Gemini + Creative Fabrica —— 揭秘多模态 Agent 如何重塑 3D 创意资产生产线

这不仅仅是一个技术升级工具，更是一个标志性案例：它展示了企业如何利用多模态大模型（LMM）将模糊的创意意图转化为精确的、工业级的参数化 3D 模型。传统的 3D 建模需要复杂的布线、贴图和参数调整。Creative Fabrica 利用 Gemini 的多模态能力，构建了一套“意图驱动”的生产管线。，从庞大的 Creative Fabrica 图库中学习“北欧风”的特征分布。生产出可商用的、高质量

DeepSeek技术社区

2026 AI 局势突变：国家大基金入场 DeepSeek，Kimi 2.0 豪掷 20 亿美元，大模型进入“内力”博弈时代

2026年中国AI行业迎来关键转折点，DeepSeek获得国家大基金450亿美元估值投资，月之暗面完成20亿美元B轮融资，标志AI竞争进入资本与算力的"内力"博弈阶段。技术层面，DeepSeek V4采用MoE架构实现1.6万亿参数的高效推理，Kimi K2.6则专注多智能体协同工作流。行业趋势显示：1)算力门槛提升至万卡级别；2)商业闭环开始形成；3)国家资本推动行业集中化。