DeepSeek-R1-Distill-Qwen-1.5B实战：集成到现有CRM系统的插件开发

本文介绍了如何在星图GPU平台上自动化部署DeepSeek-R1-Distill-Qwen-1.5B镜像，并开发将其集成到现有CRM系统的智能插件。该轻量级大语言模型能够自动化处理客户咨询回复、销售邮件撰写等任务，有效提升企业客户关系管理效率。

CeLaMbDa

190人浏览 · 2026-03-23 00:47:15

CeLaMbDa · 2026-03-23 00:47:15 发布

DeepSeek-R1-Distill-Qwen-1.5B实战：集成到现有CRM系统的插件开发

如果你正在为CRM系统寻找一个既轻量又智能的AI助手，那么DeepSeek-R1-Distill-Qwen-1.5B模型可能正是你需要的解决方案。这个1.5B参数的模型在保持不错性能的同时，对硬件要求相当友好，特别适合集成到企业现有的业务系统中。

想象一下这样的场景：你的销售团队每天要处理大量客户咨询，客服人员需要快速回复邮件、整理客户信息、分析销售数据。传统的人工处理方式不仅效率低下，还容易出错。而如果能在CRM系统中集成一个智能助手，自动处理这些重复性工作，就能让销售团队把更多精力放在核心业务上。

今天我就带你一步步实现这个目标，从模型部署到插件开发，再到实际集成，让你看到如何将一个轻量级AI模型真正用起来。

1. 为什么选择DeepSeek-R1-Distill-Qwen-1.5B

在开始技术实现之前，我们先搞清楚为什么要选这个模型。市面上AI模型那么多，从几百亿参数的大模型到几亿参数的小模型都有，选择哪个最合适呢？

1.1 模型特点分析

DeepSeek-R1-Distill-Qwen-1.5B有几个关键特点让它特别适合企业级应用：

轻量化设计：1.5B参数听起来不小，但在AI模型里算是相当轻量的。这意味着它可以在普通的服务器甚至边缘设备上运行，不需要昂贵的GPU集群。

性能平衡：虽然参数少，但通过知识蒸馏技术，它保留了原模型85%以上的能力。对于大多数企业应用场景，这个性能已经足够用了。

硬件友好：支持INT8量化，内存占用比标准模式降低75%。这意味着你可以在NVIDIA T4这样的入门级GPU上实现实时推理，大大降低了部署成本。

垂直优化：在训练过程中加入了特定领域的数据，比如法律文书、医疗问诊等，这让它在专业领域的表现更好。

1.2 CRM场景的匹配度

在CRM系统中，AI助手主要处理以下几类任务：

客户咨询回复：自动回答常见问题
邮件撰写：帮助销售写跟进邮件
数据整理：从对话中提取关键信息
报告生成：基于销售数据生成分析报告

这些任务对模型的推理能力要求不是特别高，但对响应速度和稳定性要求很高。DeepSeek-R1-Distill-Qwen-1.5B正好能满足这些需求。

2. 快速部署模型服务

要让模型在CRM系统中工作，首先需要把它部署成一个可以调用的服务。这里我用vLLM来部署，这是一个专门为大规模语言模型推理优化的框架。

2.1 环境准备

在开始之前，确保你的服务器满足以下要求：

Ubuntu 20.04或更高版本
Python 3.8+
至少8GB内存（如果使用GPU，需要NVIDIA显卡）
20GB可用磁盘空间

如果你用的是云服务器，建议选择有GPU的实例，这样推理速度会快很多。没有GPU也能运行，只是速度会慢一些。

2.2 安装依赖

首先创建一个专门的工作目录，然后安装必要的包：

# 创建工作目录
mkdir -p /root/workspace
cd /root/workspace

# 创建虚拟环境（推荐）
python -m venv venv
source venv/bin/activate

# 安装vLLM和相关依赖
pip install vllm==0.4.2
pip install openai==1.12.0
pip install fastapi==0.104.1
pip install uvicorn==0.24.0

vLLM会自动处理CUDA和PyTorch的安装，所以不需要单独安装这些。

2.3 启动模型服务

现在来启动模型服务。创建一个启动脚本能让后续管理更方便：

# 创建启动脚本
cat > start_model.sh << 'EOF'
#!/bin/bash

# 设置模型路径（根据实际情况修改）
MODEL_PATH="DeepSeek-R1-Distill-Qwen-1.5B"

# 启动vLLM服务
python -m vllm.entrypoints.openai.api_server \
    --model $MODEL_PATH \
    --served-model-name DeepSeek-R1-Distill-Qwen-1.5B \
    --host 0.0.0.0 \
    --port 8000 \
    --max-model-len 4096 \
    --gpu-memory-utilization 0.9 \
    --enforce-eager \
    --disable-log-requests
EOF

# 给脚本执行权限
chmod +x start_model.sh

# 启动服务（后台运行）
nohup ./start_model.sh > deepseek_qwen.log 2>&1 &

这个脚本做了几件事：

指定要加载的模型
设置服务监听的地址和端口
配置最大输入长度
优化GPU内存使用
在后台运行服务

2.4 检查服务状态

服务启动需要一些时间，特别是第一次运行时要下载模型。你可以通过查看日志来了解进度：

# 查看启动日志
cd /root/workspace
tail -f deepseek_qwen.log

当你看到类似下面的输出时，说明服务已经启动成功：

INFO 11-28 14:30:15 llm_engine.py:72] Initializing an LLM engine with config: ...
INFO 11-28 14:30:20 model_runner.py:84] Loading model weights...
INFO 11-28 14:30:45 model_runner.py:121] Model loaded successfully.
INFO 11-28 14:30:46 llm_engine.py:201] Engine created successfully.
INFO 11-28 14:30:47 api_server.py:127] Server started at http://0.0.0.0:8000

如果一切正常，你现在应该有一个运行在8000端口的模型服务了。

3. 测试模型服务

在开始开发插件之前，先确保模型服务工作正常。我准备了一个完整的测试脚本，可以验证各种功能。

3.1 基础功能测试

创建一个测试文件，验证模型的基本对话能力：

# test_basic.py
from openai import OpenAI
import time

class ModelTester:
    def __init__(self, base_url="http://localhost:8000/v1"):
        self.client = OpenAI(
            base_url=base_url,
            api_key="none"  # vLLM通常不需要API密钥
        )
        self.model = "DeepSeek-R1-Distill-Qwen-1.5B"
    
    def test_simple_chat(self):
        """测试简单对话"""
        print("测试1: 简单问答")
        messages = [
            {"role": "user", "content": "你好，请介绍一下你自己"}
        ]
        
        try:
            response = self.client.chat.completions.create(
                model=self.model,
                messages=messages,
                temperature=0.6,
                max_tokens=200
            )
            print(f"回复: {response.choices[0].message.content}")
            return True
        except Exception as e:
            print(f"测试失败: {e}")
            return False
    
    def test_crm_scenario(self):
        """测试CRM相关场景"""
        print("\n测试2: CRM场景测试")
        
        # 模拟客户咨询场景
        messages = [
            {
                "role": "system", 
                "content": "你是一个专业的CRM助手，帮助销售团队处理客户咨询"
            },
            {
                "role": "user",
                "content": "客户问我们的产品是否支持批量导入客户数据，请用专业友好的语气回答"
            }
        ]
        
        try:
            response = self.client.chat.completions.create(
                model=self.model,
                messages=messages,
                temperature=0.6,
                max_tokens=300
            )
            print(f"回复: {response.choices[0].message.content}")
            return True
        except Exception as e:
            print(f"测试失败: {e}")
            return False
    
    def test_streaming(self):
        """测试流式输出"""
        print("\n测试3: 流式输出测试")
        
        messages = [
            {"role": "user", "content": "用100字介绍我们的CRM系统的主要功能"}
        ]
        
        try:
            stream = self.client.chat.completions.create(
                model=self.model,
                messages=messages,
                temperature=0.6,
                max_tokens=200,
                stream=True
            )
            
            print("AI回复: ", end="", flush=True)
            full_response = ""
            
            for chunk in stream:
                if chunk.choices[0].delta.content is not None:
                    content = chunk.choices[0].delta.content
                    print(content, end="", flush=True)
                    full_response += content
            
            print()  # 换行
            return True
            
        except Exception as e:
            print(f"流式测试失败: {e}")
            return False

# 运行测试
if __name__ == "__main__":
    tester = ModelTester()
    
    print("开始测试DeepSeek-R1-Distill-Qwen-1.5B模型服务...")
    print("=" * 50)
    
    # 等待服务完全启动
    time.sleep(2)
    
    tests = [
        tester.test_simple_chat,
        tester.test_crm_scenario,
        tester.test_streaming
    ]
    
    passed = 0
    for i, test in enumerate(tests, 1):
        print(f"\n[测试 {i}/3]")
        if test():
            passed += 1
    
    print(f"\n测试完成: {passed}/3 通过")
    if passed == 3:
        print("✅ 模型服务运行正常，可以开始插件开发")
    else:
        print("⚠️  部分测试失败，请检查服务状态")

运行这个测试脚本，你应该能看到模型正常回复。如果所有测试都通过，说明模型服务已经准备就绪。

3.2 性能基准测试

对于CRM系统来说，响应速度很重要。我们来测试一下模型的推理速度：

# test_performance.py
import time
from openai import OpenAI

class PerformanceTester:
    def __init__(self):
        self.client = OpenAI(
            base_url="http://localhost:8000/v1",
            api_key="none"
        )
        self.model = "DeepSeek-R1-Distill-Qwen-1.5B"
    
    def test_response_time(self, prompt, num_tests=5):
        """测试响应时间"""
        print(f"测试提示: {prompt[:50]}...")
        
        messages = [{"role": "user", "content": prompt}]
        times = []
        
        for i in range(num_tests):
            start_time = time.time()
            
            response = self.client.chat.completions.create(
                model=self.model,
                messages=messages,
                temperature=0.6,
                max_tokens=100
            )
            
            end_time = time.time()
            elapsed = end_time - start_time
            times.append(elapsed)
            
            print(f"  测试 {i+1}: {elapsed:.2f}秒")
        
        avg_time = sum(times) / len(times)
        print(f"平均响应时间: {avg_time:.2f}秒")
        return avg_time
    
    def test_concurrent_requests(self):
        """测试并发处理能力"""
        # 这里简化测试，实际CRM系统可能需要处理多个并发请求
        print("\n测试并发处理...")
        
        test_prompts = [
            "写一封跟进邮件给客户张先生",
            "总结上周的销售数据",
            "客户投诉产品问题，如何回复",
            "生成下个月的销售计划大纲"
        ]
        
        for prompt in test_prompts:
            self.test_response_time(prompt, num_tests=3)

if __name__ == "__main__":
    tester = PerformanceTester()
    
    print("性能测试开始")
    print("=" * 50)
    
    # 测试不同长度提示的响应时间
    test_cases = [
        ("短提示", "你好"),
        ("中等提示", "请帮我写一封给客户的感谢邮件，客户名叫李明"),
        ("长提示", "基于以下销售数据生成报告：1月销售额100万，2月120万，3月150万。主要客户行业分布：科技30%，制造40%，零售30%。需要分析增长趋势和客户分布。")
    ]
    
    for name, prompt in test_cases:
        print(f"\n{name}测试:")
        tester.test_response_time(prompt)
    
    # 测试并发
    tester.test_concurrent_requests()

这个测试能帮你了解模型在实际使用中的表现。通常来说，1.5B参数的模型在普通GPU上应该能在1-3秒内完成响应，这对于CRM场景来说是完全可以接受的。

4. 开发CRM插件

现在模型服务已经正常运行，接下来我们开发一个实际的CRM插件。这个插件将集成到现有的CRM系统中，提供智能助手功能。

4.1 插件架构设计

一个好的插件应该易于集成、功能明确、稳定可靠。我设计了这样一个架构：

CRM系统
    │
    ├── CRM插件层（Python/JavaScript）
    │   ├── 请求封装
    │   ├── 错误处理
    │   ├── 缓存管理
    │   └── 日志记录
    │
    └── AI服务层（vLLM）
        ├── 模型推理
        ├── 请求队列
        └── 资源管理

插件的主要功能包括：

客户咨询自动回复
销售邮件智能撰写
客户数据智能分析
销售报告自动生成

4.2 核心插件代码实现

下面是一个完整的CRM插件实现：

# crm_ai_plugin.py
import json
import logging
import time
from typing import Dict, List, Optional, Any
from dataclasses import dataclass
from openai import OpenAI
from openai.types.chat import ChatCompletion

# 配置日志
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

@dataclass
class CRMConfig:
    """CRM插件配置"""
    api_base: str = "http://localhost:8000/v1"
    api_key: str = "none"
    model_name: str = "DeepSeek-R1-Distill-Qwen-1.5B"
    timeout: int = 30
    max_retries: int = 3
    temperature: float = 0.6
    max_tokens: int = 1024

class CRMAIPlugin:
    """CRM AI插件核心类"""
    
    def __init__(self, config: Optional[CRMConfig] = None):
        self.config = config or CRMConfig()
        self.client = OpenAI(
            base_url=self.config.api_base,
            api_key=self.config.api_key,
            timeout=self.config.timeout
        )
        
        # 缓存最近对话，避免重复计算
        self.conversation_cache = {}
        # 缓存模板响应
        self.template_cache = {}
        
        logger.info(f"CRM AI插件初始化完成，使用模型: {self.config.model_name}")
    
    def _make_request(self, messages: List[Dict], **kwargs) -> Optional[ChatCompletion]:
        """发送请求到模型服务，包含重试机制"""
        for attempt in range(self.config.max_retries):
            try:
                response = self.client.chat.completions.create(
                    model=self.config.model_name,
                    messages=messages,
                    temperature=kwargs.get('temperature', self.config.temperature),
                    max_tokens=kwargs.get('max_tokens', self.config.max_tokens),
                    stream=kwargs.get('stream', False)
                )
                return response
                
            except Exception as e:
                logger.warning(f"请求失败 (尝试 {attempt + 1}/{self.config.max_retries}): {e}")
                if attempt < self.config.max_retries - 1:
                    time.sleep(1 * (attempt + 1))  # 指数退避
                else:
                    logger.error(f"所有重试均失败: {e}")
                    return None
    
    def auto_reply_customer(self, customer_query: str, customer_info: Dict = None) -> str:
        """自动回复客户咨询"""
        logger.info(f"处理客户咨询: {customer_query[:50]}...")
        
        # 构建系统提示
        system_prompt = """你是一个专业的CRM客服助手，负责回复客户咨询。
        请根据客户问题提供准确、专业、友好的回复。
        如果问题涉及具体业务细节，请基于常识给出合理建议。
        回复要简洁明了，避免技术术语。"""
        
        # 如果有客户信息，添加到上下文中
        context = ""
        if customer_info:
            context = f"\n客户信息: {json.dumps(customer_info, ensure_ascii=False)}"
        
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": f"客户咨询:{customer_query}{context}"}
        ]
        
        response = self._make_request(messages, max_tokens=500)
        if response and response.choices:
            reply = response.choices[0].message.content
            logger.info(f"生成回复: {reply[:50]}...")
            return reply
        
        return "抱歉，暂时无法处理您的咨询，请稍后再试或联系人工客服。"
    
    def write_sales_email(self, purpose: str, recipient: str, 
                         key_points: List[str] = None) -> str:
        """撰写销售邮件"""
        logger.info(f"撰写{surpose}邮件给{recipient}")
        
        system_prompt = """你是一个专业的销售邮件撰写助手。
        请根据邮件目的和收件人信息，撰写专业、得体、有效的销售邮件。
        邮件要结构清晰，重点突出，语气恰当。"""
        
        points_text = ""
        if key_points:
            points_text = "\n关键要点:\n" + "\n".join(f"- {point}" for point in key_points)
        
        user_prompt = f"""请撰写一封{purpose}邮件。
        收件人: {recipient}
        {points_text}
        
        要求:
        1. 邮件主题明确
        2. 正文结构清晰
        3. 语气专业友好
        4. 包含合适的称呼和落款"""
        
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
        
        response = self._make_request(messages, max_tokens=800)
        if response and response.choices:
            email = response.choices[0].message.content
            logger.info(f"邮件撰写完成，长度: {len(email)}字符")
            return email
        
        return f"""主题: {purpose} - 跟进邮件

尊敬的{recipient}，

[这里是邮件正文，由于系统暂时无法生成，请手动填写]

祝好，
销售团队"""
    
    def analyze_sales_data(self, data: Dict[str, Any], analysis_type: str = "trend") -> str:
        """分析销售数据"""
        logger.info(f"分析销售数据，类型: {analysis_type}")
        
        system_prompt = """你是一个销售数据分析专家。
        请根据提供的销售数据，进行专业的数据分析，并给出有价值的见解和建议。
        分析要基于数据，结论要具体可行。"""
        
        # 将数据转换为文本描述
        data_text = json.dumps(data, ensure_ascii=False, indent=2)
        
        analysis_tasks = {
            "trend": "分析销售趋势，识别增长点和下降点",
            "customer": "分析客户分布和特征",
            "product": "分析产品表现和市场需求",
            "comprehensive": "综合分析和建议"
        }
        
        task = analysis_tasks.get(analysis_type, "综合分析")
        
        user_prompt = f"""请对以下销售数据进行{task}:

销售数据:
{data_text}

请提供:
1. 关键发现和数据洞察
2. 趋势分析和原因推测
3. 具体建议和改进措施
4. 风险提示和机会点

请用清晰的结构呈现分析结果。"""
        
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
        
        response = self._make_request(messages, max_tokens=1200)
        if response and response.choices:
            analysis = response.choices[0].message.content
            logger.info(f"数据分析完成，长度: {len(analysis)}字符")
            return analysis
        
        return "数据分析服务暂时不可用，请稍后重试。"
    
    def generate_sales_report(self, period: str, data: Dict[str, Any]) -> Dict[str, Any]:
        """生成销售报告"""
        logger.info(f"生成{period}销售报告")
        
        # 先分析数据
        analysis = self.analyze_sales_data(data, "comprehensive")
        
        # 生成报告摘要
        system_prompt = """你是一个销售报告撰写专家。
        请基于数据分析结果，生成专业的销售报告。
        报告要结构完整，重点突出，数据支撑充分。"""
        
        user_prompt = f"""基于以下数据分析结果，生成{period}的销售报告:

数据分析:
{analysis}

报告要求:
1. 执行摘要（关键成果和发现）
2. 销售业绩分析（数据支撑）
3. 市场表现评估
4. 客户分析
5. 问题和挑战
6. 建议和行动计划
7. 下一阶段目标

请用专业的报告格式呈现。"""
        
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
        
        response = self._make_request(messages, max_tokens=1500)
        
        report = {
            "period": period,
            "generated_at": time.strftime("%Y-%m-%d %H:%M:%S"),
            "data_summary": self._summarize_data(data),
            "analysis": analysis,
            "full_report": ""
        }
        
        if response and response.choices:
            report["full_report"] = response.choices[0].message.content
            logger.info(f"销售报告生成完成")
        else:
            report["full_report"] = "报告生成失败，请参考数据分析部分。"
        
        return report
    
    def _summarize_data(self, data: Dict[str, Any]) -> Dict[str, Any]:
        """简化数据用于报告摘要"""
        summary = {}
        for key, value in data.items():
            if isinstance(value, (int, float)):
                summary[key] = value
            elif isinstance(value, dict):
                # 只取第一层的关键信息
                summary[key] = {k: v for k, v in value.items() if isinstance(v, (int, float, str))}
            elif isinstance(value, list) and len(value) > 0:
                # 取列表的第一个元素作为示例
                summary[key] = f"列表，共{len(value)}项"
        
        return summary
    
    def stream_customer_reply(self, customer_query: str, 
                            callback=None) -> str:
        """流式回复客户咨询"""
        logger.info(f"流式处理客户咨询: {customer_query[:50]}...")
        
        system_prompt = "你是一个专业的客服助手，请用流式方式回复客户咨询。"
        
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": customer_query}
        ]
        
        try:
            stream = self.client.chat.completions.create(
                model=self.config.model_name,
                messages=messages,
                temperature=self.config.temperature,
                max_tokens=500,
                stream=True
            )
            
            full_response = ""
            for chunk in stream:
                if chunk.choices[0].delta.content is not None:
                    content = chunk.choices[0].delta.content
                    full_response += content
                    
                    # 如果有回调函数，调用它
                    if callback and callable(callback):
                        callback(content)
            
            return full_response
            
        except Exception as e:
            logger.error(f"流式回复失败: {e}")
            return "抱歉，暂时无法回复您的咨询。"

# 使用示例
if __name__ == "__main__":
    # 初始化插件
    plugin = CRMAIPlugin()
    
    # 示例1: 自动回复客户咨询
    print("示例1: 自动回复客户咨询")
    customer_query = "我想了解你们CRM系统的定价方案，能介绍一下吗？"
    reply = plugin.auto_reply_customer(customer_query)
    print(f"客户咨询: {customer_query}")
    print(f"AI回复: {reply}")
    print("-" * 50)
    
    # 示例2: 撰写销售邮件
    print("\n示例2: 撰写销售邮件")
    email = plugin.write_sales_email(
        purpose="产品介绍",
        recipient="张经理",
        key_points=[
            "我们的CRM系统支持客户数据智能分析",
            "提供自动化营销功能",
            "移动端随时访问",
            "30天免费试用"
        ]
    )
    print(email)
    print("-" * 50)
    
    # 示例3: 分析销售数据
    print("\n示例3: 分析销售数据")
    sales_data = {
        "monthly_sales": {
            "一月": 1000000,
            "二月": 1200000,
            "三月": 1500000
        },
        "customer_distribution": {
            "科技行业": 30,
            "制造业": 40,
            "零售业": 30
        },
        "top_products": ["产品A", "产品B", "产品C"]
    }
    
    analysis = plugin.analyze_sales_data(sales_data, "trend")
    print(f"数据分析结果:\n{analysis[:500]}...")  # 只显示前500字符

这个插件提供了CRM系统中最常用的几个AI功能。每个方法都有详细的错误处理和日志记录，确保在生产环境中稳定运行。

4.3 Web API接口封装

为了让CRM系统能够方便地调用插件，我们还需要提供一个Web API接口：

# crm_api.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional, Dict, Any
import uvicorn
from crm_ai_plugin import CRMAIPlugin, CRMConfig

app = FastAPI(title="CRM AI助手API", version="1.0.0")

# 初始化插件
plugin = CRMAIPlugin()

# 数据模型定义
class CustomerQuery(BaseModel):
    query: str
    customer_info: Optional[Dict[str, Any]] = None

class EmailRequest(BaseModel):
    purpose: str
    recipient: str
    key_points: Optional[List[str]] = None

class AnalysisRequest(BaseModel):
    data: Dict[str, Any]
    analysis_type: str = "trend"

class ReportRequest(BaseModel):
    period: str
    data: Dict[str, Any]

# API端点
@app.get("/")
async def root():
    return {"message": "CRM AI助手API服务运行中", "model": "DeepSeek-R1-Distill-Qwen-1.5B"}

@app.post("/api/customer/reply")
async def reply_customer(request: CustomerQuery):
    """自动回复客户咨询"""
    try:
        reply = plugin.auto_reply_customer(
            request.query, 
            request.customer_info
        )
        return {
            "success": True,
            "reply": reply,
            "query": request.query
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.post("/api/email/write")
async def write_email(request: EmailRequest):
    """撰写销售邮件"""
    try:
        email = plugin.write_sales_email(
            request.purpose,
            request.recipient,
            request.key_points
        )
        return {
            "success": True,
            "email": email,
            "purpose": request.purpose,
            "recipient": request.recipient
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.post("/api/analysis/sales")
async def analyze_sales(request: AnalysisRequest):
    """分析销售数据"""
    try:
        analysis = plugin.analyze_sales_data(
            request.data,
            request.analysis_type
        )
        return {
            "success": True,
            "analysis": analysis,
            "type": request.analysis_type
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.post("/api/report/generate")
async def generate_report(request: ReportRequest):
    """生成销售报告"""
    try:
        report = plugin.generate_sales_report(
            request.period,
            request.data
        )
        return {
            "success": True,
            "report": report,
            "period": request.period
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/api/health")
async def health_check():
    """健康检查"""
    try:
        # 简单测试模型是否可用
        test_reply = plugin.auto_reply_customer("你好")
        return {
            "status": "healthy",
            "model": "DeepSeek-R1-Distill-Qwen-1.5B",
            "service": "running"
        }
    except Exception as e:
        raise HTTPException(status_code=503, detail=f"服务异常: {str(e)}")

if __name__ == "__main__":
    # 启动API服务
    uvicorn.run(
        app,
        host="0.0.0.0",
        port=8080,
        log_level="info"
    )

这个API服务运行在8080端口，CRM系统可以通过HTTP请求调用各种AI功能。

4.4 前端集成示例

对于Web版的CRM系统，我们还可以提供一个简单的前端组件：

<!-- crm_ai_widget.html -->
<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>CRM AI助手</title>
    <style>
        .ai-widget {
            position: fixed;
            bottom: 20px;
            right: 20px;
            width: 400px;
            background: white;
            border-radius: 10px;
            box-shadow: 0 4px 20px rgba(0,0,0,0.15);
            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
            z-index: 1000;
        }
        
        .widget-header {
            background: #2563eb;
            color: white;
            padding: 15px;
            border-radius: 10px 10px 0 0;
            display: flex;
            justify-content: space-between;
            align-items: center;
        }
        
        .widget-title {
            font-size: 16px;
            font-weight: 600;
        }
        
        .close-btn {
            background: none;
            border: none;
            color: white;
            font-size: 20px;
            cursor: pointer;
            padding: 0;
            width: 24px;
            height: 24px;
            display: flex;
            align-items: center;
            justify-content: center;
        }
        
        .widget-body {
            padding: 20px;
            max-height: 500px;
            overflow-y: auto;
        }
        
        .chat-container {
            margin-bottom: 20px;
        }
        
        .message {
            margin-bottom: 15px;
            padding: 10px 15px;
            border-radius: 8px;
            line-height: 1.5;
        }
        
        .user-message {
            background: #e0f2fe;
            margin-left: 40px;
        }
        
        .ai-message {
            background: #f1f5f9;
            margin-right: 40px;
        }
        
        .input-area {
            display: flex;
            gap: 10px;
            margin-top: 20px;
        }
        
        .input-area textarea {
            flex: 1;
            padding: 10px;
            border: 1px solid #d1d5db;
            border-radius: 6px;
            resize: vertical;
            min-height: 60px;
            font-family: inherit;
        }
        
        .input-area button {
            background: #2563eb;
            color: white;
            border: none;
            padding: 10px 20px;
            border-radius: 6px;
            cursor: pointer;
            font-weight: 500;
        }
        
        .input-area button:hover {
            background: #1d4ed8;
        }
        
        .input-area button:disabled {
            background: #9ca3af;
            cursor: not-allowed;
        }
        
        .quick-actions {
            display: flex;
            flex-wrap: wrap;
            gap: 8px;
            margin-top: 15px;
        }
        
        .quick-btn {
            background: #f3f4f6;
            border: 1px solid #d1d5db;
            padding: 6px 12px;
            border-radius: 16px;
            font-size: 14px;
            cursor: pointer;
            transition: all 0.2s;
        }
        
        .quick-btn:hover {
            background: #e5e7eb;
        }
        
        .typing-indicator {
            display: none;
            padding: 10px 15px;
            background: #f1f5f9;
            border-radius: 8px;
            margin-bottom: 15px;
            color: #6b7280;
            font-style: italic;
        }
        
        .typing-dots {
            display: inline-block;
            animation: typing 1.4s infinite;
        }
        
        @keyframes typing {
            0%, 60%, 100% { opacity: 0.3; }
            30% { opacity: 1; }
        }
    </style>
</head>
<body>
    <div class="ai-widget" id="aiWidget">
        <div class="widget-header">
            <div class="widget-title">CRM AI助手</div>
            <button class="close-btn" onclick="toggleWidget()">×</button>
        </div>
        
        <div class="widget-body">
            <div class="chat-container" id="chatContainer">
                <div class="message ai-message">
                    你好！我是CRM AI助手，可以帮助你：
                    <ul>
                        <li>自动回复客户咨询</li>
                        <li>撰写销售邮件</li>
                        <li>分析销售数据</li>
                        <li>生成销售报告</li>
                    </ul>
                    有什么可以帮你的吗？
                </div>
            </div>
            
            <div class="quick-actions">
                <button class="quick-btn" onclick="quickAction('写跟进邮件')">写跟进邮件</button>
                <button class="quick-btn" onclick="quickAction('分析销售数据')">分析销售数据</button>
                <button class="quick-btn" onclick="quickAction('客户咨询模板')">客户咨询模板</button>
                <button class="quick-btn" onclick="quickAction('生成周报')">生成周报</button>
            </div>
            
            <div class="typing-indicator" id="typingIndicator">
                思考中<span class="typing-dots">...</span>
            </div>
            
            <div class="input-area">
                <textarea 
                    id="messageInput" 
                    placeholder="输入你的问题或指令..." 
                    rows="3"
                    onkeydown="handleKeyDown(event)"
                ></textarea>
                <button id="sendBtn" onclick="sendMessage()">发送</button>
            </div>
        </div>
    </div>

    <script>
        const API_BASE = 'http://localhost:8080';
        let isWidgetVisible = true;
        
        function toggleWidget() {
            const widget = document.getElementById('aiWidget');
            isWidgetVisible = !isWidgetVisible;
            widget.style.display = isWidgetVisible ? 'block' : 'none';
        }
        
        function addMessage(content, isUser = false) {
            const container = document.getElementById('chatContainer');
            const messageDiv = document.createElement('div');
            messageDiv.className = `message ${isUser ? 'user-message' : 'ai-message'}`;
            messageDiv.textContent = content;
            container.appendChild(messageDiv);
            container.scrollTop = container.scrollHeight;
        }
        
        function showTyping() {
            document.getElementById('typingIndicator').style.display = 'block';
        }
        
        function hideTyping() {
            document.getElementById('typingIndicator').style.display = 'none';
        }
        
        async function sendMessage() {
            const input = document.getElementById('messageInput');
            const message = input.value.trim();
            const sendBtn = document.getElementById('sendBtn');
            
            if (!message) return;
            
            // 添加用户消息
            addMessage(message, true);
            input.value = '';
            sendBtn.disabled = true;
            
            // 显示思考中
            showTyping();
            
            try {
                // 调用API
                const response = await fetch(`${API_BASE}/api/customer/reply`, {
                    method: 'POST',
                    headers: {
                        'Content-Type': 'application/json',
                    },
                    body: JSON.stringify({
                        query: message,
                        customer_info: getCustomerContext()
                    })
                });
                
                const data = await response.json();
                
                if (data.success) {
                    // 添加AI回复
                    addMessage(data.reply);
                } else {
                    addMessage('抱歉，暂时无法处理您的请求。');
                }
            } catch (error) {
                console.error('API调用失败:', error);
                addMessage('网络错误，请检查API服务是否运行。');
            } finally {
                hideTyping();
                sendBtn.disabled = false;
                input.focus();
            }
        }
        
        function quickAction(action) {
            const actions = {
                '写跟进邮件': '请帮我写一封给客户的跟进邮件，客户上周咨询了我们的产品。',
                '分析销售数据': '请分析最近一个季度的销售数据，找出增长点和改进机会。',
                '客户咨询模板': '请提供几个常见的客户咨询回复模板。',
                '生成周报': '请帮我生成上周的销售工作周报。'
            };
            
            const input = document.getElementById('messageInput');
            input.value = actions[action] || action;
            input.focus();
        }
        
        function handleKeyDown(event) {
            if (event.key === 'Enter' && !event.shiftKey) {
                event.preventDefault();
                sendMessage();
            }
        }
        
        function getCustomerContext() {
            // 这里可以从CRM系统获取当前客户上下文
            // 示例返回模拟数据
            return {
                customer_name: '示例客户',
                customer_level: 'VIP',
                last_contact: '2024-01-15'
            };
        }
        
        // 初始化
        document.getElementById('messageInput').focus();
    </script>
</body>
</html>

这个前端组件可以直接嵌入到CRM系统中，为用户提供便捷的AI助手界面。

5. 实际集成与优化

插件开发完成后，接下来就是把它集成到现有的CRM系统中。不同的CRM系统集成方式可能不同，但基本思路是一样的。

5.1 集成到常见CRM系统

5.1.1 Salesforce集成示例

如果你用的是Salesforce，可以通过Apex类来集成：

// CRM_AI_Helper.cls
public class CRM_AI_Helper {
    
    @AuraEnabled
    public static String getAIReply(String customerQuery, String customerId) {
        // 获取客户信息
        Account customer = [SELECT Name, Industry, AnnualRevenue 
                          FROM Account WHERE Id = :customerId];
        
        // 构建请求数据
        Map<String, Object> requestData = new Map<String, Object>();
        requestData.put('query', customerQuery);
        
        if (customer != null) {
            Map<String, Object> customerInfo = new Map<String, Object>();
            customerInfo.put('name', customer.Name);
            customerInfo.put('industry', customer.Industry);
            customerInfo.put('annual_revenue', customer.AnnualRevenue);
            requestData.put('customer_info', customerInfo);
        }
        
        // 调用AI服务
        HttpRequest req = new HttpRequest();
        req.setEndpoint('http://your-server:8080/api/customer/reply');
        req.setMethod('POST');
        req.setHeader('Content-Type', 'application/json');
        req.setBody(JSON.serialize(requestData));
        req.setTimeout(60000);
        
        Http http = new Http();
        HttpResponse res = http.send(req);
        
        if (res.getStatusCode() == 200) {
            Map<String, Object> response = (Map<String, Object>)JSON.deserializeUntyped(res.getBody());
            return (String)response.get('reply');
        }
        
        return '抱歉，AI助手暂时无法回复。';
    }
    
    @AuraEnabled
    public static String generateEmailTemplate(String purpose, String contactId) {
        // 类似实现...
        return '邮件模板生成中...';
    }
}

5.1.2 自定义CRM系统集成

对于自研的CRM系统，集成更加灵活。这里是一个Python Django的集成示例：

# views.py
from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt
import json
from .crm_ai_plugin import CRMAIPlugin

plugin = CRMAIPlugin()

@csrf_exempt
def ai_customer_reply(request):
    """处理客户咨询的AI回复"""
    if request.method == 'POST':
        try:
            data = json.loads(request.body)
            customer_query = data.get('query', '')
            customer_id = data.get('customer_id')
            
            # 从数据库获取客户信息
            customer_info = None
            if customer_id:
                from .models import Customer
                try:
                    customer = Customer.objects.get(id=customer_id)
                    customer_info = {
                        'name': customer.name,
                        'level': customer.level,
                        'industry': customer.industry,
                        'last_contact': customer.last_contact.strftime('%Y-%m-%d')
                    }
                except Customer.DoesNotExist:
                    pass
            
            # 调用AI插件
            reply = plugin.auto_reply_customer(customer_query, customer_info)
            
            # 保存到数据库
            from .models import Conversation
            Conversation.objects.create(
                customer_id=customer_id,
                query=customer_query,
                ai_reply=reply,
                source='ai_assistant'
            )
            
            return JsonResponse({
                'success': True,
                'reply': reply,
                'customer_id': customer_id
            })
            
        except Exception as e:
            return JsonResponse({
                'success': False,
                'error': str(e)
            }, status=500)
    
    return JsonResponse({'error': 'Method not allowed'}, status=405)

@csrf_exempt  
def ai_analyze_sales(request):
    """AI分析销售数据"""
    if request.method == 'POST':
        try:
            data = json.loads(request.body)
            period = data.get('period', 'monthly')
            
            # 从数据库获取销售数据
            from .models import SaleRecord
            from django.db.models import Sum, Count
            from django.utils import timezone
            from datetime import timedelta
            
            # 示例：获取最近30天的销售数据
            end_date = timezone.now()
            start_date = end_date - timedelta(days=30)
            
            sales_data = SaleRecord.objects.filter(
                sale_date__range=[start_date, end_date]
            ).values('product_category').annotate(
                total_amount=Sum('amount'),
                transaction_count=Count('id')
            )
            
            # 转换为插件需要的格式
            analysis_data = {
                'period': f'{start_date.date()} 至 {end_date.date()}',
                'total_sales': sum(item['total_amount'] for item in sales_data),
                'by_category': {
                    item['product_category']: {
                        'amount': item['total_amount'],
                        'count': item['transaction_count']
                    }
                    for item in sales_data
                }
            }
            
            # 调用AI分析
            analysis = plugin.analyze_sales_data(analysis_data, 'trend')
            
            return JsonResponse({
                'success': True,
                'analysis': analysis,
                'period': analysis_data['period'],
                'total_sales': analysis_data['total_sales']
            })
            
        except Exception as e:
            return JsonResponse({
                'success': False,
                'error': str(e)
            }, status=500)
    
    return JsonResponse({'error': 'Method not allowed'}, status=405)

5.2 性能优化建议

在实际生产环境中，你可能需要对插件进行一些优化：

5.2.1 缓存策略

# 添加缓存支持
import redis
from functools import lru_cache
import hashlib
import json

class CachedCRMAIPlugin(CRMAIPlugin):
    def __init__(self, config=None, redis_host='localhost', redis_port=6379):
        super().__init__(config)
        # 连接Redis缓存
        try:
            self.redis_client = redis.Redis(
                host=redis_host,
                port=redis_port,
                decode_responses=True
            )
            self.use_cache = True
        except:
            self.use_cache = False
            logger.warning("Redis连接失败，禁用缓存")
    
    def _get_cache_key(self, func_name, *args, **kwargs):
        """生成缓存键"""
        key_data = {
            'func': func_name,
            'args': args,
            'kwargs': kwargs
        }
        key_str = json.dumps(key_data, sort_keys=True)
        return f"crm_ai:{hashlib.md5(key_str.encode()).hexdigest()}"
    
    def auto_reply_customer(self, customer_query: str, customer_info: Dict = None) -> str:
        """带缓存的自动回复"""
        if not self.use_cache:
            return super().auto_reply_customer(customer_query, customer_info)
        
        # 生成缓存键
        cache_key = self._get_cache_key(
            'auto_reply_customer',
            customer_query,
            customer_info
        )
        
        # 尝试从缓存获取
        cached_reply = self.redis_client.get(cache_key)
        if cached_reply:
            logger.info(f"缓存命中: {cache_key[:20]}...")
            return cached_reply
        
        # 缓存未命中，调用AI
        reply = super().auto_reply_customer(customer_query, customer_info)
        
        # 存入缓存（过期时间1小时）
        self.redis_client.setex(cache_key, 3600, reply)
        
        return reply

5.2.2 请求批处理

对于批量处理任务，可以添加批处理支持：

class BatchCRMAIPlugin(CRMAIPlugin):
    def batch_reply_customers(self, queries: List[Dict]) -> List[Dict]:
        """批量回复客户咨询"""
        results = []
        
        # 分批处理，避免一次性请求太多
        batch_size = 5
        for i in range(0, len(queries), batch_size):
            batch = queries[i:i+batch_size]
            batch_results = self._process_batch(batch)
            results.extend(batch_results)
            
            # 避免请求过快
            time.sleep(0.5)
        
        return results
    
    def _process_batch(self, batch: List[Dict]) -> List[Dict]:
        """处理单个批次"""
        batch_results = []
        
        for query_data in batch:
            try:
                reply = self.auto_reply_customer(
                    query_data['query'],
                    query_data.get('customer_info')
                )
                batch_results.append({
                    'success': True,
                    'query_id': query_data.get('id'),
                    'reply': reply
                })
            except Exception as e:
                batch_results.append({
                    'success': False,
                    'query_id': query_data.get('id'),
                    'error': str(e)
                })
        
        return batch_results

5.2.3 监控和日志

添加详细的监控和日志，方便问题排查：

class MonitoredCRMAIPlugin(CRMAIPlugin):
    def __init__(self, config=None, metrics_collector=None):
        super().__init__(config)
        self.metrics_collector = metrics_collector
        self.request_count = 0
        self.error_count = 0
        self.total_response_time = 0
    
    def _make_request(self, messages: List[Dict], **kwargs) -> Optional[ChatCompletion]:
        """带监控的请求方法"""
        start_time = time.time()
        self.request_count += 1
        
        try:
            response = super()._make_request(messages, **kwargs)
            
            # 记录响应时间
            response_time = time.time() - start_time
            self.total_response_time += response_time
            
            # 收集指标
            if self.metrics_collector:
                self.metrics_collector.record_request(
                    success=True,
                    response_time=response_time,
                    token_count=kwargs.get('max_tokens', 0)
                )
            
            return response
            
        except Exception as e:
            self.error_count += 1
            logger.error(f"请求失败: {e}")
            
            if self.metrics_collector:
                self.metrics_collector.record_request(
                    success=False,
                    response_time=time.time() - start_time,
                    error=str(e)
                )
            
            return None
    
    def get_metrics(self) -> Dict:
        """获取性能指标"""
        avg_response_time = 0
        if self.request_count > 0:
            avg_response_time = self.total_response_time / self.request_count
        
        error_rate = 0
        if self.request_count > 0:
            error_rate = self.error_count / self.request_count
        
        return {
            'total_requests': self.request_count,
            'error_count': self.error_count,
            'error_rate': f"{error_rate:.2%}",
            'avg_response_time': f"{avg_response_time:.2f}s",
            'current_model': self.config.model_name
        }

5.3 安全考虑

在企业环境中，安全非常重要。以下是一些安全建议：

API认证：为API服务添加认证机制
输入验证：对所有输入进行严格的验证和清理
速率限制：防止滥用，限制请求频率
敏感信息过滤：避免模型处理敏感数据
审计日志：记录所有AI交互

# 安全增强的API
from fastapi import Depends, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials

security = HTTPBearer()

async def verify_token(credentials: HTTPAuthorizationCredentials = Depends(security)):
    """验证访问令牌"""
    token = credentials.credentials
    # 这里实现你的令牌验证逻辑
    if not is_valid_token(token):
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="无效的访问令牌"
        )
    return token

@app.post("/api/secure/customer/reply")
async def secure_reply_customer(
    request: CustomerQuery,
    token: str = Depends(verify_token)
):
    """需要认证的客户回复接口"""
    # 原有的业务逻辑...
    pass