实战教程：用通义千问1.8B和Gradio搭建AI信息聚合平台

本文介绍了如何在星图GPU平台上自动化部署通义千问1.5-1.8B-Chat-GPTQ-Int4 WebUI镜像，快速搭建AI信息聚合平台。该平台结合Gradio框架，可实现垂直领域内容的智能检索与对话式问答，适用于金融、医疗等专业信息的自动化处理与精准解答。

徐晓波

205人浏览 · 2026-03-19 00:23:56

徐晓波 · 2026-03-19 00:23:56 发布

实战教程：用通义千问1.8B和Gradio搭建AI信息聚合平台

1. 项目概述与技术选型

在信息爆炸的时代，如何快速获取特定领域的优质内容成为刚需。本文将手把手教你用通义千问1.8B轻量级模型和Gradio框架，搭建一个垂直领域信息聚合平台。这个平台能自动收集、整理专业内容，并以对话形式提供精准解答。

1.1 核心组件介绍

通义千问1.8B-Chat-GPTQ-Int4：阿里云推出的高效对话模型，经4-bit量化后仅需4GB显存
Gradio：快速构建机器学习Web界面的Python框架
ChromaDB：轻量级向量数据库，用于存储和检索文本片段
Sentence-Transformers：文本嵌入模型，将内容转换为可检索的向量

1.2 系统架构设计

整个系统采用"检索-增强生成"（RAG）架构：

爬虫定期抓取目标网站内容
文本清洗后存入向量数据库
用户提问时先检索相关片段
模型基于检索结果生成有依据的回答

2. 环境准备与模型部署

2.1 基础环境配置

推荐使用Linux系统，准备Python 3.10+环境：

# 创建conda环境
conda create -n qwen_rag python=3.10
conda activate qwen_rag

# 安装核心依赖
pip install torch transformers gradio chromadb sentence-transformers beautifulsoup4

2.2 模型快速部署

通义千问1.8B-Chat-GPTQ-Int4模型部署非常简单：

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True
)

首次运行会自动下载约4GB的模型文件。部署成功后可以测试基础对话功能：

input_text = "你好，请介绍一下你自己"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

3. 构建领域知识库

3.1 目标网站内容抓取

以AI新闻领域为例，抓取专业媒体最新文章：

import requests
from bs4 import BeautifulSoup

def fetch_articles(base_url, max_pages=3):
    articles = []
    headers = {'User-Agent': 'Mozilla/5.0'}
    
    for page in range(1, max_pages+1):
        url = f"{base_url}/page/{page}" if page>1 else base_url
        try:
            resp = requests.get(url, headers=headers, timeout=10)
            soup = BeautifulSoup(resp.text, 'html.parser')
            
            # 根据实际网站结构调整选择器
            for item in soup.select('.article-list a.title'):
                articles.append({
                    'title': item.text.strip(),
                    'url': item['href']
                })
        except Exception as e:
            print(f"抓取失败: {e}")
    
    return articles

3.2 文本清洗与切片

抓取到的文章需要清洗和分块处理：

def clean_and_chunk(text, chunk_size=500):
    # 移除HTML标签和多余空白
    soup = BeautifulSoup(text, 'html.parser')
    clean_text = ' '.join(soup.stripped_strings)
    
    # 按固定大小分块
    return [clean_text[i:i+chunk_size] 
            for i in range(0, len(clean_text), chunk_size)]

3.3 向量化存储

使用ChromaDB存储处理后的内容：

import chromadb
from sentence_transformers import SentenceTransformer

# 初始化向量数据库
client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection("ai_news")

# 加载嵌入模型
embed_model = SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2')

def index_content(texts, metadatas):
    # 生成嵌入向量
    embeddings = embed_model.encode(texts)
    
    # 存入数据库
    collection.add(
        documents=texts,
        embeddings=embeddings.tolist(),
        metadatas=metadatas,
        ids=[str(i) for i in range(len(texts))]
    )

4. 实现问答系统

4.1 检索增强生成流程

def retrieve_and_answer(question, top_k=3):
    # 检索相关片段
    query_embedding = embed_model.encode([question]).tolist()
    results = collection.query(
        query_embeddings=query_embedding,
        n_results=top_k
    )
    
    # 构建提示词
    context = "\n".join(results['documents'][0])
    prompt = f"""基于以下信息回答问题：
{context}

问题：{question}
答案："""
    
    # 生成回答
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(**inputs, max_new_tokens=200)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

4.2 Gradio界面开发

创建用户友好的Web界面：

import gradio as gr

def gradio_interface(question):
    answer = retrieve_and_answer(question)
    return answer

demo = gr.Interface(
    fn=gradio_interface,
    inputs=gr.Textbox(label="输入问题", placeholder="最近AI领域有什么重要突破？"),
    outputs=gr.Textbox(label="模型回答"),
    title="AI信息聚合平台",
    description="基于通义千问1.8B的垂直领域问答系统"
)

demo.launch(server_name="0.0.0.0", server_port=7860)

5. 部署与优化建议

5.1 生产环境部署

推荐使用Supervisor管理服务：

[program:qwen_rag]
command=/path/to/conda/env/bin/python app.py
directory=/path/to/project
autostart=true
autorestart=true
stderr_logfile=/var/log/qwen_rag.err.log
stdout_logfile=/var/log/qwen_rag.out.log