大数据毕业设计基于Spark美食数据分析可视化系统 Hadoop 深度学习TensorFlow LSTM 预测算法模型爬虫技术 Django框架 deepseek（建议收藏）✅

q_3375686806

1234人浏览 · 2025-03-24 22:43:01

q_3375686806 · 2025-03-24 22:43:01 发布

博主介绍：✌全网粉丝10W+,前互联网大厂软件研发、集结硕博英豪成立工作室。专注于计算机相关专业项目实战6年之久，选择我们就是选择放心、选择安心毕业✌
> 🍅想要获取完整文章或者源码，或者代做，拉到文章底部即可与我联系了。🍅

点击查看作者主页，了解更多项目！

🍅感兴趣的可以先收藏起来，点赞、关注不迷路，大家在毕设选题，项目以及论文编写等相关问题都可以给我留言咨询，希望帮助同学们顺利毕业。🍅

1、毕业设计：2025年计算机专业毕业设计选题汇总（建议收藏）✅

2、大数据毕业设计：2025年选题大全深度学习 python语言 JAVA语言 hadoop和spark（建议收藏）✅

1、项目介绍

技术栈：
Python语言、Django框架、MySQL数据库、深度学习TensorFlow的Keras构建 LSTM 模型、 LSTM 预测算法模型、Echarts可视化、selenium爬虫技术、大众点评数据
大数据技术：Hadoop、Spark、Hive

2、项目界面

（1）首页–数据概况

在这里插入图片描述

（2）美食类型分析

在这里插入图片描述
（3）美食价格分析

（4）美食评价分析
在这里插入图片描述

（5）美食地区分析
在这里插入图片描述

（6）美食词云图分析
在这里插入图片描述

（7）美食数据中心
在这里插入图片描述

（8）评价预测----- LSTM 预测算法模型

在这里插入图片描述

（9）注册登录
在这里插入图片描述

（10）数据采集
在这里插入图片描述

3、项目说明

项目功能模块介绍

1. 首页——数据概况

展示整体数据的概览，例如数据总量、数据更新时间等。
可能会用图表展示关键指标，比如不同美食类型的数量分布、热门地区等。

2. 美食类型分析

分析不同美食类型的分布情况。
可能会展示各种美食类型的占比，以及不同地区美食类型的偏好差异。

3. 美食价格分析

分析美食价格的分布情况，例如不同价格区间的餐厅数量。
可能会展示价格与评分、人气等因素的相关性。

4. 美食评价分析

分析用户对美食的评价，包括好评率、差评率等。
可能会展示评价的分布情况，以及不同美食类型或地区的评价差异。

5. 美食地区分析

分析不同地区美食的分布情况。
可能会展示热门美食地区、不同地区的美食偏好等。

6. 美食词云图分析

通过词云图展示用户评价中的高频词汇。
可以直观地看出用户对美食的常见评价词汇，例如“好吃”“服务好”“环境差”等。

7. 美食数据中心

提供一个数据管理界面，用户可以查看、筛选、导出数据。
可能会展示数据的详细信息，例如餐厅名称、地址、评分、价格等。

8. 评价预测——LSTM预测算法模型

使用LSTM模型对美食评价进行预测。
用户可以输入相关参数（如餐厅信息、历史评价等），模型会预测未来的评价趋势。

9. 注册登录

提供用户注册和登录功能。
用户可以注册账号，登录后可以访问更多功能，例如数据收藏、评价预测等。

10. 数据采集

介绍数据采集的流程和方法。
可能会使用Selenium爬虫技术从大众点评等平台采集数据，并存储到MySQL数据库中。
数据采集后可能会经过清洗、预处理等步骤，再用于分析或模型训练。

4、核心代码

#coding:utf8
#导包
from pyspark.sql import SparkSession
from pyspark.sql.functions import monotonically_increasing_id
from pyspark.sql.types import StructType,StructField,IntegerType,StringType,FloatType
from pyspark.sql.functions import count,mean,col,sum,when,max,min,avg
from pyspark.sql import functions as F

if __name__ == '__main__':
    #构建
    spark = SparkSession.builder.appName("sparkSQL").master("local[*]").\
        config("spark.sql.shuffle.partitions", 2). \
        config("spark.sql.warehouse.dir", "hdfs://node1:8020/user/hive/warehouse"). \
        config("hive.metastore.uris", "thrift://node1:9083"). \
        enableHiveSupport().\
        getOrCreate()

    sc = spark.sparkContext

    #读取
    fooddata = spark.read.table("fooddata")

    #需求一 价格TOP10评分
    top_ten_price = fooddata.orderBy(fooddata.avgPrice.desc()).limit(10)
    result1 = top_ten_price.select("title","start","avgPrice")

    df = result1.toPandas()
    # print(df)

    # sql
    result1.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "maxPriceTop"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    result1.write.mode("overwrite").saveAsTable("maxPriceTop", "parquet")
    spark.sql("select * from maxPriceTop").show()

    #需求二 totalType
    result2 = fooddata.groupby("totalType").count()

    # sql
    result2.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "typeCount"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    result2.write.mode("overwrite").saveAsTable("typeCount", "parquet")
    spark.sql("select * from typeCount").show()

    #需求三 城市均价
    reuslt3 = fooddata.groupby("city").agg(F.avg("avgPrice").alias("averagePrice"))

    # sql
    reuslt3.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "cityAvg"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    reuslt3.write.mode("overwrite").saveAsTable("cityAvg", "parquet")
    spark.sql("select * from cityAvg").show()

    #类型分析
    result4 = fooddata.groupby("totalType").agg(avg("totalComment").alias("commentAvg"))

    # sql
    result4.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "typeComment"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    result4.write.mode("overwrite").saveAsTable("typeComment", "parquet")
    spark.sql("select * from typeComment").show()

    #需求五
    result5 = fooddata.groupby("totalType").agg(
        avg("tasterate").alias("avgTasterate"),
        avg("envsrate").alias("avgEnvsrate"),
        avg("serverate").alias("avgServerate"),
    )

    # sql
    result5.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "typeRate"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    result5.write.mode("overwrite").saveAsTable("typeRate", "parquet")
    spark.sql("select * from typeRate").show()

    #需求6 精确类型
    result6 = fooddata.groupby("type").count()

    # sql
    result6.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "specificType"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    result6.write.mode("overwrite").saveAsTable("specificType", "parquet")
    spark.sql("select * from specificType").show()


    #需求七 价格分析
    reuslt7 = fooddata.groupby("city").agg(max("avgPrice").alias("maxAvgPrice"))

    # sql
    reuslt7.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "maxPriceCity"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    reuslt7.write.mode("overwrite").saveAsTable("maxPriceCity", "parquet")
    spark.sql("select * from maxPriceCity").show()

    #需求八 价格分类
    fooddata_with_category = fooddata.withColumn(
        "prcieCategory",
        when(col("avgPrice").between(0,15),'0-15元')
        .when(col("avgPrice").between(15, 50), '15-50元')
        .when(col("avgPrice").between(50, 100), '50-100元')
        .when(col("avgPrice").between(100, 200), '100-200元')
        .when(col("avgPrice").between(200, 500), '200-500元')
        .otherwise('500以上')
    )

    reuslt8 = fooddata_with_category.groupby("prcieCategory").count()

    # sql
    reuslt8.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "categoryPrice"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    reuslt8.write.mode("overwrite").saveAsTable("categoryPrice", "parquet")
    spark.sql("select * from categoryPrice").show()

    # 类型均价
    reuslt9 = fooddata.groupby("totalType").agg(avg("avgPrice").alias("allAvgPrice"))

    # sql
    reuslt9.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "typePrice"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    reuslt9.write.mode("overwrite").saveAsTable("typePrice", "parquet")
    spark.sql("select * from typePrice").show()

    #需求十 星级分布
    result10 = fooddata.groupby("start").count()

    # sql
    result10.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "startCount"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    result10.write.mode("overwrite").saveAsTable("startCount", "parquet")
    spark.sql("select * from startCount").show()

    #需求十一

    fooddata_with_mixrate = fooddata.withColumn("mixrate",
                                                col("tasterate")+col("envsrate")+col("serverate"))

    reuslt11 = fooddata_with_mixrate.groupby("city").agg(avg("mixrate").alias("avgMixrate"))

    # sql
    reuslt11.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "mixrateAvg"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    reuslt11.write.mode("overwrite").saveAsTable("mixrateAvg", "parquet")
    spark.sql("select * from mixrateAvg").show()

    # 价格最大最小
    result12 = fooddata.groupby("city").agg(
        max("avgPrice").alias("maxAvfPrice"),
        avg("avgPrice").alias("avgAvfPrice"),
        min("avgPrice").alias("minAvfPrice"),
    )

    # sql
    result12.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "mamCity"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    result12.write.mode("overwrite").saveAsTable("mamCity", "parquet")
    spark.sql("select * from mamCity").show()

    #需求十三
    total_comments_df = fooddata.groupby("address").agg(
        sum("totalComment").alias("sumTotalComment")
    )

    reuslt13 = total_comments_df.orderBy(col("sumTotalComment").desc()).limit(10)

    # sql
    reuslt13.write.mode("overwrite"). \
        format("jdbc"). \
        option("url", "jdbc:mysql://node1:3306/bigdata?useSSL=false&useUnicode=true&charset=utf8"). \
        option("dbtable", "hotAddress"). \
        option("user", "root"). \
        option("password", "root"). \
        option("encoding", "utf-8"). \
        save()

    reuslt13.write.mode("overwrite").saveAsTable("hotAddress", "parquet")
    spark.sql("select * from hotAddress").show()