基于Hive的海量公交客流起讫点挖掘方法
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP391.9

基金项目:

天津市科技计划项目(No.14ZCDGSF00124)、天津市自然科学基金项目(No.16JCYBJC15600)


A Methodology of Massive Bus Passenger Origin-Destination Mining Based on Hive
Author:
Affiliation:

Fund Project:

Tianjin Science and Technology Project(No.14ZCDGSF00124);Natural science fund of Tianjin City(No.16JCYBJC15600)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目前OD挖掘方法普遍存在无法并行分析多条线路、低效率、预测率不足的问题。考虑到Hive在海量数据上的查询性能优势,基于Hive实现了OD挖掘,克服了上述问题。基于时间阈值匹配上车站点,失配记录基于站点上客数再次匹配。基于表连接的出行链算法预测下车站点,预测失败的记录基于概率进行两次预测。以石家庄2018年1月1日至2018年3月27日的IC卡刷卡数据和调度数据进行OD挖掘,在清洗后的11,312,505条出行记录中挖掘出11,270,037条OD记录,预测率达到99.6%,出行与吸引校验质量较高,Hive并行调优开启后耗时17829.04s。可见该方法满足生产环境中离线挖掘OD的业务需求。

    Abstract:

    The current OD mining method has some ubiquitous problem that unable to analyze multiple lines in parallel, low efficiency, and low prediction rate. Considering the query performance advantages of Hive on massive data, OD mining based on Hive overcomes the problems above. The time threshold was used to match the boarding station, the failed matched record will be matched again base on the number of boarding passenger. Trip-chaining method base on table joining was used to match alighting station, the failed predicted record will be matched twice base on probability. The IC card consumption data and the scheduling data in Shijiazhuang city from January 1, 2018, to March 27, 2018, were used to do OD mining, 11,270,037 OD records were mined from cleaned 11,312,505 trip records. The matching rate reached 99.6%, with the high quality of travel and attraction checking results. Spend 17829.04s on running this method with Hive parallel on. The results show that the method satisfies the business requirements of offline OD mining in a production environment.

    参考文献
    相似文献
    引证文献
引用本文

许智宏,王怡峥,王利琴,等. 基于Hive的海量公交客流起讫点挖掘方法[J]. 科学技术与工程, 2020, 20(20): 8300-8309.
XU Zhi-hong, WANG Yi-zheng, DONG Yong-feng. A Methodology of Massive Bus Passenger Origin-Destination Mining Based on Hive[J]. Science Technology and Engineering,2020,20(20):8300-8309.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-08-29
  • 最后修改日期:2020-04-17
  • 录用日期:2020-01-18
  • 在线发布日期: 2020-07-29
  • 出版日期:
×
律回春渐,新元肇启|《科学技术与工程》编辑部恭祝新岁!
亟待确认版面费归属稿件,敬请作者关注