当前位置:首页>> >>


基于图算法与最佳图表示的单台PC图计算平台研究.rar

收藏

资源目录
    文档预览:
    编号:20181030002154371    类型:共享资源    大小:3.54MB    格式:RAR    上传时间:2018-10-30
    尺寸:148x200像素    分辨率:72dpi   颜色:RGB    工具:   
    25
    金币
    关 键 词:
    基于 算法 最佳 图表 PC 计算 平台 研究
    资源描述:
    上海交通大学硕士学位论文 基于图算法与最佳图表示的单台 PC 图计算平台研究 硕 士研究生 : 张雕 学号 : 1120379088 导 师 : 梁阿磊 申请学位 : 工程硕士 学科 : 计算机科学与技术 所 在 单 位 : 软件学院 答 辩 日 期 : 2014 年 1 月 15 日 授予学位单位 : 上海交通大学 Dissertation Submitted to Shanghai Jiao Tong University for the Degree of Master A Research on Graph Computation Platform in a Single PC by Application of Graph AlgorithmTuned Graph Representation Format Candidate: Diao Zhang Student ID: 1120379088 Supervisor: ALei Liang Academic Degree Applied for: Master of Engineering Speciality: Computer Science and Technology Affiliation: School of Software Date of Defence: Jan15, 2014 Degree-Conferring-Institution: Shanghai Jiao Tong University 上海交通大学 学位论文原创性声明 本人郑重声明:所呈交的学位论文《基于网络操作系统的负载均衡应用框架设计 》 ,是本人在导师的指导下,独立进行 研究工作所取得的成果。除文中已经注明引用的内容外,本论文不包含任何其他个人或集体已经发表或撰写过的作品成果。对本文的研究做出重要贡献的个人和集体,均已在文中以明确方式标明。本人完全意识到本声明的法律结果由本人承担。 学位论文作者签名: 日期:年月日 上海交通大学 学位论文版权使用授权书 本学位论文作者完全了解学校有关保留、使用学位论文的规定,同意学校保留并向国家有关部门或机构送交论文的复印件和电子版,允许论文被查阅和借阅。本人授权上海交通大学可以将本学位论文的全部或部分内容编入有关数据库进行 检索,可以采用影印、缩印或扫描等复制手段保存和汇编本学位论文。 保密 □,在年解密后适用本授权书。 本学位论文属于 不保密 □。 (请在以上方框内打“ √ ”) 学位论文作者签名: 指导教师签名: 日期: 年 月 日 日期: 年 月 日 上海交通大学硕士学位论文 I 基于图算法与最佳图表示的单台 PC 图计算平台研究 摘要 图可以描述实体与实体之间的联系,以顶点和边的抽象的方式分析现实中的问题,如好友推荐、网页排名 PageRank。传统的图算法假设整个图 数据可以加载进单台 PC 内存,所以对于大规模图,如社交网络、互联网等无法处理。云计算以及分布式图算法的研究用于大规模图的处理,如 Hadoop,从扩展性、容错以及开源可用性等方面发挥优势,但仍存在控制与可靠性、数据安全、成本花费以及分布式图算法的调试与优化较难的问题。 针对这种情况 ,学术界开始研究如何使用单台 PC 进行大规模图数据的处理,并达到较之于分布式算法的合理时间消耗。现有的基于单台 PC 的图计算平台,如 GraphChi、 TurboGraph,已经可以进行大规模图数据的处理。但是从平台计算性能和易用性 (即基于 平台进行图问题的抽象 )两方面都存在可以优化的方面。 本文基于对图算法与最佳图表示、 CPU 与 I/O 并行性以及内存利用三方面的研究,在 GraphChi 基础上设计和实现了一个基于单台 PC的图计算平台 HybridGraph,使计算性能和算法抽象两部分得到改进。 本文的主要工作 包括:( I) 研究和分析了图算法表示,以及总结出与图算法最佳匹配的图表示方法 ;( II)探索并证明了通过图算法与图表示格式的匹配可以提高图处理效率;( III)基于开源图计算平台GraphChi,实现了 HybridGraph 图计算平台。 关键词: 大 规模图、图计算平台 、 云计算、图算法、图表示上海交通大学硕士学位论文 II A Research on Graph Computation Platform in a Single PC by Application of Graph AlgorithmTuned Graph Representation Format ABSTRACT Graph can be used to describe the connenction between entities, and analyze real-world problem, such as friend recommendation and web page ordering, by abstracting entity as vertex and connencion as edge. Tranditional graph algorithms assume the input graph fits in the memory of a single machine. However, the large-scale graphs, such as Web Graph and Social Network, break the assumption. Then, we naturally turn to cloud computing and distributed graph algorithms, such as Hadoop, for their scalability, ease of accessibility and fault tolerance. While the problems of cloud computing remain to be solved, such as control and reliability, data security, unpredicted cost and the difficulty of distributed algorithms for debugging and optimizing. Based on this situation, analytics over large-scale graph processing on a single PC reasonably is beginning to attract significant attention in the research community. Presented graph-computing platform on a single PC, such as GraphChi, TurboGraph, can analyze large-scale graph efficiently. But they still have performance and ease abstraction of graph algorithm problem and can be optimized someway. This paper design and realize a graph computing platform on a single PC called HybridGraph based on GraphChi by the research on (1) Graph Algorithm tuned on Graph Representation Format, (2) Parallelism of CPU processing and I/O processing, and (3) Usage of Memory. And the experiment shows improvement on performance and abstraction of graph algorithm. The primary work of this paper includes: (I) Explored and classified the abstraction of graph algorithm, and summarized the graph algorithm 上海交通大学硕士学位论文 III and best-fit graph format; (II) Prove that the performance of graph processing can be improved by combination graph algorithm and graph format; (III) implementation of HybridGraph based on the open source GraphChi. KEY WORDS:large-scale graph, graph-computing platform, Cloud computing, graph algorithm, graph representation 上海交通大学硕士学位论文 IV 目录 第一章绪论 ························································································· 1 1.1 课题研究背景 ·············································································· 1 1.1.1 大规模图 ··············································································· 1 1.1.2 图计算 ·················································································· 2 1.1.3 云计算与分布式计算系统 ·························································· 2 1.1.4 单机计算系统 ········································································· 5 1.2 相关研究现状 ·············································································· 6 1.2.1 Pregel ··················································································· 6 1.2.2 PowerGraph ············································································ 7 1.2.3 GraphChi ··············································································· 9 1.2.4 TurboGraph ··········································································· 12 1.2.5 X-Stream ·············································································· 13 1.2.6 MMap ·················································································· 14 1.3 本文内容安排 ············································································· 15 第二章图与图算法 ··············································································· 16 2.1 引言 ························································································· 16 2.2 图的基本概念 ············································································· 16 2.2.1 顶点和边 ·············································································· 16 2.2.2 图的存储表示 ········································································ 16 2.2.3 子图 ···················································································· 18 2.3 图算法 ······················································································ 19 2.3.1 概述 ···················································································· 19 2.3.2 Out-edge Only 算法 ································································· 22 2.3.3 In-edge Sufficient 算法 ····························································· 22 2.3.4 In-out-edge 算法 ····································································· 22 2.4 本章小结 ··················································································· 23 第三章单台 PC 图计算平台 HYBRIDGRAPH 的设计 ··································· 24 3.1 引言 ························································································· 24 3.2 GraphChi 分析 ············································································· 25 3.2.1 Compressed row Storage ···························································· 26 上海交通大学硕士学位论文 V 3.2.2 Compressed column Storage ······················································· 26 3.2.3 GraphChi 顶点和边抽象 ··························································· 27 3.2.4 串行性 ················································································· 30 3.3 单机图计算平台设计思路 ······························································ 30 3.4 图计算配置管理 ·········································································· 32 3.4.1 计算引擎选择 ········································································ 32 3.5 图数据预处理 ············································································· 33 3.6 图计算引擎 ················································································ 34 3.7 本章小结 ··················································································· 34 第四章单 台 PC 图计算平台 HYBRIDGRAPH 的实现 ··································· 35 4.1 引言 ························································································· 35 4.2 HybridGraph API 实现 ··································································· 35 4.3 图计算配置管理实现 ···································································· 37 4.3.1 图文件命名管理 ····································································· 38 4.3.2 内存分配 ·············································································· 38 4.4 图数据预处理实现 ······································································· 40 4.5 图计算引擎实现 ·········································································· 41 4.6 本章小结 ··················································································· 43 第五章实验与分析 ··············································································· 45 5.1 引言 ························································································· 45 5.2 实验环境介绍 ············································································· 45 5.3 实验算法 ··················································································· 45 5.4 实验数据集介绍 ·········································································· 47 5.5 实验结果与分析 ·········································································· 50 5.5.1 实验目标 ·············································································· 50 5.5.2 实验结果与分析 ····································································· 50 5.6 本章小结 ··················································································· 53 第六章结束语 ····················································································· 54 6.1 主要工作与创新点 ······································································· 54 6.2 后续研究工作 ············································································· 55 参考文献 ··························································································· 56 致谢 ································································································· 59 上海交通大学硕士学位论文 VI 攻读硕士学位期间已发表或录用的论文 ··················································· 61
    展开阅读全文
    1
      金牌文库所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
    0条评论

    还可以输入200字符

    暂无评论,赶快抢占沙发吧。

    关于本文
    本文标题:基于图算法与最佳图表示的单台PC图计算平台研究.rar
    链接地址:http://www.gold-doc.com/p-254807.html
    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服客服 - 联系我们
    [email protected] 2014-2018 金牌文库网站版权所有
    经营许可证编号:浙ICP备15046084号-3
    收起
    展开