Abstract
Graph computing is more and more widely used in various fields such as spatial information network and social network. However, the existing graph computing systems have some problems like complex programming and steep learning curve. This paper introduces GRAPE, a distributed large-scale GRAPh Engine, which has the unique features of solid theoretical guarantee, ease of use, auto-parallelization and high performance. The paper also introduces several typical scenarios of graph computing, including entity resolution, link prediction, community detection and graph mining of spatial information network. In these scenarios, various problems have been encountered in the existing systems, such as failure to compute over large-scale data due to the high computation complexity, loss of accuracy due to the cropping of original data and too long execution time. In the face of these challenges, GRAPE is easy to support these computing scenarios with a series of technical improvements. With the deployment of GRAPE in Alibaba, both effectiveness and efficiency of graph computing have been greatly improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alidata (2017). https://dt.alibaba.com/page21.htm
Apache Kafka (2017). https://kafka.apache.org/
Turi Create: simplifies the development of custom machine learning models (2019). https://github.com/apple/turicreate
Aliyun dataworks (2020). https://help.aliyun.com/product/72772.html
Aliyun maxcompute (2020). https://help.aliyun.com/document_detail/27800.html
Aliyun odps (2020). https://cn.aliyun.com/product/odps
Aliyun oss (2020). https://www.aliyun.com/product/oss
PowerSwitch: Adaptive Prediction and Mode Switch on Graph-parallel Computation (2020). https://ipads.se.sjtu.edu.cn/projects/powerswitch.html
Pregel+ (2020). http://www.cse.cuhk.edu.hk/pregelplus/
Authors, J.: JanusGraph (2017). https://janusgraph.org
Avery, C.: Giraph: large-scale graph processing infrastructure on Hadoop. In: Proceedings of the Hadoop Summit, Santa Clara, vol. 11, vo. 3, pp. 5–9 (2011)
Brizan, D.G., Tansel, A.U.: A survey of entity resolution and record linkage methodologies. Commun. IIMA 6(3), 5 (2006)
Callan, J., et al.: The clueweb12 dataset (2013)
Chen, R., Shi, J., Chen, Y., Zang, B., Guan, H., Chen, H.: PowerLyra: differentiated graph computation and partitioning on skewed graphs. ACM Trans. Parallel Comput. (TOPC) 5(3), 1–39 (2019)
Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at Facebook-scale. Proc. VLDB Endow. 8(12), 1804–1815 (2015)
Fan, W., Wang, X., Wu, Y., Xu, J.: Association rules with graph patterns. Proc. VLDB Endow. 8(12), 1502–1513 (2015)
Fan, W., Xu, J., Wu, Y., Yu, W., Jiang, J.: GRAPE: parallelizing sequential graph computations. Proc. VLDB Endow. 10(12), 1889–1892 (2017)
Figueiredo, D.R., Ribeiro, L.F.R., Saverese, P.H.: struc2vec: learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, pp. 13–17 (2017)
Getoor, L., Diehl, C.P.: Link mining: a survey. ACM SIGKDD Explor. Newslett. 7(2), 3–12 (2005)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: 10th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 2012), pp. 17–30 (2012)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: Graph processing in a distributed dataflow framework. In: 11th \(\{\)USENIX\(\}\) Symposium on Operating Systems design and Implementation (\(\{\)OSDI\(\}\) 2014), pp. 599–613 (2014)
Granville, K.: Facebook and Cambridge Analytica: what you need to know as fallout widens. New York Times 19, 18 (2018)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Inc, A.: Euler: a distributed graph deep learning framework (2019). https://github.com/alibaba/euler
Inc, A.: Graph Compute (2020). https://help.aliyun.com/document_detail/134189.html
Inc, A.: Amazon Neptune (2018). https://aws.amazon.com/neptune/
Jones, N.D.: An introduction to partial evaluation. ACM Comput. Surv. (CSUR) 28(3), 480–503 (1996)
Khan, B.S., Niazi, M.A.: Network community detection: a review and visual survey. arXiv preprint arXiv:1708.00977 (2017)
Khayyat, Z., Awara, K., Alonazi, A., Jamjoom, H., Williams, D., Kalnis, P.: Mizan: a system for dynamic load balancing in large-scale graph processing. In: Proceedings of the 8th ACM European Conference on Computer Systems, pp. 169–182 (2013)
Leskovec, J.: Graph representation learning with graph convolutional networks
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146 (2010)
McSherry, F., Isard, M., Murray, D.G.: Scalability! But at what \(\{\)COST\(\}\)? In: 15th Workshop on Hot Topics in Operating Systems (HotOS \(\{\)XV\(\}\)) (2015)
Neo4j, I.: Neo4j Graph Platform (2021). https://neo4j.com/
Nielsen, A.: Nielsen global online consumer survey: trust, value and engagement in advertising. Ad week Media (2010)
Pearl, J.: Theoretical impediments to machine learning with seven sparks from the causal revolution. arXiv preprint arXiv:1801.04016 (2018)
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp. 701–710 (2014)
Ramalingam, G., Reps, T.: On the computational complexity of dynamic graph problems. Theoret. Comput. Sci. 158(1–2), 233–277 (1996)
Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management, pp. 1–12 (2013)
Tian, Y., Balmin, A., Corsten, S.A., Tatikonda, S., McPherson, J.: From “think like a vertex” to “think like a graph”. Proc. VLDB Endow. 7(3), 193–204 (2013)
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 4–24 (2020)
Yan, D., Cheng, J., Lu, Y., Ng, W.: Blogel: a block-centric framework for distributed computation on real-world graphs. Proc. VLDB Endow. 7(14), 1981–1992 (2014)
Zhu, R., et al.: AliGraph: a comprehensive graph neural network platform. arXiv preprint arXiv:1902.08730 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
� 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xu, J., Li, Z., Zeng, W., Huang, J. (2021). Graph Computing System and Application Based on Large-Scale Information Network. In: Yu, Q. (eds) Space Information Network. SINC 2020. Communications in Computer and Information Science, vol 1353. Springer, Singapore. https://doi.org/10.1007/978-981-16-1967-0_12
Download citation
DOI: https://doi.org/10.1007/978-981-16-1967-0_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1966-3
Online ISBN: 978-981-16-1967-0
eBook Packages: Computer ScienceComputer Science (R0)