【分享】Stanford Dataset全集之Citation networks
2013-08-22 14:55
465 查看
cit-HepPh(1)
Arxiv HEP-PH (high energy physics phenomenology ) citation graph is from the e-print arXiv and covers all the citations within a dataset of 34,546 papers with 421,578 edges. If a paper i cites paper j, the graph contains a directed edge from i to j.
If a paper cites, or is cited by, a paper outside the dataset, the graph does not contain any information about this.
The data covers papers in the period from January 1993 to April 2003 (124 months). It begins within a few months of the inception of the arXiv, and thus represents essentially the complete history of its HEP-PH section.
The data was originally released as a part of 2003 KDD Cup.
cit-HepPh(2)
Arxiv HEP-TH (high energy physics theory) citation graph is from the e-print arXiv and covers all the citations within a dataset of 27,770 papers with 352,807 edges. If a paper i cites paper j, the graph contains a directed edge from i to j.
If a paper cites, or is cited by, a paper outside the dataset, the graph does not contain any information about this.
The data covers papers in the period from January 1993 to April 2003 (124 months). It begins within a few months of the inception of the arXiv, and thus represents essentially the complete history of its HEP-TH section.
The data was originally released as a part of 2003 KDD Cup.
cit-Patents
U.S. patent dataset is maintained by the National Bureau of Economic Research. The data set spans 37 years (January 1, 1963 to December 30, 1999), and includes all the utility patents granted during that period, totaling 3,923,922 patents. The citation graph
includes all citations made by patents granted between 1975 and 1999, totaling 16,522,438 citations. For the patents dataset there are 1,803,511 nodes for which we have no information about their citations (we only have the in-links).
数据堂免费提供数据挖掘数据集下载:http://www.datatang.com/data/44126
数据堂-国内科研数据免费下载平台
Arxiv HEP-PH (high energy physics phenomenology ) citation graph is from the e-print arXiv and covers all the citations within a dataset of 34,546 papers with 421,578 edges. If a paper i cites paper j, the graph contains a directed edge from i to j.
If a paper cites, or is cited by, a paper outside the dataset, the graph does not contain any information about this.
The data covers papers in the period from January 1993 to April 2003 (124 months). It begins within a few months of the inception of the arXiv, and thus represents essentially the complete history of its HEP-PH section.
The data was originally released as a part of 2003 KDD Cup.
Dataset statistics | |
---|---|
Nodes | 34546 |
Edges | 421578 |
Nodes in largest WCC | 34401 (0.996) |
Edges in largest WCC | 421485 (1.000) |
Nodes in largest SCC | 12711 (0.368) |
Edges in largest SCC | 139981 (0.332) |
Average clustering coefficient | 0.2962 |
Number of triangles | 1276868 |
Fraction of closed triangles | 0.1457 |
Diameter (longest shortest path) | 12 |
90-percentile effective diameter | 5 |
cit-HepPh(2)
Arxiv HEP-TH (high energy physics theory) citation graph is from the e-print arXiv and covers all the citations within a dataset of 27,770 papers with 352,807 edges. If a paper i cites paper j, the graph contains a directed edge from i to j.
If a paper cites, or is cited by, a paper outside the dataset, the graph does not contain any information about this.
The data covers papers in the period from January 1993 to April 2003 (124 months). It begins within a few months of the inception of the arXiv, and thus represents essentially the complete history of its HEP-TH section.
The data was originally released as a part of 2003 KDD Cup.
Dataset statistics | |
---|---|
Nodes | 27770 |
Edges | 352807 |
Nodes in largest WCC | 27400 (0.987) |
Edges in largest WCC | 352542 (0.999) |
Nodes in largest SCC | 7464 (0.269) |
Edges in largest SCC | 116268 (0.330) |
Average clustering coefficient | 0.3295 |
Number of triangles | 1478735 |
Fraction of closed triangles | 0.1196 |
Diameter (longest shortest path) | 14 |
90-percentile effective diameter | 5.4 |
cit-Patents
U.S. patent dataset is maintained by the National Bureau of Economic Research. The data set spans 37 years (January 1, 1963 to December 30, 1999), and includes all the utility patents granted during that period, totaling 3,923,922 patents. The citation graph
includes all citations made by patents granted between 1975 and 1999, totaling 16,522,438 citations. For the patents dataset there are 1,803,511 nodes for which we have no information about their citations (we only have the in-links).
Dataset statistics | |
---|---|
Nodes | 3774768 |
Edges | 16518948 |
Nodes in largest WCC | 3764117 (0.997) |
Edges in largest WCC | 16511741 (1.000) |
Nodes in largest SCC | 1 (0.000) |
Edges in largest SCC | 0 (0.000) |
Average clustering coefficient | 0.0919 |
Number of triangles | 7515023 |
Fraction of closed triangles | 0.06714 |
Diameter (longest shortest path) | 22 |
90-percentile effective diameter | 9.4 |
数据堂免费提供数据挖掘数据集下载:http://www.datatang.com/data/44126
数据堂-国内科研数据免费下载平台
相关文章推荐
- 【分享】Stanford Dataset全集之Signed networks
- 【分享】Stanford Dataset全集之Location-based online social networks
- 【分享】Stanford Dataset全集之Road networks
- 【分享】Stanford Dataset全集之Networks with ground-truth communities
- 【分享】Stanford Dataset全集之Social networks
- 【分享】Stanford Dataset全集之Web graphs
- 【分享】Stanford Dataset全集之Collaboration networks
- 【分享】Stanford Dataset全集之Product co-purchasing networks
- 【分享】Stanford Dataset全集之Communication networks
- 【分享】Stanford Dataset全集之Online Communities
- 【分享】Stanford Dataset全集之Internet peer-to-peer networks
- Stanford机器学习---第四讲. 神经网络的表示 Neural Networks representation
- 【分享】Delicious 数据集:2005年9月(Delicious Dataset:September 2005)
- 沤血分享之:使用Opera浏览器技巧全集
- 《A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimat
- 【分享】视频视觉显著度数据集和评测方法(A dataset and evaluation methodology for visual saliency in video)
- [分享]从WEB SERVICE 上返回大数据量的DATASET
- 【分享】The Shape COSEG Dataset
- Stanford 机器学习 Week4 作业 Multi-class Classification and Neural Networks
- Deep Learning Practical Neural Networks with Java 电子书免积分分享