涉及大数据应用各方面的一些有用的链接
2014-02-26 10:14
302 查看
下面一些链接是关于大数据应用的各方面,有点乱,但都比较有用,会不时更新:
1、AMPLab发布了其利用workload测试Hive/Impala/Tez/Shark/Redshift等SQL查询在Scan/Aggregation/Join/External script等场景的结果,进行了性能对比。
https://amplab.cs.berkeley.edu/benchmark/
2、由上面的blog找出一个intel hadoop test bechmark tools: This benchmark suite contains 9 typical Hadoop workloads (including micro
benchmarks, HDFS benchmarks, web search benchmarks, machine learning benchmarks, and data analytics benchmarks).
https://github.com/intel-hadoop/HiBench
3、(来自hashjoin的微博)Tresata今天发布针对金融和保险行业的实时大数据挖掘解决方案。这家由Rackspace创始人投资的公司基于Spark开发了关系挖掘以及风险分析的应用,是华尔街的新宠。Hadoop为业界带来了廉价的大数据存储,下一代的大数据公司则应该围绕着如何从这些储存起来的数据中挖去价值:
http://tresata.com/news/tresata-delivers-big-data-industrys-first-real-time-network-discovery-application-powered-by-spark/
4、facebook针对hbase的使用对hdfs做了一些性能上的改进,似乎是增加了一个flash cache,需要细看一下:
http://research.cs.wisc.edu/wind/Publications/fbmessages-fast14.pdf
5、apache hadoop 2.3.0 released:
With this release, there are two s
4000
ignificant enhancements to HDFS:
• Support for Heterogeneous Storage Hierarchy in HDFS (HDFS-2832)
• In-memory Cache for data resident in HDFS via Datanodes (HDFS-4949)
In YARN, we are very excited to see that ResourceManager Automatic Failover(YARN-149) is nearly complete; even it isn’t ready for primetime yet. We expect it to land by the next release i.e. hadoop-2.4. Furthermore, a number
of key operational enhancements have been driven into YARN such as better logging, error-handling, diagnostics etc.
On the MapReduce side of the house, a key enhancement is MAPREDUCE-4421; with this we now no longer need to install MapReduce binaries on every machine and can just use a MapReduce tarball via the YARN DistributedCache by copying
it into HDFS.
http://hortonworks.com/blog/apache-hadoop-2-3-0-released/
6 apache hadoop 2.4.0 released
Hadoop 2.4.0 continues that momentum, with additional enhancements to both HDFS & YARN:
Support for Access Control Lists in HDFS (HDFS-4685)
Native support for Rolling Upgrades in HDFS (HDFS-5535)
Smooth operational upgrades with protocol buffers for HDFS FSImage (HDFS-5698)
Full HTTPS support for HDFS (HDFS-5305)
Support for Automatic Failover of the YARN ResourceManager (YARN-149)
(a.k.a Phase 1 of YARN ResourceManager High Availability)
Enhanced support for new applications on YARN with Application History Server (YARN-321)
and Application Timeline Server (YARN-1530)
Support for strong SLAs in YARN CapacityScheduler via Preemption (YARN-185)
http://hortonworks.com/blog/apache-hadoop-2-4-0-released/
1、AMPLab发布了其利用workload测试Hive/Impala/Tez/Shark/Redshift等SQL查询在Scan/Aggregation/Join/External script等场景的结果,进行了性能对比。
https://amplab.cs.berkeley.edu/benchmark/
2、由上面的blog找出一个intel hadoop test bechmark tools: This benchmark suite contains 9 typical Hadoop workloads (including micro
benchmarks, HDFS benchmarks, web search benchmarks, machine learning benchmarks, and data analytics benchmarks).
https://github.com/intel-hadoop/HiBench
3、(来自hashjoin的微博)Tresata今天发布针对金融和保险行业的实时大数据挖掘解决方案。这家由Rackspace创始人投资的公司基于Spark开发了关系挖掘以及风险分析的应用,是华尔街的新宠。Hadoop为业界带来了廉价的大数据存储,下一代的大数据公司则应该围绕着如何从这些储存起来的数据中挖去价值:
http://tresata.com/news/tresata-delivers-big-data-industrys-first-real-time-network-discovery-application-powered-by-spark/
4、facebook针对hbase的使用对hdfs做了一些性能上的改进,似乎是增加了一个flash cache,需要细看一下:
http://research.cs.wisc.edu/wind/Publications/fbmessages-fast14.pdf
5、apache hadoop 2.3.0 released:
With this release, there are two s
4000
ignificant enhancements to HDFS:
• Support for Heterogeneous Storage Hierarchy in HDFS (HDFS-2832)
• In-memory Cache for data resident in HDFS via Datanodes (HDFS-4949)
In YARN, we are very excited to see that ResourceManager Automatic Failover(YARN-149) is nearly complete; even it isn’t ready for primetime yet. We expect it to land by the next release i.e. hadoop-2.4. Furthermore, a number
of key operational enhancements have been driven into YARN such as better logging, error-handling, diagnostics etc.
On the MapReduce side of the house, a key enhancement is MAPREDUCE-4421; with this we now no longer need to install MapReduce binaries on every machine and can just use a MapReduce tarball via the YARN DistributedCache by copying
it into HDFS.
http://hortonworks.com/blog/apache-hadoop-2-3-0-released/
6 apache hadoop 2.4.0 released
Hadoop 2.4.0 continues that momentum, with additional enhancements to both HDFS & YARN:
Support for Access Control Lists in HDFS (HDFS-4685)
Native support for Rolling Upgrades in HDFS (HDFS-5535)
Smooth operational upgrades with protocol buffers for HDFS FSImage (HDFS-5698)
Full HTTPS support for HDFS (HDFS-5305)
Support for Automatic Failover of the YARN ResourceManager (YARN-149)
(a.k.a Phase 1 of YARN ResourceManager High Availability)
Enhanced support for new applications on YARN with Application History Server (YARN-321)
and Application Timeline Server (YARN-1530)
Support for strong SLAs in YARN CapacityScheduler via Preemption (YARN-185)
http://hortonworks.com/blog/apache-hadoop-2-4-0-released/
相关文章推荐
- 一些有用的php 应用
- Python 2.7 有用的一些链接
- 一些常见有用的图像视频资源链接 -zt
- DTN学习的一些有用链接
- 一些有用的链接
- 一些看过的有用的资料的链接集合
- 一些有用的链接地址_20171206
- 分享一些有用的链接(陆续更新)
- __存储一些有用的链接
- 一些有用的链接
- wince的一些有用的链接、博客、工具
- 一些有用的博客,资料链接
- C++编译、链接涉及到的一些基本问题
- 关于java多媒体一些有用的链接
- 一些有用的博客,链接置于此处,便于查阅
- 一些有用的学习链接(AndroidStudio)
- Deep learning的一些有用链接
- 大数据分析一些有用的站点
- 一些在有用的地址,别的大神写的链接,记录一下
- 虚拟机骚操作·Ubuntu16.04和UbuntuKylin16.04使用感受以及黑屏问题·Ubuntu下搜狗拼音输入法的n种问题AND推荐一些重装时的有用的一些问题链接