GIS Tools For Hadoop
2014-12-29 03:52
218 查看
|
location in Big Data is becoming significantly important.
Data that includes location, and that is enhanced with geographic information in a structured form, is often referred to as Spatial Data. Doing Analysis on Spatial data requires an understanding of geometry and operations that can be preformed on it. Enabling
Hadoop to include spatial data and spatial analysis is the goal of this Esri Open Source effort.
GIS
Tools for Hadoop is an open source toolkit intended for Big Spatial Data Analytics. The toolkit provides different libraries:
Esri
Geometry API for Java: A generic geometry library, can be used to extend
Hadoop core with vector geometry types and operations, and enables developers to build MapReduce applications for spatial data.
Spatial
Framework for Hadoop: Extends Hive and is based on the Esri Geometry API, to enable
Hive Query Language users to leverage a set of analytical functions and geometry types. In addition to some utilities for JSON used in ArcGIS.
Geoprocessing
Tools for Hadoop: Contains a set of ready to use ArcGIS Geoprocessing tools, based
on the Esri Geometry API and Spatial Framework for Hadoop. Developers can download the source code of the tools and customize it; they can also create new tools and contribute it to the open source project. Through these tools ArcGIS users can move their spatial
data and execute a pre-defined workflow inside Hadoop.
The GIS
Tools for Hadoop toolkit allows users, who want to leverage the Hadoop Framework, to do spatial analysis on spatial data; for example:
Run Filter and aggregate operations on billions of spatial data records inside Hadoop based on spatial criteria.
Define new areas represented as polygons, and run Point in Polygon analysis on billions of spatial data records inside Hadoop.
Visualize analysis results on a map with rich styling capabilities, and a rich set of base maps.
Integrate your maps in reports, or publish them as map applications online.
Getting started
Developers can get started at Spatial
Framework for Hadoop.
ArcGIS users can get started at Geoprocessing
Tools for Hadoop.
How it all works?
Overall there are four Github
projects that make up the toolkit.
Firstly, the Esri
Geometry API for Java: project. This is a generic library that includes geometry objects, spatial operations, and spatial indexing, it can be used to spatially enable Hadoop. By deploying the Esri geometry API library (as a jar) within Hadoop, developers
are able to build Map/Reduce applications that are spatially enabled, by leveraging the Esri Geometry API along with the other Hadoop APIs in their application.
Secondly, the Spatial
Framework for Hadoop project. This library includes the user defined objects that extend Hive with the capabilities of the Esri Geometry API. By enabling this library in Hive, users are able to construct queries that are very SQL like using HQL. In this
case, users don’t have to write a Map/Reduce application, they can interact with Hive, write their SQL like queries and get answers directly from Hadoop. Queries in this case can include spatial operations and values.
Thirdly, the Geoprocessing
Tools for Hadoop project. These tools are specifically used in ArcGIS. Through the tools, users can connect to Hadoop from ArcGIS. Connecting to Hadoop from ArcGIS is really useful to the toolkit users, since they can import their analysis result in ArcGIS
for Visualization. They can also do more complex and sophisticated analysis now that they narrowed down their data to a specific subset. Additionally, users can leverage the ArcGIS platform capabilities to publish their maps to web and mobile apps, and can
integrate it with BI reports.
Finally, the GIS
Tools for Hadoop project. This project is intended as a place to include multiple samples that leverage the toolkit. The samples can leverage the low level libraries, or the Geoprocessing tools. A couple of samples are available to help you test the deployment
of the spatial libraries with Hadoop and Hive, and make sure everything runs with no issues before you start leveraging the setup from your HQL queries, or from the GP tools. To check your deployment, for Hive and GP tools usage, the sample point-in-polygon-aggregation-hive
can be utilized. The sample leverages the data and lib directories on the same path.
相关文章推荐
- [Hadoop学习]Esri/gis-tools-for-hadoop介绍
- GIS Tools for Hadoop 详细介绍
- GIS Hadoop 开发案例 (gis-tools-for-hadoop)
- NameNode Recovery Tools for the Hadoop Distributed
- NameNode Recovery Tools for the Hadoop Distributed File System
- Error:The SDK Build Tools revision (23.0.3) is too low for project ':app'. Minimum required is 25.0.
- Issue: vmware tools pre-built modules not suitable for running kernel
- 20 Best Code Review Tools for Developers
- Some tools for dll
- Standalone Debugging Tools for Windows (WinDbg)
- Visual StudioTools for Unity 使用技巧2
- Hibernate Tools for Eclipse插件的安装和使用
- Windows Azure Tools for VS2010新版本可用
- Python tools for penetration testers
- daemontools for storm
- Visual Studio Tools for Applications
- SCCM Configmgr 2012 Manage Workgroup Computers for Deployment,Remote tools etc
- Unable to load native-hadoop library for your platform
- How to Install VM Tools for Linux Platform in VMware Workstation
- Professional Java Tools for Extreme Programming