Data Management

An Introduction to Big Data

Two years ago the Big Data team released GIS Tools for Hadoop on GitHub. GIS Tools for Hadoop is an open source project that allows users to integrate Hadoop (a distributed big data platform) with big spatial data, complete distributed spatial analysis, and move data between the Hadoop Distributed Filing System (HDFS) and ArcGIS Desktop.

Until now, it has been difficult for many GIS users to take full advantage of these tools, or even just try them out (and see what all this big data talk is about). We know that not everyone has a cluster sitting around (although they are cheaper than you’d think) so we have put together a tutorial for beginners – no cluster or development experience needed!

This tutorial takes you through the steps of downloading and starting up a virtual machine (a self-contained portable Hadoop environment), accessing GIS Tools for Hadoop through GitHub, and pointing you towards tutorials and samples that teach you how to complete analyses on your big spatial data.

Check out the tutorial on GitHub, and let us know if you have any questions, or other tutorials you want to see on our GeoNet page.

(Post submitted by Sarah Ambrose, Big Data Team)

Next Article

Adding content to your Sites and Initiatives just got easier

Read this article