ArcGIS Hub

Classifying civic data into binary categories

Welcome to Week 5 of ArcGIS Hub’s Civic Analytics Notebook series. If this is your first time coming across a post from this series, you can start with our introductory post to join us and explore our notebooks from the previous weeks. Last week we worked with survey data to perform text analysis on the comments submitted by respondents. The idea was to summarize the common themes and sentiments of these responses without having to read through them all. 

This week we embark on a journey to explore a supervised machine learning technique. There are two main supervised learning techniques: where you either classify your data into pre-defined categories (Classification) or where you predict or forecast unknown values (Regression). We will look at a simple Binary classification technique which normally classifies your data to yes/no, true/false, this/that categories. This week, we evaluate whether zoning plays a role in building permits being revoked. We start by reading in the Building Permits data for Miami, FL since 2014 and filter the data to include only those permits that were still approved and those that have been revoked. 

classification

Normally, issued building permits get revoked/cancelled later if the scope of work changes from what was initially estimated and applied for. We test to see if changing scope is a factor in cancellation, and if so then how strong a factor in successfully classifying our permits as revoked or not. We split 75% our data into training set which is used to build our classification model. The remaining 25% of data is used to test how well our classifier performs. Depending on the application at hand, training and test data may not be derived from the same source but in the interest of these notebooks we will work with the same data. We use the sklearn library of Python to build this classifier and to evaluate its performance. Don’t forget to check out our notebook to see what quantitative impact zoning has on permit cancellation. 

A binary classifier can be built to answer several questions using your civic data. To name a couple, you can evaluate 

I invite you to explore this notebook using data from your local Hub. Download and add this notebook to your ArcGIS Online organization to adapt it with your civic data. Also, feel free to share your thoughts and results from your data classification experiments with us on our Geonet discussion thread. I look forward to hearing from you on your findings and feedback! 

Link to notebook – Does zoning play a role in building permits being revoked? 

About the author

(Data, Maps, Python, Cities, Books) Nerd with ArcGIS Hub

Connect:

Leave a Reply

Please Login to comment

Next Article

One Minute Map Hacks: 31-35

Read this article