Optimizing Geographic Processing and Analysis for Big Data; SciPy 2013 Presentation

Summary

Authors: Brittain, Carissa, GeoDecisions; Gleason, Jason, GeoDecisions

Track: GIS - Geospatial Data Analysis

Considering performance becomes more and more important as the size of datasets increase. Many factors, some outside a developer's control, can seriously impact performance; sometimes to the point that a processing script or database becomes unusable. The example discussed here is an arcpy geoprocessing script that required more than 26 hrs to process and load 24 hrs of wind velocity data from across the United States. Changing the script to apply basic optimization strategies reduced that processing time to under an hour. Benchmark tests and database inspection while applying each strategy showed the results of each change and allowed for calculating each change's impact on the final performance. Understanding and applying even basic optimization methods can have a large return on effort when working with large datasets and can have a significant impact on processing time.