Use of Scripting in Geoprocessing - Presentation at VJTI Geospatial Workshop

December 1, 2018   

Use of Scripting in Geoprocessing

Tayyabali Sayyad,

Asst. Professor

Don Bosco Institute of Technology, Mumbai


Outline

  • GIS project workflow
  • Role of the documentation in your project
  • Where Geoprocessing comes into picture
  • Why to learn scripting GIS ?
  • Popular scripting languages used in GIS
  • Integration of the R and Python
  • Why you need to creating custom tools
  • Python and R popular GIS packages
  • Examples of R scripting
  • References

GIS project workflow

  1. Determine the objectives of the project
  2. Build the database and prepare the data for analysis
  3. Perform the analysis
    • Determine methodology and sequence of operations
    • Process the data
    • Evaluate and interpret the results
    • Refine the analysis as needed and generate alternatives
  4. Present the results

Role of documentation in your project

  • Documentation is important in every step to record of your methodology

  • To easily duplicate your workflow and share your work with others


What is Geoprocessig ?

  • Geoprocessing is any GIS operation used to manipulate data.
  • Takes an input dataset, performs an operation on that dataset, and returns the result of the operation as an output dataset

Need of Geoprocessing

  • In a modelling, processes are connected to represent and execute a geoprocessing workflow.

  • Models can be saved, easily modified, and run as many times as needed to perform different analyses and test “what if” scenarios.

  • Like all GIS data, models should be documented so they can be shared with others.


Where Geoprocessing comes into picture

  • Prepration of data
    • Data conversion, errors detections, extract and make new attributes
  • Analysis
    • Feature slection, overlays operations, regression analysis and image classification, interpolation, finding patterns
  • Documentation
    • Recording your method of data preprating and analysis

Why to learn scripting language than just using GIS softwares ?

  • Complete control of the underlying algorithms, data, and execution
  • Automate specific, repetitive analysis tasks with minimal
  • Create a program that’s easy to share, analysis can be reproduced
  • Learn geospatial analysis beyond pushing buttons in software

What is script ?

  • Set of instructions in plain text, stored in a file and carried out by a software program

  • Not all scripting languages can be used to write scripts for geoprocessing

  • Popular scripting languages used in GIS Python and R


Integration of the R and Python

  • Extend the core functionality of the software and even create your custom tools
  • PyQGIS for QGIS
  • ArcGIS API for Python
  • R ArcGIS Bridge

You can create your own tools in QGIS

  • System tools are designed to perform one small but essential operation on geographic data

  • Using scripting you execute these tools in a sequence, feeding the output of one tool to the input of another

  • You can build your own library of tools that perform small but essential tasks for your organization


Lets dive into the world of scripting languages !


  • Python important language, Easy to learn, outstanding documentation, easy to pickup syntax
  • Great support for data analysis and processing through the Numpy or Pandas, and others libraries
  • Python is incorporated into ArcGIS, QGIS, GRASS GIS, gvSIG, and many other open source projects, that make the language worth knowing.
  • Swiss army knife for GIS.

  • R is free, open source
  • Good support for interactive use
  • 8000+ packages for data modeling, geo stastistics, visulization, machine learning etc
  • Integrated with QGIS,ArchGIS,GRASS
  • Large Community

Essential Python Geospatial Libraries


Library Use
geopandas Extend Python data analysis library for SP data
shapely Manipulation and analysis of geometric objects in the Cartesian plane, Deployed with GEOS which is Geometry Engine, Open Source of PostGIS
rtree For efficiently querying spatial data
geographiclib For solving geodesic problems, conversions betweengeographic, UTM, UPS, MGRS, geocentric, and local cartesian coordinates,
pyshp For reading and writing shapefiles (in pure Python)
pyproj For conversions between projections

Library Use
rasterio A Pythonic way to work with geospatial rasters,Based on GDAL package which is for raster and vector
fiona For making it easy to read/write geospatial data formats, like files, dictionaries, mappings, and iterators python way
ogr/gdal For reading, writing, and transforming geospatial data formats, Vector and Raster
pyqgis QGIS Python API
geopy client for several popular geocoding web services

Library Use
geojsonio.py For shooting data to the web
h5py Your pythonic gateway to hdf5 files, acess like Numpy array
pyModis Download and preprocess MODIS data
pyspatial projection aware querying of vector/raster data

Data Analysis Libraries

Library Use
scipy General scientific computing library. Has a spatial module
scikit-image Algorithms for (satellite) image processing
scikit-learn Machine learning for python
statsmodels For models and stats in Python
pysal Spatial econometrics, exploratory spatial and spatio-temporal data analysis, spatial clustering (and more)
networkx For working with networks
rasterstats For analyzing rasters based on vector geometries (zonal statistics)

Plotting/Mapping

Library Use
matplotlib For all my plotting needs
cartopy geospatial data processing in order to produce maps based on PROJ.4, NumPy and Shapely libraries, matplotlib
folium Python Data visualize using Leaflet.js Maps

Essential R Geospatial packages


Classes for spatial data

Library Use
sp Classes and Methods for Spatial Data, plotting data as maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc.
sf Support for simple features, a standardized way to encode spatial vector data. Binds to ‘GDAL’ for reading and writing data, to ‘GEOS’ for geometrical operations, and to ‘PROJ’ for projection conversions and datum transformations.
raster Extension of spatial data classes to virtualise access to large rasters, permitting large objects to be analysed, and extending the analytical tools available for both raster and vector data

Handling spatial data

Library Use
gdistance Calculate distances and routes on geographic grids
geosphere Spherical trigonometry for geographic applications. That is, compute distances and related measures for angular (longitude/latitude) locations
trip Functions for accessing and manipulating spatial data for animal tracking, with straightforward coercion from and to other formats

Reading and writing spatial data

Library Use
rgdal The rgdal package provides bindings to GDAL -supported raster formats and OGR -supported vector formats. It contains functions to write raster files in supported formats.

Reading and writing spatial data

Library Use
maps Display of maps. Projection code and larger maps are in separate packages (‘mapproj’ and ‘mapdata’)

Visualisation

Library Use
RColorBrewer Provides color schemes for maps http://colorbrewer2.org/
leafletR Interactive web-maps using the open-source JavaScript library Leaflet
plotGoogleMaps Provides an interactive plot device for handling the geographic data for web browsers, designed for the automatic creation of web maps
quickmapr allows for basic zooming, panning, identifying,labeling, selecting, and measuring spatial objects

Geostatistics

Library Use
gstat Spatial and Spatio-Temporal Geostatistical Modelling, Prediction and Simulation, Variogram modelling; simple, ordinary and universal point or block kriging; spatio-temporal kriging; sequential Gaussian or indicator simulation; variogram and variogram map plotting utility functions.

Summary of the important packages in R

  • sp : Provides classes and methods for spatial data
  • sf : Advanced and new way to use classess for geoprocessing
  • rgdal : Importing and exporting geospatial data formats
  • rgeos : For topologies operations
  • raster : Working with raster data
  • tmap : Working with thematic maps
  • ggplot2 : Data visulization
  • ggmap : For adding base map, google and open street map
  • leaflet : Interactive map in R
  • spatstat : Spatial point pattern analysis
  • gstat : Geostastical modeling

GIS Softwares vs R and Python Scripting

Attribute Desktop GIS (GUI) R / Python
Home disciplines Geography Computing, Statistics
Software focus Graphical User Interface Command line
Reproducibility Minimal Maximal

Reproducibility A process in which the same results can be generated by others using publicly accessible code


References




comments powered by Disqus