Skip to main content

SCINet for geospatial research


By: Pat Clark and Rowan Gaffney

Use Cases


Machine Learning

Modeling

  • Process Based or Statistical Models

Time Series Analysis

  • Estimating Productivity
  • Land Use / Land Change

Geostatistics

  • Spatial Variance or Autocorrelation
  • Kriging / Interpolation

Processing Data

  • UAS DN/Radiance to Reflectance

When to Use SCINet?


Setting up analyses to run on SCINet involves a non-trivial amount of overhead. Therefore, you should first evaluate if SCINet is an appropriate avenue for your research. Typically, analyses that are well-suited for SCINet are:

  • CPU intensive workloads
  • high memory workloads

Additional considerations are:

  • Are my analyses already optimized?
  • Will I need to parallelize my analyses (typical for CPU intensive workloads)?
  • Will I require more than a single node of compute power (ie. distributed computing)?

Tools and Software


The following tools/software are currently available on SCINet. (See the Preinstalled Software List for a full list of currently available software.)

Geospatial Specific Software

Applicable General Software

  • H2O (3.2.0.3): Distributed in-memory machine learning platform with APIs in R and Python
  • Python (3.6.6): Interpreted, high-level, general-purpose programming language
  • R (3.5.2): Software environment for statistical computing and graphics
  • RStudio and RStudio Server: An integrated development environment (IDE) for R
  • JupyterLab: Web-based user interface for Project Jupyter

Other

  • SCINet Remote Sensing Container Image: Python+R geospatial libraries and JupyterLab IDE (R, IDL, and Python kernels).
    • User Tutorial for JupyterLab+Dask Distrubuted using:
      • container: /project/geospatial_tutorials/data_science_im_rs_latest.sif
      • sbatch script: /project/geospatial_tutorials/data_science_nb_dask.sbatch
    • Optionally, pull the container from dockerhub to local folder with:
      singularity pull docker://rowangaffney/data_science_im_rs