Scaling and changing the data type is an effective way to reduce the overall size of your data. A common datatype is floating point 64 bit, which has a level of precision that is often far greater than the precision in the data. Consider reflectance data ranging from 0.0 to 1.0 in floating point representation. Scaling by 10,000, and converting to int16 (16 bits) perserves the precision of the data and can reduce the size by a factor of 4.
Consider the Below Example:
import numpy as np
array1 = np.random.random((1000,10000)).astype(np.float64)
print('This array, in '+str(array1.dtype)+', is '+str(array1.nbytes*1e-6)+' MB')
array2 = np.round(array1*10000.,0).astype(np.int16)
print('Scaling and converting the array to '+str(array2.dtype)+' results in a size of '+str(array2.nbytes*1e-6)+' MB')
Most raster formats have internal lossless compression options. Depending on the nature of your data, this can reduce the size substantially. For detailed specifications of raster formats see the GDAL specifications. Other common file formats, such as Zarr (Zarr Compression) and NetCDF (NetCDF Compression) have internal compression options as well.
A common issue with geospatial analysis is working with data in different projections. A typical workflow may be to reproject data into a common projection, which results in duplication. An alternative is to use Virtual Raster Files (.VRT), which a simple .xml files that describe how the data should be transformed when opened.
In addition to re-projecting, Virtual Raster Files can be used to mosaic, alter resolutions, resample, etc... An efficient tool for building VRT files is gdalbuildvrt
Interface
Packages / Resouces
Cluster
Container
Below are the steps/commands to setup the Python/JupyterLab environment followed by a soundless video of the process on Ceres. Note that you will need to already have a SCINet account. Please visit the SCINet website for detailed instructions to setup an account.
Access JupyterHub
Currently, to access JupyterHub, you need to port forward the application to your local system. However, in the future there will be a public URL, and you will not longer need to do this step. To port forward JupyterHub, run the following command in the PowerShell (windows) or terminal (linux). Note that you will need to replace your USER.NAME with your SciNET user name.
ssh -N -L 8000:jupyterhub.scinet.local:80 USER.NAME@login.scinet.science
Open in Browser
Open a web browser (firefox, chrome, edge, etc...) and go to localhost:8000
Spawn JupyterLab
Once logged into JupyterHub, you are given a set of options when launching JupyterLab. Below are brief descriptions of each option, followed by the value to use for this tutorial. If this is the first time spawning a notebook from a container on Docker or Singularity Hub (as in this example), it will take 4-10 minutes to donwload and build the container. The container is then cached in your home directory, so on subsequent tries, JupyterLab should spawn in 10 - 30 seconds.
Node Type: Ceres partition to use when running JupyterLab
short OR brief-low (ethier works)
Number of Cores: Number of Cores to allocate
4
Job Duration: Length of Job (HH:MM:SS)
00:30:00
Additional Slurm Options: Sbatch Options
(leave blank)
Notebook/Lab Options: Additional JupyterLab or Jupyter Notebook options
--notebook-dir=/project/geospatial_tutorials/
Enter the full path to the container image: Location of the container to use
docker://rowangaffney/data_science_im_rs:latest
Container Exec Options: Additional options for the singularity exec command
--bind /etc/munge --bind /var/log/munge --bind /var/run/munge --bind /usr/bin/squeue --bind /usr/bin/sinfo --bind /usr/bin/scancel --bind /usr/bin/sbatch --bind /usr/bin/scontrol --bind /scinet01/gov/usda/ars/scinet/system/slurm:/etc/slurm --bind /run/munge --bind /usr/lib64 --bind /scinet01 --bind $HOME --bind /software/7/apps/envi -H $HOME:/home/jovyan
Please be cognizant of the compute resources you are requesting (see best practices below).
Best Practices
Note that this data_science_im_rs container has two main environments which include (as of March 18, 2020):
Name | Version | Build | Channel |
---|---|---|---|
_libgcc_mutex | 0.1 | conda_forge | conda-forge |
_openmp_mutex | 4.5 | 1_llvm | conda-forge |
_py-xgboost-mutex | 2.0 | cpu_0 | conda-forge |
_r-mutex | 1.0.1 | anacondar_1 | conda-forge |
affine | 2.3.0 | py_0 | conda-forge |
aiohttp | 3.6.2 | py36h516909a_0 | conda-forge |
appdirs | 1.4.3 | py_1 | conda-forge |
arrow-cpp | 0.15.1 | py36had5782a_4 | conda-forge |
asciitree | 0.3.3 | py_2 | conda-forge |
async-timeout | 3.0.1 | py_1000 | conda-forge |
attrs | 19.3.0 | py_0 | conda-forge |
backcall | 0.1.0 | py_0 | conda-forge |
beautifulsoup4 | 4.8.2 | py36_0 | conda-forge |
binutils_impl_linux-64 | 2.33.1 | h53a641e_8 | conda-forge |
binutils_linux-64 | 2.33.1 | h9595d00_16 | conda-forge |
bleach | 3.1.1 | py_0 | conda-forge |
blinker | 1.4 | py_1 | conda-forge |
bokeh | 1.4.0 | py36_0 | conda-forge |
boost-cpp | 1.70.0 | h8e57a91_2 | conda-forge |
boto3 | 1.12.12 | py_0 | conda-forge |
botocore | 1.15.12 | py_0 | conda-forge |
bottleneck | 1.3.2 | py36hc1659b7_0 | conda-forge |
brotli | 1.0.7 | he1b5a44_1000 | conda-forge |
bwidget | 1.9.14 | 0 | conda-forge |
bzip2 | 1.0.8 | h516909a_2 | conda-forge |
c-ares | 1.15.0 | h516909a_1001 | conda-forge |
ca-certificates | 2019.11.28 | hecc5488_0 | conda-forge |
cachetools | 3.1.1 | py_0 | conda-forge |
cairo | 1.16.0 | hfb77d84_1002 | conda-forge |
cartopy | 0.17.0 | py36h39d8c00_1011 | conda-forge |
certifi | 2019.11.28 | py36_0 | conda-forge |
cffi | 1.13.2 | py36h8022711_0 | conda-forge |
cfitsio | 3.470 | hb60a0a2_2 | conda-forge |
cftime | 1.0.4.2 | py36hc1659b7_0 | conda-forge |
chardet | 3.0.4 | py36_1003 | conda-forge |
click | 7.0 | py_0 | conda-forge |
click-plugins | 1.1.1 | py_0 | conda-forge |
cligj | 0.5.0 | py_0 | conda-forge |
cloudpickle | 1.3.0 | py_0 | conda-forge |
colorcet | 2.0.1 | py_0 | conda-forge |
cryptography | 2.8 | py36h72c5cf5_1 | conda-forge |
curl | 7.68.0 | hf8cf82a_0 | conda-forge |
cycler | 0.10.0 | py_2 | conda-forge |
cytoolz | 0.10.1 | py36h516909a_0 | conda-forge |
dask | 2.11.0 | py_0 | conda-forge |
dask-core | 2.11.0 | py_0 | conda-forge |
dask-glm | 0.2.0 | py_1 | conda-forge |
dask-jobqueue | 0.7.0 | py_0 | conda-forge |
dask-labextension | 1.1.0 | py_0 | conda-forge |
dask-ml | 1.2.0 | py_0 | conda-forge |
dask-xgboost | 0.1.10 | py_0 | conda-forge |
datashader | 0.10.0 | py_0 | conda-forge |
datashape | 0.5.4 | py_1 | conda-forge |
dbus | 1.13.6 | he372182_0 | conda-forge |
decorator | 4.4.2 | py_0 | conda-forge |
defusedxml | 0.6.0 | py_0 | conda-forge |
distributed | 2.11.0 | py36_0 | conda-forge |
docopt | 0.6.2 | py_1 | conda-forge |
docutils | 0.15.2 | py36_0 | conda-forge |
double-conversion | 3.1.5 | he1b5a44_2 | conda-forge |
earthengine-api | 0.1.213 | py_0 | conda-forge |
entrypoints | 0.3 | py36_1000 | conda-forge |
et_xmlfile | 1.0.1 | py_1001 | conda-forge |
expat | 2.2.9 | he1b5a44_2 | conda-forge |
fasteners | 0.14.1 | py_3 | conda-forge |
fastparquet | 0.3.3 | py36hc1659b7_0 | conda-forge |
fiona | 1.8.13 | py36h900e953_0 | conda-forge |
fontconfig | 2.13.1 | h86ecdb6_1001 | conda-forge |
freetype | 2.10.0 | he983fc9_1 | conda-forge |
freexl | 1.0.5 | h14c3975_1002 | conda-forge |
fribidi | 1.0.5 | h516909a_1002 | conda-forge |
fsspec | 0.6.2 | py_0 | conda-forge |
future | 0.18.2 | py36_0 | conda-forge |
gcc_impl_linux-64 | 7.3.0 | hd420e75_5 | conda-forge |
gcc_linux-64 | 7.3.0 | h553295d_16 | conda-forge |
gcsfs | 0.6.0 | py_0 | conda-forge |
gdal | 3.0.4 | py36hbb6b9fb_1 | conda-forge |
geopandas | 0.7.0 | py_1 | conda-forge |
geos | 3.8.0 | he1b5a44_0 | conda-forge |
geotiff | 1.5.1 | hcbe54f9_9 | conda-forge |
geoviews | 1.6.6 | py_1 | conda-forge |
geoviews-core | 1.6.6 | py_1 | conda-forge |
gettext | 0.19.8.1 | hc5be6a0_1002 | conda-forge |
gflags | 2.2.2 | he1b5a44_1002 | conda-forge |
gfortran_impl_linux-64 | 7.3.0 | hdf63c60_5 | conda-forge |
gfortran_linux-64 | 7.3.0 | h553295d_16 | conda-forge |
giflib | 5.2.1 | h516909a_2 | conda-forge |
git | 2.25.0 | pl526hce37bd2_0 | conda-forge |
glib | 2.58.3 | py36h6f030ca_1002 | conda-forge |
glog | 0.4.0 | he1b5a44_1 | conda-forge |
google-api-core | 1.16.0 | py36_1 | conda-forge |
google-api-python-client 1.7.11 | py_0 | conda-forge | |
google-auth | 1.11.2 | py_0 | conda-forge |
google-auth-httplib2 | 0.0.3 | py_3 | conda-forge |
google-auth-oauthlib | 0.4.1 | py_2 | conda-forge |
google-cloud-core | 1.3.0 | py_0 | conda-forge |
google-cloud-storage | 1.26.0 | py_0 | conda-forge |
google-resumable-media | 0.5.0 | py_1 | conda-forge |
googleapis-common-protos 1.51.0 | py36_1 | conda-forge | |
graphite2 | 1.3.13 | hf484d3e_1000 | conda-forge |
graphviz | 2.42.3 | h0511662_0 | conda-forge |
grpc-cpp | 1.25.0 | h5321d42_1 | conda-forge |
gsl | 2.6 | h294904e_0 | conda-forge |
gst-plugins-base | 1.14.5 | h0935bb2_2 | conda-forge |
gstreamer | 1.14.5 | h36ae1b5_2 | conda-forge |
gxx_impl_linux-64 | 7.3.0 | hdf63c60_5 | conda-forge |
gxx_linux-64 | 7.3.0 | h553295d_16 | conda-forge |
h5netcdf | 0.8.0 | py_0 | conda-forge |
h5py | 2.10.0 | nompi_py36h513d04c_102 | conda-forge |
harfbuzz | 2.4.0 | h9f30f68_3 | conda-forge |
hdf4 | 4.2.13 | hf30be14_1003 | conda-forge |
hdf5 | 1.10.5 | nompi_h3c11f04_1104 | conda-forge |
heapdict | 1.0.1 | py_0 | conda-forge |
holoviews | 1.12.7 | py_0 | conda-forge |
httplib2 | 0.17.0 | py36_0 | conda-forge |
hvplot | 0.5.2 | py_0 | conda-forge |
icu | 64.2 | he1b5a44_1 | conda-forge |
idna | 2.9 | py_1 | conda-forge |
idna_ssl | 1.1.0 | py36_1000 | conda-forge |
imageio | 2.8.0 | py_0 | conda-forge |
importlib_metadata | 1.5.0 | py36_0 | conda-forge |
intake | 0.5.4 | py_0 | conda-forge |
intake-esm | 2019.12.13 | py_0 | conda-forge |
intake-parquet | 0.2.3 | py_0 | conda-forge |
intake-sql | 0.2.0 | py_0 | conda-forge |
intake-stac | 0.2.3 | py_0 | conda-forge |
intake-xarray | 0.3.1 | py_0 | conda-forge |
ipykernel | 5.1.4 | py36h5ca1d4c_0 | conda-forge |
ipython | 7.13.0 | py36h5ca1d4c_0 | conda-forge |
ipython_genutils | 0.2.0 | py_1 | conda-forge |
ipywidgets | 7.5.1 | py_0 | conda-forge |
jdcal | 1.4.1 | py_0 | conda-forge |
jedi | 0.16.0 | py36_0 | conda-forge |
jinja2 | 2.11.1 | py_0 | conda-forge |
jmespath | 0.9.4 | py_0 | conda-forge |
joblib | 0.14.1 | py_0 | conda-forge |
jpeg | 9c | h14c3975_1001 | conda-forge |
json-c | 0.13.1 | h14c3975_1001 | conda-forge |
jsonschema | 3.2.0 | py36_0 | conda-forge |
jupyter | 1.0.0 | py_2 | conda-forge |
jupyter-server-proxy | 1.2.0 | py_0 | conda-forge |
jupyter_client | 6.0.0 | py_0 | conda-forge |
jupyter_console | 6.0.0 | py_0 | conda-forge |
jupyter_core | 4.6.3 | py36_0 | conda-forge |
kealib | 1.4.10 | h58c409b_1005 | conda-forge |
kiwisolver | 1.1.0 | py36hc9558a2_0 | conda-forge |
krb5 | 1.16.4 | h2fd8d38_0 | conda-forge |
ld_impl_linux-64 | 2.33.1 | h53a641e_8 | conda-forge |
libblas | 3.8.0 | 15_openblas | conda-forge |
libcblas | 3.8.0 | 15_openblas | conda-forge |
libclang | 9.0.1 | default_hde54327_0 | conda-forge |
libcurl | 7.68.0 | hda55be3_0 | conda-forge |
libdap4 | 3.20.4 | hd3bb157_0 | conda-forge |
libedit | 3.1.20170329 | hf8c457e_1001 | conda-forge |
libevent | 2.1.10 | h72c5cf5_0 | conda-forge |
libffi | 3.2.1 | he1b5a44_1006 | conda-forge |
libgcc-ng | 9.2.0 | h24d8f2e_2 | conda-forge |
libgdal | 3.0.4 | h022d3c0_1 | conda-forge |
libgfortran-ng | 7.3.0 | hdf63c60_5 | conda-forge |
libgomp | 9.2.0 | h24d8f2e_2 | conda-forge |
libiconv | 1.15 | h516909a_1005 | conda-forge |
libkml | 1.3.0 | h4fcabce_1010 | conda-forge |
liblapack | 3.8.0 | 15_openblas | conda-forge |
libllvm8 | 8.0.1 | hc9558a2_0 | conda-forge |
libllvm9 | 9.0.1 | hc9558a2_0 | conda-forge |
libnetcdf | 4.7.3 | nompi_h9f9fd6a_101 | conda-forge |
libopenblas | 0.3.8 | h5ec1e0e_0 | conda-forge |
libpng | 1.6.37 | hed695b0_0 | conda-forge |
libpq | 12.2 | hae5116b_0 | conda-forge |
libprotobuf | 3.10.1 | h8b12597_0 | conda-forge |
libsodium | 1.0.17 | h516909a_0 | conda-forge |
libspatialindex | 1.9.3 | he1b5a44_3 | conda-forge |
libspatialite | 4.3.0a | hd318ce7_1035 | conda-forge |
libssh2 | 1.8.2 | h22169c7_2 | conda-forge |
libstdcxx-ng | 9.2.0 | hdf63c60_2 | conda-forge |
libtiff | 4.1.0 | hc3755c2_3 | conda-forge |
libtool | 2.4.6 | h14c3975_1002 | conda-forge |
libuuid | 2.32.1 | h14c3975_1000 | conda-forge |
libwebp | 1.0.2 | h56121f0_5 | conda-forge |
libxcb | 1.13 | h14c3975_1002 | conda-forge |
libxgboost | 0.90 | he1b5a44_4 | conda-forge |
libxkbcommon | 0.10.0 | he1b5a44_0 | conda-forge |
libxml2 | 2.9.10 | hee79883_0 | conda-forge |
libxslt | 1.1.33 | h31b3aaa_0 | conda-forge |
llvm-openmp | 9.0.1 | hc9558a2_2 | conda-forge |
llvmlite | 0.31.0 | py36h8b12597_0 | conda-forge |
locket | 0.2.0 | py_2 | conda-forge |
lxml | 4.5.0 | py36h7ec2d77_0 | conda-forge |
lz4-c | 1.8.3 | he1b5a44_1001 | conda-forge |
make | 4.3 | h516909a_0 | conda-forge |
markdown | 3.2.1 | py_0 | conda-forge |
markupsafe | 1.1.1 | py36h516909a_0 | conda-forge |
matplotlib-base | 3.1.3 | py36h250f245_0 | conda-forge |
mechanicalsoup | 0.12.0 | py_0 | conda-forge |
metpy | 0.12.0 | py_0 | conda-forge |
mistune | 0.8.4 | py36h516909a_1000 | conda-forge |
monotonic | 1.5 | py_0 | conda-forge |
mpi | 1.0 | openmpi | conda-forge |
mpi4py | 3.0.3 | py36h0299808_0 | conda-forge |
msgpack-numpy | 0.4.4.3 | py_0 | conda-forge |
msgpack-python | 1.0.0 | py36hc9558a2_0 | conda-forge |
multidict | 4.7.5 | py36h516909a_0 | conda-forge |
multipledispatch | 0.6.0 | py_0 | conda-forge |
munch | 2.5.0 | py_0 | conda-forge |
nbconvert | 5.6.1 | py36_0 | conda-forge |
nbformat | 5.0.4 | py_0 | conda-forge |
ncurses | 6.1 | hf484d3e_1002 | conda-forge |
netcdf4 | 1.5.3 | nompi_py36hd35fb8e_102 | conda-forge |
networkx | 2.4 | py_0 | conda-forge |
notebook | 6.0.3 | py36_0 | conda-forge |
nspr | 4.25 | he1b5a44_0 | conda-forge |
nss | 3.47 | he751ad9_0 | conda-forge |
numba | 0.48.0 | py36hb3f55d8_0 | conda-forge |
numcodecs | 0.6.4 | py36he1b5a44_0 | conda-forge |
numpy | 1.18.1 | py36h95a1406_0 | conda-forge |
oauth2client | 4.1.3 | py_0 | conda-forge |
oauthlib | 3.0.1 | py_0 | conda-forge |
olefile | 0.46 | py_0 | conda-forge |
openjpeg | 2.3.1 | h981e76c_3 | conda-forge |
openmpi | 4.0.2 | hdf1f1ad_3 | conda-forge |
openpyxl | 3.0.3 | py_0 | conda-forge |
openssl | 1.1.1d | h516909a_0 | conda-forge |
owslib | 0.19.1 | py_0 | conda-forge |
packaging | 20.1 | py_0 | conda-forge |
pandas | 1.0.1 | py36hb3f55d8_0 | conda-forge |
pandoc | 2.9.2 | 0 | conda-forge |
pandocfilters | 1.4.2 | py_1 | conda-forge |
panel | 0.8.0 | 0 | conda-forge |
pango | 1.42.4 | ha030887_1 | conda-forge |
param | 1.9.3 | py_0 | conda-forge |
parquet-cpp | 1.5.1 | 2 | conda-forge |
parso | 0.6.2 | py_0 | conda-forge |
partd | 1.1.0 | py_0 | conda-forge |
pcre | 8.44 | he1b5a44_0 | conda-forge |
perl | 5.26.2 | h516909a_1006 | conda-forge |
pexpect | 4.8.0 | py36_0 | conda-forge |
phantomjs | 2.1.1 | 1 | conda-forge |
pickleshare | 0.7.5 | py36_1000 | conda-forge |
pillow | 7.0.0 | py36hefe7db6_0 | conda-forge |
pint | 0.11 | py_1 | conda-forge |
pip | 20.0.2 | py_2 | conda-forge |
pixman | 0.38.0 | h516909a_1003 | conda-forge |
pooch | 1.0.0 | py_0 | conda-forge |
poppler | 0.67.0 | h14e79db_8 | conda-forge |
poppler-data | 0.4.9 | 1 | conda-forge |
postgresql | 12.2 | hf1211e9_0 | conda-forge |
proj | 6.3.1 | hc80f0dc_1 | conda-forge |
prometheus_client | 0.7.1 | py_0 | conda-forge |
prompt_toolkit | 2.0.10 | py_0 | conda-forge |
protobuf | 3.4.1 | py36_0 | conda-forge |
psutil | 5.7.0 | py36h516909a_0 | conda-forge |
psycopg2 | 2.8.4 | py36h72c5cf5_1 | conda-forge |
pthread-stubs | 0.4 | h14c3975_1001 | conda-forge |
ptyprocess | 0.6.0 | py_1001 | conda-forge |
py-xgboost | 0.90 | py36_4 | conda-forge |
pyarrow | 0.15.1 | py36h8b68381_1 | conda-forge |
pyasn1 | 0.4.8 | py_0 | conda-forge |
pyasn1-modules | 0.2.7 | py_0 | conda-forge |
pycparser | 2.19 | py_2 | conda-forge |
pyct | 0.4.6 | py_0 | conda-forge |
pyct-core | 0.4.6 | py_0 | conda-forge |
pydap | 3.2.2 | py36_1000 | conda-forge |
pydrive | 1.3.1 | py_1 | conda-forge |
pyepsg | 0.4.0 | py_0 | conda-forge |
pygments | 2.5.2 | py_0 | conda-forge |
pyhdf | 0.10.2 | py36h3a4e923_0 | conda-forge |
pyjwt | 1.7.1 | py_0 | conda-forge |
pykdtree | 1.3.1 | py36hc1659b7_1002 | conda-forge |
pyopenssl | 19.1.0 | py_1 | conda-forge |
pyparsing | 2.4.6 | py_0 | conda-forge |
pyproj | 2.5.0 | py36he3cd046_1 | conda-forge |
pyqt | 5.12.3 | py36hcca6a23_1 | conda-forge |
pyqt5-sip | 4.19.18 | pypi_0 | pypi |
pyqtwebengine | 5.12.1 | pypi_0 | pypi |
pyrsistent | 0.15.7 | py36h516909a_0 | conda-forge |
pysal | 1.14.4 | py36_0 | conda-forge |
pyshp | 2.1.0 | py_0 | conda-forge |
pysocks | 1.7.1 | py36_0 | conda-forge |
python | 3.6.7 | h357f687_1006 | conda-forge |
python-dateutil | 2.7.5 | py_0 | conda-forge |
python-graphviz | 0.13.2 | py_0 | conda-forge |
python-snappy | 0.5.4 | py36hee44bf9_1 | conda-forge |
pytz | 2019.3 | py_0 | conda-forge |
pyviz_comms | 0.7.3 | py_0 | conda-forge |
pywavelets | 1.1.1 | py36hc1659b7_0 | conda-forge |
pyyaml | 5.3 | py36h516909a_0 | conda-forge |
pyzmq | 19.0.0 | py36h1768529_0 | conda-forge |
qt | 5.12.5 | hd8c4c69_1 | conda-forge |
qtconsole | 4.7.1 | py_0 | conda-forge |
qtpy | 1.9.0 | py_0 | conda-forge |
r | 3.6 | r36_1003 | conda-forge |
r-base | 3.6.2 | h7ed4ef7_1 | conda-forge |
r-boot | 1.3_24 | r36h6115d3f_0 | conda-forge |
r-class | 7.3_15 | r36hcdcec82_1001 | conda-forge |
r-cluster | 2.1.0 | r36h9bbef5b_2 | conda-forge |
r-codetools | 0.2_16 | r36h6115d3f_1001 | conda-forge |
r-foreign | 0.8_76 | r36hcdcec82_0 | conda-forge |
r-kernsmooth | 2.23_16 | r36hfa343cc_1 | conda-forge |
r-lattice | 0.20_40 | r36hcdcec82_0 | conda-forge |
r-mass | 7.3_51.5 | r36hcdcec82_0 | conda-forge |
r-matrix | 1.2_18 | r36h7fa42b6_2 | conda-forge |
r-mgcv | 1.8_31 | r36h7fa42b6_0 | conda-forge |
r-nlme | 3.1_144 | r36h9bbef5b_0 | conda-forge |
r-nnet | 7.3_13 | r36hcdcec82_0 | conda-forge |
r-recommended | 3.6 | r36_1003 | conda-forge |
r-rpart | 4.1_15 | r36hcdcec82_1 | conda-forge |
r-spatial | 7.3_11 | r36hcdcec82_1003 | conda-forge |
r-survival | 3.1_8 | r36hcdcec82_0 | conda-forge |
rasterio | 1.1.3 | py36h900e953_0 | conda-forge |
rasterstats | 0.14.0 | py_0 | conda-forge |
re2 | 2020.03.03 | he1b5a44_0 | conda-forge |
readline | 8.0 | hf8c457e_0 | conda-forge |
requests | 2.23.0 | py36_0 | conda-forge |
requests-oauthlib | 1.2.0 | py_0 | conda-forge |
rioxarray | 0.0.21 | py_0 | conda-forge |
rpy2 | 3.1.0 | py36r36hc1659b7_3 | conda-forge |
rsa | 4.0 | py_0 | conda-forge |
rtree | 0.9.4 | py36h7b0cdae_0 | conda-forge |
ruamel.yaml | 0.16.6 | py36h516909a_0 | conda-forge |
ruamel.yaml.clib | 0.2.0 | py36h516909a_0 | conda-forge |
s3fs | 0.4.0 | py_0 | conda-forge |
s3transfer | 0.3.3 | py36_0 | conda-forge |
sat-stac | 0.3.3 | py_0 | conda-forge |
scikit-image | 0.16.2 | py36hb3f55d8_0 | conda-forge |
scikit-learn | 0.22.1 | py36hcdab131_1 | conda-forge |
scipy | 1.4.1 | py36h921218d_0 | conda-forge |
sed | 4.7 | h1bed415_1000 | conda-forge |
selenium | 3.141.0 | py36h516909a_1000 | conda-forge |
send2trash | 1.5.0 | py_0 | conda-forge |
setuptools | 45.2.0 | py36_0 | conda-forge |
shapely | 1.7.0 | py36h5d51c17_0 | conda-forge |
simpervisor | 0.3 | py_1 | conda-forge |
simplegeneric | 0.8.1 | py_1 | conda-forge |
simplejson | 3.17.0 | py36h516909a_0 | conda-forge |
six | 1.14.0 | py36_0 | conda-forge |
snappy | 1.1.8 | he1b5a44_1 | conda-forge |
snuggs | 1.4.7 | py_0 | conda-forge |
sortedcontainers | 2.1.0 | py_0 | conda-forge |
soupsieve | 1.9.4 | py36_0 | conda-forge |
spectral | 0.20 | py_0 | conda-forge |
sqlalchemy | 1.3.13 | py36h516909a_0 | conda-forge |
sqlite | 3.30.1 | hcee41ef_0 | conda-forge |
streamz | 0.5.2 | py_0 | conda-forge |
tbb | 2018.0.5 | h2d50403_0 | conda-forge |
tblib | 1.6.0 | py_0 | conda-forge |
terminado | 0.8.3 | py36_0 | conda-forge |
testpath | 0.4.4 | py_0 | conda-forge |
threddsclient | 0.4.2 | py_0 | conda-forge |
thrift | 0.11.0 | py36he1b5a44_1001 | conda-forge |
thrift-cpp | 0.12.0 | hf3afdfd_1004 | conda-forge |
tiledb | 1.7.0 | hcde45ca_2 | conda-forge |
tk | 8.6.10 | hed695b0_0 | conda-forge |
tktable | 2.10 | h555a92e_3 | conda-forge |
toolz | 0.10.0 | py_0 | conda-forge |
tornado | 6.0.3 | py36h516909a_4 | conda-forge |
tqdm | 4.43.0 | py_0 | conda-forge |
traitlets | 4.3.3 | py36_0 | conda-forge |
typing_extensions | 3.7.4.1 | py36_0 | conda-forge |
tzcode | 2019a | h516909a_1002 | conda-forge |
tzlocal | 2.0.0 | py_0 | conda-forge |
uriparser | 0.9.3 | he1b5a44_1 | conda-forge |
uritemplate | 3.0.1 | py_0 | conda-forge |
urllib3 | 1.25.7 | py36_0 | conda-forge |
wcwidth | 0.1.8 | py_0 | conda-forge |
webencodings | 0.5.1 | py_1 | conda-forge |
webob | 1.8.6 | py_0 | conda-forge |
wheel | 0.34.2 | py_1 | conda-forge |
widgetsnbextension | 3.5.1 | py36_0 | conda-forge |
xarray | 0.15.0 | py_0 | conda-forge |
xerces-c | 3.2.2 | h8412b87_1004 | conda-forge |
xgboost | 0.90 | py36he1b5a44_4 | conda-forge |
xgeo | 1.0 | py_0 | conda-forge |
xorg-kbproto | 1.0.7 | h14c3975_1002 | conda-forge |
xorg-libice | 1.0.10 | h516909a_0 | conda-forge |
xorg-libsm | 1.2.3 | h84519dc_1000 | conda-forge |
xorg-libx11 | 1.6.9 | h516909a_0 | conda-forge |
xorg-libxau | 1.0.9 | h14c3975_0 | conda-forge |
xorg-libxdmcp | 1.1.3 | h516909a_0 | conda-forge |
xorg-libxext | 1.3.4 | h516909a_0 | conda-forge |
xorg-libxpm | 3.5.13 | h516909a_0 | conda-forge |
xorg-libxrender | 0.9.10 | h516909a_1002 | conda-forge |
xorg-libxt | 1.1.5 | h516909a_1003 | conda-forge |
xorg-renderproto | 0.11.1 | h14c3975_1002 | conda-forge |
xorg-xextproto | 7.3.0 | h14c3975_1002 | conda-forge |
xorg-xproto | 7.0.31 | h14c3975_1007 | conda-forge |
xrviz | 0.1.4 | py_1 | conda-forge |
xz | 5.2.4 | h14c3975_1001 | conda-forge |
yaml | 0.2.2 | h516909a_1 | conda-forge |
yarl | 1.3.0 | py36h516909a_1000 | conda-forge |
zarr | 2.4.0 | py_0 | conda-forge |
zeromq | 4.3.2 | he1b5a44_2 | conda-forge |
zict | 2.0.0 | py_0 | conda-forge |
zipp | 3.1.0 | py_0 | conda-forge |
zlib | 1.2.11 | h516909a_1006 | conda-forge |
zstd | 1.4.4 | h3b9ef0a_1 | conda-forge |
Name | Version | Build | Channel |
---|---|---|---|
_libgcc_mutex | 0.1 | conda_forge | conda-forge |
_openmp_mutex | 4.5 | 1_llvm | conda-forge |
_r-mutex | 1.0.1 | anacondar_1 | conda-forge |
binutils_impl_linux-64 | 2.33.1 | h53a641e_8 | conda-forge |
binutils_linux-64 | 2.33.1 | h9595d00_16 | conda-forge |
boost-cpp | 1.70.0 | h8e57a91_2 | conda-forge |
bwidget | 1.9.14 | 0 | conda-forge |
bzip2 | 1.0.8 | h516909a_2 | conda-forge |
ca-certificates | 2019.11.28 | hecc5488_0 | conda-forge |
cairo | 1.16.0 | hfb77d84_1002 | conda-forge |
cfitsio | 3.470 | hb60a0a2_2 | conda-forge |
curl | 7.68.0 | hf8cf82a_0 | conda-forge |
expat | 2.2.9 | he1b5a44_2 | conda-forge |
fontconfig | 2.13.1 | h86ecdb6_1001 | conda-forge |
freetype | 2.10.0 | he983fc9_1 | conda-forge |
freexl | 1.0.5 | h14c3975_1002 | conda-forge |
fribidi | 1.0.5 | h516909a_1002 | conda-forge |
gcc_impl_linux-64 | 7.3.0 | hd420e75_5 | conda-forge |
gcc_linux-64 | 7.3.0 | h553295d_16 | conda-forge |
geos | 3.7.2 | he1b5a44_2 | conda-forge |
geotiff | 1.5.1 | hcd53e25_3 | conda-forge |
gettext | 0.19.8.1 | hc5be6a0_1002 | conda-forge |
gfortran_impl_linux-64 | 7.3.0 | hdf63c60_5 | conda-forge |
gfortran_linux-64 | 7.3.0 | h553295d_16 | conda-forge |
giflib | 5.1.7 | h516909a_1 | conda-forge |
glib | 2.58.3 | h6f030ca_1002 | conda-forge |
graphite2 | 1.3.13 | hf484d3e_1000 | conda-forge |
gsl | 2.6 | h294904e_0 | conda-forge |
gxx_impl_linux-64 | 7.3.0 | hdf63c60_5 | conda-forge |
gxx_linux-64 | 7.3.0 | h553295d_16 | conda-forge |
harfbuzz | 2.4.0 | h9f30f68_3 | conda-forge |
hdf4 | 4.2.13 | hf30be14_1003 | conda-forge |
hdf5 | 1.10.5 | nompi_h3c11f04_1104 | conda-forge |
icu | 64.2 | he1b5a44_1 | conda-forge |
jpeg | 9c | h14c3975_1001 | conda-forge |
json-c | 0.13.1 | h14c3975_1001 | conda-forge |
kealib | 1.4.10 | h58c409b_1005 | conda-forge |
krb5 | 1.16.4 | h2fd8d38_0 | conda-forge |
ld_impl_linux-64 | 2.33.1 | h53a641e_8 | conda-forge |
libblas | 3.8.0 | 15_openblas | conda-forge |
libcblas | 3.8.0 | 15_openblas | conda-forge |
libcurl | 7.68.0 | hda55be3_0 | conda-forge |
libdap4 | 3.20.4 | hd3bb157_0 | conda-forge |
libedit | 3.1.20170329 | hf8c457e_1001 | conda-forge |
libffi | 3.2.1 | he1b5a44_1006 | conda-forge |
libgcc-ng | 9.2.0 | h24d8f2e_2 | conda-forge |
libgdal | 3.0.1 | hf47eb90_8 | conda-forge |
libgfortran-ng | 7.3.0 | hdf63c60_5 | conda-forge |
libgomp | 9.2.0 | h24d8f2e_2 | conda-forge |
libiconv | 1.15 | h516909a_1005 | conda-forge |
libkml | 1.3.0 | h4fcabce_1010 | conda-forge |
liblapack | 3.8.0 | 15_openblas | conda-forge |
libnetcdf | 4.6.2 | h303dfb8_1003 | conda-forge |
libopenblas | 0.3.8 | h5ec1e0e_0 | conda-forge |
libpng | 1.6.37 | hed695b0_0 | conda-forge |
libpq | 11.5 | hd9ab2ff_2 | conda-forge |
libsodium | 1.0.17 | h516909a_0 | conda-forge |
libspatialite | 4.3.0a | h57ae47a_1030 | conda-forge |
libssh2 | 1.8.2 | h22169c7_2 | conda-forge |
libstdcxx-ng | 9.2.0 | hdf63c60_2 | conda-forge |
libtiff | 4.1.0 | hc3755c2_3 | conda-forge |
libuuid | 2.32.1 | h14c3975_1000 | conda-forge |
libxcb | 1.13 | h14c3975_1002 | conda-forge |
libxml2 | 2.9.10 | hee79883_0 | conda-forge |
llvm-openmp | 9.0.1 | hc9558a2_2 | conda-forge |
lz4-c | 1.8.3 | he1b5a44_1001 | conda-forge |
make | 4.3 | h516909a_0 | conda-forge |
ncurses | 6.1 | hf484d3e_1002 | conda-forge |
openjpeg | 2.3.1 | h981e76c_3 | conda-forge |
openssl | 1.1.1d | h516909a_0 | conda-forge |
pango | 1.42.4 | ha030887_1 | conda-forge |
pcre | 8.44 | he1b5a44_0 | conda-forge |
pixman | 0.38.0 | h516909a_1003 | conda-forge |
poppler | 0.67.0 | h14e79db_8 | conda-forge |
poppler-data | 0.4.9 | 1 | conda-forge |
postgresql | 11.5 | hc63931a_2 | conda-forge |
proj4 | 6.1.1 | hc80f0dc_1 | conda-forge |
pthread-stubs | 0.4 | h14c3975_1001 | conda-forge |
r-assertthat | 0.2.1 | r36h6115d3f_1 | conda-forge |
r-backports | 1.1.5 | r36hcdcec82_0 | conda-forge |
r-base | 3.6.2 | h7ed4ef7_1 | conda-forge |
r-base64enc | 0.1_3 | r36hcdcec82_1003 | conda-forge |
r-class | 7.3_15 | r36hcdcec82_1001 | conda-forge |
r-classint | 0.4_2 | r36h9bbef5b_0 | conda-forge |
r-cli | 2.0.2 | r36h6115d3f_0 | conda-forge |
r-codetools | 0.2_16 | r36h6115d3f_1001 | conda-forge |
r-crayon | 1.3.4 | r36h6115d3f_1002 | conda-forge |
r-dbi | 1.1.0 | r36h6115d3f_0 | conda-forge |
r-digest | 0.6.25 | r36h0357c0b_0 | conda-forge |
r-e1071 | 1.7_3 | r36h0357c0b_0 | conda-forge |
r-ellipsis | 0.3.0 | r36hcdcec82_0 | conda-forge |
r-evaluate | 0.14 | r36h6115d3f_1 | conda-forge |
r-fansi | 0.4.1 | r36hcdcec82_0 | conda-forge |
r-fastmap | 1.0.1 | r36h0357c0b_0 | conda-forge |
r-fnn | 1.1.3 | r36h0357c0b_1 | conda-forge |
r-foreach | 1.4.8 | r36h6115d3f_0 | conda-forge |
r-foreign | 0.8_76 | r36hcdcec82_0 | conda-forge |
r-gdalutils | 2.0.3.2 | r36h6115d3f_0 | conda-forge |
r-glue | 1.3.1 | r36hcdcec82_1 | conda-forge |
r-gstat | 2.0_4 | r36hcdcec82_0 | conda-forge |
r-htmltools | 0.4.0 | r36h0357c0b_0 | conda-forge |
r-httpuv | 1.5.2 | r36h0357c0b_1 | conda-forge |
r-intervals | 0.15.1 | r36h0357c0b_1003 | conda-forge |
r-irdisplay | 0.7 | r36_1001 | conda-forge |
r-irkernel | 1.1 | r36h6115d3f_0 | conda-forge |
r-iterators | 1.0.12 | r36h6115d3f_0 | conda-forge |
r-jsonlite | 1.6.1 | r36hcdcec82_0 | conda-forge |
r-kernsmooth | 2.23_16 | r36hfa343cc_1 | conda-forge |
r-later | 1.0.0 | r36h0357c0b_0 | conda-forge |
r-lattice | 0.20_40 | r36hcdcec82_0 | conda-forge |
r-magrittr | 1.5 | r36h6115d3f_1002 | conda-forge |
r-maptools | 0.9_9 | r36hcdcec82_0 | conda-forge |
r-mass | 7.3_51.5 | r36hcdcec82_0 | conda-forge |
r-matrix | 1.2_18 | r36h7fa42b6_2 | conda-forge |
r-mime | 0.9 | r36hcdcec82_0 | conda-forge |
r-pbdzmq | 0.3_3 | r36h559a7a4_1002 | conda-forge |
r-pillar | 1.4.3 | r36h6115d3f_0 | conda-forge |
r-promises | 1.1.0 | r36h0357c0b_0 | conda-forge |
r-r.methodss3 | 1.8.0 | r36h6115d3f_0 | conda-forge |
r-r.oo | 1.23.0 | r36h6115d3f_0 | conda-forge |
r-r.utils | 2.9.2 | r36h6115d3f_0 | conda-forge |
r-r6 | 2.4.1 | r36h6115d3f_0 | conda-forge |
r-rappdirs | 0.3.1 | r36hcdcec82_1003 | conda-forge |
r-raster | 3.0_12 | r36h0357c0b_0 | conda-forge |
r-rcpp | 1.0.3 | r36h0357c0b_0 | conda-forge |
r-repr | 1.1.0 | r36h6115d3f_0 | conda-forge |
r-reticulate | 1.14 | r36h0357c0b_0 | conda-forge |
r-rgdal | 1.4_7 | r36h33584d0_0 | conda-forge |
r-rgeos | 0.5_2 | r36h05224b2_0 | conda-forge |
r-rlang | 0.4.5 | r36hcdcec82_0 | conda-forge |
r-sf | 0.8_0 | r36h33584d0_0 | conda-forge |
r-shiny | 1.4.0 | r36h6115d3f_0 | conda-forge |
r-snow | 0.4_3 | r36h6115d3f_1001 | conda-forge |
r-sourcetools | 0.1.7 | r36he1b5a44_1001 | conda-forge |
r-sp | 1.4_1 | r36hcdcec82_0 | conda-forge |
r-spacetime | 1.2_3 | r36h6115d3f_0 | conda-forge |
r-units | 0.6_5 | r36h0357c0b_0 | conda-forge |
r-utf8 | 1.1.4 | r36hcdcec82_1001 | conda-forge |
r-uuid | 0.1_4 | r36hcdcec82_0 | conda-forge |
r-vctrs | 0.2.3 | r36hcdcec82_0 | conda-forge |
r-xtable | 1.8_4 | r36h6115d3f_2 | conda-forge |
r-xts | 0.12_0 | r36hcdcec82_0 | conda-forge |
r-zeallot | 0.1.0 | r36h6115d3f_1001 | conda-forge |
r-zoo | 1.8_7 | r36hcdcec82_0 | conda-forge |
readline | 8.0 | hf8c457e_0 | conda-forge |
sed | 4.7 | h1bed415_1000 | conda-forge |
sqlite | 3.30.1 | hcee41ef_0 | conda-forge |
tbb | 2018.0.5 | h2d50403_0 | conda-forge |
tiledb | 1.6.2 | hcde45ca_3 | conda-forge |
tk | 8.6.10 | hed695b0_0 | conda-forge |
tktable | 2.10 | h555a92e_3 | conda-forge |
tzcode | 2019a | h516909a_1002 | conda-forge |
udunits2 | 2.2.27.6 | h4e0c4b3_1001 | conda-forge |
xerces-c | 3.2.2 | h8412b87_1004 | conda-forge |
xorg-kbproto | 1.0.7 | h14c3975_1002 | conda-forge |
xorg-libice | 1.0.10 | h516909a_0 | conda-forge |
xorg-libsm | 1.2.3 | h84519dc_1000 | conda-forge |
xorg-libx11 | 1.6.9 | h516909a_0 | conda-forge |
xorg-libxau | 1.0.9 | h14c3975_0 | conda-forge |
xorg-libxdmcp | 1.1.3 | h516909a_0 | conda-forge |
xorg-libxext | 1.3.4 | h516909a_0 | conda-forge |
xorg-libxrender | 0.9.10 | h516909a_1002 | conda-forge |
xorg-renderproto | 0.11.1 | h14c3975_1002 | conda-forge |
xorg-xextproto | 7.3.0 | h14c3975_1002 | conda-forge |
xorg-xproto | 7.0.31 | h14c3975_1007 | conda-forge |
xz | 5.2.4 | h14c3975_1001 | conda-forge |
zeromq | 4.3.2 | he1b5a44_2 | conda-forge |
zlib | 1.2.11 | h516909a_1006 | conda-forge |
zstd | 1.4.4 | h3b9ef0a_1 | conda-forge |
Uses the Dask Jobqueue Library to submit jobs to SLURM. Each "Slurm job" has X number of "Python workers".
Scales across nodes and partitions.
Number of workers can be scaled up or down dynamically.
Subject to SLURM resource allocation.
JupyterLab has a Dask add-on to monitor the cluster.
Dask includes a Dataframe (ie: Pandas) and Array (ie: Numpy) equivalent features.
Below are the steps/commands to setup the cluster. Below these steps is a gif of the process on Ceres.
import os
import time
import dask_jobqueue as jq
from dask.distributed import Client,wait
import dask.array as da
Need to specify:
partition='short,brief-low'
container_url = 'docker://rowangaffney/data_science_im_rs:latest'
conda_env = 'geo'
num_processes = 2
num_threads_per_processes = 6
mem = 3.2*num_processes*num_threads_per_processes
n_cores_per_job = num_processes*num_threads_per_processes
clust = jq.SLURMCluster(queue=partition,
processes=num_processes,
cores=n_cores_per_job,
memory=str(mem)+'GB',
interface='ib0',
local_directory='$TMPDIR',
tmpdir_ssh='/project/cper_neon_aop/neon_2017/analysis/prepocessing/',
death_timeout=30,
python="singularity -vv exec --bind /usr/lib64 --bind /scinet01 --bind /software/7/apps/envi/bin/ {} /opt/conda/envs/{}/bin/python".format(container_url,conda_env),
walltime='02:00:00',
job_extra=["--output=/dev/null","--error=/dev/null"])
cl=Client(clust)
dash_addr = '''/user/{}/proxy/{}/status'''.format(os.environ['USER'],cl.scheduler_info()['services']['dashboard'])
print('Dask Lab Extention Address (paste into the dask search box): '+dash_addr)
cl
num_jobs=12
clust.scale(n=num_jobs*num_processes)
while (((cl.status == "running") and (len(cl.scheduler_info()["workers"]) < num_jobs*num_processes))):
time.sleep(.1)
cl
A few quick example.
t = da.random.random((10000,7500,100),chunks=(400,400,-1))
t
t2 = t.mean()
t2
Now we will dynamically load the data, compute the results, and drop the data.
t2.compute()
Lets try working with data larger than memory
t = da.random.random((100000,7500,100),chunks=(400,400,-1))
t
t2 = t.mean()
t2
t2.compute()
Alternatively, we can load the data to the cluster with the "persist" option
t = da.random.random((10000,7500,100),chunks=(400,400,-1)).persist()
wait(t)
t
t.mean().compute()