Getting and Comparing KSSL Data

2015-12-18 Dylan Beaudette

Introduction

This document demonstrates how to use the soilDB package to download KSSL data from SoilWeb. These data are from the June 2015 snapshot, and will be updated as future snapshots are released. Comparisons are made via graphical summaries of key soil properties with depth, using data structures and functions from the aqp package.

Installation

With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:

# run these commands in the R console
install.packages('maps', dep=TRUE) # stable version from CRAN + dependencies
install.packages('soilDB', dep=TRUE) # stable version from CRAN + dependencies
install.packages('soilDB', repos="http://R-Forge.R-project.org", type='source') # most recent copy from r-forge

Getting KSSL Data

Data can be queried by "taxonname"" (typically a soil series), using a geographic bounding-box in WGS84 referenced coordinates, or by MLRA. Taxonname matching is case-insensitive and based on the most current taxonname in NASIS. See ?fetchKSSL() or this GitHub repository for details.

In this example, we are downloading KSSL data for the musick, chaix, and holland soils; commonly found on granitic rocks of the Sierra Nevada Mountain region of California.

# load libraries
library(soilDB)
library(plyr)
library(lattice)
library(maps)

# define plotting style
tps <- list(superpose.line=list(col=c('RoyalBlue', 'DarkRed', 'DarkGreen'), lwd=2))

# fetch KSSL data by fuzzy-matching of series name
musick <- fetchKSSL('musick')
holland <- fetchKSSL('holland')
chaix <- fetchKSSL('chaix')

# generate a basemap of northern California, with county boundaries
map('county', 'California', xlim=c(-123.25, -118.75), ylim=c(36.5, 41))
# add long/lat axes
map.axes()
# add locations of musick
points(y ~ x, data=site(musick), pch=21, bg='RoyalBlue')
# add locations of holland
points(y ~ x, data=site(holland), pch=21, bg='DarkRed')
# add locations of chaix
points(y ~ x, data=site(chaix), pch=21, bg='DarkGreen')
# add a simple legend
legend('topright', pch=21, pt.bg=c('RoyalBlue', 'DarkRed', 'DarkGreen'), legend=c('musick', 'holland', 'chaix'), bty='n')

Profile Plots

Generate "thematic" profile sketches by coloring horizons according to clay content.

par(mar=c(0,1,5,1))
plot(musick, name='hzn_desgn', color='clay')

plot(holland, name='hzn_desgn', color='clay')

plot(chaix, name='hzn_desgn', color='clay')

Combining Data

Since the results from fetchKSSL() function always returns data in the same format, it is possible to stack the results of several KSSL queries using rbind().

# check "correlated as" names from each object
table(musick$taxonname)
Musick MUSICK
4 7
table(holland$taxonname)
Holland HOLLAND
9 8
table(chaix$taxonname)
Chaix CHAIX
7 5
# since there are multiple permuations of each, normalize soil series name by replacement
musick$taxonname <- 'musick'
holland$taxonname <- 'holland'
chaix$taxonname <- 'chaix'

# stack datasets into a new SoilProfileCollection
g <- rbind(chaix, musick, holland)

# compute weighted-mean particle diameter
g$wmpd <- with(horizons(g), ((vcs * 1.5) + (cs * 0.75) + (ms * 0.375) + (fs * 0.175) + (vfs *0.075) + (silt * 0.026) + (clay * 0.0015)) / (vcs + cs + ms + fs + vfs + silt + clay))

# tabulate number of pedons by normalized series name
table(g$taxonname)
chaix holland musick
12 17 11
# plot the grouped object, with profiles arranged by taxonname
par(mar=c(0,1,5,1))
groupedProfilePlot(g, groups = 'taxonname', color='clay', group.name.offset = -10)

# estimate soil depth based on horizon designations
sdc <- getSoilDepthClass(g, name='hzn_desgn', top='hzn_top', bottom='hzn_bot')

# splice-into site data
site(g) <- sdc

# summarize soil depth by taxonname
tapply(g$depth, g$taxonname, summary)
## $chaix
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    64.0    84.0    97.0   113.1   125.5   196.0 
## 
## $holland
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   137.0   162.0   170.0   188.1   193.0   321.0 
## 
## $musick
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   152.0   177.5   183.0   209.1   250.0   295.0

Aggregation

The slab() function is used to aggregate selected variables within collections of soil profiles along depth-slices. In this case, we aggregate clay, pH (1:1 H2O, estimated from saturated paste pH when missing), and base saturation at pH 8.2 along 1-cm thick slices and within groups defined by the variable 'taxonname'. See ?slab() for details on how this function can be used.

g.slab <- slab(g, taxonname ~ clay + estimated_ph_h2o + bs82 + wmpd)

# inspect stacked data structure
str(g.slab)
## 'data.frame':    3852 obs. of  10 variables:
##  $ variable             : Factor w/ 4 levels "clay","estimated_ph_h2o",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ taxonname            : chr  "chaix" "chaix" "chaix" "chaix" ...
##  $ p.q5                 : num  3.63 3.63 3.63 3.71 3.71 ...
##  $ p.q25                : num  5.82 5.82 5.82 6.34 6.34 ...
##  $ p.q50                : num  7.95 7.95 7.95 8.14 8.14 ...
##  $ p.q75                : num  9.25 9.25 9.25 9.64 9.64 ...
##  $ p.q95                : num  10.2 10.2 10.2 11.7 11.7 ...
##  $ top                  : int  0 1 2 3 4 5 6 7 8 9 ...
##  $ bottom               : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ contributing_fraction: num  0.75 0.75 0.75 0.917 0.917 ...

Refine Labels

It is convenient to know how many pedons there are within each collection--therefore, we append this value to the series name. Using the same approach, we can rename the soil properties with more useful descriptions and units of measure.

# re-name soils with series name + number of pedons-- order is critical !
new.levels <- c('musick', 'holland', 'chaix')
new.labels <- paste(new.levels, ' [', c(length(musick), length(holland), length(chaix)), ' pedons]', sep='')
g.slab$taxonname <- factor(g.slab$taxonname, levels=new.levels, labels=new.labels)

# new names should match the order in:
levels(g.slab$variable)
## [1] "clay"             "estimated_ph_h2o" "bs82"             "wmpd"
# re-name soil property labels-- order is critical !
levels(g.slab$variable) <- c('Clay (%)', 'pH 1:1 H2O', 'Base Saturation at pH 8.2 (%)', 'WMPD (mm)')

Graphically Compare

The slice-wise median and 25th/75th percentiles are reasonable estimations of central tendency and spread. "Contributing fraction" values (% of pedons with data at a given depth) are printed along the right-hand side of each panel. These values provide both an indication of how deep the soils are, and, how much confidence can be placed in the aggregate data at any given depth.

# plot grouped, aggregate data
xyplot(top ~ p.q50 | variable, groups=taxonname, data=g.slab, ylab='Depth',
             xlab='median bounded by 25th and 75th percentiles',
             lower=g.slab$p.q25, upper=g.slab$p.q75, ylim=c(155,-5),
             panel=panel.depth_function, alpha=0.25, sync.colors=TRUE,
             prepanel=prepanel.depth_function,
             cf=g.slab$contributing_fraction,
       par.strip.text=list(cex=0.8),
             strip=strip.custom(bg=grey(0.85)),
             layout=c(4,1), scales=list(x=list(alternating=1, relation='free'), y=list(alternating=3)),
             par.settings=tps,
             auto.key=list(columns=3, lines=TRUE, points=FALSE)
)


This document is based on aqp version 1.9.3 and soilDB version 1.6.7.