Some ideas on summarizing soil color

D.E. Beaudette
2015-07-31
This document is based on aqp version 1.9 and sharpshootR version 0.8-3.

Introduction

This tutorial outlines a simple method for summarizing soil color from a collection of soil profiles using the aqp and sharpshootR packages for R. Suppose you have a collection of soil profiles and would like to compare the relative frequencies of soil color among different horizons, depths, hillslope positions, or bedrock. How would you determine the most frequent color or a reasonable range in colors for a given group? One solution to this problem is presented in the figure below. In this case, relative proportions of soil color are arranged by genetic horizon. Larger “color bars” are the most frequently described colors, both in terms of number of observed horizons and the thickness of those horizons. The small numbers in parenthesis denote the number of horizons associated with any given color. See examples below for more ideas. The manual page for aggregateColor and aggregateColorPlot contain additional details.

plot of chunk agg-color-data-0

Setup R Envionment

If you have never used the aqp or sharpshootR packages before, you will likely need to install them. This only needs to be done once.

# stable version from CRAN and all dependencies
install.packages("aqp", dep = TRUE)
install.packages("soilDB", dep = TRUE)
install.packages("sharpshootR", dep = TRUE)
# latest versions from R-Forge:
install.packages("aqp", repos = "http://R-Forge.R-project.org", type = "source")
install.packages("soilDB", repos = "http://R-Forge.R-project.org", type = "source")
install.packages("sharpshootR", repos = "http://R-Forge.R-project.org", type = "source")

Now that you have all of the R packages that this document depends on, it would be a good idea to load them. R packages must be installed anytime you change versions of R (e.g. after an upgrade), and loaded anytime you want to access functions from within those packages.

library(soilDB)
library(aqp)
library(sharpshootR)

Sample Data

While the methods outlined in this document can be applied to any collection of pedons, it is convenient to work with a standardized set of data. You can follow along with the analysis by copying code from the following blocks and running it in your R session. The sample data used in this document is based on 30 soil profiles that have been correlated to the Loafercreek soil series from the Sierra Nevada Foothill Region of California. Note that the internal structure of the loafercreek data is identical to the structure returned by fetchNASIS() from the soilDB package. All horizon-level values are pulled from the pedon horizon table of the pedons being analyzed.

# load sample data from the soilDB package
data(loafercreek, package = "soilDB")
# graphical check
par(mar = c(0, 0, 0, 0))
plot(loafercreek, name = "", print.id = FALSE, cex.names = 0.8, axis.line.offset = -4, max.depth = 100)

plot of chunk load-data

Sample data, pedons correlated to the Loafercreek soil series. Original data were loaded from NASIS.


Examples

These examples can be readily adapted to your own data. Note that you may need to adjust object (change loafercreek to the name of your SoilProfileCollection object) and column (genhz, etc.) names. Data loaded from NASIS via fetchNASIS can be used without modification (except for object name) in these examples.

Typically color summaries are grouped by some kind of “horizon-level” attribute: depth slice, horizon designation, generalized horizon label, or within a diagnostic feature. However, it is possible to group colors by a “site-level” attribute such as bedrock kind, taxonname, or hillslope position. Note that grouping colors by a site-level attribute will pool colors from all depths.

Depth Slices

# slice color data at select depths
s <- slice(loafercreek, c(5, 10, 15, 25, 50, 75) ~ soil_color, strict = FALSE)

# make horizon labels based on slice depth
s$slice <- paste0(s$hzdept, " cm")
s$slice <- factor(s$slice, levels = guessGenHzLevels(s, "slice")$levels)

par(mar = c(4.5, 2.5, 4.5, 0))
aggregateColorPlot(aggregateColor(s, "slice"), label.cex = 0.65, main = "Loafercreek Dry Colors\nDepth Slices", 
    print.n.hz = TRUE)

plot of chunk agg-color-data-1


Generalized Horizon Labels

# generalize horizon names using REGEX rules
n <- c("Oi", "A", "BA", "Bt1", "Bt2", "Bt3", "Cr", "R")
p <- c("O", "^A$|Ad|Ap|AB", "BA$|Bw", "Bt1$|^B$", "^Bt$|^Bt2$", "^Bt3|^Bt4|CBt$|BCt$|2Bt|2CB$|^C$", "Cr", 
    "R")
loafercreek$genhz <- generalize.hz(loafercreek$hzname, n, p)

# remove non-matching generalized horizon names
loafercreek$genhz[loafercreek$genhz == "not-used"] <- NA
loafercreek$genhz <- factor(loafercreek$genhz)

aggregateColorPlot(aggregateColor(loafercreek, "genhz"), main = "Loafercreek Series Dry Colors\nGeneralized Horizon Labels", 
    print.n.hz = TRUE, label.cex = 0.8)

plot of chunk agg-color-data-2


Site-Level Data

par(mar = c(4.5, 4, 4.5, 0))
aggregateColorPlot(aggregateColor(loafercreek, "hillslope_pos"), main = "Loafercreek Series Dry Colors\nHillslope Position")

plot of chunk agg-color-data-3

par(mar = c(4.5, 5, 4.5, 0))
aggregateColorPlot(aggregateColor(loafercreek, "bedrock_kind"), main = "Loafercreek Series Dry Colors\nBedrock Kind")

plot of chunk agg-color-data-3


Additional Examples

Note that when using data from NASIS, you must first establish a selected set of site and pedon data.

Soil Color Signatures

alt text

# get data from NASIS or similar source
f <- fetchNASIS(rmHzErrors = TRUE)

# an ordered set of series names
soils <- c("ahwahnee", "auberry", "musick", "holland", "shaver", "chaix", "canisrocks")

# extract these soils and normalize taxonname
f.sub <- f[grep(paste0(soils, collapse = "|"), f$taxonname, ignore.case = TRUE), ]
for (x in soils) {
    f.sub$taxonname[grep(x, f.sub$taxonname, ignore.case = TRUE)] <- x
}
# reset levels to order specified above
f.sub$taxonname <- factor(f.sub$taxonname, levels = soils)

par(mar = c(4.5, 5, 4.5, 0))
aggregateColorPlot(aggregateColor(f.sub, "taxonname"), label.cex = 0.65, main = "Soil Color Signatures", 
    print.n.hz = TRUE, rect.border = NA, print.label = FALSE, horizontal.borders = TRUE)

Many Soils Associated with Common Bedrock

alt text

# geology + depth slices
f.sub <- f[grep("diorite", f$bedrock_kind, ignore.case = TRUE), ]
f.sub <- slice(f.sub, c(5, 10, 15, 25, 50, 100, 150) ~ soil_color)

# make fake label
f.sub$slice <- paste0(f.sub$hzdept, " cm")
f.sub$slice <- factor(f.sub$slice, levels = guessGenHzLevels(f.sub, "slice")$levels)

aggregateColorPlot(aggregateColor(f.sub, "slice"), label.cex = 0.65, main = "Soils Formed on \"Diorite\"")

How Does it Work?

Color “bar” proportions are calculated using a combination of horizon thickness and number of observations, normalized to sum to 1 within each group. Within each group weights are computed:

\[ w_i = \sqrt{\sum{d_i}} * n_i \]

where \(w_i\) is the weight or width of color bar \(i\), \(d_i\) is the thickness of all horizon data associated with color \(i\), and \(n\) is the number of horizons in which color \(i\) has been described. Thicker “bars” suggest more frequent and larger (total) thickness of soil material with any given color.

The following block of code demonstrates how the aggregateColor works and the type of data that are returned.

# load some example data
data(sp1, package = "aqp")

# upgrade to SoilProfileCollection and convert Munsell colors
sp1$soil_color <- with(sp1, munsell2rgb(hue, value, chroma))
depths(sp1) <- id ~ top + bottom
site(sp1) <- ~group

# generalize horizon names
n <- c("O", "A", "B", "C")
p <- c("O", "A", "B", "C")
sp1$genhz <- generalize.hz(sp1$name, n, p)

# inspect the results
a <- aggregateColor(sp1, "genhz")
print(a)
## $scaled.data
## $scaled.data$O
##   soil_color    weight n.hz   munsell
## 2  #3F2E23FF 0.6782801    2 7.5YR 2/2
## 1  #3D2F21FF 0.1771101    1  10YR 2/2
## 3  #584537FF 0.1446098    1 7.5YR 3/2
## 
## $scaled.data$A
##   soil_color     weight n.hz   munsell
## 1  #3D2F21FF 0.35145069    4  10YR 2/2
## 4  #584537FF 0.23420410    3 7.5YR 3/2
## 8  #755D41FF 0.10666124    1  10YR 4/3
## 2  #3F2E23FF 0.07542088    2 7.5YR 2/2
## 7  #725D4EFF 0.06212829    1 7.5YR 4/2
## 9  #785C44FF 0.05333062    1 7.5YR 4/3
## 3  #554636FF 0.04937456    1  10YR 3/2
## 5  #5D432FFF 0.04727254    1 7.5YR 3/3
## 6  #624126FF 0.02015708    1 7.5YR 3/4
## 
## $scaled.data$B
##    soil_color     weight n.hz   munsell
## 4   #584537FF 0.28423143    5 7.5YR 3/2
## 12  #755D41FF 0.20332270    3  10YR 4/3
## 3   #554636FF 0.12647500    3  10YR 3/2
## 6   #5A452CFF 0.08209742    2  10YR 3/3
## 2   #432C1BFF 0.05519822    2 7.5YR 2/3
## 7   #5D432FFF 0.05349940    2 7.5YR 3/3
## 8   #5F4232FF 0.05174484    2   5YR 3/3
## 10  #725D4EFF 0.02717772    1 7.5YR 4/2
## 13  #805840FF 0.02304104    1   5YR 4/4
## 9   #64402BFF 0.02094184    1   5YR 3/4
## 5   #594439FF 0.02038329    1   5YR 3/2
## 1   #3D2F21FF 0.01797637    1  10YR 2/2
## 14  #8F775BFF 0.01797637    1  10YR 5/3
## 11  #735D50FF 0.01593435    1   5YR 4/2
## 
## $scaled.data$C
##   soil_color     weight n.hz   munsell
## 1  #584537FF 0.27888687    3 7.5YR 3/2
## 5  #735D50FF 0.15180678    2   5YR 4/2
## 7  #7A5C37FF 0.13146853    2  10YR 4/4
## 8  #8F775BFF 0.13146853    2  10YR 5/3
## 6  #755D41FF 0.08900462    1  10YR 4/3
## 4  #725D4EFF 0.06573427    1 7.5YR 4/2
## 2  #5A452CFF 0.06000690    1  10YR 3/3
## 3  #5D432FFF 0.05367180    1 7.5YR 3/3
## 9  #A99174FF 0.03795170    1  10YR 6/3
## 
## 
## $aggregate.data
##   genhz munsell.hue munsell.value munsell.chroma munsell.sigma     col       red     green
## 1     O         5YR             2              2    0.01874053 #423226 0.2598472 0.1941299
## 2     A        2.5Y             3              2    0.02999432 #534130 0.3267851 0.2536673
## 3     B         5YR             3              2    0.03751066 #604A37 0.3764125 0.2910382
## 4     C        10YR             4              3    0.02847749 #705943 0.4374595 0.3479658
##        blue  n
## 1 0.1472077  3
## 2 0.1889345  9
## 3 0.2158563 14
## 4 0.2633552  9