Kean Publications

Evaluation of conditioned Latin hypercube sampling for soil mapping based on a machine learning method

Lin Yang, Nanjing University
Xinming Li, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences
Jingjing Shi, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences
Feixue Shen, Nanjing University
Feng Qi, Kean University
Binbo Gao, China Agricultural University
Ziyue Chen, Beijing Normal University
A. Xing Zhu, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences
Chenghu Zhou, Nanjing University

Document Type

Article

Publication Date

6-15-2020

Abstract

Sampling design plays an important role in soil survey and soil mapping. Conditioned Latin hypercube sampling (cLHS) has been proven as an efficient sampling strategy and used widely in digital soil mapping. cLHS samples are randomly selected in each stratum of environmental variables, thus the produced sample sets can vary significantly at different runs with the same sample size. Although variation of mapping accuracies caused by the randomness of cLHS has been realized and qualitatively mentioned in past studies. However, how the randomness of cLHS could quantitatively influence mapping accuracy has rarely been examined. In this study, we conducted experiments to examine how the sample randomness quantitatively influence soil mapping accuracy with different sample sizes, and analyzed the possible reasons from a pedogenesis perspective. The results showed that the largest range of mapping accuracies of 500 repeats was 39.5% at a sample density of 2.59 point/km2, while the smallest range was 7.3% at the maximum sample size with a sample density of 32.47 point/km2. The sample density for satisfactory prediction accuracies in our study area was at least 10.06 Point/km2. The results showed that both the allocation of sample points to each soil series and the typicality of sample points played important roles in mapping accuracies. But the deep reasons causing the unstable performance of cLHS at small sample sizes were the imbalanced class distribution of soil series and the overlap between soil series in the distribution of environmental covariates. Researchers need to be cautious about the output when applying cLHS with small sampling densities. Some effective approaches to address this issue include increasing the sample size, checking the sample allocations of a cLHS design with the assistance of legacy soil maps, or adding the legacy soil map as a variable during sampling design. When the sampling resources and legacy soil maps are limited for an area, fuzzy k-means clustering sampling could be a potential alternative. This study provides useful references for better understanding the uncertainty of cLHS when the sample density is small and selecting alternative sampling methods accordingly.

Publication Title

Geoderma

DOI

10.1016/j.geoderma.2020.114337

Recommended Citation

Yang, Lin; Li, Xinming; Shi, Jingjing; Shen, Feixue; Qi, Feng; Gao, Binbo; Chen, Ziyue; Zhu, A. Xing; and Zhou, Chenghu, "Evaluation of conditioned Latin hypercube sampling for soil mapping based on a machine learning method" (2020). Kean Publications. 1214.
https://digitalcommons.kean.edu/keanpublications/1214

This document is currently not available here.

COinS

Kean Publications

Evaluation of conditioned Latin hypercube sampling for soil mapping based on a machine learning method

Document Type

Publication Date

Abstract

Publication Title

DOI

Recommended Citation

Browse

Search

Resources

Links

Kean Publications

Evaluation of conditioned Latin hypercube sampling for soil mapping based on a machine learning method

Authors

Document Type

Publication Date

Abstract

Publication Title

DOI

Recommended Citation

Share

Browse

Search

Resources

Links