Bootstrap and clustering techniques are foundational tools across scientific disciplines, playing a particularly important role in spatial analysis. However, traditional bootstrap methods often fall short in preserving spatial dependencies and complex attribute relationships during resampling. In this work, we introduce a novel framework in the Spatial Machine Learning domain that leverages deep learning techniques to enhance stratified bootstrap procedures for spatial data. Deep learning has already revolutionized prediction and classification tasks in data with temporal and spatial dependencies. In this work we want to extend the scope of application to bootstrap analysis by using tools like entity embeddings and autoencoders. By encoding high-cardinality categorical variables into continuous representations, entity embeddings facilitate the discovery of meaningful spatial and attribute-based cluster. These embeddings are then passed to a Deep Embedded Clustering (DEC) algorithm that can use them to create clusters. This algorithm is able to handle high-dimensional big data using an autoencoder based architecture that performs dimensionality reduction and clustering simultaneously to avoid loss of information. These clusters can be finally used as strata that guide a stratified bootstrap approach which preserves spatial autocorrelation and heterogeneity. We demonstrate the utility of our framework by performing a bootstrap analysis of high-tech firm productivity in the Lombardy region. Our approach is able to analyze efficiently large amounts of high dimensional data with complex attributes.

Spatial bootstrapping using deep clustering methods: Spatial machine learning applied to Lombardy high-tech businesses

Bumbea Alessio
;
Mazzitelli Andrea;Giuffrida Annamaria;
2025-01-01

Abstract

Bootstrap and clustering techniques are foundational tools across scientific disciplines, playing a particularly important role in spatial analysis. However, traditional bootstrap methods often fall short in preserving spatial dependencies and complex attribute relationships during resampling. In this work, we introduce a novel framework in the Spatial Machine Learning domain that leverages deep learning techniques to enhance stratified bootstrap procedures for spatial data. Deep learning has already revolutionized prediction and classification tasks in data with temporal and spatial dependencies. In this work we want to extend the scope of application to bootstrap analysis by using tools like entity embeddings and autoencoders. By encoding high-cardinality categorical variables into continuous representations, entity embeddings facilitate the discovery of meaningful spatial and attribute-based cluster. These embeddings are then passed to a Deep Embedded Clustering (DEC) algorithm that can use them to create clusters. This algorithm is able to handle high-dimensional big data using an autoencoder based architecture that performs dimensionality reduction and clustering simultaneously to avoid loss of information. These clusters can be finally used as strata that guide a stratified bootstrap approach which preserves spatial autocorrelation and heterogeneity. We demonstrate the utility of our framework by performing a bootstrap analysis of high-tech firm productivity in the Lombardy region. Our approach is able to analyze efficiently large amounts of high dimensional data with complex attributes.
2025
Deep clustering
Entity embedding
Spatial bootstrap
Spatial correlation
Spatial machine learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12606/31325
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
social impact