---
title: "Addressing a Decade-Old 'Continental Fallacy' in Spatial Econometrics"
author: "NJ Talingting"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
toc: true
number_sections: true
vignette: >
%\VignetteIndexEntry{Addressing a Decade-Old 'Continental Fallacy' in Spatial Econometrics}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
> ### **TL;DR: The "Geographic Exclusion" Bug**
> * **The Problem:** For decades, standard spatial models have been systematically excluding provinces in island nations. In the Philippines, they "orphan" Philippine provinces (leaving 20% of the nation isolated), creating a fragmented 80.2% connectivity that biases economic predictions.
> * **The Methodology:** Using localized 'palay' prices of the Philippines from 2016-2025, we demonstrate the limitations of traditional Queen-contiguity in maritime environments to accurately capture maritime spillovers, noting a statistically significant error bias ($p < 0.05$).
> * **The Solution:** `ArchipelagoEngine` recalibrates the network to **100% connectivity** ($k=5$), neutralizing spatial bias and ensuring that national policy interventions are mathematically inclusive of the entire archipelago. Such connectivity can be scaled to other archipelagic nations.
---
# Introduction: The Problem of Matrix Rank
The Spatial Weight Matrix ($W$) is the fundamental operator for capturing spillovers. However, in archipelagic topographies like the Philippines, the standard Queen-contiguity operator fails to satisfy the condition of a connected graph because archipelagos does not necessarily share land borders.
When $nc > 1$ (the number of disjoint components), the spatial weights matrix is technically a block-diagonal matrix where certain blocks (isolated provinces like Batanes) have zero probability of interaction. This violates the assumption of a unified spatial process.
# Mathematical Diagnostic of the Queen Model
We tested the 'palay' price residuals using a Spatial Error Model (SEM):
$$y = X\beta + u, \quad u = \lambda Wu + \epsilon$$
## Connectivity Failure
Figure 1: Standard Queen Logic (Left) vs. ArchipelagoEngine k=5 (Right)
Under Queen contiguity, we observed:
* **Network Connectivity:** 80.2%.
* **Graph Structure:** 23 disjoint components.
## The Moran's I Paradox
We calculated the Global Moran’s $I$ on the residuals. A well-specified model should yield randomized residuals (white noise).
| Metric | Queen Model | KNN ($k=5$) |
|:--- |:--- |:--- |
| **Moran's I** | 0.024 | -0.001 |
| **p-value** | 0.041 (Significant) | 0.485 (Randomized) |
The Queen model's $p < 0.05$ indicates **systematic predictive bias**. The model is failing to account for maritime trade routes, forcing the error term to absorb the spatial structure.
# Recalibration via KNN ($k=5$)
`ArchipelagoEngine` solves this by optimizing the parameter $k$. At $k=5$, we achieve a singular connected component ($nc=1$), ensuring that the spatial multiplier effect can flow through the entire archipelago.
```{r, message= FALSE, warning= FALSE}
library(ArchipelagoEngine)
library(spdep)
# Verification of Matrix Integration
w_knn <- build_archipelago_weight(raw_data, k = 5)
S_knn <- spdep::n.comp.nb(w_knn$neighbours)
print(S_knn$nc) # National Connectivity must equal 1
```
## Implications
We conclude that Queen-contiguity is misspecified for fragmented topographies like the Philippine archipelago. Using a fragmented $W$ matrix results in "geographic exclusion." By enforcing 100% connectivity, we ensure that spatial spillovers are mathematically representative of the entire nation. This recalibration ensures that national-scale econometric models maintain global connectivity, thereby neutralizing the structural bias inherent in disjoint spatial weights and providing a more robust framework for analyzing cross-regional economic trends.
Furthermore, the transition from a disjoint Queen-matrix to a connected KNN-matrix ensures that
the spectral radius of $W$ is well-behaved ($\rho(W) = 1$), allowing for
stable maximum likelihood estimation of the spatial autoregressive parameter $\lambda$.
Specifically, by ensuring a single connected component ($nc = 1$), we guarantee
that the Jacobian term $\ln|I - \lambda W|$ in the log-likelihood function is
monotonic and well-defined over the feasible range of $\lambda$.
## Limitations and Recommendations
A key limitation of 'ArchipelagoEngine' is its reliance on spatial proximity, which may force arbitrary topological connections between islands that lack real-world functional interaction. While this geometric abstraction is an inherent trade-off of the model, integrating empirical transport data — such as the Roll-on/Roll-off (RoRo) networks in the 'roroph' package — offers a more realistic representation of maritime connectivity.
However, researchers must note that RoRo networks can be endogenous to the system, shaped by the very internal economic or geographic factors the model seeks to analyze. Therefore, supplementary tests such as Instrumental Variables (IV) must be conducted to control for this potential endogeneity.
## References
**Anselin, L. (1988).** Spatial Econometrics: Methods and Models.
**LeSage, J., & Pace, R. K. (2009).** Introduction to Spatial Econometrics.
**Bivand, R. S., & Wong, D. W. (2018).** "Comparing methods for isolating units of spatial autocorrelation."
Mastodon