Correlation matrices building for covariance localization

Description

This notebook demonstrates how to build the correlation matrices (\(\rho_{MD}\)) and (\(\rho_{DD}\)) that are used to localize the covariance matrices by element-wise multiplication (Schur product or Hadamard product) of the original covariance matrices:

\(\tilde{C}^{l}_{MD} = \rho_{MD} \odot C^{l}_{MD}\)

\(\tilde{C}^{l}_{DD} = \rho_{DD} \odot C^{l}_{DD}\)

with \(\odot\) the element wise multiplication.

Once built these matrices can be passed to the constructor of an ESMDA instance using the keywords md_correlation_matrix and dd_correlation_matrix respectively.

Motivation

The computational cost of ESMDA and other ensemble smoothers mainly depends on the ensemble size since the computation of the prediction vectors d_pred requires \(N_{e}\) simulations at each assimilation. Consequently, the number of ensemble realizations should be kept as small as possible to reduce computational burden. However, when the size of the ensemble chosen for inversion is too small, it might not be statistically representative of the variability of the unknowns: The low rank approximation of the error covariance matrix introduces sampling errors which decrease proportionally to \(1/\sqrt(N_{e})\) [1]. For small \(N_{e}\), long-range spurious correlations might appear causing the filter divergence. Covariance localization (CL) is a technique developed to overcome these issues.

The method relies on element-wise multiplication of the original covariance matrices by space and (optionally) time dependent correlation function that reduces the correlations between space and time distant observations and parameters.

Note that the space and time correlation matrices are computed independently and are gathered afterwards by element-wise multiplication.

References

[1] G. Evensen. The Ensemble Kalman Filter: Theoretical formulation and practical implementation. Ocean Dynamics, 53(4):343–367, 2003. ISSN 16167341. doi: 10.1007/s10236-003-0036-9.

[2] Todaro, V. (2021). Advanced techniques for solving groundwater and surface water problems in the context of inverse methods and climate change [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/166439

  • Import the required modules

[2]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import distance_matrix
import numpy as np
import numpy.typing as npt
  • Type hints for numpy

[3]:
NDArrayFloat = npt.NDArray[np.float64]
  • Define some configurations for the plots

[4]:
# Some configurations for the plots
new_rc_params = {
    "font.family": "sans-serif",
    "font.sans-serif": ["Helvetica", "DejaVu Sans"],
    "font.size": 16,
    "text.usetex": False,
    "savefig.format": "svg",
    "svg.fonttype": "none",  # to store text as text, not as path
    "savefig.facecolor": "w",
    "savefig.edgecolor": "k",
    "savefig.dpi": 300,
    "figure.constrained_layout.use": True,
    "figure.facecolor": "w",
}
csfont = {"fontname": "Comic Sans MS"}
hfont = {"fontname": "Helvetica"}
plt.rcParams.update(new_rc_params)

Correlation functions

For both time and space correlation matrices, we need to functions that transform distances into weights in the interval [0-1]. Intuitively, the weight should decrease when the distance increases. Probability density functions are quite interesting to achieve that. - An example is a reversed beta cumulative distribution function.

[5]:
def reversed_beta_cumulative(x: NDArrayFloat, beta: float = 3) -> NDArrayFloat:
    """
    Transform the input with 1/ (1 + (1 / (1 - x)) ^ beta)

    Parameters
    ----------
    x : np.ndarray
        Input array. Idealy, values should be between 0. and 1.0.
    beta: float, optional
        Shape factor. Must be strictly positive. The default is 3.

    Returns
    -------
    np.ndarray.
        Array of same dimension as input array.

    """
    if beta < 0:
        raise ValueError(f"Beta ({beta}) should be strictly positive !")

    x2 = x.copy()
    x2[x <= 0] = np.nan
    x2[x >= 1] = np.nan
    fact = np.where(np.isnan(x2), 0.0, 1.0 / (1.0 + np.power((x2 / (1 - x2)), -beta)))
    fact[x >= 1] = 1
    return 1 - fact
[6]:
x = np.linspace(0, 1.1, 20)
plt.plot(x, reversed_beta_cumulative(x))
[6]:
[<matplotlib.lines.Line2D at 0x7fc239b32280>]
../_images/tutorials_correlation_matrices_building_8_1.png
  • When plotting the function, it can be seen that after 1.0, the function is always 0.0. To overcome this issue, we apply a scaling factor by wrapping it in a distance_to_weight function

[7]:
def distance_to_weight_beta_cumulative(
    x: NDArrayFloat, beta: float = 3, scaling_factor: float = 20
) -> NDArrayFloat:
    """
    Transform the input with 1/ (1 + (1 / (1 - x / scaling_factor)) ^ beta)

    Parameters
    ----------
    x : NDArrayFloat
        Input array. Idealy, values should be between 0. and 1.0.
    beta: float, optional
        Shape factor. The smalest beta, the slower the variation, the higher beta
        the sharpest the transition (tends to a dirac function). Must be strictly
        positive. The default is 3.
    scaling_factor: float, optional
        The scaling factor. At 0, the function equals 1.0, at half the scaling factor,
        it equals 0.5, and at the scaling factor, is equals zero.
        The default is 20.

    Returns
    -------
    NDArrayFloat.
        Array of same dimension as input array.

    """
    if scaling_factor <= 0.0:
        raise ValueError(f"The scaling factor ({scaling_factor}) should be strictly positive !")
    return reversed_beta_cumulative(x / scaling_factor, beta=beta)


x = np.linspace(0, 40, 41)
scaling_factor = 30  # meters
for beta in [0.0, 0.5, 1.0, 2.0, 3.0, 5.0, 10.0, 20.0]:
    plt.plot(x, reversed_beta_cumulative(x / scaling_factor, beta=beta), label=f"beta = {beta}")
plt.legend()
[7]:
<matplotlib.legend.Legend at 0x7fc237a91130>
../_images/tutorials_correlation_matrices_building_10_1.png

Note: many other function could be used. In her PdD manuscript, Todaro (2021) implement a fifth order correlation function proposed by Gaspari & Cohn (1999) which acts very similarly but without beta parameter.

See: Gaspari, G. & Cohn, S. E. (1999), ‘Construction of correlation functions in two and three dimensions’, Quarterly Journal of the Royal Meteorological Society 125(554), 723–757

[8]:
def _part1(d: float) -> float:
    return (
        -1 / 4 * d ** 5.0
        + 1 / 2 * d ** 4.0
        + 5 / 8 * d ** 3.0
        - 5 / 3 * d ** 2.0
        + 1.0
    )

def _part2(d: float) -> float:
    return (
        1 / 12 * d ** 5.0
        - 1 / 2 * d ** 4.0
        + 5 / 8 * d ** 3
        + 5 / 3 * d ** 2.0
        - 5.0 * d
        + 4.0
        - 2 / 3 * (d ** (-1.0))
    )

def distance_to_weight_fifth_order(
    x: NDArrayFloat, threshold: float = 20
) -> NDArrayFloat:
    """
    Transform a distance into a weight. See Gaspari & Cohn (1999).

    Parameters
    ----------
    x : NDArrayFloat
        Input distance values.
    threshold: float, optional
        Threshold distance at which the weight becomes zero.

    Returns
    -------
    NDArrayFloat.
        Array of same dimension as input array.

    Reference
    ---------
    Gaspari, G. & Cohn, S. E. (1999), ‘Construction of correlation functions in two
    and three dimensions’, Quarterly Journal of the Royal
    Meteorological Society 125(554), 723–757
    """
    if threshold <= 0:
        raise ValueError(f"The threshold ({threshold}) should be strictly positive !")

    x2 = x.copy() / threshold

    x2[x2 < 0] = np.nan
    x2[x2 >= 2.0] = np.nan

    return np.where(np.isnan(x2), 0.0, np.where(x2 >= 1.0, _part2(np.where(x2 <= 0.0, np.nan, x2)), _part1(x2)))


# plt.plot(x, distance_to_weight_fifth_order(x, threshold=20))

for threshold in [0.5, 1.0, 2.0, 3.0, 5.0, 10.0, 20.0]:
    plt.plot(x, distance_to_weight_fifth_order(x, threshold), label=f"threshold = {threshold}")
plt.legend()
[8]:
<matplotlib.legend.Legend at 0x7fc237949400>
../_images/tutorials_correlation_matrices_building_12_1.png

General procedure

No matter the matrix being built, a straightforward approach is: 1) Create vectors of coordinates (time or space) 2) Compute the distance matrix between two vectors of coordinates obtained in 1) (param-obs or obs-obs) 3) Apply a distance_to_weight function to the matrix from 2)

Parameters - observations correlation matrix

Spatial correlation matrix

The goal is to build the covariance matrix between the adjusted parameters and the observations.

For simplicity, let’s consider: - A regular (4, 5) grid with non square meshes = different dx & dy. - A single parameter adjusted over the all grid, - 3 observation locations with 10, 9 and 15 observation values for the three locations respectively.

The correlation matrix is expected to have the dimensions (\(N_{m}\), \(N_{obs}\)) with \(N_{m}\) the number of adjusted parameter values = the number of gridcells, and \(N_{obs}\) the number of observations (10 + 9 + 15 = 34).

  • Define and plot the grid

[ ]:
from hytecio.utils import node
[53]:
# grid dimensions
nx = 4
dx = 50
ny = 5
dy = 30

# Vectors of x and y coordinates (case of a regular grid)
x_vect_coords = np.linspace(0, (nx - 1) * dx, nx)
y_vect_coords = np.linspace(0, (ny - 1) * dy, ny)

# Build grid coordinates from it -> 2D array with dimensions (nx * ny, 2)
coordinates = np.array(np.meshgrid(x_vect_coords, y_vect_coords), order="F").reshape(2, -1).T

# 20 cells expected
print(coordinates.shape)

# mesh indices
mesh_indices = np.arange(nx * ny).reshape(ny, nx)

# Show the mesh indices
plt.imshow(mesh_indices, origin="lower", extent=(0.0, nx * dx, 0.0, ny * dy))
plt.colorbar(label="mesh index")
plt.xlabel("X (m)")
plt.ylabel("Y (m)")
(20, 2)
[53]:
Text(0, 0.5, 'Y (m)')
../_images/tutorials_correlation_matrices_building_16_2.png
[54]:
from hytecio.utils import hytec_node_number_to_indices
[55]:
hytec_node_number_to_indices(np.arange(nx * ny), nx=nx, ny=ny)
[55]:
(array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]),
 array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]),
 array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))
[56]:
coordinates
[56]:
array([[  0.,   0.],
       [ 50.,   0.],
       [100.,   0.],
       [150.,   0.],
       [  0.,  30.],
       [ 50.,  30.],
       [100.,  30.],
       [150.,  30.],
       [  0.,  60.],
       [ 50.,  60.],
       [100.,  60.],
       [150.,  60.],
       [  0.,  90.],
       [ 50.,  90.],
       [100.,  90.],
       [150.,  90.],
       [  0., 120.],
       [ 50., 120.],
       [100., 120.],
       [150., 120.]])
  • Define and plot the observation locations

[57]:
# number of values per observartion locations
n_obs_list = [10, 9, 15]

obs_grid_indices = [(1, 1), (3, 3), (1, 3)]
obs_grid_positions = [(i * dx, j * dy) for i, j in obs_grid_indices]

grid_obs = np.zeros((ny, nx))

for x_index, y_index in obs_grid_indices:
    grid_obs[y_index, x_index] = 1.0

# plot the observation locations in the grid
plt.imshow(grid_obs, origin="lower", extent=(0.0, nx * dx, 0.0, ny * dy))
plt.colorbar(label="1 = observation location")
plt.xlabel("X (m)")
plt.ylabel("Y (m)")
[57]:
Text(0, 0.5, 'Y (m)')
../_images/tutorials_correlation_matrices_building_21_1.png
  • The vector of parameter coordinates is already built. Let’s do the same for the observations

[58]:
# one coordinates set per obs value
obs_val_coords = np.concatenate(
    [[obs_grid_positions[i]] * n_obs_list[i] for i in range(len(obs_grid_positions))],
    axis=0,
)
  • The distance and weight matrices can be directly computed from these two vectors

[59]:
# distance matrix computation
md_space_distances_mat = distance_matrix(coordinates, obs_val_coords)
# weight distance computation
md_space_correlation_matrix = distance_to_weight_beta_cumulative(
    md_space_distances_mat, beta=3, scaling_factor=150
)

print(f"md_space_correlation_matrix.shape = {md_space_correlation_matrix.shape}")

# Plot the correlation matrix
plt.imshow(md_space_correlation_matrix)
plt.xlabel("Obs")
plt.ylabel("Params")
plt.colorbar(label="weights")
md_space_correlation_matrix.shape = (20, 34)
[59]:
<matplotlib.colorbar.Colorbar at 0x7fc2159b4c40>
../_images/tutorials_correlation_matrices_building_25_2.png

Time correlation matrix

If both adjusted parameters and observations depend on time, it might be interesting to add time correlations (not alwayd relevant). The procedure is the same than for the spatial correlation but with time coordinates. For the simplicity of the example, it is assumed that the time of acquisition of the observation is equal to two indices in the value table, and that all parameters are at time 0.0.

[60]:
# one time per obs value. Note that we need to insert a column filled with zeros (2d array expected).
obs_val_times = np.zeros((sum(n_obs_list), 2))
obs_val_times[:, 0] = np.concatenate(
    [range(n_obs_list[i]) for i in range(len(n_obs_list))]
)
[61]:
param_vals_times = np.zeros((nx * ny, 2))
[62]:
# distance matrix computation
md_time_distances_mat = distance_matrix(param_vals_times, obs_val_times)
# weight distance computation
md_time_correlation_matrix = distance_to_weight_beta_cumulative(
    md_time_distances_mat, beta=3, scaling_factor=15
)

print(f"md_time_correlation_matrix.shape = {md_time_correlation_matrix.shape}")

# Plot the correlation matrix
plt.imshow(md_time_correlation_matrix)
plt.xlabel("Obs")
plt.ylabel("Params")
plt.colorbar(label="weights")
md_time_correlation_matrix.shape = (20, 34)
[62]:
<matplotlib.colorbar.Colorbar at 0x7fc2158e7910>
../_images/tutorials_correlation_matrices_building_29_2.png

Merging space and time correlation matrices

Simply apply and element-wise multiplication

[63]:
md_correlation_matrix = md_space_correlation_matrix * md_time_correlation_matrix

# Plot the correlation matrix
plt.imshow(md_correlation_matrix)
plt.xlabel("Obs")
plt.ylabel("Params")
plt.colorbar(label="weights")
[63]:
<matplotlib.colorbar.Colorbar at 0x7fc21592cb20>
../_images/tutorials_correlation_matrices_building_31_1.png

Observations - observations correlation matrix

Spatial correlation matrix

As for the parameter-observable covariance matrix, we use the distance between the observation locations to build the matrix.

[64]:
# distance matrix computation
obs_distances_mat = distance_matrix(obs_val_coords, obs_val_coords)
# weight distance computation
dd_space_correlation_matrix = distance_to_weight_beta_cumulative(
    obs_distances_mat, beta=3, scaling_factor=150
)

print(f"dd_space_correlation_matrix.shape = {dd_space_correlation_matrix.shape}")

# Plot the correlation matrix
plt.imshow(dd_space_correlation_matrix)
plt.xlabel("Obs")
plt.ylabel("Obs")
plt.colorbar(label="weights")
dd_space_correlation_matrix.shape = (34, 34)
[64]:
<matplotlib.colorbar.Colorbar at 0x7fc2157512e0>
../_images/tutorials_correlation_matrices_building_34_2.png

Time correlation matrix

One possibility is also to decorrelate the observations having been made at distant times. The procedure to build the matrix is very similar to the one used to build the spatial covariance matrix. The only difference is that temporal coordinates are used and not spatial coordinates.

[66]:
# one time per obs value. Note that we need to insert a zero column.
obs_val_times = np.zeros((sum(n_obs_list), 2))
obs_val_times[:, 0] = np.concatenate(
    [range(n_obs_list[i]) for i in range(len(n_obs_list))]
)
# time-distance matrix computation
obs_time_distances_mat = distance_matrix(obs_val_times, obs_val_times)
# weight distance computation
dd_time_correlation_matrix = distance_to_weight_beta_cumulative(
    obs_time_distances_mat, beta=3, scaling_factor=10
)

# Plot the correlation matrix
plt.imshow(dd_time_correlation_matrix)
plt.xlabel("Obs")
plt.ylabel("Obs")
plt.colorbar(label="weights")
[66]:
<matplotlib.colorbar.Colorbar at 0x7fc2157042e0>
../_images/tutorials_correlation_matrices_building_36_1.png

Merging space and time localization matrices

[68]:
dd_correlation_matrix = dd_space_correlation_matrix * dd_time_correlation_matrix

# Plot the correlation matrix
plt.imshow(dd_correlation_matrix)
plt.xlabel("Obs")
plt.ylabel("Obs")
plt.colorbar(label="weights")
[68]:
<matplotlib.colorbar.Colorbar at 0x7fc21555c8b0>
../_images/tutorials_correlation_matrices_building_38_1.png