pyesmda.ESMDA#

class pyesmda.ESMDA(obs: ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]], m_init: ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]], cov_obs: ~covmats._covariances.CovarianceMatrix, forward_model: ~typing.Callable[[...], ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]]], forward_model_args: ~typing.Sequence[~typing.Any] = (), forward_model_kwargs: ~typing.Dict[str, ~typing.Any] | None = None, n_assimilations: int = 4, inversion_type: ~pyesmda._inversion.ESMDAInversionType | str = ESMDAInversionType.SUBSPACE_RESCALED, cov_obs_inflation_factors: ~typing.Sequence[float] | None = None, cov_mm_inflation_factor: float = 1.0, C_DD_localization: ~pyesmda._localization.LocalizationStrategy = <pyesmda._localization.NoLocalization object>, C_MD_localization: ~pyesmda._localization.LocalizationStrategy = <pyesmda._localization.NoLocalization object>, m_bounds: ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]] | None = None, save_ensembles_history: bool = False, seed: int | None = None, is_forecast_for_last_assimilation: bool = True, random_state: int | ~numpy.random._generator.Generator | ~numpy.random.mtrand.RandomState | None = 198873, batch_size: int = 5000, is_parallel_analyse_step: bool = True, truncation: float = 0.99, logger: ~logging.Logger | None = None)[source]#

Ensemble Smoother with Multiple Data Assimilation.

Implement the ES-MDA as proposed by Emerick, A. A. and A. C. Reynolds [Emerick and Reynolds, 2013, Emerick and Reynolds, 2013].

__init__(obs: ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]], m_init: ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]], cov_obs: ~covmats._covariances.CovarianceMatrix, forward_model: ~typing.Callable[[...], ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]]], forward_model_args: ~typing.Sequence[~typing.Any] = (), forward_model_kwargs: ~typing.Dict[str, ~typing.Any] | None = None, n_assimilations: int = 4, inversion_type: ~pyesmda._inversion.ESMDAInversionType | str = ESMDAInversionType.SUBSPACE_RESCALED, cov_obs_inflation_factors: ~typing.Sequence[float] | None = None, cov_mm_inflation_factor: float = 1.0, C_DD_localization: ~pyesmda._localization.LocalizationStrategy = <pyesmda._localization.NoLocalization object>, C_MD_localization: ~pyesmda._localization.LocalizationStrategy = <pyesmda._localization.NoLocalization object>, m_bounds: ~numpy.ndarray[tuple[~typing.Any, ...], ~numpy.dtype[~numpy.float64]] | None = None, save_ensembles_history: bool = False, seed: int | None = None, is_forecast_for_last_assimilation: bool = True, random_state: int | ~numpy.random._generator.Generator | ~numpy.random.mtrand.RandomState | None = 198873, batch_size: int = 5000, is_parallel_analyse_step: bool = True, truncation: float = 0.99, logger: ~logging.Logger | None = None) None[source]#

Construct the instance.

Parameters:
  • obs (NDArrayFloat) – Obsevrations vector with dimension \(N_{\mathrm{obs}}\).

  • m_init (NDArrayFloat) – Initial ensemble of parameters vector with dimensions (\(N_{m}\), \(N_{e}\)).

  • cov_obs (covmats.CovarianceMatrix) – Covariance matrix of observed data measurement errors with dimensions (\(N_{\mathrm{obs}}\), \(N_{\mathrm{obs}}\)). Also denoted \(R\). It can be a numpy array or a sparse matrix (scipy.linalg).

  • forward_model (callable) – Function calling the non-linear observation model (forward model) for all ensemble members and returning the predicted data for each ensemble member.

  • forward_model_args (Optional[Tuple[Any]]) – Additional args for the callable forward_model. The default is None.

  • forward_model_kwargs (Optional[Dict[str, Any]]) – Additional kwargs for the callable forward_model. The default is None.

  • n_assimilations (int, optional) – Number of data assimilations (\(N_{a}\)). The default is 4.

  • inversion_type (Union[ESMDAInversionType, str]) – See ESMDAInversionType for more details. The default is ESMDAInversionType.SUBSPACE_RESCALED.

  • cov_obs_inflation_factors (Optional[Sequence[float]]) – Multiplication factor used to inflate the covariance matrix of the measurement errors. Must match the number of data assimilations (\(N_{a}\)). The default is None.

  • cov_mm_inflation_factor (float) – Factor used to inflate the initial ensemble around its mean. See [Anderson, 2007]. The default is 1.0 i.e., no inflation.

  • C_DD_localization (LocalizationStrategy) – Localization operator \(\rho_{DD}\) applied to the predictions empirical auto-covariance matrices. Expected dimensions of the operator are (\(N_{\mathrm{obs}}\), \(N_{\mathrm{obs}}\)). It can be fixed (defined correlation matrix used for all iterations) or adaptive and even user defined. See implementations of LocalizationStrategy.

  • C_MD_localization (Optional[csr_matrix]) – Localization operator \(\rho_{DD}\) applied to the parameters-predictions empirical corss-covariance matrices. Expected dimensions of the operator are (\(N_{m}\), \(N_{\mathrm{obs}}\)). It can be fixed (defined correlation matrix used for all iterations) or adaptive and even user defined. See implementations of LocalizationStrategy.

  • m_bounds (Optional[NDArrayFloat], optional) – Lower and upper bounds for the \(N_{m}\) parameter values. Expected dimensions are (\(N_{m}\), 2) with lower bounds on the first column and upper on the second one. The default is None.

  • save_ensembles_history (bool, optional) – Whether to save the history predictions and parameters over the assimilations. The default is False.

  • seed (Optional[int]) –

    Deprecated since version 0.4.2: Since 0.4.2, you can use the parameter random_state instead.

  • is_forecast_for_last_assimilation (bool, optional) – Whether to compute the predictions for the ensemble obtained at the last assimilation step. The default is True.

  • random_state (Optional[Union[int, np.random.Generator, np.random.RandomState]]) – Pseudorandom number generator state used to generate resamples. If random_state is None (or np.random), the numpy.random.RandomState singleton is used. If random_state is an int, a new RandomState instance is used, seeded with random_state. If random_state is already a Generator or RandomState instance then that instance is used.

  • batch_size (int) – Number of parameters that are assimilated at once. This option is available to overcome memory limitations when the number of parameters is large. In that case, the size of the covariance matrices tends to explode and the update step must be performed by chunks of parameters. The default is 5000.

  • is_parallel_analyse_step (bool, optional) – Whether to use parallel computing for the analyse step if the number of batch is above one. It relies on concurrent.futures multiprocessing. The default is True.

  • truncation (float) – A value in the range ]0, 1], used to determine the number of significant singular values kept when using svd for the inversion of \((C_{dd} + \alpha C_{d})\): Only the largest singular values are kept, corresponding to this fraction of the sum of the nonzero singular values. The goal of truncation is to deal with smaller matrices (dimensionality reduction), easier to inverse. The default is 0.99.

  • logger (Optional[logging.Logger]) – Optional logging.Logger instance used for event logging. The default is None.

Properties

C_DD_localization

Localization operator \(\rho_{DD}\) applied to the predictions empirical auto-covariance matrices; Expected dimensions of the operator are (\(N_{obs}\), \(N_{obs}\)); It can be fixed (defined correlation matrix used for all iterations) or adaptive and even user defined; See implementations of pyesmda.LocalizationStrategy.

C_MD_localization

Localization operator \(\rho_{DD}\) applied to the parameters-predictions empirical corss-covariance matrices; Expected dimensions of the operator are (\(N_{m}\), \(N_{obs}\)); It can be fixed (defined correlation matrix used for all iterations) or adaptive and even user defined; See implementations of pyesmda.LocalizationStrategy.

anomalies

Return the matrix of anomalies.

batch_size

Number of parameters that are assimilated at once; This option is available to overcome memory limitations when the number of parameters is large; In that case, the size of the covariance matrices tends to explode and the update step must be performed by chunks of parameters.

cov_dd

Autocovariance matrix of estimated parameters; Dimensions are (\(N_{m}, N_{m}\)).

cov_md

Cross-covariance matrix between the forecast state vector and predicted data; Dimensions are (\(N_{m}, N_{obs}\)).

cov_mm

Get the estimated parameters autocovariance matrix.

cov_obs

Get the observation errors covariance matrix.

cov_obs_inflation_factors

Get the inlfation factors for the covariance matrix of the measurement errors.

d_dim

Return the number of forecast data.

d_history

List of vectors of predicted values obtained at each assimilation step.

d_obs_uc

d_pred

Vectors of predicted values (one for each ensemble member) with dimensions (\(N_{obs}\), \(N_{e}\)).

forward_model

Function calling the non-linear observation model (forward model) for all ensemble members and returning the predicted data for each ensemble member.

forward_model_args

Additional args for the callable forward_model.

forward_model_kwargs

Function calling the non-linear observation model (forward model) for all ensemble members and returning the predicted data for each ensemble member.

inversion_type

Inversion type.

is_forecast_for_last_assimilation

Whether to compute the predictions for the ensemble obtained at the last assimilation step.

is_parallel_analyse_step

Whether to use parallel computing for the analyse step if the number of batch is above one.

logger

logging.Logger instance used for event logging.

m_bounds

Get the parameter errors covariance matrix.

m_dim

Return the length of the parameters vector.

m_history

List of successive m_prior.

m_prior

Vectors of parameter values (one vector for each ensemble member) used in the last assimilation step; Dimensions are (\(N_{m}\), \(N_{e}\)).

n_assimilations

Return the number of assimilations to perform.

n_batches

Number of batch used in the optimization.

n_ensemble

Return the number of ensemble members.

obs

Obsevrations vector with dimensions (\(N_{\mathrm{obs}}\)).

rng

The random number generator used in the predictions perturbation step.

save_ensembles_history

Whether to save the history predictions and parameters over the assimilations.

truncation

Return the truncation number for the svd in inversion.

Methods

loginfo

Log the message.

set_cov_obs_inflation_factors

Set the inflation factors the covariance matrix of the measurement errors.

solve

Solve the optimization problem with ES-MDA algorithm.