ForecastGRRateGrid.calculate_statistics#

ForecastGRRateGrid.calculate_statistics(agg_func: str | Callable = 'mean', extra_stats: dict = {})#

Get statistics for a, b, alpha, mc and number_events, if present, per timestep and grid cell, aggregated over all realizations of the grid (i.e. over all grid_id values).

Works with a normal range index (aggregates over grid cells) or with a MultiIndex with starttime, (endtime) and cell_id.

Parameters:
  • agg_func – Aggregation function to use for the main value, e.g. ‘mean’, …

  • extra_stats – Additional metrics to calculate. Provided as {‘suffix’: ‘statistic’}. Metric can be any function or string that can be passed to pandas.DataFrame.agg.

Returns:

statistics – An aggregated DataFrame with the calculated statistics. Extra statistics will be added as new columns with the name in the form of <metric>_<suffix>.

Examples

Create a ForecastGRRateGrid from a dictionary with two grid cells, each having two realizations (as indicated by grid_id).

>>> import pandas as pd
>>> from seismostats import ForecastGRRateGrid
>>> data = { 'longitude_min': [9, 9, 10, 10],
...          'longitude_max': [10, 10, 11, 11],
...          'latitude_min': [45, 45, 46, 46],
...          'latitude_max': [46, 46, 47, 47],
...          'depth_min': [10, 10, 20, 20],
...          'depth_max': [20, 20, 30, 30],
...          'number_events': [5, 6, 10, 12],
...          'a': [0.8, 0.9, 1.0, 1.1],
...          'b': [0.95, 1.0, 1.05, 1.1],
...          'mc': [1.2, 1.2, 1.3, 1.3],
...          'grid_id': [0, 1, 0, 1]}
>>> forecast = ForecastGRRateGrid(
...     data,
...     starttime=pd.Timestamp('2023-01-01'),
...     endtime=pd.Timestamp('2023-01-02'))
>>> forecast

   longitude_min  longitude_max  ...    a     b   mc  grid_id
0            9.0           10.0  ...  0.8  0.95  1.2        0
1            9.0           10.0  ...  0.9  1.00  1.2        1
2           10.0           11.0  ...  1.0  1.05  1.3        0
3           10.0           11.0  ...  1.1  1.10  1.3        1

Compute the mean and standard deviation of the GR parameters for each grid cell:

>>> stats = forecast.calculate_statistics(
...     agg_func='mean',
...     extra_stats={'std': lambda x: x.std(ddof=0)}
... )
>>> stats

   longitude_min  longitude_max  ...      b   mc   b_std  mc_std
0            9.0           10.0  ...  0.975  1.2  0.0353     0.0
1           10.0           11.0  ...  1.075  1.3  0.0353     0.0