Loading...
Loading...
Generic methodology to forecast storage volume and cost, then evaluate rolling retention options for a large dataset without double-counting growth.
In production, retention execution should include throttling and safety controls to avoid overwhelming the storage layer and downstream systems.
Inputs: TotalVol[m] for each month m in calibration window ΔTotal[m] = TotalVol[m] - TotalVol[m-1] AvgΔTotal = average( ΔTotal[m] ) over calibration months excluding baseline month
Baseline month: b (e.g., first month in window) MoMGrowthTotal = AvgΔTotal / TotalVol[b]
For forecast month t (t = b+1, b+2, ...): TotalVolForecast[t] = TotalVolForecast[t-1] * (1 + MoMGrowthTotal)
UnitCost[m] = Cost[m] / TotalVol[m] AvgUnitCost = average( UnitCost[m] ) over calibration months
CostBaseline[t] = TotalVolForecast[t] * AvgUnitCost
Retention window: N months TargetInc[t] = forecasted monthly incremental volume for target dataset TargetWithRetention[t] = sum( TargetInc[t - k] ) for k = 0..(N-1)
TargetTotalBaseline[t] = forecasted cumulative target dataset volume (no retention) AdjustedTotal[t] = TotalVolForecast[t] - TargetTotalBaseline[t] + TargetWithRetention[t]
CostScenario[t] = AdjustedTotal[t] * AvgUnitCost
| Month | Total Volume (GB) | Target Dataset (GB) | Cost (USD) |
|---|---|---|---|
| 2025-01 | 831,640,814.1 | 7,472,975 | 1,514,802 |
| 2025-02 | 847,841,295 | 6,950,684 | 1,560,000 |
| 2025-03 | 867,477,568 | 8,101,004 | 1,590,000 |
| 2025-04 | 885,610,182 | 8,181,687 | 1,620,000 |
Tip: keep this appendix generic by labeling the target dataset as “Target Dataset” if you don’t want to expose internal dataset names.