md

🧩 Syntax:
hOf course. Here is a comprehensive, step-by-step guide to building a Liquidity Risk Model for Retail Deposits in Python. This solution includes data simulation, preparation, metric calculation, classification, behavioral analysis, and statistical modeling as requested.

### Solution Overview

The solution is structured as a complete Python script, simulating a typical workflow in a Jupyter Notebook. It's broken down into the following parts:

1.  **Setup & Data Simulation**: We'll start by importing necessary libraries and generating realistic synthetic data for `transaction_data.csv` and `account_rates.csv` since none was provided. This makes the solution fully executable.
2.  **Part 1: Data Preparation & Feature Engineering**: Loading, cleaning, merging data, and reconstructing daily account balances.
3.  **Part 2: Key Metrics & Core/Non-Core Classification**: Calculating stability metrics and applying a rules-based engine to classify accounts.
4.  **Part 3: Behavioral Maturity Analysis**: Analyzing deposit behavior over time using vintage curves and decay analysis.
5.  **Part 4: Statistical Models**: Implementing Survival Analysis and Time Series Forecasting.
6.  **Part 5: Segmentation Analysis**: Aggregating results by customer segments to derive actionable insights.

---

### Python Implementation

```python
# =============================================================================
# PART 0: SETUP & DATA SIMULATION
# =============================================================================
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import date, timedelta
import warnings

# --- Statistical Modeling Libraries ---
from lifelines import KaplanMeierFitter
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import adfuller

# --- Configuration ---
warnings.filterwarnings('ignore')
sns.set(style="whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

# --- Data Simulation Function ---
def generate_synthetic_data(num_accounts=500, start_date_str='2020-01-01', end_date_str='2023-12-31'):
    """Generates realistic synthetic transaction and rate data."""
    print("Generating synthetic data...")
    
    start_date = pd.to_datetime(start_date_str)
    end_date = pd.to_datetime(end_date_str)
    date_range = pd.date_range(start_date, end_date)
    
    # Create Accounts
    accounts = []
    for i in range(num_accounts):
        account_opening_date = start_date + timedelta(days=np.random.randint(0, (end_date - start_date).days / 2))
        accounts.append({
            'deposit_account_key': 1000 + i,
            'customer_id': 5000 + i,
            'customer_birth_year': np.random.randint(1950, 2005),
            'customer_zip_code': np.random.randint(10000, 98000),
            'opening_date': account_opening_date,
            # Stability profile: lower is more stable
            'stability_profile': np.random.choice(['stable', 'normal', 'volatile'], p=[0.4, 0.4, 0.2])
        })
    accounts_df = pd.DataFrame(accounts)

    # Generate Transactions
    transactions = []
    for _, acc in accounts_df.iterrows():
        acc_date_range = pd.date_range(acc['opening_date'], end_date)
        current_balance = 0
        
        for trans_date in acc_date_range:
            # Simulate transaction frequency based on stability
            if acc['stability_profile'] == 'stable' and np.random.rand() > 0.95:
                freq_factor = 1
            elif acc['stability_profile'] == 'normal' and np.random.rand() > 0.85:
                freq_factor = 1
            elif acc['stability_profile'] == 'volatile' and np.random.rand() > 0.70:
                freq_factor = 1
            else:
                continue

            transaction_sign = np.random.choice(['credit', 'debit'], p=[0.55, 0.45])
            
            if transaction_sign == 'credit':
                amount = np.random.lognormal(mean=7, sigma=1.5)
            else: # Debit
                # Avoid negative balance
                max_debit = current_balance * 0.5 if current_balance > 0 else 0
                if max_debit == 0: continue
                amount = np.random.uniform(100, max(200, max_debit))

            transactions.append({
                'deposit_account_key': acc['deposit_account_key'],
                'transaction_date_key': trans_date,
                'transaction_sign': transaction_sign,
                'transaction_type': np.random.choice(['salary', 'transfer', 'payment', 'other']),
                'transaction_amt_sek': round(amount, 2),
                'transaction_id': len(transactions) + 1,
            })
            current_balance += amount if transaction_sign == 'credit' else -amount

    transactions_df = pd.DataFrame(transactions)
    
    # Merge customer info into transactions
    transactions_df = transactions_df.merge(accounts_df[['deposit_account_key', 'customer_id', 'customer_birth_year', 'customer_zip_code']], 
                                            on='deposit_account_key')

    # Generate Account Rates
    rates = []
    for _, acc in accounts_df.iterrows():
        # Opening rate
        rates.append({
            'deposit_account_key': acc['deposit_account_key'],
            'effective_date': acc['opening_date'],
            'interest_rate': round(np.random.uniform(0.001, 0.01), 4)
        })
        # Simulate rate changes
        for _ in range(np.random.randint(0, 3)):
            change_date = acc['opening_date'] + timedelta(days=np.random.randint(100, (end_date - acc['opening_date']).days))
            rates.append({
                'deposit_account_key': acc['deposit_account_key'],
                'effective_date': change_date,
                'interest_rate': round(np.random.uniform(0.005, 0.035), 4)
            })
    rates_df = pd.DataFrame(rates)

    # Save to CSV
    transactions_df.to_csv('transaction_data.csv', index=False)
    rates_df.to_csv('account_rates.csv', index=False)
    print("Synthetic data generated and saved to CSV files.")
    return transactions_df, rates_df

# --- Execute Data Generation ---
# Set to False if you already have the files
GENERATE_NEW_DATA = True
if GENERATE_NEW_DATA:
    transactions_df, rates_df = generate_synthetic_data()
else:
    print("Loading existing data from CSV files...")
    transactions_df = pd.read_csv('transaction_data.csv')
    rates_df = pd.read_csv('account_rates.csv')

# Convert date columns to datetime
transactions_df['transaction_date_key'] = pd.to_datetime(transactions_df['transaction_date_key'])
rates_df['effective_date'] = pd.to_datetime(rates_df['effective_date'])

print("\nData Simulation and Loading Complete.")
print("Transaction Data Shape:", transactions_df.shape)
print("Account Rates Data Shape:", rates_df.shape)

# =============================================================================
# PART 1: DATA PREPARATION & FEATURE ENGINEERING
# =============================================================================
print("\n--- Part 1: Data Preparation & Feature Engineering ---")

# --- 1.1 Reconstruct Daily Balance ---
def reconstruct_daily_balances(transactions):
    """Reconstructs daily balance for each account from transactions."""
    print("Reconstructing daily balances...")
    
    # Convert transaction sign to +/- 1
    transactions['signed_amt'] = transactions.apply(
        lambda row: row['transaction_amt_sek'] if row['transaction_sign'] == 'credit' else -row['transaction_amt_sek'],
        axis=1
    )

    # Calculate net change per day
    daily_net_flow = transactions.groupby(['deposit_account_key', 'transaction_date_key'])['signed_amt'].sum().reset_index()

    # Calculate cumulative balance
    daily_net_flow = daily_net_flow.sort_values(by=['deposit_account_key', 'transaction_date_key'])
    daily_net_flow['balance'] = daily_net_flow.groupby('deposit_account_key')['signed_amt'].cumsum()

    # Create a full date range for each account
    full_date_range = daily_net_flow.groupby('deposit_account_key').agg(
        start_date=('transaction_date_key', 'min'),
        end_date=('transaction_date_key', 'max')
    ).reset_index()

    all_dates = []
    for _, row in full_date_range.iterrows():
        dates = pd.date_range(start=row['start_date'], end=date.today(), freq='D')
        all_dates.append(pd.DataFrame({'deposit_account_key': row['deposit_account_key'], 'date': dates}))
    
    all_dates_df = pd.concat(all_dates)

    # Merge and forward-fill balances
    daily_balances = pd.merge(all_dates_df, daily_net_flow[['deposit_account_key', 'transaction_date_key', 'balance']], 
                              left_on=['deposit_account_key', 'date'], 
                              right_on=['deposit_account_key', 'transaction_date_key'], 
                              how='left')
    
    daily_balances = daily_balances.sort_values(by=['deposit_account_key', 'date'])
    daily_balances['balance'] = daily_balances.groupby('deposit_account_key')['balance'].ffill().fillna(0)
    # Ensure no negative balances
    daily_balances['balance'] = daily_balances['balance'].clip(lower=0)
    
    return daily_balances[['deposit_account_key', 'date', 'balance']]

daily_balances_df = reconstruct_daily_balances(transactions_df)
print("Daily balances reconstructed. Shape:", daily_balances_df.shape)

# --- 1.2 Create Account-Level Summary Table ---
print("Creating account-level summary table...")
# Get unique account/customer info from transactions
account_info = transactions_df[['deposit_account_key', 'customer_id', 'customer_birth_year', 'customer_zip_code']].drop_duplicates()

# Calculate Account Age (Tenure)
account_opening_dates = rates_df.groupby('deposit_account_key')['effective_date'].min().reset_index()
account_opening_dates.rename(columns={'effective_date': 'opening_date'}, inplace=True)

# Merge opening dates
account_summary = pd.merge(account_info, account_opening_dates, on='deposit_account_key', how='left')

# Calculate Account Tenure in Months
current_date = date.today()
account_summary['account_tenure_months'] = ((pd.to_datetime(current_date) - account_summary['opening_date']).dt.days / 30.44).astype(int)

# Calculate Customer Age
account_summary['customer_age'] = current_date.year - account_summary['customer_birth_year']

# Get current balance
current_balances = daily_balances_df.loc[daily_balances_df.groupby('deposit_account_key')['date'].idxmax()]
current_balances = current_balances[['deposit_account_key', 'balance']].rename(columns={'balance': 'current_balance'})

# Merge current balance into summary
account_summary = pd.merge(account_summary, current_balances, on='deposit_account_key', how='left')
account_summary.fillna({'current_balance': 0}, inplace=True)

print("Account summary created. Shape:", account_summary.shape)
print(account_summary.head())

# =============================================================================
# PART 2: KEY METRICS & CORE/NON-CORE CLASSIFICATION
# =============================================================================
print("\n--- Part 2: Key Metrics & Core/Non-Core Classification ---")

# --- 2.1 Calculate Key Metrics ---
print("Calculating key stability metrics for each account...")

metrics = []
today = pd.to_datetime(date.today())
one_year_ago = today - pd.DateOffset(years=1)

for key in account_summary['deposit_account_key'].unique():
    acc_balances = daily_balances_df[daily_balances_df['deposit_account_key'] == key]
    acc_trans = transactions_df[transactions_df['deposit_account_key'] == key]
    
    if acc_balances.empty:
        continue
    
    # Filter for last 12 months
    balances_12m = acc_balances[acc_balances['date'] >= one_year_ago]
    
    # --- Metrics Calculation ---
    current_balance = account_summary.loc[account_summary['deposit_account_key'] == key, 'current_balance'].iloc[0]
    
    # Stability Ratio
    min_balance_12m = balances_12m['balance'].min() if not balances_12m.empty else 0
    stability_ratio = min_balance_12m / current_balance if current_balance > 0 else 0
    
    # Balance Volatility (Coefficient of Variation)
    avg_balance_12m = balances_12m['balance'].mean() if not balances_12m.empty else 0
    std_dev_12m = balances_12m['balance'].std() if not balances_12m.empty else 0
    balance_volatility = std_dev_12m / avg_balance_12m if avg_balance_12m > 0 else 0
    
    # Transaction Frequency
    num_months_active = (acc_trans['transaction_date_key'].max() - acc_trans['transaction_date_key'].min()).days / 30.44
    total_transactions = len(acc_trans)
    trans_freq_monthly = total_transactions / num_months_active if num_months_active > 0 else 0
    
    # Deposit Decay Rate (simplified monthly average)
    monthly_balances = acc_balances.set_index('date').resample('M')['balance'].last()
    monthly_decay = (monthly_balances.shift(1) - monthly_balances) / monthly_balances.shift(1)
    avg_decay_rate = monthly_decay[monthly_decay > 0].mean() # Only consider outflows

    metrics.append({
        'deposit_account_key': key,
        'stability_ratio': stability_ratio,
        'balance_volatility': balance_volatility,
        'trans_freq_monthly': trans_freq_monthly,
        'avg_decay_rate': avg_decay_rate if pd.notna(avg_decay_rate) else 0
    })

metrics_df = pd.DataFrame(metrics)
account_summary = pd.merge(account_summary, metrics_df, on='deposit_account_key')

# --- 2.2 Core vs Non-Core Classification ---
print("Classifying deposits into Core/Non-Core categories...")

def classify_deposit(row):
    tenure = row['account_tenure_months']
    volatility = row['balance_volatility']
    stability = row['stability_ratio']
    
    if tenure < 6 or volatility > 0.75:
        return 'Non-Core'
    if tenure >= 24 and volatility < 0.2 and stability > 0.8:
        return 'Core - Highly Stable'
    if tenure >= 12 and volatility < 0.4 and stability > 0.6:
        return 'Core - Stable'
    else:
        return 'Semi-Core'

account_summary['classification'] = account_summary.apply(classify_deposit, axis=1)

print("Classification complete. Distribution of deposit types:")
print(account_summary['classification'].value_counts(normalize=True).to_string())

# Visualize Classification
plt.figure(figsize=(10, 6))
sns.countplot(x='classification', data=account_summary, order=['Core - Highly Stable', 'Core - Stable', 'Semi-Core', 'Non-Core'])
plt.title('Distribution of Deposit Classifications')
plt.ylabel('Number of Accounts')
plt.show()


# =============================================================================
# PART 3: BEHAVIORAL MATURITY ANALYSIS
# =============================================================================
print("\n--- Part 3: Behavioral Maturity Analysis ---")

# --- 3.1 Vintage Analysis ---
print("Performing Vintage Analysis...")
account_summary['opening_vintage'] = account_summary['opening_date'].dt.to_period('M')

# Calculate months on book
daily_balances_merged = pd.merge(daily_balances_df, account_summary[['deposit_account_key', 'opening_date']], on='deposit_account_key')
daily_balances_merged['months_on_book'] = ((daily_balances_merged['date'] - daily_balances_merged['opening_date']).dt.days / 30.44).astype(int)

# Normalize balance by first month's average balance
first_month_balance = daily_balances_merged[daily_balances_merged['months_on_book'] == 1].groupby('deposit_account_key')['balance'].mean().reset_index()
first_month_balance.rename(columns={'balance': 'first_month_avg_balance'}, inplace=True)
daily_balances_merged = pd.merge(daily_balances_merged, first_month_balance, on='deposit_account_key')
daily_balances_merged['normalized_balance'] = daily_balances_merged['balance'] / daily_balances_merged['first_month_avg_balance']

# Create Vintage Pivot Table
vintage_pivot = daily_balances_merged.groupby(['opening_vintage', 'months_on_book'])['normalized_balance'].mean().unstack(level=0)

# Plot Vintage Curves
plt.figure(figsize=(14, 8))
vintage_pivot.iloc[:24, :].plot(legend=True, grid=True, figsize=(14, 8))
plt.title('Vintage Analysis: Normalized Deposit Balance Retention by Opening Cohort')
plt.xlabel('Months on Book')
plt.ylabel('Normalized Balance (vs. First Month Avg)')
plt.ylim(0, 2) # Cap y-axis for better visualization
plt.legend(title='Opening Vintage', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.show()

# --- 3.2 Deposit Decay Curves by Classification ---
print("Generating Deposit Decay Curves by Segment...")
decay_data = pd.merge(daily_balances_merged, account_summary[['deposit_account_key', 'classification']], on='deposit_account_key')
decay_pivot = decay_data.groupby(['classification', 'months_on_book'])['normalized_balance'].mean().unstack(level=0)

plt.figure(figsize=(12, 7))
decay_pivot.iloc[:36].plot(ax=plt.gca())
plt.title('Deposit Decay/Growth Curve by Classification')
plt.xlabel('Months on Book')
plt.ylabel('Average Normalized Balance')
plt.axhline(1, color='black', linestyle='--', label='Initial Balance')
plt.legend(title='Classification')
plt.show()

# =============================================================================
# PART 4: STATISTICAL MODELS
# =============================================================================
print("\n--- Part 4: Statistical Models ---")

# --- 4.1 Survival Analysis: Time to Significant Withdrawal ---
print("Building Survival Analysis model...")

# Define "event": a withdrawal of >25% of the balance in 30 days
event_threshold = 0.25
survival_data = []

for key in account_summary['deposit_account_key'].unique():
    acc_balances = daily_balances_df[daily_balances_df['deposit_account_key'] == key].set_index('date')['balance']
    
    # Calculate 30-day rolling minimum balance
    rolling_min = acc_balances.rolling(window='30D').min()
    event_occurred = (rolling_min < (1 - event_threshold) * acc_balances.shift(30))
    
    first_event_date = event_occurred[event_occurred == True].first_valid_index()
    
    opening_date = account_summary.loc[account_summary['deposit_account_key'] == key, 'opening_date'].iloc[0]
    
    if first_event_date:
        duration = (first_event_date - opening_date).days
        observed = 1
    else:
        duration = (today - opening_date).days
        observed = 0
        
    survival_data.append({
        'deposit_account_key': key,
        'duration': duration,
        'event_observed': observed
    })

survival_df = pd.DataFrame(survival_data)
survival_df = pd.merge(survival_df, account_summary[['deposit_account_key', 'classification']], on='deposit_account_key')

# Fit Kaplan-Meier estimator for each segment
plt.figure(figsize=(12, 7))
ax = plt.subplot(111)

for name, grouped_df in survival_df.groupby('classification'):
    kmf = KaplanMeierFitter()
    kmf.fit(grouped_df['duration'], event_observed=grouped_df['event_observed'], label=name)
    kmf.plot_survival_function(ax=ax)

plt.title('Survival Function: Time to Significant Withdrawal (>25%) by Deposit Class')
plt.xlabel('Days Since Account Opening')
plt.ylabel('Probability of "Survival" (No large withdrawal)')
plt.show()


# --- 4.2 Time Series Forecasting (ARIMA) ---
print("Building Time Series Forecast model (ARIMA)...")

# Select a sample account for forecasting (e.g., a stable, long-tenure account)
sample_account_key = account_summary.loc[account_summary['classification'] == 'Core - Highly Stable', 'deposit_account_key'].iloc[0]
ts_data = daily_balances_df[daily_balances_df['deposit_account_key'] == sample_account_key]
ts = ts_data.set_index('date')['balance'].resample('W').mean() # Resample to weekly to smooth and speed up

# Check for stationarity
result = adfuller(ts.dropna())
print(f'ADF Statistic for sample account {sample_account_key}: {result[0]}')
print(f'p-value: {result[1]}')
# If p > 0.05, we difference the series
d = 1 if result[1] > 0.05 else 0

# Fit ARIMA model (using simple p,d,q for demonstration)
# For production, use auto_arima to find optimal parameters
try:
    model = ARIMA(ts, order=(5, d, 1))
    fitted_model = model.fit()

    # Forecast next 90 days (approx 13 weeks)
    forecast = fitted_model.get_forecast(steps=13)
    forecast_index = pd.date_range(ts.index[-1], periods=13, freq='W')
    forecast_series = pd.Series(forecast.predicted_mean, index=forecast_index)
    conf_int = forecast.conf_int()
    conf_int.index = forecast_index

    # Plot
    plt.figure(figsize=(14, 7))
    plt.plot(ts, label='Historical Weekly Balance')
    plt.plot(forecast_series, label='Forecast', color='red')
    plt.fill_between(forecast_index, conf_int.iloc[:, 0], conf_int.iloc[:, 1], color='pink', alpha=0.5, label='95% Confidence Interval')
    plt.title(f'ARIMA Forecast for Deposit Account {sample_account_key}')
    plt.xlabel('Date')
    plt.ylabel('Balance (SEK)')
    plt.legend()
    plt.show()
except Exception as e:
    print(f"Could not generate ARIMA forecast for account {sample_account_key}: {e}")
    
# --- Note on Monte Carlo Simulation ---
print("\n--- Note on Monte Carlo Simulation for Stress Testing ---")
print("""
Monte Carlo simulation would be the next step for stress testing. The process involves:
1.  **Model Individual Account Dynamics**: Use the historical volatility and decay rates to model daily balance changes as a stochastic process (e.g., Geometric Brownian Motion).
2.  **Define Scenarios**: Create scenarios like 'Interest Rate Shock' (+200bps), 'Recession' (higher withdrawal rates), or 'Market Panic'.
3.  **Simulate**: For each account, run thousands of simulation paths for its future balance under each scenario.
4.  **Aggregate**: Sum up the simulated balances across the portfolio at each future time step.
5.  **Analyze Distribution**: Analyze the distribution of total deposit outflows to determine expected and unexpected liquidity needs (e.g., calculate VaR - Value at Risk).
This provides a forward-looking view of liquidity risk under stress.
""")

# =============================================================================
# PART 5: SEGMENTATION ANALYSIS
# =============================================================================
print("\n--- Part 5: Segmentation Analysis ---")

# --- 5.1 Analysis by Core/Non-Core Classification ---
segment_summary = account_summary.groupby('classification').agg(
    num_accounts=('deposit_account_key', 'count'),
    total_balance_sek=('current_balance', 'sum'),
    avg_balance_sek=('current_balance', 'mean'),
    avg_tenure_months=('account_tenure_months', 'mean'),
    avg_volatility=('balance_volatility', 'mean'),
    avg_stability_ratio=('stability_ratio', 'mean')
).sort_values(by='total_balance_sek', ascending=False)

print("--- Summary by Deposit Classification ---")
print(segment_summary.to_string(formatters={
    'total_balance_sek': '{:,.0f}'.format,
    'avg_balance_sek': '{:,.0f}'.format,
    'avg_tenure_months': '{:.1f}'.format,
    'avg_volatility': '{:.2%}'.format,
    'avg_stability_ratio': '{:.2%}'.format
}))

# --- 5.2 Analysis by Customer Age Group ---
bins = [0, 30, 45, 60, 100]
labels = ['<30', '30-45', '45-60', '60+']
account_summary['customer_age_group'] = pd.cut(account_summary['customer_age'], bins=bins, labels=labels, right=False)

age_group_summary = account_summary.groupby('customer_age_group').agg(
    num_accounts=('deposit_account_key', 'count'),
    total_balance_sek=('current_balance', 'sum'),
    avg_balance_sek=('current_balance', 'mean'),
    avg_volatility=('balance_volatility', 'mean')
)

print("\n--- Summary by Customer Age Group ---")
print(age_group_summary.to_string(formatters={
    'total_balance_sek': '{:,.0f}'.format,
    'avg_balance_sek': '{:,.0f}'.format,
    'avg_volatility': '{:.2%}'.format
}))

plt.figure(figsize=(14, 6))
plt.subplot(1, 2, 1)
sns.barplot(x=age_group_summary.index, y=age_group_summary['total_balance_sek'])
plt.title('Total Deposit Balance by Customer Age Group')
plt.ylabel('Total Balance (SEK)')

plt.subplot(1, 2, 2)
sns.boxplot(x='customer_age_group', y='balance_volatility', data=account_summary, showfliers=False)
plt.title('Balance Volatility by Customer Age Group')
plt.ylabel('Volatility (Std Dev / Mean)')
plt.tight_layout()
plt.show()

# --- 5.3 Analysis by Geographic Region (using Zip Code prefix) ---
account_summary['region'] = account_summary['customer_zip_code'].astype(str).str[0:2] # Using first 2 digits as region proxy
region_summary = account_summary.groupby('region')['current_balance'].sum().nlargest(10)

print("\n--- Top 10 Regions by Total Deposit Balance ---")
print(region_summary.to_string(name=False, float_format='{:,.0f} SEK'.format))

plt.figure(figsize=(12,6))
region_summary.plot(kind='bar')
plt.title('Total Deposit Balance by Geographic Region (Top 10)')
plt.xlabel('Region (Zip Code Prefix)')
plt.ylabel('Total Balance (SEK)')
plt.xticks(rotation=45)
plt.show()

print("\n\n--- LIQUIDITY RISK MODELING COMPLETE ---")

```

### How to Interpret the Outputs and Use the Model

1.  **Core vs. Non-Core Classification**: This is the cornerstone of your liquidity risk management.
    *   **Core - Highly Stable** deposits are your most reliable funding source. They are less likely to run off in a crisis and can be assigned a longer behavioral maturity for regulatory reporting (like LCR and NSFR).
    *   **Non-Core** deposits are flighty. You must hold a larger High-Quality Liquid Asset (HQLA) buffer against them, as regulators will assign a high outflow rate to this category.
    *   **Action**: Adjust your HQLA portfolio based on the mix of these deposits. If the Non-Core portion grows, your liquidity risk is increasing.

2.  **Behavioral Maturity Analysis**:
    *   **Vintage Curves**: These show how deposit balances for new cohorts evolve. If recent vintages are decaying faster than older ones, it could signal a change in customer behavior or product appeal.
    *   **Decay Curves**: These directly estimate the "stickiness" of funds. The flatter the curve, the more stable the deposit segment. The "permanent balance floor" (where the curve flattens out) is a crucial input for estimating the stable portion of your funding.
    *   **Action**: Use the average life and decay rates to justify the behavioral assumptions in your liquidity models. For example, you can argue for a lower outflow rate on "Core - Highly Stable" deposits by showing their flat decay curve.

3.  **Statistical Models**:
    *   **Survival Analysis**: The survival curve gives you the probability that a deposit will remain "stable" over time. A steep drop indicates a high risk of early, significant withdrawals. This helps quantify the stability of different segments.
    *   **Time Series Forecasting (ARIMA)**: This model helps predict near-term deposit levels. It's useful for operational cash management and projecting funding needs over the next 1-3 months.
    *   **Monte Carlo Simulation (Next Step)**: This is essential for stress testing. By simulating thousands of possible futures under adverse conditions, you can estimate your potential liquidity shortfall and ensure your buffer is sufficient to withstand a severe crisis.

4.  **Segmentation Analysis**:
    *   This analysis reveals where your risks and strengths lie. For example, you might find that older customers (`60+`) have higher, more stable balances, making them a key segment to retain. You might also find that deposits from a specific geographic region are more volatile.
    *   **Action**: Tailor your product strategy, marketing, and pricing based on these insights. You could offer loyalty benefits to stable customer segments or re-evaluate high-rate "hot money" products that attract non-core funds.

### Conclusion

This Python model provides a robust framework for assessing and managing liquidity risk from retail deposits. By combining rules-based classification, behavioral analysis, and statistical modeling, a bank can gain deep insights into its funding stability, optimize its balance sheet, and meet regulatory requirements more effectively. The key is to regularly refresh the model with new data to capture evolving customer behaviors and market conditions.