Table of Contents

Components and indicators

Monitoring coverage

flowchart LR EXP["fast-epdg
/metrics :9817"] EXP --> CFG["Config
2 metrics"] EXP --> NET["Network
1 metric"] EXP --> PROTO["Protocols L5-L7
15 metrics"] EXP --> SVC["Service KPI
4 metrics"] EXP --> SESS["Session State
4 metrics"] EXP --> APP["Application
3 metrics"] EXP --> SYS["System
4 metrics"] PROTO --> IKEV2["IKEv2
SWu — 3"] PROTO --> GTPC["GTPv2-C
S2b — 4"] PROTO --> GTPU["GTP-U
S2b data — 3"] PROTO --> DIA["Diameter
SWm/SWx/S6b — 5"]

Quantitative review by category

Category Number of metrics Survey interval Key indicators
Config 2 10 sec Configuration status, reload counter
Network 1 10 sec Node connection status (PGW/AAA/HSS)
IKEv2 (SWu) 3 10 sec Reports by type (IKE_SA_INIT, IKE_AUTH, CREATE_CHILD_SA), delay diagram, errors
GTPv2-C (S2b) 4 10 sec Messages (Create/Modify/Delete Session), delays, errors, relays
GTP-U data plane 3 10 sec Packets/bytes, tunneling errors
Diameter (SWm/SWx/S6b) 5 10 sec Command code messages (DER/DEA, MAR/MAA, AAR/AAA), delays, errors, watchdog, connection status
Service KPI 4 10 sec Percentage of successful attempts, duration histogram, service availability, uptime
Session State 4 10 sec IKE SA, Child SA, GTP sessions, all users
Application 3 10 sec Number of streams, memory, log messages by levels
System 4 10 sec CPU recycling, memory, memory disposal, open FD
Total 33 metrics

Naming principles

All metrics have the prefix epdg_ and are organized in a hierarchy:

epdg_
├── config_*           # Configuration
├── network_*          # Network layer
├── ikev2_*            # SWu (IKEv2/IPSec)
├── gtp_*              # S2b control-plane GTPv2-C
├── gtpu_*             # S2b data-plane GTP-U
├── diameter_*         # SWm/SWx/S6b
├── service_*          # Service KPIs (attach, availability, uptime)
├── session_*          # Session Status (IKE SA, Child SA, GTP, subscribers)
├── app_*              # App Metrics (memory, threads, logs)
└── system_*           # System metrics (CPU, disk, network)