Question 1

Should I use peak or average EPS?

Accepted Answer

Use **average EPS** for storage sizing and **peak EPS** for ingest pipeline capacity planning.

Storage is determined by total volume over time — average EPS × seconds is the right metric. But your ingest pipeline (log forwarders, collectors, message queues) must handle peak load without dropping events. Peaks during incident response, scans, or business hours can be 5–10× the average EPS. Size ingest capacity for peak; size storage for average.

Question 2

How do I find my actual EPS?

Accepted Answer

If you have an existing SIEM or log management system, query the events-per-second metric from its monitoring dashboard. Most SIEMs expose EPS as a built-in operational metric.

For a new deployment, run your log sources into a sample pipeline for one week — ideally including a business day, weekend, and a period of elevated activity. Divide total events by total seconds to get average EPS. This sample is far more reliable than a theoretical estimate and will also reveal event size averages from real log data.

Question 3

What retention period should I plan for?

Accepted Answer

Start with your legal, regulatory, customer, and internal policy requirements. Do not assume one retention period fits every organization or every log type.

Useful planning anchors:
- **PCI DSS** commonly requires audit log history for at least **12 months**, with at least the most recent **3 months** immediately available for analysis.
- **GDPR-style storage limitation** does not set one fixed security-log period; personal data should be kept only as long as necessary for the purpose and legal basis.
- **SOC 2, ISO 27001, and internal security policies** usually depend on the controls, contracts, risk assessment, and auditor expectations.

From a threat-hunting perspective, many teams keep **30–90 days** searchable and move older logs to cheaper retention. Confirm the final retention design with compliance, legal, and security leadership.

Question 4

How does SIEM compression affect the storage estimate?

Accepted Answer

Log data compresses very well — text-based formats (syslog, JSON, CEF) typically achieve 70–85% compression. However, SIEM vendors store much more than raw events: parsed fields, inverted indexes for fast search, correlation state, and metadata.

The net result varies significantly:
- **Splunk SmartStore**: expect 1.5–2× raw event volume after indexing
- **Elastic (ECS + ILM)**: with compression enabled, 1.0–1.4× raw
- **Microsoft Sentinel (Log Analytics)**: ~1.2× raw for most log types

Run a pilot with 1–2 representative log sources before committing to a storage architecture. Vendor-provided sizing tools also provide starting estimates but tend to be conservative.

Log source	Typical EPS	Avg event size
Windows endpoints (500)	1,000	800 bytes
Linux servers (20)	300	600 bytes
Firewall / IDS (5)	800	500 bytes
Cloud (AWS/Azure)	400	700 bytes
Total	2,500	~650 bytes

Source type	Low EPS	Typical EPS	High EPS
Windows DC (per server)	50	200	800
Windows workstation (per host)	1	3	10
Linux server (per host)	5	15	100
Palo Alto / Fortinet firewall	100	500	5,000
IDS/IPS sensor	200	1,000	10,000
Web application (per node)	10	100	2,000
AWS CloudTrail (per account)	5	50	500

Factor	Effect on stored volume
Field extraction (parsing)	+20–40%
Index structures	+15–25%
Compression	−40–70% (depends on log type)
Net typical overhead	+10–50%

Tier	Access time	Relative cost	Typical use
Hot (SSD/SAN)	<1 second	High	Last 7–30 days, active investigation
Warm (spinning disk)	Seconds	Medium	30–90 days, routine queries
Cold (object store, S3/GCS)	Minutes	Very low	90–365+ days, compliance archive

SIEM Log Volume Estimator

How this calculator works

Frequently asked questions

SIEM Log Volume Estimator

Calculator tool

How this calculator works

The Core Formula

Worked Example — Mid-Size Enterprise

EPS Benchmarks by Log Source

Indexing Overhead

Storage Tiers

Frequently asked questions