Itops
LogZilla App Store application: Itops
Operations Monitoring
The Operations Monitoring app provides operational event detection and categorization across multiple infrastructure domains. It analyzes log messages from system services, containers, orchestration platforms, databases, and application servers to identify and tag operationally significant events.
App Function
The Operations app uses a high-performance three-gate filtering pipeline to:
- Detect operational events across multiple categories
- Tag events with relevant operational classifications
- Enable rapid incident identification and response
- Provide unified operations monitoring across diverse log sources
Operations Categories
The app categorizes operational events into four primary areas:
- High Severity Events - Critical system events requiring immediate attention (kernel panics, OOM, disk full, hardware errors)
- Service Disruptions - Service outages, failures, restarts, and connectivity issues
- Application Errors - Application crashes, exceptions, and runtime errors
- Infrastructure Changes - Configuration changes, deployments, scaling events, and maintenance activities
How It Works
Two-Gate Pipeline
The app uses an optimized two-stage filtering process for maximum performance:
Gate 1: Seed Terms Scans the message for category-specific seed terms using literal substring matching. Only messages containing relevant keywords proceed to pattern matching.
Gate 2: Pattern Matching Uses an optimized first-character lookup table to efficiently match against category-specific operational patterns.
This pipeline achieves high throughput while detecting operational events from ANY log source, regardless of the program name.
Supported Log Sources
The app detects operational events from ANY log source based on message content. This includes but is not limited to:
System Infrastructure
- Linux/Unix: kernel messages, systemd, init systems, cron
- Windows: Event Log, Service Control Manager, Windows Update
- Containers: Docker, containerd, Kubernetes, OpenShift
Network Devices
Note: For optimal detection of network device events, install the associated vendor app from the LogZilla App Store (e.g., cisco, fortigate, palo_alto).
- Firewalls: Cisco ASA, Fortigate, Palo Alto, WatchGuard, pfSense
- Routers/Switches: Cisco IOS, Juniper, Aruba, Meraki
- Load Balancers: F5, Citrix, HAProxy
Applications
- Databases: PostgreSQL, MySQL, MongoDB, Redis, Elasticsearch
- Web Servers: nginx, Apache, IIS
- Application Servers: Tomcat, JBoss, WebLogic
- Message Queues: RabbitMQ, Kafka
Cloud & Orchestration
- Cloud Platforms: AWS, Azure, GCP
- Orchestration: Kubernetes, Docker Swarm, Nomad
- CI/CD: Jenkins, GitLab, GitHub Actions, ArgoCD
User Tags Generated
The app generates the following user tags based on detected operational events:
| Tag Name | Description | Category |
|---|---|---|
Ops High Severity | Critical system event detected | High Severity |
Ops Service Disruption | Service outage or failure | Service Disruption |
Ops App Error | Application error or exception | Application Error |
Ops Infra Change | Infrastructure change detected | Infrastructure Change |
Ops Event | Any operational event (rollup) | All Categories |
Ops Severity Level | Critical/High/Medium/Low | Severity Classification |
Events can receive multiple tags if they match criteria for multiple categories.
Severity Level Hierarchy
When an event matches multiple categories, the highest severity takes precedence:
- Critical - High Severity Events (kernel panic, OOM, hardware failure)
- High - Service Disruptions (service down, connection refused)
- Medium - Application Errors (exceptions, crashes)
- Low - Infrastructure Changes (deployments, config changes)
Integration with Other Apps
The Operations app runs at rule priority 900, which means it executes after most vendor-specific apps. This allows it to benefit from message parsing and normalization performed by vendor apps:
- cisco app parses Cisco ASA, IOS, and other Cisco device logs
- fortigate app parses Fortigate firewall logs
- palo_alto app parses Palo Alto firewall logs
- ms_windows app parses Windows Event Log messages
- linux app normalizes various Linux service messages
Install relevant vendor apps alongside Operations for optimal detection of operational events from those sources.
Dashboard
The app includes a detailed operations overview dashboard featuring:
- Real-time operations event rates (EPS/EPD)
- Severity level distribution and scoring
- High severity event tracking by host and program
- Service disruption monitoring
- Application error analysis
- Infrastructure change tracking
- Event rate charts for all categories
- Operations events by syslog severity