Itops

LogZilla App Store application: Itops

Operations Monitoring

The Operations Monitoring app provides operational event detection and categorization across multiple infrastructure domains. It analyzes log messages from system services, containers, orchestration platforms, databases, and application servers to identify and tag operationally significant events.

App Function

The Operations app uses a high-performance three-gate filtering pipeline to:

  • Detect operational events across multiple categories
  • Tag events with relevant operational classifications
  • Enable rapid incident identification and response
  • Provide unified operations monitoring across diverse log sources

Operations Categories

The app categorizes operational events into four primary areas:

  1. High Severity Events - Critical system events requiring immediate attention (kernel panics, OOM, disk full, hardware errors)
  2. Service Disruptions - Service outages, failures, restarts, and connectivity issues
  3. Application Errors - Application crashes, exceptions, and runtime errors
  4. Infrastructure Changes - Configuration changes, deployments, scaling events, and maintenance activities

How It Works

Two-Gate Pipeline

The app uses an optimized two-stage filtering process for maximum performance:

Gate 1: Seed Terms Scans the message for category-specific seed terms using literal substring matching. Only messages containing relevant keywords proceed to pattern matching.

Gate 2: Pattern Matching Uses an optimized first-character lookup table to efficiently match against category-specific operational patterns.

This pipeline achieves high throughput while detecting operational events from ANY log source, regardless of the program name.

Supported Log Sources

The app detects operational events from ANY log source based on message content. This includes but is not limited to:

System Infrastructure

  • Linux/Unix: kernel messages, systemd, init systems, cron
  • Windows: Event Log, Service Control Manager, Windows Update
  • Containers: Docker, containerd, Kubernetes, OpenShift

Network Devices

Note: For optimal detection of network device events, install the associated vendor app from the LogZilla App Store (e.g., cisco, fortigate, palo_alto).

  • Firewalls: Cisco ASA, Fortigate, Palo Alto, WatchGuard, pfSense
  • Routers/Switches: Cisco IOS, Juniper, Aruba, Meraki
  • Load Balancers: F5, Citrix, HAProxy

Applications

  • Databases: PostgreSQL, MySQL, MongoDB, Redis, Elasticsearch
  • Web Servers: nginx, Apache, IIS
  • Application Servers: Tomcat, JBoss, WebLogic
  • Message Queues: RabbitMQ, Kafka

Cloud & Orchestration

  • Cloud Platforms: AWS, Azure, GCP
  • Orchestration: Kubernetes, Docker Swarm, Nomad
  • CI/CD: Jenkins, GitLab, GitHub Actions, ArgoCD

User Tags Generated

The app generates the following user tags based on detected operational events:

Tag NameDescriptionCategory
Ops High SeverityCritical system event detectedHigh Severity
Ops Service DisruptionService outage or failureService Disruption
Ops App ErrorApplication error or exceptionApplication Error
Ops Infra ChangeInfrastructure change detectedInfrastructure Change
Ops EventAny operational event (rollup)All Categories
Ops Severity LevelCritical/High/Medium/LowSeverity Classification

Events can receive multiple tags if they match criteria for multiple categories.

Severity Level Hierarchy

When an event matches multiple categories, the highest severity takes precedence:

  1. Critical - High Severity Events (kernel panic, OOM, hardware failure)
  2. High - Service Disruptions (service down, connection refused)
  3. Medium - Application Errors (exceptions, crashes)
  4. Low - Infrastructure Changes (deployments, config changes)

Integration with Other Apps

The Operations app runs at rule priority 900, which means it executes after most vendor-specific apps. This allows it to benefit from message parsing and normalization performed by vendor apps:

  • cisco app parses Cisco ASA, IOS, and other Cisco device logs
  • fortigate app parses Fortigate firewall logs
  • palo_alto app parses Palo Alto firewall logs
  • ms_windows app parses Windows Event Log messages
  • linux app normalizes various Linux service messages

Install relevant vendor apps alongside Operations for optimal detection of operational events from those sources.

Dashboard

The app includes a detailed operations overview dashboard featuring:

  • Real-time operations event rates (EPS/EPD)
  • Severity level distribution and scoring
  • High severity event tracking by host and program
  • Service disruption monitoring
  • Application error analysis
  • Infrastructure change tracking
  • Event rate charts for all categories
  • Operations events by syslog severity
Itops | LogZilla Documentation