Monitoring Thin Pools

Monitoring of Thin pools with SYMCLI

Monitoring of Thin pools with SMC

Monitoring of Thin pools with the event daemon

 

symcfg monitor command starts monitoring

One way to monitor thin pool space is to run the symcfg monitor command with the –i option. This command runs every “i” seconds and permits the specification of a user defined action script once a pool threshold has been reached. The action script can be set to execute each time the condition is encountered, e.g. a 60% pool full condition or only once. In the latter event the –norepeat option causes the action script to execute only once.

 

The action script can perform any task that the storage administrator chooses to perform ranging from sending an e-mail or writing to a log file up to executing a task that adds additional enabled data devices to the pool in question.

 

Monitor Thin Provisioning with SMC

Alerts

There are three alerts that can be selected that are applicable specifically for device pools and two for thin devices.

·         Device Pool Config Change – Change of status as a result of adding or removing data or thin devices.

·         Device Pool Free Space – Free space remaining in a device pool meets or exceeds a specific percentage.

·         Device Pool Status – Change of pool status e.g.  if the pool state changes from Enabled to Disabled.

·         Thin Device Allocation – Allocated space on a thin device meets or exceeds a specific percentage.

·         Thin Device Usage – Amount of used space on a thin device meets or exceeds a specific percentage.

 

Alert Severities

Severity of Alerts Related to

Ÿ  Pool free space

Ÿ  Thin Device allocation

Ÿ  Thin Device usage

There are three types and levels of alerts that can be triggered by SMC. Alert severities will be warning, critical, or fatal depending on the type of alert and the threshold that is being equaled or exceeded.

Alerts for pool or thin device usage or allocation (Device Pool Free Space, Thin Device Allocation, and Thin Device Usage) will trigger when the following thresholds are met or exceeded:

Ÿ  60% - issue a warning alert

Ÿ  65% - issue a warning alert

Ÿ  70% - issue a warning alert

Ÿ  80% - issue a critical alert

Ÿ  100% - issue a fatal alert

These are the default values coded in SMC and. Alerts that have not been cleared by the user can be viewed by clicking the Alerts button. They can alternately be viewed by clicking the New Alerts button at the upper right of the browser window. This button will also show the number of new alerts (alerts that have not been acknowledged by the user).

Pool Usage Thresholds Set by user

In  version of 7.0 of SMC Device Pool Free Space can be customized by the user to allow specific pools to be monitored for user defined thresholds.

The dialog box for configuring pool thresholds can be found in the Tasks screen by clicking Config Pool Utilization Threshold... under Administration:

Sample Alerts

Alerts2

In the sample shown above the alert’s severity, as displayed by SMC are:

(1) Fatal

(2) Critical

(3) Warning

(4) Information

(5) Normal  

Event Daemon Overview

The event daemon (storevntd) enables the monitoring of Symmetrix operations by detecting and reporting events as they happen. The event daemon continually collects Symmetrix event information in real-time, filters the events by severity and type, and responds by alerting client event applications and/or logging events to specified targets.

When using the daemon with a client event application (for example, the Symmetrix Management Console), the application registers with the event daemon, specifying the events in which it is interested. When used in this manner, the daemon will automatically start when the client application requests its services.

When configuring the daemon to log events, it is possible to set it up to log the events to a remote Syslog server, the UNIX Syslog, the Windows Event log, SNMP, and/or a file on disk. The event daemon is also supported on z/OS. The daemon should typically be configured to automatically start at system boot.

Ÿ  Started, stopped and managed via stordaemon command

Ÿ  Can be set to start on system boot

Ÿ  Event daemon restarted by watchdog daemon (storwatchd) in the event it fails or is stopped

 

By default, the event daemon will automatically start the first time a Solutions Enabler application requires its services.

The daemon can be manually started using the stordaemon command.

stordaemon start storevntd

Alternatively it can be automatically started every time the local host is booted using the following command:

stordaemon install storevntd –autostart

The event daemon is documented in the Solutions Enabler Installation Guide. The amount of information is somewhat limited. The most informative source is the daemon_options file, which has examples of how to set up the options. Some of the options shown here are undocumented and was taken from Engineering documentation.

Ÿ  Installation Guide

Ÿ  /var/symapi/config/daemon_options

Ÿ  stordaemon action storevntd -cmd help

Ÿ  Some options shown here are undocumented

–        thresh_critical, thresh_major, thresh_warn, etc.

–        sev=<info | warning | minor | major | critical | fatal>

 

Logging Options

Ÿ  Reports events in two ways

–        Alerts client event applications

–        Logs events to one of four specified targets

Ψ  UNIX syslog

Ψ  Windows Event Log

Ψ  SNMP

Ψ  Flat file

Ÿ  Client event applications register with the daemon

–        Specifies events of interest

–        Daemon automatically started when client application requests services

–        e.g. SMC, ControlCenter

 

To manually perform monitoring with the event daemon, a user needs to specify a target for the daemon’s messages. The targets can be one or a combination of:

file        Events are written to a file on disk.

snmp        Events are mapped into SNMP traps.

system   Events are written to the local host’s syslog services on Unix. The syslog’s configuration settings determine which log file the message is written to. On Windows the messages are sent to the Event Log.

syslog   Events are sent directly to a remote syslog server, bypassing any local syslog service.

Special applications such as SMC or Control Center have been written to take advantage of the event daemon. These applications request the information they need and the event daemon supplies it.

 

 

Reporting of Symmetrix Events

The list of events to be logged are specified in the daemon options file. The event daemon can report on a large number of categories of events which are documented in the Solutions Enabler Installation Guide. The list of individual event codes are documented in a table that covers 12 pages in the Appendix of the Guide.

Certain events correspond to numerical quantities of some sort.  A threshold is associated with each severity level, and an event is generated at that severity when the event's value exceeds                                              the associated threshold.  These fields can be used to override the default threshold values                                            controlling when an event is delivered.

One example of this is the event that indicates the percentage of space used for a device pool.  These fields can be set to control when events are to be generated, e.g.: thresh_critical=96, thresh_major=80, thresh_warn=60, thresh_info=40.

An easy way to find the set of supported events and categories is by querying a running event daemon. 

# Load the Symmetrix detector:

stordaemon action storevntd –cmd load_plugin Symmetrix

# List the event categories that it supports:

stordaemon action storevntd –cmd list -categories

SNMP Traps

Ÿ  SNMP events logged via SNMP traps to a remote SNMP management server

–        Encoded according to the Fibre Channel Alliance MIB (version 3.0)

Ÿ  snmp_trap_client_registration

–        Host name or IP address.

–        Port at this host that the traps should be sent to.

–        The trap sending filter levels as defined in the FC-management (Fibre Channel) MIB

Ÿ  The event daemon provides the necessary SNMP MIB support and trap generation services required to monitor the status of Symmetrix storage environments from third-party enterprise management frameworks. The event daemon includes a loadable SNMP library which, once enabled and configured in the daemon_options file, acts as a self contained SNMP agent. It is responsible for maintaining internal Fibre Alliance MIB (V3.0) tables, responding to SNMP browse requests, and generating traps in response to events.

Ÿ  For an application to receive SNMP trap information from the event daemon, it must be specified as a trap target. The application’s IP address, the port on which the application will be listening for the trap and the filter that determines the highest severity level for which traps will be sent.

Ÿ  The possible values range from 1 through 10, where:

Ÿ  1 = Unknown

Ÿ  2 = Emergency

Ÿ  3 = Alert

Ÿ  4 = Critical

Ÿ  5 = Error

Ÿ  6 = Warning

Ÿ  7 = Notify

Ÿ  8 = Info

Ÿ  9 = Debug

Ÿ  10 = Mark (all messages logged)

Starting the Event Daemon

•          To manually start daemon, issue command:

# stordaemon start storevntd

•          To start the daemon automatically at system boot

# stordaemon install storevntd -autostart

•          When daemon starts, it reads options from daemon_options and prints what it read in the event daemon log

/var/symapi/log/stordevntd.log0 or stordevntd.log1

•          Events are recorded in the event log whose default name is:

/var/symapi/log/events.log0 or events.log1

•          To have the event daemon re-read the daemon options file

# stordaemon action storevntd –cmd reload

The event daemon can be started explicitly by the user or set to start automatically every time the system boots. When the daemon starts it prints an entry in the event daemon log storevntd.log0 or storevntd.log1 file. These are circular logs that grow to 1 MB size before they get overwritten. The event daemon log notes the options that it read from the daemon options file at the time it started.

A record of events is noted in the events log, which is also a circular file called events.log0 or events.log1. The file can be given a different name if the user chooses to specify it in the daemon options file.

 

Event Codes for VP Monitoring

 

Code

Category Event Description

1206

Status - Pool state has changed to

[Not Present | Unknown | Online | Offline|

Write Disabled | Failed].

1207

Status - Pool configuration has changed.

1208

Status - Pool utilization is now nn%

1212

Status - Virtual device is now N percent allocated.

1213

Status - Virtual device is now N percent used

 

Sample Entries for Monitoring Device Pools

storevntd:symm_poll_interval = 20

storevntd:log_event_targets = file

storevntd:log_symmetrix_events = sid=000190301194,1206,1207;\                 sid=000190301194,1208,thresh_critical=80,thresh_major=65,thresh_warn=50,thresh_info=40;\                           sid=000190301194,1212,thresh_critical=80,thresh_major=65,thresh_warn=50,thresh_info=40;\                      sid=000190301194,1213,thresh_critical=80,thresh_major=65,thresh_warn=50,thresh_info=40;

This is an example that uses undocumented threshold values to specify information and different warning levels.

The options in the daemon options file are read by the event daemon at the time of startup. While the event daemon is running, these options stay in force. To dynamically alter the options, the simplest way is to edit the options file and issue the command:

# stordaemon action storevntd –cmd reload

Another way is to use the command

# stordaemon setvar storevntd –name var=Value

Entries in the Event Log – 1

Ÿ  Disable Data Pool

[fmt=evt] [evtid=1206] [date=2009-02-26T14:22:50] [symid=000190301194] [TPDataPool=WB94] [sev=major]  = Data Pool state has changed to Offline.

[fmt=evt] [evtid=1207] [date=2009-02-26T14:22:50] [symid=000190301194] [TPDataPool=WB94] [sev=info]  = Data Pool configuration has changed.

Ÿ  Enable Data Pool

[fmt=evt] [evtid=1206] [date=2009-02-26T14:24:10] [symid=000190301194] [TPDataPool=WB94] [sev=normal]  = Data Pool state has changed to Online.

[fmt=evt] [evtid=1207] [date=2009-02-26T14:24:10] [symid=000190301194] [TPDataPool=WB94] [sev=info]  = Data Pool configuration has changed.

Entries in the Event Log – 2

[fmt=evt] [evtid=1212] [date=2009-02-26T11:17:50] [symid=000190301194] [Device=0103] [sev=info]  = Thin Device is now 40 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-02-26T11:17:50] [symid=000190301194] [Device=0104] [sev=info]  = Thin Device is now 40 percent allocated.

[fmt=evt] [evtid=1208] [date=2009-02-26T11:17:50] [symid=000190301194] [TPDataPool=WB94] [sev=info]  = Data Pool utilization is now 40 percent.

[fmt=evt] [evtid=1212] [date=2009-02-26T11:19:31] [symid=000190301194] [Device=0103] [sev=warning]  = Thin Device is now 50 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-02-26T11:19:31] [symid=000190301194] [Device=0104] [sev=warning]  = Thin Device is now 50 percent allocated.

[fmt=evt] [evtid=1208] [date=2009-02-26T11:19:31] [symid=000190301194] [TPDataPool=WB94] [sev=warning]  = Data Pool utilization is now 50 percent.

[fmt=evt] [evtid=1212] [date=2009-02-26T11:21:53] [symid=000190301194] [Device=0103] [sev=major]  = Thin Device is now 65 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-02-26T11:21:53] [symid=000190301194] [Device=0104] [sev=major]  = Thin Device is now 65 percent allocated.

[fmt=evt] [evtid=1208] [date=2009-02-26T11:21:53] [symid=000190301194] [TPDataPool=WB94] [sev=major]  = Data Pool utilization is now 65 percent.

[fmt=evt] [evtid=1212] [date=2009-02-26T11:24:35] [symid=000190301194] [Device=0103] [sev=critical]  = Thin Device is now 80 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-02-26T11:24:35] [symid=000190301194] [Device=0104] [sev=critical]  = Thin Device is now 80 percent allocated.

[fmt=evt] [evtid=1208] [date=2009-02-26T11:24:35] [symid=000190301194] [TPDataPool=WB94] [sev=critical]  = Data Pool utilization is now 80 percent.

 

 

Second Options Sample and Daemon Log Entry

storevntd:symm_poll_interval = 10

storevntd:symm_sync_frequency = 1

storevntd:log_event_targets = file

storevntd:log_symmetrix_events = \

          sid=000194900180,1208, sev=warning;\

          sid=000194900180,1212, sev=warning;

----------------------------------------------------------------------

  [4810              Listener] Jun-23 12:55:46.989 : =============================================

        [4810              Listener] Jun-23 12:55:46.989 : storevntd Starting, Version: V7.0-915 (0.0)

        [4810              Listener] Jun-23 12:55:47.028 : Event Channel from remote hosts will not be secure

        [4810              Listener] Jun-23 12:55:47.028 : Option setting: log_event_targets    = file

        [4810              Listener] Jun-23 12:55:47.029 : Option setting: log_symmetrix_events = sid=000194900180,1208, sev=warning;          sid=000194900180,1212, sev=warning;

        [4810              Listener] Jun-23 12:55:47.068 : Option setting: symm_poll_interval   = 10

        [4810              Listener] Jun-23 12:55:47.068 : Option setting: symm_sync_frequency  = 1

        [4810              Listener] Jun-23 12:55:47.069 : Loaded Event PlugIn SymmetrixPlugin

        [4810              Listener] Jun-23 12:55:48.710 : storevntd Daemon : up and running

 

Entries in Event Log – 3

[fmt=evt] [evtid=1212] [date=2009-06-23T13:05:29] [symid=000194900180] [Device=0191] [sev=warning]  = Thin Device is now 60 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-06-23T13:06:00] [symid=000194900180] [Device=0191] [sev=major]  = Thin Device is now 70 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-06-23T13:06:30] [symid=000194900180] [Device=0191] [sev=critical]  = Thin Device is now 80 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-06-23T13:07:31] [symid=000194900180] [Device=0191] [sev=fatal]  = Thin Device is now 100 percent allocated.

[fmt=evt] [evtid=1208] [date=2009-06-23T13:09:03] [symid=000194900180] [TPDataPool=sun180] [sev=warning]  = Data Pool utilization is now 60 percent.

[fmt=evt] [evtid=1208] [date=2009-06-23T13:09:34] [symid=000194900180] [TPDataPool=sun180] [sev=minor]  = Data Pool utilization is now 65 percent.

[fmt=evt] [evtid=1208] [date=2009-06-23T13:10:04] [symid=000194900180] [TPDataPool=sun180] [sev=major]  = Data Pool utilization is now 70 percent.

[fmt=evt] [evtid=1208] [date=2009-06-23T13:11:06] [symid=000194900180] [TPDataPool=sun180] [sev=critical]  = Data Pool utilization is now 80 percent.

[fmt=evt] [evtid=1212] [date=2009-06-23T13:11:36] [symid=000194900180] [Device=0192] [sev=minor]  = Thin Device is now 65 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-06-23T13:12:07] [symid=000194900180] [Device=0192] [sev=major]  = Thin Device is now 75 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-06-23T13:12:37] [symid=000194900180] [Device=0192] [sev=critical]  = Thin Device is now 85 percent allocated.

[fmt=evt] [evtid=1212] [date=2009-06-23T13:13:38] [symid=000194900180] [Device=0192] [sev=fatal]  = Thin Device is now 100 percent allocated.

[fmt=evt] [evtid=1208] [date=2009-06-23T13:13:38] [symid=000194900180] [TPDataPool=sun180] [sev=fatal]  = Data Pool utilization is now 100 percent.

Default Thresholds

60             Warning       65  Minor Warning     70   Major Warning   80 Critical      100  Fatal

 

Skew