Article created by Rainer Gerhards
Article last updated by Florian Riedl
Monitoring Windows is important even for small environments. Automatically monitored, critical failures can often be avoided. But how to monitor a system without too much effort? The basic idea behind a successful monitoring and alerting system is to centralize all system events at a single monitoring station. Once the information is centralized, it can be used to build an alerting system or even carry out the corrective actions.
What is a Monitoring System made up from?
Successful monitoring systems do usually have many components. These are typically loosely coupled so that new requirements can be added easily. Keep in mind how often systems and environments change – flexibility in a monitoring system is nowadays a “must have”. Typically, a system consists of
- Data collector processes
- Storage engine
- Background processes
In this scenario, the data collectors run on the monitored systems. These should be light weight processes because they shouldn’t put too much burden on the host system. This is especially important if high performing systems like web servers are to be monitored. The data collector picks up “interesting” events and forwards them to the central storage engine.
The storage engine then stores the received event notifications to persistent storage. That way, it is safe from any manipulations or technical problems at the monitored systems. The storage engine typically runs on limited number of machines. Often, there is only a single storage engine inside a whole network. That’s really not a bad idea, as the whole concept of monitoring is to have all information centrally. Multiple storage engines, on the other hand, are typically used in complex scenarios, mostly with WAN links in between. There, a local storage engines serves as a central hub for one location and forwards the information to the central system.
The analysis console finally is used by the system administrators. It is the interface allows to have a look at consolidated reports and also allows to drill down into more specific topics. Ideally, the console supports multiple concurrent users as well as provides some hints to fixing detected problems. Integrated links to vendor knowledge bases or public search & discussion services are a valuable help here.
Of course, data collectors and the storage engine are background processes. But there can also be background processes that consolidate and monitor the storage engine’s data on a schedule – e.g. daily. So administrators either receive an activity overview report or an exception report (for important and urgent matters).
What about Windows?
Windows does not come with a built in monitoring solution. So you need some tools to get it going.
Windows logs the most important state date into the event log. Third party vendors are also encouraged to log any events to the event log. For example, most Anti-Virus products will log caught viruses here. The event log is definitely the place to look at if you’d like to monitor an Windows system’s health. As a build-in tool, only the Windows event viewer is available. That tool allows interactive display of current events but was never meant to be part of an automated monitoring solution.
What we need is a data collector that can run in the background. For this, we use EventReporter. This product monitors the event log in near real time and forwards all new logged messages to the storage engine via SETP or syslog protocol.
See difference between SETP and syslog protocols.
Why did I say “near real time”? Well, EventReporter by design does not operate on Windows event notifications, which have been proven to be not fully reliable under extreme scenarios. Instead, it polls the event logs on a pre-set schedule. Resource usage is very moderate, so the schedule can be set to run every 30 seconds – even more often in very security sensitive environments. EventReporter does not only forward the logs but also checks if someone truncates them (via Windows Event Viewer or an API call). If that is done, a notification is send the the storage engine. This functionality is important, as such log truncations can be a good indication of an intruder. EventReporter is installed on each system that is to be monitored. It runs on all flavors of NT (even ALPHA), so really all systems can be monitored.
Storing the Events…
Now we need something to store the events collected/reported by EventReporter. We use WinSyslog for this. This enhanced syslog daemon works much like it’s Unix pendant. But besides writing to flat files, it can also log to a database and carry out flexible actions.
In our monitoring system, we use it for two functions: first of all, it stores all events. In our case, events are written both to a flat file as well as the database. We use this approach because bulk analysis is done fastest with the help of flat files. However, viewing event details is done best by using a database. So we’ve taken the route to simply write to both stores and have the best of both worlds. A large hard disk is of course helpful here…
Besides storing events, WinSyslog acts also as an alerting engine. It can be configured to detect important message fragments or high priority messages and set to forward these to an email account. If your paging provider supports an email to pager interface, this is also the way to call a pager in case of an emergency.
Typically, only a single instance of WinSyslog is needed. However, it has support for syslog cascading. Cascading is used if a reporting hierarchy is build. This is most often done in corporate networks involving WAN links where only higher importance messages should be send to a central data store while less important messages are stored at the individual sites locally. That way, complete data is available for drill-down, but it is not necessarily being transmitted over the WAN. WinSyslog fully supports cascading. It is also able to forward only selected messages based on rules.
The integrated Solution
As you see, the system is made up of three main components. Each of these has specific duties to perform. The modular approach provides the flexibility need in today’s environments. For example, if Cisco information is to be integrated into the system, you simply need to point the Cisco boxes to the WinSyslog server. Now, the storage engine saves the new events. Even though MonitorWare Console does not (yet) pick up and analyze the Cisco events, they can be viewed with the WinSyslog web interface, which might be very helpful during analysis.
Also, an administrator has the option to add his or her own custom scripts to be executed on the stored event data. The open system architecture provides unlimited flexibility to do so.
It is also easy to integrate Unix and Linux machines into the scenario. They support syslog natively and as such can both send and consume syslog messages. In fact, the EventReporter product alone is often used as a tool to integrate Windows events into Unix based management systems.
Conclusion
An effective monitoring solution can save the administrator a lot (and I mean lot) of work. It can also help prevent major system breakdowns, as critical situations can be detected early and – hopefully – solved before any damage occurs. This is especially true if your think about security monitoring.
As I have outlined, a monitoring system needs not to be very complex or hard to set up. Just use some ready to run tools, integrate them and enjoy the benefits of the system.
Tools used
The following tools were used to build the monitoring system:
- EventReporter – data collector
- WinSyslog – storage engine and alert notification
If you’d like to build you own system, you can download free evaluation copies from the respective web sites. Detailed installation instructions are available in the additional article “Forward NT Event Logs to a Syslog Server“.
I hope this article is helpful. If you have any questions or remarks, please do not hesitate to contact me at rgerhards@adiscon.com.