Metrics are a necessary evil in the IT world. A good set of metrics helps us to diagnose problems in our environments and plan for future growth.
Think of it this way: Metrics will tell us where we have been, where we are, where we are headed, and what is currently wrong. This can be very helpful in planning budgets, and ultimately, get us new toys. Err, equipment.
There are a lot of packages out there that will gather a boatload of metrics, and give it all to you in nice little graphs. A few that come to mind are:
- System Center Operations Manager (SCOM)
- SolarWinds
- Nagios
- Spiceworks
- … etc …
Each package has its pros and cons. Some do things the others don’t. Some you pay big dollars for, and some are free (ad supported). Some require agents, and some don’t. But they all have one thing that they do: they collect data from all monitored systems in a central location, and allow reports and graphs to be built for human consumption.
Recently, I had a puzzle in which I had to figure out a way to collect a variety of metrics from Windows computers (servers and/or workstations), without purchasing software. Rather than pick up a package like SpiceWorks (free, ad supported), I opted to build my own monitoring solution.
To build this package, which I will call PSMetrics, I’ll be using PowerShell as a collector on the monitored systems, a classic ASP IIS ingester/processor, and MySQL for the data storage.
Since my environment is all Windows servers and clients, I can leverage PowerShell to pull key performance counters from the monitored host. After local processing, the metrics get posted up to the ingestor on the IIS server.
I chose ASP for ingestor piece because, simply, it is what I know. I haven’t taken the time to learn .Net or PHP, and classic ASP works quite nicely for what I have in mind. The ingestor will accept data from the collectors, and examine the data to ensure all require fields are present and correctly formatted. If all tests are passed, the ingestor will send the data up to the database. The IIS server will happily handle multiple simultaneous connections, just like any website. Further, if the connection pool gets exhausted, you can simply stand up multiple web servers, and situate them behind a load balancer.
Warning!
When configuring the IIS server, be sure to set the session length to around 1-2 minutes. By default, IIS session length is set to 20 minutes, and you can easily run your web server out of memory if you are monitoring too many hosts.
The data storage piece is pretty straightforward, and uses MySQL or MS SQL. The key takeaway here is maintenance. If you are collecting data from a large number of machines, the database will grow, and grow, and grow. A maintenance cycle will have to be decide upon, as well as an automated method of running that maintenance cycle.
For example, based on the number of machine to be monitored (plus 10%), how long are metrics to be kept? During the maintenance cycle, will metrics be “lost”, simply removed from the database? Or should we get fancy? For example, we could average the readings over five minutes for each monitor, remove the readings from the “operational” table, and move them off to a “history” table. I’ll figure this stuff out when I start building out the database.
Finally, reporting. I’ll be building out another web server piece for viewing the metrics. The interface should be clean, accessible, and easy to use. This is probably the most straightforward piece. One caveat is that I will be building the reporting app on the same server as the ingestor, when it should be on a machine of its own. I only have one web server at the moment, so it’ll have to share duties with other apps for a bit.
So there is the idea. I’ll be posting progress as I go, along with code samples, and all the rest of the goodies. Maybe we have some *nix d00dz out there that might want to get in on this? Maybe build a Bash script for monitoring *nix hosts? Maybe if we get enough expertise on this, we could have a really good (and free) package!