- Posted by Vithun
- On February 23, 2021
- 0 Comments
The number of Performance Metrics (PMs) monitored by the EMS depends upon the complexity of the device. To get an idea of the challenges involved, let us analyze a simple scenario involving the monitoring of only 20 PMs.
Assume the following:
- Let the number of PMs to be collected per NE be 20 (depending upon the device that is being managed, it could vary from tens to hundreds of metrics).
- Let the data collection interval be every 5 minutes.
- Let the number of NEs to be monitored be 100.
The number of PMs to be stored in the database
- per day is 576,000 (or 20 x 100 x 12 x 24)
- per week is 4 million (576,000 x 7)
- per month is 17.28 million
- per year is 207 million
This number will increase tenfold if the devices that need to be managed increases to 1000 NEs. This type of number is common in large networks where we could be dealing with extremely large volumes of data (however, in a lab environment, such volumes may not occur).
The problem with millions of records
Even though there are millions of records in the database, when a user requests a report, it is expected that the report should be available within a few seconds. However, these reports have to be generated by processing millions of records. If the request is not handled properly, it could take a long time to generate the reports.
For example, to determine the total bandwidth that is being used by a video application, all traffic data has to be processed by summing up all bandwidth utilization entries for each employee’s video traffic. Reports like this might take a few minutes to execute. This might appear too long from a user’s point of view.
So, how to reduce this reporting time?
To reduce reporting time, EMS designers should have a very good understanding of the database and its tuning methods.
- To improve the performance of a database, it is advisable (as a general rule) to restrict the number of rows in a single database table to a few million records.
- Then, as a basic step, it makes sense to group the PMs and create a single table with various metrics as individual columns. Even though this is a fundamental rule used when designing an EMS from scratch, it does not happen often when frameworks are being used to build an EMS. By default, frameworks typically store each metric in a separate row. This requires the use of custom tables instead of generic tables.
- Another common practice for limiting the database size is the purging of data that is older than a few days. This may not be desirable if users want to get historical reports.
- The issue of large and unwieldy databases can be overcome by splitting data into multiple tables. Identifying the best approach to split data is the key that differentiates successful reporting engines from those that fail.
- Here are two ways in which the data can be split:
- Creating a table for each day or week depending on the amount of data being collected.
- Creating a new table once a certain size has been reached.
The inefficiency of such approaches
The mentioned approaches might appear to solve the performance problem. But they do not, because of the following reasons:
- Both these approaches require a master table, which maintains the names of tables as well as the start and end times of the data included in those tables. This is clearly an unnecessary overhead.
- Whenever the user queries for a report with a timeframe that crosses two tables, unannounced problems will arise.
If we need a report about the amount of bandwidth used for various services (like video, audio, and voice over IP (VoIP)), per user over the last month, we need to go through the following steps:
- Identify the individual tables that contain the data (it is quite possible that the data may be spread across multiple tables).
- Frame the query and execute it across all individual tables.
- Merge the results from multiple queries by grouping the usage on a per-user basis.
It is quite possible these steps will end up consuming more time, defeating the very purpose of splitting the data across multiple tables.
In some cases, it becomes very difficult to frame a query across two tables. For instance, if we have to get the bandwidth usage of the top 10 users, we need the entire data set to derive the perfect report. This is why most common reporting engines don’t have substantial support for generating reports from multiple tables.
A far better alternative approach is to use a trending engine, which retrieves historical reports quickly and efficiently.
Trending engine – to the rescue
In order to manage the data and process them into meaningful information, modern EMS solutions have a trending engine that generates aggregated statistics from collected data. The trending engine runs in the background and periodically generates required statistical information about the collected data.
Consider the example of a trending engine that runs hourly and collects data every 5 minutes. The engine will crunch the 12 data points collected per hour and generate an hourly bin (which contains information such as the minimum, maximum, and average for the 12 data points). Similarly, a daily bin might have statistical measures for the data collected throughout the entire day. Intelligent trending engines also provide the option to generate multiple data points within a single day (every 2 hours, 4 hours, and so on). It ensures that there are more data points available for a meaningful statistical analysis.
By using the trending engine, the data collected can be purged after a week. Data that is required for the monthly report can be reconstructed from the hourly or daily bin that was generated by the engine. Therefore, with a trending engine, users can generate reports in a very short period of time.
Of course, collecting and managing data from a multitude of devices is an enormous task. Leave it to the experts – Dhyan’s NetMan has the reputation of managing over a million devices. It leverages the trending engine to provide you the maximum benefits. Also, it comes equipped with all the required functionalities to keep your devices up and running without any impediments.