Monitoring is what allows you to generate complete transparency of the online service that you’re responsible for, including cloud infrastructure, application functionality and SLA. Modern IT monitoring seems to be composed of two layers: an infrastructure layer and an application layer. On the infrastructure layer, VMs, network, and storage are monitored, revealing memory consumption, CPU utilization, and network connection metrics. On the application layer, database performance, browsing latency, and actual application functionalities, such as users registration, login and cart are monitored. For mega sites like eBay and PayPal, even the slightest latency can lead to a loss of millions of dollars. If your online service isn’t monitored closely, the trust and confidence of your users can be significantly compromised. In this post, I would like to touch on several points that describe the current state of the market, how essential it is to monitor your resources, and what monitoring is built on.
API-Based Monitoring
APIs are of the most important features of modern IT environment. Whether a service is provided or consumed, the fact that an API is developed first affirms its centrality to an application’s development lifecycle. When searching for a tool to monitor your infrastructure or application, you wouldn’t expect to install an agent on your servers because it’s often perceived as intrusive. This data allows cloud ecosystem vendors who offer monitoring to provide great UIs and analytics tools in efforts to support cloud customers with an overall better experience. However, monitoring vendors that rely on their customers to install their end-point agent, or so-called “intrusive” services, are still needed in order to monitor the heart of an application, including specific processes.
The Tools
The cloud ecosystem is bursting with real-time and analytics tools that monitor an infrastructure’s performance, cost, availability, and more. There are infrastructure-monitoring open source tools like Nagios, Zabbix, and Netflix Ice; and commercial tools like DataDog. Additionally, there are very popular and successful APM (Application Performance Monitoring) tools, such as NewRelic and Appdynamics. How can you go about choosing the right tools? Let me offer a few important guidelines.
You should choose to work with veteran vendors that offer many integration points. You will eventually end up bundling a few tools together to streamline an issue from the ground up: from a specific instance, to running an application workload or database, all the way to log and event managers. You want to be able to act in no time and post-mortem an outage with handy, detailed reports. Your operations team needs to consider tools that integrate with other open source or commercial complementary services. Also, make sure to use tools that offer action initiation. This means that they are capable of not only shutting down specific instances, but also running action scripts, or even deep integration, with test frameworks. You will be better equipped to handle issues if your tools can be configured to send alerts, since you will be encouraged to respond quickly and accordingly.
Although the new modern environment is integrated, delivered, and monitored in a continuous and automated fashion, the human touch is increasingly needed. Your 24/7 NOC is still relevant and valuable. Monitoring tools enable you to eliminate risks by providing trend analyses on how your application reacts when performing updates and how it handles unexpected growth. You can react before your users experience an actual downgrade. Monitoring is not only about local events anymore, but about a series of various occurrences, and as such, must be approached in a modern and dynamic way.
This post is brought to you by the VMware vCloud Air Network.