There’s nothing riskier for a company’s reputation and business than being unable to measure the level of service that customers are receiving. However, ad-hoc monitoring strategies are no longer an option these days, with customers demanding 24×7 uptime, services being split and deployed as individual units, systems scaling themselves, and product engineers being responsible for operating their code.
Two of the most popular full-stack monitoring tools that have emerged to address the challenges presented by elastic services and full stack observability are New Relic and Datadog. While both can help companies gain better application and infrastructure observability, they differ significantly in pricing, available integrations, and support ecosystems.
Let’s take a look at why you may want to choose one solution over the other by comparing their features.
Datadog is a tool released in 2010 as a protocol-based monitoring platform. It supports infrastructure observability, custom metrics, and alerting. In 2017, the company launched a modern APM solution. It has also launched serverless, logging, SLO/SRE, and full Docker observability. Datadog offers 350 built-in integrations to get data into and insights out of the system. Their customers include Fox, Samsung, Whole Foods, AARP, and Comcast.
New Relic is an application performance management (APM)-focused solution, founded in 2008. New Relic offers serverless, logging, and custom metrics, and it also has rich cloud native offerings, including Docker and serverless. New Relic provides 262 built-in integrations. Its “Insights” product enables ad hoc queries of all New Relic data using its custom query language NRQL. Their customers include Alkami, The Climate Corporation, 20 minutes, and iCIMS.
Core Features (Winner: Tie)
Both Datadog and New Relic have full suites of observability products with core offerings based on providing organizations with full visibility across their infrastructure and applications and at the business level. These core offerings include:
- APM: Application Performance Monitoring focused on providing application insights to engineers, including web requests, language-specific function visibility, and database query performance. Monitoring is usually supported by automatic instrumentation, providing a low barrier to entry/adoption.
- Infrastructure: Monitors cloud system resources, e.g., CPU, memory, disk, network, etc.
- Logging: A widely-supported technique to record application events.
- Custom Metrics/Events: Important for associating business events, e.g., customer sign-up with engineering and infrastructure events.
- Alerting: Used to proactively notify engineers about issues.
- Synthetics: Used to run regular checks that monitor service uptime, i.e., probing as a service.
- Cloud Native (Docker, Serverless): Enables the ability to observe ephemeral cloud-based resources.
- Distributed Tracing: Supports tracking requests as they flow through systems.
Cost/Support (Winner: Datadog)
When choosing an observability solution, it’s critical to perform cost projections and track expenses. Both Datadog’s and New Relic’s prices are based on numbers of instances and their usage. Both tools offer per product pricing, which makes cost estimation complicated.
Datadog calculates its prices using the number of instances running and the volume of data being processed, whereas New Relic calculates its prices based on the number of instances running, the instance size, the duration of time the instance is on, and the volume of data. New Relic also has a form of pricing based on up front compute units. This pricing structure is centered around the amount of time a program is being monitored per month.
Unfortunately, comparing costs between these two providers is an apples to oranges comparison, because each tool offers significantly different features, constraints, and billing dimensions, and each individual feature has its own costs, constraints, and billing dimensions as well. Datadog’s costs are easier to calculate and understand since they don’t scale with instance size.
Cost Visibility (Winner: Datadog)
Cost visibility is essential to understanding usage as well as to performing cost projections. Both of these tools provide client dashboards to drill into usage and perform cost analyses. Datadog provides clear, up front pricing on their products, as well as price scenarios that make calculation easier. Unfortunately, New Relic doesn’t even have a full price sheet available online. Its site only lists an APM price for Amazon’s “t2.micro” instances. I couldn’t find prices for any larger instances.
Learning Curve / Documentation (Winner: Datadog)
Even though Datadog is a closed-source project, it has a huge Github open-source offering, which includes all of its APM agents and its monitoring daemon. The effect of this becomes obvious when searching for answers to questions or looking for tooling support. Datadog also has point and click and complex aggregations built into its DNA. Last year, New Relic launched a feature called “One Dashboards” designed to make it easier to create and reuse dashboards. While both providers offer great documentation for their products, Datadog’s clean UX and open-source community support really set it apart.
Integration / Ingestion (Winner: Datadog)
Integrations provide connections with third party services such as AWS, Pingdom, Mongo, and Redis. These connections pipe data out of the originating platform and into Datadog or New Relic, where data can be aggregated, visualized, and alerted on (using PagerDuty, VictorOps, and other similar tools). Having data inside a single observability tool instead of spreading data across multiple tools is critical for ensuring optimal client experiences. When incidents happen, every second counts, and response times are slowed down when engineers have to search through multiple tools. Integrations also help to get data out of specific types of technologies, such as databases and web services, allowing engineers to correlate those with their applications.
Both tools have integrations for the most popular cloud providers. Since platforms generally only expose a few ways to integrate (through their API), the more integrations there are, the more likely it is that they will cover the tools you use. Datadog currently supports 350 integrations, while New Relic supports 262.
Alerting (Winner: Datadog)
Alerting enables businesses to be notified as soon as something out of the ordinary happens. Alerting features gather metrics as input, set conditions for triggering an alert, and specify channels for output—most commonly, emails, chats, and/or the triggering of an incident in an incident management system. Alerts are usually defined through a UI.
Datadog models all input as metrics, making it easy to perform complex operations and aggregations using a simple point and click visual web interface. This enables non-technical users to create advanced alerts. New Relic supports a number of complex queries through NRQL, their custom query language. It has an easy UI with simple thresholds, but it requires NRQL for more advanced operations, creating a steeper learning curve for non-technical team members.
Datadog has a wider range of supported operations as well as aggregations and a cleaner UX for creating alerts. Its product has defined the industry standard for alert UX.
Data Retention (Winner: Datadog)
Data retention defines how long data is available in a system. As data ages, both Datadog and New Relic aggregate them to make storage cheaper. The longer the retention period, the easier it is to see long-term (seasonal, monthly, quarterly) trends. Datadog, by default, stores most data for 15 months, while New Relic stores it for 13 months. While New Relic does keep a subset of metrics forever, its retention is generally shorter than Datadog’s. New Relic does offer a feature called Insights which can store data for longer time periods.
Custom Metrics (Winner: Datadog)
Datadog allows applications to send custom metrics on an infrastructure plan. While it requires custom instrumentation, there are open-source solutions dedicated to making this process easier. In 2019, New Relic announced support for open metrics. Their goal was to enable popular projects to easily get data into their platform, something Datadog has facilitated from its inception.
All of Datadog’s metrics are custom-based on the Dogstatsd/Statsd protocol. This enables Datadog to be used even without an APM to collect important application-level metrics such as the number of requests being made, how long those requests are taking to process, and what the status codes of those requests are. These protocol-based solutions provide a foundation for a rich third party ecosystem as well as for community support and tooling—benefits that New Relic currently lacks.
Rollout (Winner: Tie)
Datadog and New Relic differ in their deployments. Datadog requires a piece of software (a daemon) to be deployed on each infrastructure host. It listens for metrics and then submits them. Each application is then configured to point to this local software. New Relic uses a library-based approach for their APM which submits metrics directly from an application to New Relic. Like Datadog, New Relic also requires a daemon for infrastructure metrics. Both New Relic and Datadog APM offerings make it easy to collect data by providing per language libraries that can easily be added to applications.
Datadog is simpler because it has a single standard approach to deployment (its daemon) across all its products. While New Relic does have two different approaches, its APM doesn’t require any additional services to gather metrics.
Monitoring is an essential business process, and it goes without saying that any observability solution is better than no observability solution at all. That said, based on its cost transparency, focus on UX, documentation, and structured metrics, Datadog is a better tool overall. New Relic’s sales and pricing cycle adds extra barriers to entry when compared to Datadog’s full SAAS offering. New Relic is a good choice for an organization with little or no prior observability experience as it provides a low touch, application-focused solution geared towards engineers. DataDog is a good choice as a foundational building block. Its commitment to its open-source status shows in its extensive support and tooling options.
Remember, though, that when using either of these monitoring solutions, it’s critical to perform in-depth cost projections, especially if you’re planning to increase the number of serverless or container-based applications that you’re running.