How we can scrape prometheus metrics from amazon ECS

Purushottam Baghel

Purushottam BaghelContent Writer

Jun 23, 20224 MIN READ

Prometheus is a metrics collection and alerting tool developed by and released to open source by SoundCloud. Prometheus consists of a lot of different parts, but the main app is responsible for scraping metrics to store in the database and retrieve as and when required. If efficiently deployed, a Prometheus cluster can collect millions of metrics every second.

In the last few months, I got the opportunity to work on Prometheus metrics scraping. The most exciting challenge for me revolved around scraping metrics from ECS. After a lot of trials, Prometheus-ecs-discovery helped solve that problem. In this blog, I have elaborated on the process. 

How prometheus scrape metrics work

Before we get started on Prometheus, let’s understand the concept of metrics itself. In simple terms, metrics are used to measure something. For example, the time it takes for you to read my blog is a metric. The number of words in my blog is a metric. The average number of letters in the words of my blog is also a metric.

Most metrics, like the ones I mentioned above, are fairly static–and you don’t really need a system like Prometheus to measure them. But when it comes to metrics that change over time, Prometheus can be a great help. For instance, if I want to know how many views or hits my blog is getting, or if I wanted to know how many build and deploy cycles are happening every hour–all of this information can be fed into the tool.

Metrics collection with Prometheus relies on the pull model, which means that it is responsible for getting or ‘scraping’ metrics from the services it monitors. The first thing Prometheus needs is a ‘target.’ Targets are endpoints that supply the metrics that the tool stores.

Prometheus collects metrics from targets by scraping metrics HTTP endpoints. Once Prometheus has a list of endpoints, it can start retrieving metrics from them. The configuration points to a specific location on the endpoint, which supplies a stream of text that identifies the metric and its current value. 

Let’s take an example to understand this. Consider an application that produces Prometheus metrics at dummy IP 192.0.2.0:7000/metrics.

This diagram illustrates an application that produces Prometheus metrics.

Prometheus mentions scraping config in its configuration file i.e. prometheus.yml. It says scrape from static_configs mentioned HTTP endpoint (192.0.2.0:7000/metrics).

Since Prometheus can only use HTTP to talk to endpoints for metrics collection, you can use exporters to connect Prometheus with endpoints for services that don’t have a native Prometheus metrics endpoint.

Exporters are small and purpose-built programs designed to stand between Prometheus and anything you want to monitor that doesn’t natively support Prometheus.

About amazon ECS

Amazon Elastic Container Service (ECS) is a highly scalable and fast container management service. Now on a single machine, there could be many containers running. Dynamic port mapping allows you to run multiple tasks over the same host using multiple random host ports.

The following table illustrates our example of the application running on a 7000 port that maps to the 32768 host port.

Network Bindings  
Host PortContainer PortProtocolExternal Link
327687000tcp192.0.2.0 :32768

If we run multiple tasks of the same application that produces metrics at 7000, the dynamic port mapping will assign two different host ports for them. These host ports will be used to communicate with the application.

Why we use prometheus service discovery

Prometheus service discovery is a standard method of finding endpoints to scrape for metrics. The following are its benefits:

  • Increases the number of tasks in each ECS service: We can increase the number of tasks in each ECS service. This will generate more endpoints to consume. Even though we can add multiple targets in static configs, maintaining them manually is really painful.

  • Hosts port number changes on ECS service restart: When we restart the service in ECS, it provides a different host port number, and we need to know the host and port number to scrape metrics. However, finding a host port is not feasible every time.

How we devised a solution

The increasing number of tasks and changing ports are difficult to manage by statics config. So we need some mechanism so that these changes can automatically reflect. Prometheus also provides an option to read targets from files through file_sd_configs

ecs_targets.yml file looks like this. It contains IP and port.

Note: Prometheus needs to be supplied with a list of targets, the host/IP, and the port of each service from which metric data should be scraped.

Why we leveraged prometheus ECS discovery

Sure. Having a file that has all the latest information might solve our problem. But how will that file be updated every time we restart our service and increase our task? That is where Prometheus-ecs-discovery comes in. It needs to have the latest up-to-date information.

The best way to get up-to-date information about ECS tasks in your AWS account is through the AWS API. It can return the private IP address and port required for Prometheus to scrape metrics and other metadata.

Prometheus will read the targets from a file whose location is specified in the configuration file with the file_sd_config element. Prometheus will also intermittently reload this file (every 5 minutes, by default).

With the help of the above flow, you can write your own code or use the already existing open-source service discovery. You can find more details on GitHub.

Start using Freshworks today!

Customer service, IT, and CRM software that’s powerful yet easy to use.

Try for freeLearn more