Installation overview of node_exporter, prometheus and grafana

Prometheus is an open source systems monitoring and alerting toolkit originally build at Soundcloud. This blogpost shows how to install the needed components to do visualisation of linux system statistics via Grafana.

Addition June 29, 2018: a really quick simple install is provided in this blogpost: very quick install of prometheus, node exporter and grafana This uses an ansible script that does most of the installation and configuration for you.

The setup consists of 3 components:
node_exporter, an exporter of system and hardware metrics.
prometheus, a metric collection and persistence layer.
grafana, the visualisation layer.

1. Preparation
The needed components are installed in the home directory of the user ‘prometheus’. In order for that user exist, it must obviously first be created:

# useradd prometheus
# su - prometheus
$

This installation guide uses Oracle Linux 7.3, but should work for RHEL or Centos too.

2. Node exporter
The next thing to do is install the node exporter. Please mind new version do come out, so you might want to verify the latest release on

$ curl -LO "https://github.com/prometheus/node_exporter/releases/download/v0.14.0/node_exporter-0.14.0.linux-amd64.tar.gz"
$ mkdir -p Prometheus/node_exporter
$ cd $_
$ tar xzf ../../node_exporter-0.14.0.linux-amd64.tar.gz

Now become root and create a unit file to automatically startup the node exporter using systemd:

# echo "[Unit]
Description=Node Exporter

[Service]
User=prometheus
ExecStart=/home/prometheus/Prometheus/node_exporter/node_exporter-0.14.0.linux-amd64/node_exporter

[Install]
WantedBy=default.target" > /etc/systemd/system/node_exporter.service

And make systemd start the node exporter:

# systemctl daemon-reload
# systemctl enable node_exporter.service
# systemctl start node_exporter.service

Next you can verify if the node exporter is running by using ‘systemctl status node_exporter.service:

# systemctl status node_exporter.service
● node_exporter.service - Node Exporter
   Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-07-31 15:20:54 UTC; 7s ago
 Main PID: 3017 (node_exporter)
   CGroup: /system.slice/node_exporter.service
           └─3017 /home/prometheus/Prometheus/node_exporter/node_exporter-0.14.0.linux-amd64/node_exporter

Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - hwmon" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - infiniband" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - textfile" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - conntrack" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - diskstats" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - entropy" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - loadavg" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - sockstat" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - wifi" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg="Listening on :9100" source="node_exporter.go:186"

Additionally, you can go to hostname:9100, and look if that page says ‘node exporter’, and has a link called ‘metric’, which has all the metrics.

3. Prometheus
After we installed node_exporter to provide measurements, we must install the software that can fetch that information and store it. That is what prometheus does. First, become the prometheus user again, and install prometheus. Here too is important to realise that newer versions will come out after this article has been written:

# su - prometheus
$ curl -LO "https://github.com/prometheus/prometheus/releases/download/v1.7.1/prometheus-1.7.1.linux-amd64.tar.gz"
$ cd Prometheus
$ tar xzf ../prometheus-1.7.1.linux-amd64.tar.gz
$ cd prometheus-1.7.1.linux-amd64
$ echo "scrape_configs:

  - job_name: 'prometheus'
    scrape_interval: 1s
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node_exporter'
    scrape_interval: 1s
    static_configs:
      - targets: ['localhost:9100']"> prometheus.yml

This downloaded and unzipped prometheus, and created prometheus scrape config to fetch data from prometheus itself and the node exporter. Now become root, and install the systemd unit file for prometheus:

# echo "[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target

[Service]
User=prometheus
Restart=on-failure
ExecStart=/home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/prometheus -config.file=/home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/prometheus.yml -storage.local.path=/home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/data

[Install]
WantedBy=multi-user.target" > /etc/systemd/system/prometheus.service

And make systemd start prometheus:

# systemctl daemon-reload
# systemctl enable prometheus.service
# systemctl start prometheus.service

And verify prometheus is running:

# systemctl status prometheus.service
● prometheus.service - Prometheus Server
   Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-07-31 15:36:55 UTC; 9s ago
     Docs: https://prometheus.io/docs/introduction/overview/
 Main PID: 22656 (prometheus)
   CGroup: /system.slice/prometheus.service
           └─22656 /home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/prometheus -config.file=/home/prometheus/Prometheus/prometheus-1.7.1....

Jul 31 15:36:55 test.local systemd[1]: Started Prometheus Server.
Jul 31 15:36:55 test.local systemd[1]: Starting Prometheus Server...
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Starting prometheus (version=1.7.1, branch=mast...n.go:88"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Build context (go=go1.8.3, user=root@0aa1b7fc43...n.go:89"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Host details (Linux 3.10.0-514.26.2.el7.x86_64 ...n.go:90"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Loading configuration file /home/prometheus/Pro....go:252"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Loading series map and head chunks..." source="....go:428"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="0 series loaded." source="storage.go:439"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Starting target manager..." source="targetmanager.go:63"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Listening on :9090" source="web.go:259"
Hint: Some lines were ellipsized, use -l to show in full.

Additionally you can go to hostname:9090/targets and verify both node_exporter and prometheus report state=UP.

At this point, system metrics are fetched and stored. All we need to do, is visualise it. An excellent tool for doing so is grafana. This is how grafana is installed:

4. Grafana
This webpage shows installation instructions and a link to the latest version. During the time of writing of this blogpost, the latest version was 4.1.1. This is how grafana is installed: (please mind installation and systemd require root privileges)

# yum install https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.4.1-1.x86_64.rpm

Next up make systemd handle grafana and start it:

# systemctl daemon-reload
# systemctl enable grafana-server.service
# systemctl start grafana-server.service

And check if grafana is running:

# systemctl status grafana-server.service
● grafana-server.service - Grafana instance
   Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-07-31 15:43:11 UTC; 1min 58s ago
     Docs: http://docs.grafana.org
 Main PID: 22788 (grafana-server)
   CGroup: /system.slice/grafana-server.service
           └─22788 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile= cfg:default.paths.logs=/var/log/grafana cfg:default.path...

Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Starting plugin search" logger=plugins
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=warn msg="Plugin dir does not exist" logger=plugins dir=/...plugins
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Plugin dir created" logger=plugins dir=/var/lib...plugins
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing Alerting" logger=alerting.engine
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing CleanUpService" logger=cleanup
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing Stream Manager"
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing HTTP Server" logger=http.server ad...socket=
Jul 31 15:44:34 test.local grafana-server[22788]: t=2017-07-31T15:44:34+0000 lvl=info msg="Request Completed" logger=context userId=0 orgI...eferer=
Jul 31 15:44:34 test.local grafana-server[22788]: t=2017-07-31T15:44:34+0000 lvl=info msg="Request Completed" logger=context userId=0 orgI...eferer=
Jul 31 15:44:34 test.local grafana-server[22788]: t=2017-07-31T15:44:34+0000 lvl=info msg="Request Completed" logger=context userId=0 orgI...eferer=
Hint: Some lines were ellipsized, use -l to show in full.

5. Grafana configuration
Next, we need to hook up grafana with prometheus. First, go to hostname:3000.
– Login with admin/admin
– Click ‘add datasource’
– Name: prometheus, Type: Prometheus
– Http settings: http://localhost:9090, select Access: ‘proxy’.
– Click ‘save and test’. This should result in ‘success’ and ‘datasource updated.’

Now click on the grafana symbol in the left upper corner, dashboards, import. Enter ‘2747’ at ‘grafana.com dashboard’. This will say ‘Linux memory’, select the prometheus datasource which you just defined, and click import.

This should result in a dashboard the shows you the linux memory area’s (click on the picture to get a better view!):

Advertisement
10 comments
  1. The above tool does not seem to give any useful info on linux memory. Slab is made up of 2 parts of which only one is reclaimable i.e reused if needed by kernel to allocate/give to some other memory usage by userland. Also buffers portion is no longer meaningful in newer kernels and it is all about pagecache i.e File backed pages imho.
    if you want to get more info check my blog where i have 2 short posts on memory breakup from linux kernel perspective. Pls do provide any feedback later when u get time. thx

    • I assume with ‘above tool’ you mean node_exporter and prometheus? I am not sure if you have looked how it works, it’s just exporting and storing information from /proc/meminfo (amongst other stats). What is important in my opinion is to know where the majority of memory is going. On database servers that is huge pages (if used), Shared memory (small pages), Anonymous memory and Cached (which is both page cache and file backed mappings, I haven’t found a way to make a distinction between the two), and maybe Pagetables.

      Memory area’s like Slab typically take a few percent of memory on the real life (database) systems I see, which means in my case I don’t care how that is composed. Same for the memory area that is designated as Buffers: it’s a tiny portion, so I am not really interested in it.

      Even Cached is really only important for the part of the executable and libraries mapped in, which is relatively small on a real system. Database IO for Oracle should do DIO, so the page cache is of low interest.

      What do you find useful, and for what purpose?

  2. Thx for your reply.
    1. Yes. i haven’t seen in detail the working of tool but from my vast prior experience on such tools overhead,inability to provide fine-grained breakup/data that often on the outset doesn’t look important/significant but matters alot eventually. But then if free+open source one normally doesnt care to some extent.

    2.Agree for DB servers esp Oracle. esp Hugepages by default,lots of direct reads into PGA,Async+DIO etc but then there are certain OS/linux issues that can bite you in DB env . It is where such detailed understandin/internal can help. But again i agree most with you.

    3.For many other cases(non-DB servers) i have seen SLAB portion can be significant. Even for DB servers these days you do have some agent softwares that run where Anonymous growth(possible leaks if any) together with Cached(that includes Mapped for exec+shared lib) can be important too.

    4. Note: Use of “Buffers” or “Cached” terminology as well as those dataitems reported is no longer useful.
    Instead it is all about File portions(fs data blocks) together with Mapped portion of binaries/exec that would matter most.

    Below holds true in general if you need to know the calculation.
    (Buffers+Cached+SReclaimable) – (ActiveFile+InactiveFile+$SReclaimable) = Shmem (shared memory as used in db servers etc ). So ActiveFile+InactiveFile is what matters mostly that included Mapped that is held for exec loaded in use.

    SReclaimable is that portion of SLAB that kernel sees as reclaimable belonging to non root fs dentries etc .Again insignificant for most part on Db servers .All other dataitems as reported in /proc/meminfo.

    • I am trying to figure out what you want to say.

      1.
      You don’t need the details of the product to be honest, it picks its values from /proc/meminfo. It seems you pick your values from there too.
      I think it’s dangerous to make assumptions on software in general just by the classification ‘open source’. There are very good and very bad pieces of software in open source. I don’t see why you shouldn’t care because it’s open source. It’s free, so on your lab equipment you can choose what to use and what is useful.

      The other way around, in my experience most (so not all…) common commercial monitoring (-suite) tools are extremely high on overhead, and often show a clear lack of understanding what is measured, nor do they store useful measurements that can be used to prove anything. Where proof is the ability for me to reconstruct what is going on technically. Please mind I am making a statement without any hard facts. I do that on purpose here, remind this when you look at a commercial monitoring suite by any of the big vendors.

      2.
      You say here you agree, but ‘then there are certain OS/linux issues that can bite you in DB env’. I don’t like like these statements, because you make a statement without any facts. Be factual, create a testcase that someone can replay and show how to measure it, include any versions that are important, and explain what you see and why it’s bad. _please_ don’t magically come up with numbers without telling where they come from and what they mean and provide a solution without telling what is the problem.

      3.
      You make two statements here.

      3.1 SLAB
      ‘I have seen SLAB portion can be significant’. Okay. Architecture (x86_64?) Linux kernel version? Total amount of available memory? Total amount of SLAB? If this is > 5% of total available memory: what is causing SLAB to be so huge? Does it need to be that big? Can it be managed or even be tuned?

      3.2 Anonymous/Cached
      Not sure what statement you are making here, I think you are saying here that looking at Anonymous and Cached can be important for determining leaks in agent software. Is that what you try to say? Can you be more specific about what software you are talking? Obviously, software should not leak memory. If it does, it’s quite recognisable in most cases by a consistent increase in size, or a consistent increase in size after a certain event.

      4. Buffers and Cached no longer useful
      I haven’t looked extensively into the kernel source to what Buffers and Cached mean.

      In fact, buffers sometimes is explained as page cache metadata, and sometimes as the IO buffer for IO transfers by the kernel. The latter sounds logical to me, which is currently what I think ‘buffers’ is. If you can provide some proof of in what kernel version this is different or even obsolete, please PLEASE show me!

      Cached. This is a statistic I don’t really like, because it seems to include some other statistics, and it’s documented badly. Here too I relied on blogposts. However, I done some testing here too. I do not get negative values, but that requires me to ignore some things. Let me explain.
      Shmem is a statistic that is reported separately, and apparently is included in Cached. The Shmem statistic is consistent with the ipcs -mu output. That’s two distinct sources reporting the same values. For that reason, I trust ‘Shmem’.

      I tried playing around with the remainder of Cached (Cached-Shmem) and in some way tried to include Mapped to figure out the page cache and the mmapped portion. I could not get consistent values. This is the reason I just group them and report them both as one statistic. This way, I can get consistent and non-negative values.

      Here too counts: you say Cached is a useless statistic, okay: where does it say so, or how can you prove it’s reporting porkies?

      For the formula, this seems to be using /proc/meminfo values, correct? This is what node_exporter is providing to prometheus??

      Can you explain what you show with the formula? The outcome seems to be the amount of shared memory, which is a value that is directly derivable from /proc/meminfo, which you can validate with ipcs -mu. Are you saying that page cache and mapped files are in Active_File and Inactive_File?

      In fact, when I take Cached and subtract Shmem, the value comes very close to active+inactive (file)?

  3. 1.I am just sharing my experience on this issue/topic of Linux MM/Memory Management and esp few pitfalls of reporting on linux memory usage in general.
    So I am not disregarding any tool (big or small/open source) or your blog as well but initially pointed out ‘MemAvailable’ computed item reported in /proc/meminfo ,its computation &significance. That’s all.

    2.3 For these i would post some data later as i dont have any env up & running . High Slab values in non-DB env,leaks in software etc

    4. Yes,, ActiveFile+InactiveFile holds page cache together with Mapped.
    Cached+Buffers = Active File +Inactive File + Shmem. So you can see that shared memory is included in Cached+Buffers total value and hence when checking how much kernel can reclaim esp (when computing ‘MemAvailable’ without need to Swap it could be misleading to consider Cached,Buffers portion. Because IMHO kernel does not consider shmem as part of reclaimable without swapping but considers mainly File Page cache together with Slab reclaim portion.
    ( So you can check this File backed Page Cache i.e Active File + Inactive File includes ‘Mapped’ currently.

  4. 1.
    Do the community a favour and create a concrete case that shows such an issue, show how to measure it, and if possible what to do about it. State clearly what the blogpost is about in the top, explain how such situation could be encountered, and if possible how you can artificially generate that situation, and how you can see/measure this, and then if possible how to solve or workaround. I read the blogs you created, and it sometimes wasn’t clear where the measurements came from, or what the point was with the figures you presented. Make it easy to understand, we can not look in your head.

    4.
    What I read here, is that you are saying that Cached minus Shmem indeed is page cache and memory mapped files. Why state that these should not be used then?

    You still haven’t provided any reasons why Buffers should not be used?

    I created a presentation where I set memory area’s of the oracle database purposeful to values to make the system get memory pressure. The reason for setting up prometheus is to be able to see memory area’s resize live during that presentation. Based on that, I can see that free memory obviously is taken first when memory is requested, but even during free memory available, Cached minus Shmem is the first memory area that gets its pages pulled, which supposedly are page cache pages, then Cached pages are swapped, which obviously are the true memory mapped files. After that shared memory gets swapped too given enough memory pressure.

    It seems like the point you are trying to make is about ‘available memory’, which is something I do not talk about. I talk about how the memory looks like.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: