sampler-fork/README.md

429 lines
15 KiB
Markdown
Raw Normal View History

2019-09-14 15:06:58 +00:00
# Sampler. Visualization for any shell command.
2019-04-25 00:03:47 +00:00
[![Build Status](https://travis-ci.com/sqshq/sampler.svg?token=LdyRhxxjDFnAz1bJg8fq&branch=master)](https://travis-ci.com/sqshq/sampler) [![Go Report Card](https://goreportcard.com/badge/github.com/sqshq/sampler)](https://goreportcard.com/report/github.com/sqshq/sampler)
2019-04-20 04:30:13 +00:00
Sampler is a tool for shell commands execution, visualization and alerting. Configured with a simple YAML file.
2019-04-19 03:50:48 +00:00
![sampler](https://user-images.githubusercontent.com/6069066/56404396-70b14d00-6234-11e9-93cd-54461bf40c96.gif)
2019-04-20 04:30:13 +00:00
2019-08-12 01:54:59 +00:00
## Why do I need it?
2019-08-12 02:11:40 +00:00
One can sample any dynamic process right from the terminal - observe changes in the database, monitor MQ in-flight messages, trigger a deployment script and get notification when it's done.
2019-08-12 01:54:59 +00:00
2019-09-14 15:06:58 +00:00
If there is a way to get a metric using a shell command - then it can be visualized with Sampler momentarily.
2019-08-12 01:54:59 +00:00
2019-04-20 04:30:13 +00:00
## Installation
### macOS
2019-06-18 04:49:53 +00:00
```bash
brew cask install sampler
```
or
```bash
2019-08-18 17:40:19 +00:00
sudo curl -Lo /usr/local/bin/sampler https://github.com/sqshq/sampler/releases/download/v1.0.3/sampler-1.0.3-darwin-amd64
2019-08-07 05:08:47 +00:00
sudo chmod +x /usr/local/bin/sampler
2019-06-18 04:49:53 +00:00
```
2019-04-20 04:30:13 +00:00
### Linux
2019-06-18 04:49:53 +00:00
```bash
2019-08-18 17:40:19 +00:00
sudo wget https://github.com/sqshq/sampler/releases/download/v1.0.3/sampler-1.0.3-linux-amd64 -O /usr/local/bin/sampler
2019-08-07 05:08:47 +00:00
sudo chmod +x /usr/local/bin/sampler
2019-06-18 04:49:53 +00:00
```
Note: `libasound2-dev` system library is required to be installed for Sampler to play a [trigger](https://github.com/sqshq/sampler#triggers) sound tone. Usually the library is in place, but if not - you can install it with your favorite package manager, e.g `apt install libasound2-dev`
2019-10-14 22:48:11 +00:00
#### Packaging status
2020-01-09 14:14:44 +00:00
- [Fedora](https://apps.fedoraproject.org/packages/golang-github-sqshq-sampler) `sudo dnf install golang-github-sqshq-sampler` (F31+)
2019-08-04 04:57:25 +00:00
### Windows (experimental)
2019-07-29 00:32:15 +00:00
Recommended to use with advanced console emulators, e.g. [Cmder](https://cmder.net/)
2019-08-18 17:40:19 +00:00
[Download .exe](https://github.com/sqshq/sampler/releases/download/v1.0.3/sampler-1.0.3-windows-amd64.exe)
2019-06-18 04:49:53 +00:00
2019-04-20 04:30:13 +00:00
## Usage
2019-04-25 00:03:47 +00:00
You specify shell commands, Sampler executes them with a required rate. The output is used for visualization.
Using Sampler is basically a 3-step process:
2019-08-11 04:28:00 +00:00
- Define your shell commands in a YAML configuration file
2019-04-20 04:30:13 +00:00
- Run `sampler -c config.yml`
2019-04-25 00:03:47 +00:00
- Adjust components size and location on UI
2019-04-20 04:30:13 +00:00
2019-08-12 01:54:59 +00:00
## But there are so many monitoring systems already
2019-09-14 15:06:58 +00:00
Sampler is by no means an alternative to full-scale monitoring systems, but rather an easy to setup development tool.
2019-08-12 01:54:59 +00:00
2019-08-12 02:14:37 +00:00
If spinning up and configuring [Prometheus with Grafana](https://prometheus.io) is complete overkill for you task, Sampler might be the right solution. No servers, no databases, no deploy - you specify shell commands, and it just works.
2019-08-12 01:54:59 +00:00
2019-08-12 02:11:40 +00:00
## Then it should be installed on every server I monitor?
No, you can run Sampler on local, but still gather telemetry from multiple remote machines. Any visualization might have `init` command, where you can ssh to a remote server. See [SSH example](https://github.com/sqshq/sampler#ssh)
2019-08-12 01:54:59 +00:00
2019-06-11 03:01:37 +00:00
## Contents
2019-04-20 04:30:13 +00:00
2019-04-25 00:03:47 +00:00
- [Components](#components)
- [Runchart](#runchart)
- [Sparkline](#sparkline)
- [Barchart](#barchart)
- [Gauge](#gauge)
- [Textbox](#textbox)
- [Asciibox](#asciibox)
- [Bells and whistles](#bells-and-whistles)
- [Triggers (conditional actions)](#triggers)
2019-06-11 03:11:53 +00:00
- [Interactive shell (database interaction, remote server access, etc)](#interactive-shell-support)
2019-04-25 00:03:47 +00:00
- [Variables](#variables)
- [Color theme](#color-theme)
2019-06-22 23:56:34 +00:00
- [Real-world recipes (contributions welcome!)](#real-world-recipes)
- [Databases (MySQL, PostgreSQL, MongoDB, Neo4j)](#databases)
- [Kafka](#kafka)
- [Docker](#docker)
2019-06-22 23:56:34 +00:00
- [SSH](#ssh)
- [JMX](#jmx)
2019-04-20 04:30:13 +00:00
2019-06-11 03:01:37 +00:00
## Components
2019-06-22 04:10:05 +00:00
The following is a list of configuration examples for each component type, with macOS compatible sampling scripts.
2019-06-11 03:11:53 +00:00
2019-06-11 03:01:37 +00:00
### Runchart
2019-06-10 02:21:26 +00:00
![runchart](https://user-images.githubusercontent.com/6069066/59168666-aff96d00-8b04-11e9-99b6-34d8bae37bd2.png)
2019-04-25 00:03:47 +00:00
```yml
runcharts:
- title: Search engine response time
rate-ms: 500 # sampling rate, default = 1000
scale: 2 # number of digits after sample decimal point, default = 1
legend:
enabled: true # enables item labels, default = true
details: false # enables item statistics: cur/min/max/dlt values, default = true
items:
- label: GOOGLE
sample: curl -o /dev/null -s -w '%{time_total}' https://www.google.com
color: 178 # 8-bit color number, default one is chosen from a pre-defined palette
- label: YAHOO
sample: curl -o /dev/null -s -w '%{time_total}' https://search.yahoo.com
- label: BING
2019-06-10 02:21:26 +00:00
sample: curl -o /dev/null -s -w '%{time_total}' https://www.bing.com
2019-04-25 00:03:47 +00:00
```
2019-06-11 03:01:37 +00:00
### Sparkline
2019-06-10 02:21:26 +00:00
![sparkline](https://user-images.githubusercontent.com/6069066/59167746-de754900-8b00-11e9-9305-c9a4176634d2.png)
2019-04-25 00:03:47 +00:00
```yml
sparklines:
2019-06-10 02:21:26 +00:00
- title: CPU usage
rate-ms: 200
scale: 0
sample: ps -A -o %cpu | awk '{s+=$1} END {print s}'
- title: Free memory pages
rate-ms: 200
scale: 0
sample: memory_pressure | grep 'Pages free' | awk '{print $3}'
2019-04-25 00:03:47 +00:00
```
2019-06-11 03:01:37 +00:00
### Barchart
2019-06-10 02:21:26 +00:00
![barchart](https://user-images.githubusercontent.com/6069066/59167751-de754900-8b00-11e9-8d01-efd04ae1eec6.png)
2019-04-25 00:03:47 +00:00
```yml
barcharts:
- title: Local network activity
rate-ms: 500 # sampling rate, default = 1000
scale: 0 # number of digits after sample decimal point, default = 1
items:
- label: UDP bytes in
sample: nettop -J bytes_in -l 1 -m udp | awk '{sum += $4} END {print sum}'
- label: UDP bytes out
sample: nettop -J bytes_out -l 1 -m udp | awk '{sum += $4} END {print sum}'
- label: TCP bytes in
sample: nettop -J bytes_in -l 1 -m tcp | awk '{sum += $4} END {print sum}'
- label: TCP bytes out
sample: nettop -J bytes_out -l 1 -m tcp | awk '{sum += $4} END {print sum}'
```
2019-06-11 03:01:37 +00:00
### Gauge
2019-06-12 02:14:54 +00:00
![gauge](https://user-images.githubusercontent.com/6069066/59318799-4c06ae00-8c96-11e9-868a-7fef803f3739.png)
2019-04-25 00:03:47 +00:00
```yml
gauges:
- title: Minute progress
rate-ms: 500 # sampling rate, default = 1000
scale: 2 # number of digits after sample decimal point, default = 1
percent-only: false # toggle display of the current value, default = false
2019-06-10 02:21:26 +00:00
color: 178 # 8-bit color number, default one is chosen from a pre-defined palette
2019-04-25 00:03:47 +00:00
cur:
sample: date +%S # sample script for current value
max:
sample: echo 60 # sample script for max value
min:
sample: echo 0 # sample script for min value
2019-06-10 02:21:26 +00:00
- title: Year progress
cur:
sample: date +%j
max:
sample: echo 365
min:
sample: echo 0
2019-04-25 00:03:47 +00:00
```
2019-06-11 03:01:37 +00:00
### Textbox
2019-06-10 02:36:35 +00:00
![textbox](https://user-images.githubusercontent.com/6069066/59168949-192db000-8b06-11e9-900b-0e92ff494f62.png)
2019-04-25 00:03:47 +00:00
```yml
textboxes:
- title: Local weather
rate-ms: 10000 # sampling rate, default = 1000
sample: curl wttr.in?0ATQF
border: false # border around the item, default = true
color: 178 # 8-bit color number, default is white
- title: Docker containers stats
rate-ms: 500
sample: docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.PIDs}}"
```
2019-06-11 03:01:37 +00:00
### Asciibox
![asciibox](https://user-images.githubusercontent.com/6069066/59169283-aa515680-8b07-11e9-8beb-716a387aed1b.png)
2019-04-25 00:03:47 +00:00
```yml
asciiboxes:
- title: UTC time
rate-ms: 500 # sampling rate, default = 1000
font: 3d # font type, default = 2d
border: false # border around the item, default = true
2019-06-10 02:36:35 +00:00
color: 43 # 8-bit color number, default is white
2019-04-25 00:03:47 +00:00
sample: env TZ=UTC date +%r
```
2019-04-20 04:30:13 +00:00
2019-06-11 03:01:37 +00:00
## Bells and whistles
2019-04-20 04:30:13 +00:00
2019-06-11 03:01:37 +00:00
### Triggers
2019-04-25 00:03:47 +00:00
Triggers allow to perform conditional actions, like visual/sound alerts or an arbitrary shell command.
2019-06-22 04:10:05 +00:00
The following examples illustrate the concept.
2019-06-23 00:16:21 +00:00
#### Clock gauge, which shows minute progress and announces current time at the beginning of each minute
2019-06-22 04:10:05 +00:00
```yml
gauges:
- title: MINUTE PROGRESS
2019-06-23 00:16:21 +00:00
position: [[0, 18], [80, 0]]
2019-06-22 04:10:05 +00:00
cur:
sample: date +%S
max:
sample: echo 60
min:
sample: echo 0
triggers:
- title: CLOCK BELL EVERY MINUTE
condition: '[ $label == "cur" ] && [ $cur -eq 0 ] && echo 1 || echo 0' # expects "1" as TRUE indicator
actions:
terminal-bell: true # standard terminal bell, default = false
sound: true # NASA quindar tone, default = false
visual: false # notification with current value on top of the component area, default = false
script: say -v samantha `date +%I:%M%p` # an arbitrary script, which can use $cur, $prev and $label variables
```
#### Search engine latency chart, which alerts user when latency exceeds a threshold
```yml
runcharts:
- title: SEARCH ENGINE RESPONSE TIME (sec)
2019-06-23 00:16:21 +00:00
rate-ms: 200
2019-06-22 04:10:05 +00:00
items:
- label: GOOGLE
sample: curl -o /dev/null -s -w '%{time_total}' https://www.google.com
2019-06-23 00:16:21 +00:00
- label: YAHOO
sample: curl -o /dev/null -s -w '%{time_total}' https://search.yahoo.com
2019-06-22 04:10:05 +00:00
triggers:
- title: Latency threshold exceeded
2019-06-23 00:16:21 +00:00
condition: echo "$prev < 0.3 && $cur > 0.3" |bc -l # expects "1" as TRUE indicator
2019-06-22 04:10:05 +00:00
actions:
terminal-bell: true # standard terminal bell, default = false
sound: true # NASA quindar tone, default = false
visual: true # visual notification on top of the component area, default = false
script: 'say alert: ${label} latency exceeded ${cur} second' # an arbitrary script, which can use $cur, $prev and $label variables
```
2019-04-20 04:30:13 +00:00
2019-06-11 03:01:37 +00:00
### Interactive shell support
2019-06-22 04:10:05 +00:00
In addition to the `sample` command, one can specify `init` command (executed only once before sampling) and `transform` command (to post-process `sample` command output). That covers interactive shell use case, e.g. to establish connection to a database only once, and then perform polling within interactive shell session.
2019-06-22 04:38:33 +00:00
#### Basic mode
2019-06-22 04:10:05 +00:00
```yml
textboxes:
- title: MongoDB polling
rate-ms: 500
init: mongo --quiet --host=localhost test # executes only once to start the interactive session
sample: Date.now(); # executes with a required rate, in scope of the interactive session
transform: echo result = $sample # executes in scope of local session, $sample variable is available for transformation
```
#### PTY mode
2019-08-05 21:17:57 +00:00
In some cases interactive shell won't work, because its stdin is not a terminal. We can fool it, using PTY mode:
2019-06-22 04:10:05 +00:00
```yml
textboxes:
- title: Neo4j polling
pty: true # enables pseudo-terminal mode, default = false
init: cypher-shell -u neo4j -p pwd --format plain
sample: RETURN rand();
transform: echo "$sample" | tail -n 1
- title: Top on a remote server
pty: true # enables pseudo-terminal mode, default = false
init: ssh -i ~/user.pem ec2-user@1.2.3.4
sample: top
```
2019-04-20 04:30:13 +00:00
#### Multistep init
It is also possible to execute multiple init commands one after another, before you start sampling.
```yml
textboxes:
- title: Java application uptime
multistep-init:
- java -jar jmxterm-1.0.0-uber.jar
- open host:port # or local PID
- bean java.lang:type=Runtime
sample: get Uptime
```
2019-06-11 03:01:37 +00:00
### Variables
2019-04-25 00:03:47 +00:00
If the configuration file contains repeated patterns, they can be extracted into the `variables` section.
Also variables can be specified using `-v`/`--variable` flag on startup, and any system environment variables will also be available in the scripts.
2019-04-20 04:30:13 +00:00
2019-06-22 04:10:05 +00:00
```yml
variables:
mongoconnection: mongo --quiet --host=localhost test
barcharts:
- title: MongoDB documents by status
items:
- label: IN_PROGRESS
init: $mongoconnection
2019-06-22 04:38:33 +00:00
sample: db.getCollection('events').find({status:'IN_PROGRESS'}).count()
2019-06-22 04:10:05 +00:00
- label: SUCCESS
init: $mongoconnection
2019-06-22 04:38:33 +00:00
sample: db.getCollection('events').find({status:'SUCCESS'}).count()
2019-06-22 04:10:05 +00:00
- label: FAIL
init: $mongoconnection
2019-06-22 04:38:33 +00:00
sample: db.getCollection('events').find({status:'FAIL'}).count()
2019-06-22 04:10:05 +00:00
```
2019-06-11 03:01:37 +00:00
### Color theme
2019-06-22 04:28:32 +00:00
![light-theme](https://user-images.githubusercontent.com/6069066/59959405-994c0200-9484-11e9-856b-c4d18716e1de.png)
2019-06-22 04:10:05 +00:00
```yml
theme: light # default = dark
sparklines:
- title: CPU usage
sample: ps -A -o %cpu | awk '{s+=$1} END {print s}'
```
2019-04-20 04:30:13 +00:00
2019-06-22 23:56:34 +00:00
## Real-world recipes
### Databases
The following are different database connection examples. Interactive shell (init script) usage is recommended to establish connection only once and then reuse it during sampling.
2019-06-22 23:56:34 +00:00
<details><summary>MySQL</summary>
```yml
# prerequisite: installed mysql shell
variables:
mysql_connection: mysql -u root -s --database mysql --skip-column-names
sparklines:
- title: MySQL (random number example)
pty: true
init: $mysql_connection
sample: select rand();
```
</details>
<details><summary>PostgreSQL</summary>
```yml
# prerequisite: installed psql shell
variables:
PGPASSWORD: pwd
postgres_connection: psql -h localhost -U postgres --no-align --tuples-only
sparklines:
- title: PostgreSQL (random number example)
init: $postgres_connection
sample: select random();
```
</details>
<details><summary>MongoDB</summary>
```yml
# prerequisite: installed mongo shell
variables:
mongo_connection: mongo --quiet --host=localhost test
sparklines:
- title: MongoDB (random number example)
init: $mongo_connection
sample: Math.random();
```
</details>
<details><summary>Neo4j</summary>
```yml
# prerequisite: installed cypher shell
variables:
neo4j_connection: cypher-shell -u neo4j -p pwd --format plain
sparklines:
- title: Neo4j (random number example)
pty: true
init: $neo4j_connection
sample: RETURN rand();
transform: echo "$sample" | tail -n 1
```
</details>
### Kafka
<details><summary>Kafka lag per consumer group</summary>
```yml
variables:
kafka_connection: $KAFKA_HOME/bin/kafka-consumer-groups --bootstrap-server localhost:9092
runcharts:
- title: Kafka lag per consumer group
2019-08-05 00:38:12 +00:00
rate-ms: 5000
2019-06-22 23:56:34 +00:00
scale: 0
items:
- label: A->B
sample: $kafka_connection --group group_a --describe | awk 'NR>1 {sum += $5} END {print sum}'
- label: B->C
sample: $kafka_connection --group group_b --describe | awk 'NR>1 {sum += $5} END {print sum}'
- label: C->D
sample: $kafka_connection --group group_c --describe | awk 'NR>1 {sum += $5} END {print sum}'
```
</details>
### Docker
<details><summary>Docker containers stats (CPU, MEM, O/I)</summary>
```yml
textboxes:
- title: Docker containers stats
sample: docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}\t{{.PIDs}}"
```
</details>
2019-06-22 23:56:34 +00:00
### SSH
2019-06-23 00:16:21 +00:00
<details><summary>TOP command on a remote server</summary>
```yml
variables:
sshconnection: ssh -i ~/my-key-pair.pem ec2-user@1.2.3.4
textboxes:
- title: SSH
pty: true
init: $sshconnection
sample: top
```
</details>
2019-06-22 23:56:34 +00:00
### JMX
2019-06-23 04:08:24 +00:00
2019-06-27 02:38:54 +00:00
<details><summary>Java application uptime example</summary>
2019-06-23 04:08:24 +00:00
```yml
2019-07-28 17:15:34 +00:00
# prerequisite: download [jmxterm jar file](https://docs.cyclopsgroup.org/jmxterm)
textboxes:
- title: Java application uptime
multistep-init:
- java -jar jmxterm-1.0.0-uber.jar
- open host:port # or local PID
- bean java.lang:type=Runtime
sample: get Uptime
transform: echo $sample | tr -dc '0-9' | awk '{printf "%.1f min", $1/1000/60}'
```
2019-06-23 04:08:24 +00:00
2019-06-27 02:38:54 +00:00
</details>