BigBio Notes: Monit: Monitoring your Services

Bioinformatics Applications are moving more in the direction of "Microservices" Architectures where services should be fine-grained and the protocols should be lightweight. Microservices Architectures decomposed the application into different smaller services improving the modularity; making the application easier to develop, deploy and maintain. It also parallelizes development by enabling small autonomous teams to develop, deploy and scale their respective services independently.

With more services (Databases, APIs, Web Applications, Pipelines) more components should be trace, monitor, to know the health of your application. There might be different roles that are played by different services (in different physical/logical machines) that can be even geographically isolated from each other. As a whole, these services/servers might be providing a combined service to the end application. A particular issue or problem on any of the server should not affect the final application behavior and must be found and fixed before the outage happens.

Multiple applications allow developers/devops and sysadmins to monitor all the services in a microservices application, but the most popular ones are Nagios and Monit.

Monit: Monit ensures that critical services and resources are alive and not misbehaving. It does this in a few ways. If a process/service does not exist (not running), Monit will start it. If a process has exceeded its pre-defined resource boundaries, Monit will restart it.

In order to accomplish this, the Monit configuration specifies the location of a process’ PID file, and the commands necessary to start and stop each process. Timeouts can also be specified if certain processes take extra long to startup or shutdown. Alerts can be configured to send e-mails upon executing any monitoring action on a process. In addition to monitoring process existence and resource use, Monit is capable of many other monitoring functions, including file sizes, checksums, permissions, network connectivity, etc.

Monit has been included in major Linux distribution such as Rehat, Ubuntu, etc; but also it can be installed manually from the source.

Monitoring remote web and web services:

In order to monitor the web components and services Monit provides an easy configuration through HTTP requests:

check host web-server with address mymachine.org

if failed ping with timeout 200 seconds then alert

if failed port 8080 protocol http with timeout 300 seconds then alert

Monit allows defining timeouts with help in case network issues testing your services/servers. The combination of a ping alert and http alert allows defining if the problem is with the server (ping) or with the service (http).

It is possible to trace/monitor specific pages/resources or components of your API. For example, if you have a Solr or Lucene instance running you want to monitor, you can monitor a specific endpoint/core in your Monit configuration:

check host solr-server address solr-server.org

if failed ping then alert

if failed

port 8080

protocol http

request "/solr/selected-core/select?q=*%3A*&fl=id&wt=json&indent=true"

status = 200

timeout 300 seconds

then alert

The paramter request provides a unique/powerful way to monitor HTTP entry point for APIs-based services such as Solr, Resful-APIs, MongoDB, etc. The status paramters is the value of the header request status parameter.

Multiple HTTP entry point should be used to control/monitor the different application services in a microservices architecture.

Monitoring Remote Databases and Other Services:

Monit allows to monitor services in other ports. This feature allows user to monitor databases and other services. For example, users can control Oracle instances using this mechanisms:

check host oracle-server with address oracle-machine.org

if failed host oracle-machine.org port 1521 type tcp then alert

mode passive

group database

These defintions only allow users to know what happens with your service, but do not allow you to control them. Monit provides a UI that allow to know the status of each service.

In order to configure the UI, the user should define a set of parameters:

set httpd port 2812

#use address localhost # only accept connection from localhost

allow admin:monit # require user 'admin' with password 'monit'

#with ssl { # enable SSL/TLS and set path to server certificate

# pemfile: /etc/ssl/certs/monit.pem

It is really import to know in details the paramters of this definition. The allow address parameter defines if the UI will be accessible only from localhost or not. In the example, the UI is accessible from other machines in the network. the parameter allow admin:monit enable to user admin with password monit to connect to the UI.

Monitoring process

Monit is great to monitor remote services, servers and databases; but is more than that. Monit allows to monitor process. You can use Monit to monitor daemon processes or similar programs running on your servers. Monit is particularly useful for monitoring daemon processes, such as those started at system boot time. In contrast to many other monitoring systems, Monit can act if an error situation should occur, e.g.; not only by sending an allert to the user/devops/sysadmin but also stoping/restarting or start automatically.

Monit can also monitor process characteristics, such as how much memory or cpu cycles a process is using. You can also use Monit to monitor files, directories and filesystems on localhost. Monit can monitor these items for changes, such as timestamps changes, checksum changes or size changes.

Monit can be used to test programs or scripts at certain times, much like cron, but in addition, you can test the exit value of a program and perform an action or send an alert if the exit value indicates an error. This means that you can use Monit to perform any type of check you can write a script for.

check process bamboo matching "bamboo"

if does not exist then exec "/home/users/admin/apps/bin/bamboo-manual_start.sh"

The following example control if the bamboo process is running, if is not running Monit lunch the defined script /home/users/admin/apps/bin/bamboo-manual_start.sh. The process can be control also using the pid file is exists or not, for example:

check process foobar with pidfile /var/run/foobar.pid
start program = "/etc/init.d/foobar start" with timeout 60 seconds
stop program = "/etc/init.d/foobar stop"

Why Monit?

- Monit is easy to setup and running for the most simple cases. It is different to Nagios that is complicated to set up and extreme overkill for monitoring simple task.

- Is provided by major operation systems such as Rehat, Ubuntu, etc.

- It can do a lot more than you think, it can do maintenance as well as fix the errors in an error situation in below sections.

Proactive
Processes
Files, Dirs and Filesystems
Cloud and Hosts
Programs and scripts
System

BigBio Notes

Wednesday, 24 January 2018

Monit: Monitoring your Services

No comments:

Post a Comment