Heartbeat configuration

From Kolmisoft Wiki
Revision as of 05:55, 19 October 2010 by Admin (talk | contribs)
Jump to navigationJump to search

What is High availability?

High availability is a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period.

Lets say you have two servers 'A' and 'B' with MOR installed, MySQL is running replication between them. By default all trafic comming to server 'A' is monitored by server 'B', so when server 'A' fails, server 'B' stands in its position by given time.

So allmost no data is lost, and your users will be happy with your services.

Hearbeat example.jpg


What is Heartbeat?

Heartbeat is software which implements these monitoring and availability features for your servers. It must be carrefully installed, configured and tested on both servers to ensure correct producing of services.


Installation

Hearbeat 2.99 has been full tested on Centos 5.2 only, there is no guarantee that it will work on older versions or distros.

Download mor install scripts from svn in both servers.

Run /usr/src/mor/sh_scripts/install_heartbeat.sh in both servers

Script will:

* download special file, then yum automaticaly install correct heartbeat files for your system.
* install pacemaker, its future only.
* configure /etc/ha.d/authkeys so you don't need to change anything here
* configure /etc/ha.d/ha.cf file, but you still need adjust it by hand (how? later on this page).
* configure /etc/ha.d/haresources file, but you still need to change few bits there, also, later on this page.
* add 2 lines to /etc/hosts, but just for testing purporses only, so you will have to change ips here.


Configuration


Before going further, you need to setup hostname of both servers. Make sure master will have node01 and slave node02. (uname -n) must return correct words.

Edit /etc/sysconfig/network to change your hostname. Then reboot your machine.


First configure master (node01):

Open /etc/hosts and you will see something like this:

192.168.0.131 node01 #change to correct IP
192.168.0.132 node02 #change to correct IP here aswell

Change ip of node01 of master machine, which "accept all incoming traffic by default".

And change ip to slave, witch will "accept all data, if master will die"


Open /etc/ha.d/ha.cf and change:


deadtime higher or lower setting. Deadtime means how many seconds have to pass before takeovering job from master.

Remember, this should be lower on on "lightly" loaded machines, and higher and "highly" loaded machines.

Deadtime 10 is more than enough. (Default is 5)

Make sure you add 2 network interfaces for heartbeat broadcasts.


The bcast directive is used to configure which interfaces Heartbeat sends UDP broadcast traffic on.

For example 'bcast eth0 eth1'.


Now open /etc/ha.d/haresources:

node01 IPaddr2::192.168.0.14 # just for testing, remember this ip can't be used in your network!!!
node01 192.168.0.14 asterisk # just for testing, remember this ip can't be used in your network!!!

So, first of all:

Assuming node01 is master and node02 is slave, by default all traffic is going to node01.

You need to choose IP from your network and never use it, otherwise this will lead to unexpected results.

node01 IPaddr2::192.168.0.14, this means "master (node01)will listen on 192.168.0.14 and work with all data"

node01 192.168.0.14 asterisk, "if master is dead, slave (node02) will restart asterisk and start to accept traffic comming from 192.168.0.14"


Now copy all configuration to slave (node02).

shell@node01:/$ scp -r /etc/ha.d/ root@node02:/etc/


Start heartbeat on both servers by running /etc/init.d/heartbeat start. If everything will be ok, you will see something like this:

Starting High-Availability services:

2008/12/18_18:13:22 INFO: Resource is stopped

2008/12/18_18:13:22 INFO: Resource is stopped

                                                          [  OK  ]

Testing

Now run iptraf on both machines, from another machine start pinging your binded IP address (in this example we spooked about 192.168.0.14).

Check for masters iptraf window, you will see incoming ICMP data, slave have to be quiet.

Heartbeat1.png


Now kill master (for example: ifconfig eth0 down), after short period of time (on this example 5 seconds) on slave you will see incoming ICMP packets.

To bring master to work again, first of all you have to configure network on master and start heartbeat. After 20~ seconds master have to start doing his job again.


Now go to GUI and change your default asterisk server IP (Settings -> Servers), change from 127.0.0.1 to virtual IP address. If your GUI is unable connect to them, make sure both asterisk servers allow connections from your GUI server. (file: /etc/asterisk/manager.conf)

sip.conf

In file /etc/asterisk/sip.conf enter correct IP for value bindaddr, correct IP = IP to which your devices are registering. E.g. virtual IP.

Do same changes for bindaddr in iax.conf, h323.conf and manager.conf

Restart Asterisk

Make this on both servers.


Possible problems

Look at the picture:

Heartbeat broadcast.jpg

In this example HeartBeat in bcast directive has only one interface configured on both servers: eth1

This can lead to problems if the link eth1 <----> eth1 between the servers fails (broken cable, switch, NIC, etc)

You must specify at least 2 interfaces in bcast directive

The bcast directive is used to configure which interfaces Heartbeat sends UDP broadcast traffic on.

An example of bcast directive:

bcast eth0 eth1



See also

High availability (Heartbeat clustering)