Showing posts with label AWS. Show all posts
Showing posts with label AWS. Show all posts

Monday, September 23, 2013

Cross-Region: Redundant VPN between Regions with Openswan and Nagios NRPE

Introduction:

I am new to scripting so you will most likely be able to improve my scripts :)

I am working on a interesting cross-region project right now. We will have 1 ELB with 4 ejabberd-nodes in each region. With Route 53 latency based Records I want to redirect the mobile app to the nearest region.


It's a ejabberd chat app which needs of course to be in the same private subnet for clustering and replication. So I will setup 2 openswan vpn servers and 1 vpn-watcher in each region. Additional I also setup a ejabberd-watcher which will monitor and event-handle the ejabberd cluster + the watchers will monitor and event-handle each other.
If 1 tunnel goes down the vpn-watcher will check which route is active and replace it if needed.
Because I am lazy, and ill make health-checks every 5 seconds, I want to keep the configuration as simple as possible and the health-checks as few as possible. So ill handle the VPN tunnels as 2 tunnel-groups, instead of handling every tunnel itself


Security Rules:
                                                         Ports:                                             From:
project-vpn                                     500, 4500                                 region{a,b,c}-vpn{1,2}
project-watcher                               ECHO REQUEST, 5666               project-watcher
project-replication                           ports needed for replication          project-replication

Servers:
                                                     Security-Groups
VPN                                  project-vpn, project-watcher, project-replication
WATCHER                         project-watcher
APP-SERVER                    project-replication, project-watcher

This excellent HowTo will explain how to setup the openswan vpn tunnel:
I also added additionally in /etc/sysctl.conf

# protect routing table from ICMP redirect packets net.ipv4.conf.all.accept_redirects = 0 # Enable Logging net.ipv4.conf.all.log_martians = 1 # Openswan net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.eth0.send_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.eth0.accept_redirects = 0


VPN-Watcher:

  • Create a IAM Role and only give permissions for describe-routes and replace-route:
             

- EC2
- Use the Generator for Permissions, only select describe-routes and replace-route
  • Launch a AWS-Linux Instance with the watcher role and attach a EIP to it
  • Install Nagios and NRPE:
$ sudo yum -y install nagios nagios-plugins-all nagios-plugins-nrpe nrpe php httpd
$ sudo sh -c "chkconfig httpd on && chkconfig nagios on && chkconfig nrpe on && chkconfig postfix on"
  • Configure postfix
  • Configure Nagios
add your mail address in contacts.cfg and you also should delete the line about notifications for linux-servers in templates.cfg

change the health_check interval, you can either change the interval_length nagios.cfg or define it on the service or both. Here is a good explanation:
nagios-check-service-frequency-based-on-service-status

add some commands in commands.cfg

# nrpe define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ }


define command{
command_name vpn1-handler
command_line /usr/lib64/nagios/plugins/eventhandlers/event_handler_ipsec_01 $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
        }

define command{
        command_name    vpn2-handler
        command_line    /usr/lib64/nagios/plugins/eventhandlers/event_handler_ipsec_02 $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$

        }

Add your Hosts
/etc/nagios/conf.d/region-vpn-01.cfg

define host { use linux-server host_name region-vpn-01 alias region-vpn-01 address 999.99.99.999 } define service { use generic-service host_name region-vpn-01 service_description IPSEC check_command check_nrpe!check_ipsec max_check_attempts 4 event_handler vpn1-handler } define service { use generic-service host_name region-vpn-01 service_description Current Load check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0 }

Add the event_handler scripts

$ wget https://raw.github.com/peterromfeldhk/nagios/master/change_vpn-tunnel 
$ for i in `seq 1 2`; do cp change_vpn-tunnel /usr/lib64/nagios/plugins/change_tunnel$i; done
$ wget https://raw.github.com/peterromfeldhk/nagios/master/handler_nrpe
$ for i in `seq 1 2`; do cp handler_nrpe /usr/lib64/nagios/plugins/eventhandlers/event_handler_ipsec_0$i; done
$ sed -i "s/change_tunnel1/change_tunnel2/g" /usr/lib64/nagios/plugins/eventhandlers/event_handler_ipsec_02

edit change_tunnel scripts, i use them hardcoded without variables so you may need to adjust them a bit more, i always use +x in the first line to troubleshoot scripts :)

Example:

Like I already said I am using 2 Tunnelgroups. Here is a Roadmap, example routing and example config:


Example for handling “region1-vpn1”:

MYREGION=us-east-1
OTHERREG=eu-west-1
THIRDREG=ap-northeast-1
OTHERIID=iid-12
SECIID=iid-22
THIRDIID=iid-32
MYIP=”10.1.0.1”
MYCIDR=”10.1.0.0/16”
OTHERCIDR=”10.2.0.0/16”
THIRDCIDR=”10.3.0.0/16”
MYTABLEID=rtb-1
OTHERTABLE=rtb-2
THIRDTABLE=rtb-3

Hope this explanation is good enought, please email me if not!

  • configure NRPE on VPN-servers
$ sudo yum install nagios-plugins-all nagios-plugins-nrpe nrpe
$ sudo chkconfig nrpe on
$ wget https://raw.github.com/peterromfeldhk/nagios/master/check_ipsec
$ sudo mv check_ipsec /usr/lib64/nagios/plugins/
$ sudo sh -c "echo 'tunnalname1 rightip1' > /usr/lib64/nagios/plugins/gateways.txt && echo 'tunnalname2 rightip2' >> /usr/lib64/nagios/plugins/gateways.txt"
$ sudo sed -i "s/allowed_hosts=127.0.0.1/allowed_hosts=IP.OF.VPN.WATCHER/g" /etc/nagios/nrpe.cfg
$ sudo sh -c "echo 'command[check_ipsec]=sudo /usr/lib64/nagios/plugins/check_ipsec --tunnels 2' >> /etc/nagios/nrpe.cfg"
$ sudo sh -c "echo 'command[restart_ipsec]=sudo /etc/init.d/ipsec restart' >> /etc/nagios/nrpe.cfg"

As Root:
command out "Defaults    requiretty" in /etc/sudoers to allow remote commands
and create a nopassword for nrpe

# vim /etc/sudoers.d/nrpe
Cmnd_Alias IPSEC = /usr/lib64/nagios/plugins/check_ipsec 
Cmnd_Alias RESEC = /etc/init.d/ipsec restart
nrpe ALL=NOPASSWD:IPSEC, RESEC
# chmod 400 !$
  • testing and troubleshooting
check if nrpe works without args, eq 
/usr/lib64/nagios/plugins/check_nrpe -H ip.of.target.x
if it works it should give you the version, else its most likely port 5666
if it works, but with args not, then its most likely the sudoers config