jabber.spline.de

notes about spline's jabber service

reclustering ejabberd nodes

2012-03-04 by Michael
Der Jabber-Dienst ist dieser Tage etwas instabil, das hat 2 Gründe:
Assume we have a 2-node setup (vm-jabber{0,1}) which has a broken replication scheme and start over be purging vm-jabber1 completely. Since Ejabberd V 2.1.x there is a nice way to remove a db node from a setup.
On our master server (vm-jabber0): Make sure to include the following line in ejabberd.cfg
{modules,
 [
[...]
  {mod_admin_extra, []},
[...]
After this, restart the ejabberd process and run:
ejabberdctl remove_node 'ejabberd@vm-jabber1'
In a debug shell (or the webinterface) confirm that the node has been purged:
$ ejabberdctl debug
Attaching Erlang shell to node ejabberd@vm-jabber0.
To detach it, press: Ctrl+G, q, Return

Erlang R14A (erts-5.8) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [kernel-poll:false]

Eshell V5.8  (abort with ^G)
(ejabberd@vm-jabber0)1> mnesia:info().
// SNIP //
running db nodes   = ['ejabberd@vm-jabber0']
stopped db nodes   = [] 
master node tables = []
// SNIP //
// Hit Ctrl-C twice to abort the debug shell
On the purged node, stop ejabberd, remove all database files and get a fresh ejabberd.cfg copy from the master. Also, we will need the master cookie to authenticate the nodes with each other.
/etc/init.d/ejabberd stop
rm -rf /var/lib/ejabberd/*
scp root@vm-jabber0:/etc/ejabberd/ejabberd.cfg /etc/ejabberd/
chow root:ejabberd /etc/ejabberd/ejabberd.cfg
chmod 640 /etc/ejabberd/ejabberd.cfg
scp root@vm-jabber0:/var/lib/ejabberd/.erlang.cookie /var/lib/ejabberd/
chown ejabberd:ejabberd /var/lib/ejabberd/.erlang.cookie
chmod 440 /var/lib/ejabberd/.erlang.cookie
When we are done we have to rebuild the mnesia database i.e. import the schema (to disc) and get copies for all tables from the master. So we start an basic erlang process and not ejabberd since this would recreate the ejaberd db for a new local setup.
su - ejabberd -c bash
erl -sname ejabberd@vm-jabber1 -mnesia dir '"/var/lib/ejabberd/"' \
  -mnesia extra_db_nodes "['ejabberd@vm-jabber0']" -s mnesia
[...]
(ejabberd@vm-jabber1)1> mnesia:change_table_copy_type(schema, node(), disc_copies).
// submit and hit ctrl-c twice to exit or check the newly populated db with mnesia:info().
Now you can fire up the second ejabberd node von vm-jabber1. But there is still work to do. Ejabberd makes some weird decisions storing the data. Here is what we want for ejabberd failover. Basically we want to store as much shared data as possible in ram AND disc so that the slave node can start ejabberd on its own because it has a copy of everything on disc. Of course some tables are not necessary to start the jabber server like session or s2s can be stored in RAM only. The important thing is to elliminate or at least reduce the number of "remote copy" entries since this could block failover. Some memory eating things like offline_msg can be ignored if there is not enough ram to begin with. I found it very handy to use the web_admin module to go through the replication type of each table, here is a reminder on how to tunnel it through to your client (we do not forward port 5280 here):
ssh vm-jabber0 -L 8000:localhost 5280 # and fire up a browser
First go through the master table and make sure every table has a sane type - you need a disc copy if the nodes hast to start on itws own!

That's it. Good Luck :)

yes, we're open

2011-10-12 by mk

Ab sofort wird der Spline Jaber-Server für jeden nutzbar sein. Das bedeutet, man kann sich mit einem Jabber-Client seiner Wahl einen Account für jabber.spline.de registrieren. Falls ihr der Startcom SSL CA nicht traut, ist hier der Fingerprint des SSL-Zertifikats:
0d:16:b6:9a:08:d1:52:13:1b:ff:f7:0a:c8:75:7f:93:58:fb:41:0a
Bei Schwierigkeiten oder Fragen könnt ihr euch per Mail an die Maintainer wenden oder im IRC-Channel #spline auf irc.freenode.net vorbei schauen.