Unable to edit on second node in cluster on 9.9

First time user of xwiki, I apologize up front if I missed something obvious or need a little hand holding, however strong linux background.

I have installed Xwiki 9.9 on 2 servers Centos 7, Tomcat, multi-mastered mariadb , using the WAR package, all packages are stock from Centos repos. I followed the clustering tutorial here: http://www.xwiki.org/xwiki/bin/view/Documentation/AdminGuide/Clustering/DistributedEventClusterSetup/

When making updates (editing a page) on the “first” node everything works great and the change instantly shows up on the second page (no caching issues). However when I try to make an edit on the second the edit page screen spins and spins and never loads the edit box.

I have checked logs and nothing is being reported, not sure where else to even look further, any advice is much appreciated.

Sounds like the save is stuck in a loop for some reason. Maybe make sure the second node is configured to target the first node and not himself is you use TCP based setup.

I believe this is what you are talking about:

[root@pikes ~]# ps aux | grep Djgroups
tomcat 28526 0.3 21.8 3155716 458960 ? Ssl Nov09 4:21 /usr/lib/jvm/jre/bin/java -Djgroup.bind_addr=208.91.90.129 -Djgroups.tcpping.initial_hosts=208.91.90.130[7800] -classpath /usr/share/tomcat/bin/bootstrap.jar:/usr/share/tomcat/bin/tomcat-juli.jar:/usr/share/java/commons-daemon.jar -Dcatalina.base=/usr/share/tomcat -Dcatalina.home=/usr/share/tomcat -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/cache/tomcat/temp -Djava.util.logging.config.file=/usr/share/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager org.apache.catalina.startup.Bootstrap start
root 31215 0.0 0.0 11044 820 pts/0 S+ 14:10 0:00 grep --color=auto Djgroups
[root@pikes ~]# netstat -anp | grep 7800
tcp6 0 0 208.91.90.129:7800 :::* LISTEN 28526/java
tcp6 0 0 208.91.90.129:54833 208.91.90.130:7800 ESTABLISHED 28526/java

[root@pyramid ~]# ps aux | grep Djgroups
root 4685 0.0 0.0 9000 836 pts/0 S+ 14:10 0:00 grep --color=auto Djgroups
tomcat 7549 0.3 17.7 3153416 372796 ? Ssl Nov09 4:24 /usr/lib/jvm/jre/bin/java -Djgroup.bind_addr=208.91.90.130 -Djgroups.tcpping.initial_hosts=208.91.90.129[7800] -classpath /usr/share/tomcat/bin/bootstrap.jar:/usr/share/tomcat/bin/tomcat-juli.jar:/usr/share/java/commons-daemon.jar -Dcatalina.base=/usr/share/tomcat -Dcatalina.home=/usr/share/tomcat -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/cache/tomcat/temp -Djava.util.logging.config.file=/usr/share/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager org.apache.catalina.startup.Bootstrap start
[root@pyramid ~]# netstat -anp | grep 7800
tcp6 0 0 208.91.90.130:7800 :::* LISTEN 7549/java
tcp6 0 0 208.91.90.130:7800 208.91.90.129:54833 ESTABLISHED 7549/java

It appears they are targeting each other and connecting properly, is there a log somewhere that might show if there is an issue with that connecting?

You can enabling DEBUG logging for package “org.xwiki.observation.remote.internal.jgroups” for the XWiki side.
That will at least tell you if message are sent and received trough jgroups.

Then for more details you’ll need JGroups debug log but not really sure what’s best here.

Looks like messages are being passed as seen here:

Nov 15 01:00:00 pyramid.ajserver.com server[4392]: 2017-11-15 01:00:00,139 [jgroups-4,event,pyramid-43401] DEBUG o.r.i.j.DefaultJGroupsReceiver - Received JGroups remote event [event: [org.xwiki.bridge.event.DocumentUpdatedEvent@ca7530a7], source: [{docversion=166.1, doclanguage=, origdocversion=165.1, origdoclanguage=, docname=xwiki:XWiki.Notifications.NotificationEmailHourlySender}], data: [{contextwiki=xwiki, contextuser=XWiki.superadmin}]]
Nov 15 01:00:00 pyramid.ajserver.com server[4392]: 2017-11-15 01:00:00,147 [DefaultQuartzScheduler_Worker-1] DEBUG .o.r.i.j.JGroupsNetworkAdapter - Send JGroups remote event [event: [org.xwiki.bridge.event.DocumentUpdatedEvent@ca7530a7], source: [{docversion=166.1, doclanguage=, origdocversion=165.1, origdoclanguage=, docname=xwiki:XWiki.Notifications.NotificationEmailHourlySender}], data: [{contextwiki=xwiki, contextuser=XWiki.superadmin}]]

I’m going to have to bring this back into a lab as I just need to get the production environment up on a single server, but if someone has successfully clustered 9.9 it would be nice to know it works and hear if they had any learning experiences they can share

There several integration tests with a clustering setup so it’s definitely working on 9.9.

I have tracked this back to a tomcat serverlet crash during initial flavor install causing some extensions to not be installed on the second node, after re-running the flavor install with the upgrade to 9.10 things seem to be working.

Thanks much for the help.

Glad you found the issue :slight_smile: