XWiki 11.2, docker distribution. Not very high load (total ~500 users). Maximum dozen of read requests per minute.
The issue is that from time to time the application hangs completely.
All I see is java process eating all available CPU resources. Logs do not indicate any errors, long-running processes, etc…
Not sure how to troubleshoot this. Could someone please point me to the right direction ?
Attaching threads dump, created with ‘kill -3 1’ threads.dump.txt (1.6 MB)
Yesterday night it hung again… 200% CPU - any connection times out.
Nothing in logs… Threads dump attached…
Any advise is very much appreciated xwiki.txt (394.5 KB)
…
Hi @unadequate. I checked quickly the threads and they don’t seem to be blocked by XWiki but by Tomcat. You can see a lot of threads blocked by:
"http-nio-8080-exec-152" #763564 daemon prio=5 os_prio=0 tid=0x00007fc2d44ac800 nid=0x42a7 waiting for monitor entry [0x00007fc22306f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.register(AbstractProtocol.java:1020)
- waiting to lock <0x00000006c0996be8> (a org.apache.coyote.AbstractProtocol$ConnectionHandler)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:797)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
- locked <0x000000078b1ce2e8> (a org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
This could suggest some problem with your Tomcat setup or some bugs with the version of Tomcat you’re using.
Sure but but we don’t code Tomcat (nor do we code Docker FWIW) What I meant to say is that it seems the problem is coming from Tomcat and not from XWiki, at least I don’t see any hint about XWiki having a problem in your logs. There can be thousands of reasons why you’re having problems with Tomcat, including the version of Tomcat you use, the resources you give to it, the OS it runs on, how you configured it, etc. You could ask on the Tomcat forums to help you out too since they should know better than us here who are more XWiki experts than Tomcat experts.
Right now, I don’t have any clue for you about the problem except that a lot of threads are blocked by this lock.
Maybe someone else here will have a clue. Sorry for not helping more.
One idea would be to set up XWiki without the Docker image to rule out Docker too. Another would be to use Jetty to see if the problem is with Tomcat or not.
Yes, I clearly understand that you are not responsible neither for Tomcat, nor for Docker.
But don’t think the issue is with Docker… Will research possible tomcat issues…
So, it hung multiple times during last 2 days. I wrote a script which is checking if xwiki is alive and if it is not - executes ‘kill -3’.
Looking into the most recent threads dump, I see that lots of threads are BLOCKED - waiting to lock <0x00000006c1e88408> (a org.xwiki.classloader.NamespaceURLClassLoader)
And lock 0x00000006c1e88408 is held by the following thread:
"https://xwiki.domain.local/bin/get/XWiki/SuggestSolrService?query=fq%3Dtype%3ADOCUMENT%0Afq%3Dclass%3AXWiki.XWikiUsers%0Aqf%3Dproperty.XWiki.XWikiUsers.last_name%5E10%20property.XWiki.XWikiUsers.first_name%5E5%20name%5E2.5%20object.XWiki.XWikiUsers&nb=3&outputSyntax=plain&input=map%20job" #151066 daemon prio=5 os_prio=0 tid=0x00007f696c19f000 nid=0x5313 waiting for monitor entry [0x00007f6892344000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
- locked <0x00000006c1e88408> (a org.xwiki.classloader.NamespaceURLClassLoader)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.xwiki.classloader.xwiki.internal.ContextNamespaceURLClassLoader.loadClass(ContextNamespaceURLClassLoader.java:176)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.commons.lang3.ClassUtils.getClass(ClassUtils.java:993)
at org.apache.commons.lang3.ClassUtils.getClass(ClassUtils.java:1059)
at org.apache.commons.lang3.ClassUtils.getClass(ClassUtils.java:1042)
at org.apache.commons.configuration2.beanutils.BeanHelper.loadClass(BeanHelper.java:509)
at org.apache.commons.configuration2.beanutils.BeanHelper.fetchBeanClass(BeanHelper.java:547)
at org.apache.commons.configuration2.beanutils.BeanHelper.createBeanCreationContext(BeanHelper.java:612)
at org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:458)
at org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:481)
at org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:494)
at org.apache.commons.configuration2.builder.BasicConfigurationBuilder.createResultInstance(BasicConfigurationBuilder.java:447)
at org.apache.commons.configuration2.builder.BasicConfigurationBuilder.createResult(BasicConfigurationBuilder.java:417)
at org.apache.commons.configuration2.builder.BasicConfigurationBuilder.getConfiguration(BasicConfigurationBuilder.java:285)
- locked <0x00000007a38ee8a0> (a org.apache.commons.configuration2.builder.FileBasedConfigurationBuilder)
at org.apache.commons.configuration2.builder.fluent.Configurations.properties(Configurations.java:300)
at com.xpn.xwiki.internal.skin.EnvironmentSkin.getProperties(EnvironmentSkin.java:98)
at com.xpn.xwiki.internal.skin.EnvironmentSkin.createParent(EnvironmentSkin.java:76)
at com.xpn.xwiki.internal.skin.AbstractSkin.getParent(AbstractSkin.java:101)
at com.xpn.xwiki.internal.skin.AbstractSkin.getParent(AbstractSkin.java:37)
at com.xpn.xwiki.internal.skin.AbstractSkin.getResource(AbstractSkin.java:121)
at com.xpn.xwiki.internal.template.InternalTemplateManager.getTemplate(InternalTemplateManager.java:939)
at com.xpn.xwiki.internal.template.InternalTemplateManager.getTemplate(InternalTemplateManager.java:954)
at com.xpn.xwiki.internal.template.InternalTemplateManager.renderFromSkin(InternalTemplateManager.java:720)
at com.xpn.xwiki.internal.template.InternalTemplateManager.renderFromSkin(InternalTemplateManager.java:703)
at com.xpn.xwiki.internal.template.InternalTemplateManager.render(InternalTemplateManager.java:689)
at com.xpn.xwiki.internal.template.DefaultTemplateManager.render(DefaultTemplateManager.java:78)
at com.xpn.xwiki.XWiki.evaluateTemplate(XWiki.java:2354)
at com.xpn.xwiki.web.Utils.parseTemplate(Utils.java:179)
at com.xpn.xwiki.web.XWikiAction.execute(XWikiAction.java:513)
at com.xpn.xwiki.web.XWikiAction.execute(XWikiAction.java:215)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:425)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:228)
at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1913)
at org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:449)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at com.xpn.xwiki.web.ActionFilter.doFilter(ActionFilter.java:112)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.xwiki.wysiwyg.filter.ConversionFilter.doFilter(ConversionFilter.java:109)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.xwiki.container.servlet.filters.internal.SetHTTPHeaderFilter.doFilter(SetHTTPHeaderFilter.java:63)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.xwiki.container.servlet.filters.internal.SavedRequestRestorerFilter.doFilter(SavedRequestRestorerFilter.java:208)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.xwiki.container.servlet.filters.internal.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:111)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.xwiki.resource.servlet.RoutingFilter.doFilter(RoutingFilter.java:132)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:660)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:679)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:798)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:808)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
- locked <0x00000007a356ff20> (a org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
Could you please advice, how to properly interpret that ? does that mean that ClassLoader stucks for some reason ?
Sorry I’m not that good in java, my specialization is PHP =)
Usually a thread dump like this is the consequence of the whole JVM being super slow for some reason but hard to tell why. Could be the server itself being slow because of something else, just not powerful enough for the load, too many temporary memory created and the garbage collector having a hard time keeping track and blocking everything. You might get a bit more information using a tool like glowroot (https://glowroot.org/).
So I’m still struggling with XWiki hanging over and over again…
Situation became even worse - it usually hangs 3-4 times a day.
One time it was Java Out Of Memory Error (I had max heap size set to 2G).
It’s hard to imaging why so small installation like mine could require so many RAM.
But okay I’ve increased max heap size to 3Gigs… Last thread dumps show that there were hundreds if not thousands of threads like this:
"Thread-52233" #52297 daemon prio=5 os_prio=0 tid=0x00007f97f8188800 nid=0x4d9a runnable [0x00007f97543b4000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.net.SocketInputStream.read(SocketInputStream.java:224)
at com.novell.ldap.asn1.ASN1Identifier.<init>(ASN1Identifier.java:129)
at com.novell.ldap.Connection$ReaderThread.run(Connection.java:1269)
at java.lang.Thread.run(Thread.java:748)
Seems to be LDAP authentication module?
Any clue why did it spawn to many threads. I guess the app hangs due to JVM eat all available RAM.
You should enable automatic memory dump in case of OOM, see Java HotSpot VM Command-Line Options ( HeapDumpOnOutOfMemoryError). It’s generally easy (at least for us ) to see what cause this from the memory dump.