CPU 100% for a while, xwiki can't seem to recover afterwards, not sure how to debug

Would be interesting to enable automatic memory dump on OutOfMemoryError to have a better idea of where all this memory went. See https://dev.xwiki.org/xwiki/bin/view/Community/Debugging#HAnalyzeOutOfMemoryissues for more details on how to enable it.

I’ve set the HeapDumpOnOutOfMemoryError AND HeapDumpPath like this:

      - JAVA_OPTS="-XX:+HeapDumpOnOutOfMemoryError"
      - JAVA_OPTS="-XX:HeapDumpPath=/usr/local/xwiki/dump"
      - JAVA_OPTS="-Xmx2048m"

But I’m not getting any dumps (while I’m seeing OOM errors like before), I’ve tried a few different paths as well.
I’ve also tried setting it as one:

      - JAVA_OPTS="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/xwiki/dump -Xmx2048m"

But that gives me errors as it just sees it as 1 really long option instead of 3 seperate.

I’ll keep trying, but I’m not sure what’s wrong.

That’s definitely not going to work as each line replace the previous one from what I understand.

Where are you setting this ?

in a docker compose yml, see above yml (2nd comment on original post)

If I set them as 1 JAVA_OPTS I’ll get the following:

Configuring XWiki...
 Setting environment variables
   Deploying XWiki in the 'ROOT' context
 Replacing environment variables in files
   Generating authentication validation and encryption keys...
   Setting permanent directory...
   Configure libreoffice...
   Reusing existing config file hibernate.cfg.xml...
   Reusing existing config file xwiki.cfg...
   Reusing existing config file xwiki.properties...
 NOTE: Picked up JDK_JAVA_OPTIONS:  --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED
 Unrecognized VM option 'HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/xwiki/dump -Xmx2048m'
 Did you mean '(+/-)HeapDumpOnOutOfMemoryError'? Error: Could not create the Java Virtual Machine.
 Error: A fatal exception has occurred. Program will exit.

so it looks like it sees it as 1 parameter…

setting them without quotations helped!

I’m going to wait until it gets the OOM errors a few times.

Ok,

I’ve had 1 dump (and that’s all I get in one run). I’ve opened it in the Eclipse Memory Analyzer and I’m seeing the following stuff (I’m not well versed on this matter, if I need to show something else I’d be more than happy to comply):
image

image

image

image

image

image

image

image

Looking at the result it does look like solr is a big suspect. I’ve also seen a lot of errors about solr and not serializing stuff…

Is this something you’ve seen before (and hopefully fixable?)

Would that be possible for me to get that memory dump to dig a bit more ? Maybe there is sensitive stuff in it ?

sending a dm via matrix :wink:

So, moving on from our DM’s (thanks @tmortagne), prod. has now restarted twice (because of our healthcheck on it).

It hasn’t dumped anything because I think prod was already slowed down too much that it trigger the healthcheck instead of getting OOM errors.

But the 2nd time it failed on the healthcheck I had the oppertunity to get some logs out:

 2023-01-04 08:33:01,065 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,066 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,067 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,067 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,067 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,067 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,068 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,068 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)
 2023-01-04 08:33:01,068 [recoveryExecutor-23-thread-1-processing-x:events] WARN  o.a.s.u.UpdateLog              - REPLAY_ERR: Exception replaying log
 java.util.concurrent.RejectedExecutionException: null
 	at org.apache.solr.util.OrderedExecutor.execute(OrderedExecutor.java:65)
 	at org.apache.solr.update.UpdateLog$LogReplayer.execute(UpdateLog.java:2058)
 	at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1922)
 	at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1784)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 	at java.base/java.lang.Thread.run(Unknown Source)

though it’s not much (the same error every few milliseconds).

I’ll set the healthcheck a bit more lenient and hope for something more next time.

First time I see this type of log. It seems to suggest that Solr is not in a great shape.

It might be interesting to try a standalone Solr approach as it tend to start causing performances problems in embedded mode when the database is growing, and I remember you mentioned having quite a lot of users. Note that if you go for this, you should also move the xwiki_ratings and xwiki_events cores (in embedded mode located in <permdir>/store/solr but without the xwiki_ prefix) to the standalone Solr instance, as these two contain unique data which cannot be reconstructed (the two others will be reindexed automatically but it could take a bit of time).

You’re scaring me…

Isn’t there a way to just hard resetting the whole SOLR instance, hopefully fixing the problem?
Also could I deduct from this that, us importing our pages in a new xwiki instance made SOLR crap itself?

I’ll try and add the standalone SOLR to our test instance first then…

It’s easy, but as I said, you might lose data by doing that for core rating (meaning loose likes) and core events (meaning loose past notifications) so depending on how deeply you care about that you could:

  • full wipe: stop XWiki and delete <permdir>/store/solr and <permdir>/cache/solr
  • keep some cores: stop XWiki and delete <permdir>/store/solr/<cores you don't want to keep> and <permdir>/cache/solr

We don’t really use the rating system, past notifications is a hit I’m willing to take.

We’ll do a full wipe and see how xwiki will react on that… Doing it now!
I’ll update the post :wink:

Note that by “causing problem” I only meant performance problem, it’s the first time I see something that looks like corruption (if that’s really it).

Having it run for a bit more than 12 hours and I’m rather slightly enthusiastic about how ‘healthy’ the logs look right now!

The only warning I’m seeing right now are the ExtensionIndexJob/UnknownHostException but that’s because it can’t reach the internet! (I understand there is a ticket for that, so it’s all good!).

I’ll keep an eye on it for now and will post any updates (or set this to solved when it has been online for 14+ days without failure).

Not sure which ticket you are referring to. Generally the best here is to explicitly tell XWiki there is no extensions repository available using extension.repositories= in xwiki.properties so that it does not even try.

I’ll add it.

We had another crash (same M.O.) but the dump was ‘only’ 0 bytes, I guess docker restarted the container a bit too fast… we change the healthcheck again to be more lenient.I did say in the logs it was a OOM error.

It just failed again, I have a dump.

interesting thing is that the test xwiki we have running (which stays untouched) and is the same as production hasn’t failed since starting it up so it looks like it’s triggered by user interaction?

If you’re still interested in helping and looking at the dump, @tmortagne, we can do it the same way as we did last time.

After looking together with @tmortagne to the dump and @tmortagne making a snapshot and implementing said snapshop, I’m pretty happy with the result, we have been running the snapshot for 2+ weeks now with usage everyday without a single hiccup!

If I remember correctly, this specific problem shouldn’t happen anymore from 14.10.4 forwards.

1 Like

Yes, the fix will be part of 14.10.4 (planned for release this week).

1 Like