Huge search index left to build on myxwiki

Hi,

The index queue is very big (again) on MyXwiki causing search not to be accurate for recent documents:

> 195831(Estimated time remaining: 25h 26m)

Can someone have a look? Many thanks,

Hi,

Today search returns no result on my Wiki hosted on myxwiki.

I had a look in the admin, the index queue is very big and will end in more than 1 month :roll_eyes::

513238(Estimated time remaining: 867h 13m)

Can someone have a look on what’s going on? Many thanks.

I think we just discovered that the upgrade has caused myxwiki.org to be slow, see https://jira.xwiki.org/browse/XWIKI-17334

I don’t know if that’s the problem you’re seeing, but it could be I guess.

@tmortagne?

myxwiki.org index started from scratch yesterday because it was upgraded to 12.3 but then https://jira.xwiki.org/browse/XWIKI-17334 came up which make myxwiki.org super slow. We are currently working on fixing this.

Thanks to take care of MyXwiki.

Hope I could perform searches again ;-).

MyXWiki.org is back in a much better shape now. The solr indexing is still going to take a little while but definitely not 867h.

Thanks!

The number of items decrease slowly, currently 508400 versus 513238 yesterday (*). Wait and see.

(*) sometime the number of items also increase for instance from xx316 to xx348 ; it may be due to the Wiki farm activity?

Yes there is the queue of missing pages produced during the last XWiki startup which increased quickly and then started to decreased and from time to time some small increase caused by new saves (many people are upgrading their wiki which both increase the queue and impact a bit the general speed, I also created a new wiki today and an “empty” wiki contains more than 700 wiki pages). It’s sent to the remote Solr instance in batch of 50 elements (to not impact too much the memory). I see the current number is down to 505399 (note that this is not the number of pages but the number of elements and each object property produce a separated element in the solr index).

Of course your own wiki will be fully indexed (at least stuff you did before the last startup) long before this number is down to 0 since this is the total number for the whole farm and any new modification is added at the end of the queue.

Thanks for your explanations.

We are now around 499 000 elements to be indexed by Solr.

Hope the night will help this queue to decrease drastically because I guess most of myxwiki.org users are located in Europe and won’t perform any action / new saves during the next 8 hours :wink:.

Question: what’s the reason why index started from scratch yesterday? You stated it’s because of the 12.3 upgrade but I don’t understand why old documents users didn’t save since several years need to be indexed again?

It’s not so much related to XWiki upgrade than the fact that it implied a big jump of Solr version (from 8.1.1 to 8.5) and Solr is not great at migrating data. So far our policy was simply to get rid of the old index and recreate it from scratch to avoid any problem since it’s all caching technically but yes it’s quite a pain for huge instances like myxwiki.org. We are currently experimenting Solr cores that can be migrated (on XWiki side) but it’s very simple schemas compared to the search index (the reason we are experimenting that is because those new cores are used as unique storage and are not indexing stuff stored somewhere else).

Today I had a look, with hope :star_struck:, to the index queue:

302895 (Estimated time remaining: 445h 38m)

The “estimated time remaining” is not accurate with big “jumps” from 445h to 11h (more than x 40).

We are less than half the way compared to 5 days ago :disappointed_relieved: (it was 513238, see my post above).

No document of my Wiki is yet indexed and ready for search :disappointed_relieved:.

Should myxwiki.org users will have to wait around 1 more week to have search available again?

I agree it’s not normal. Even having to wait 20 minutes at startup is not normal and a bug to me. So several hours and even days it a critical issue. It’s all the more important than several xwiki features depend on the SOLR index being good.

Now myxwiki.org is serving its purpose here which is as a real-life test bed for xwiki, on a large instance. Now we need to find how to fix this. Some ideas at random:

  • Debug perf indexing issues
  • Index first active wikis and push inactive wikis to last
  • Scale indexing by having sharding and more generally implement well-known strategies to scale SORL (if the issue is not an xwiki issue).
  • Implement index optimizations. For example a lot of docs are exactly the same across all wiki instances (the default documents, 1000 per wiki or so, times 300). It could be enough to index one and then somehow to just copy the index or have aliases.
  • Take inactive wikis offline (export their XAR content, store it for 1 year for ex and then delete). An inactive wiki could be a wiki that has not had user-made changes over a year. We would need to indicate this on the home page of myxwiki.org

@tmortagne WDYT?

Thanks Vincent to confirm this index issue needs to be handled seriously.

You’re right and now it’s really clear for me!

Most of my posts here in the help forum are linked either to xwiki upgrade issues or to myxwiki instance maintenance / performance issues. I guess the proportion is somehow 80%/20%: 80 for upgrade and 20 for instance maintenance … this SOLR issue seems to be at the intersection of both of theses 2 categories :smiley:.

Hi,

I wanted to check the size if the index queue today and when entering in the admin application, I received the following error:

Failed to execute the [include] macro. Cause: [Current user [xwiki:XWiki.xrichard] doesn't have view rights on document [Document tutos:XWiki.ConfigurableClassMacros]]. Click on this message for details.

Then when performing a 2d access to the admin application the error does not raised again but some menus are missing …

Many thanks in advance for your help (I’m quite sure the same issues are occuring on other wikis hosted on myxwiki / I’m quite sure these issues are not linked to my sole Wiki).

Admin application is back without error and all its menus are available! I don’t understand what’s going on…

But the indexer has still a lot of work to do (I’m quite sure the estimated time remaining won’t be enough because the queue is continuously growing).

Time remaining 2

It’s actually not the same queue. There seems to be a bug with the Solr sync at startup which restart mostly from scratch at every restart leading to the feeling that it never ends, it actually been finished several times already.

I disabled the auto sync and we are debugging it.

@xrichard could you remind me what is your wiki’s id, can’t remember it and I don’t find it in this thread

The infinite solr reindex seems to be caused by some wiki having the wrong encoding and would like to make sure it’s the same reason for yours.

Is it tutos ?

Hi,

Thanks for your help and yes the concerned wiki is “tutos”.