Hi.
Nearly all parameters to search for are case insensitive. But not tags. A tag “Example” isn’t found if searched for “example”.
Any idea how to change that behaviour?
Regards, Simpel
Hi.
Nearly all parameters to search for are case insensitive. But not tags. A tag “Example” isn’t found if searched for “example”.
Any idea how to change that behaviour?
Regards, Simpel
same problem! is there any solution?
Hi,
You can’t force case-insensitive search from the search query. The way the search query is analyzed and matched against a field from the Solr index is determined by the field type. In other words the search results depend very much on the way the data is indexed. The page tags are stored using a Static List property (of the XWiki.TagClass). Static List property values are currently indexed as strings, which means no transformation is performed. Each tag ends up as a token in the Solr index. See:
The field name is property.XWiki.TagClass.tags_string. The _string suffix triggers the string indexing, see xwiki-platform/xwiki-platform-core/xwiki-platform-search/xwiki-platform-search-solr/xwiki-platform-search-solr-server/xwiki-platform-search-solr-server-core/src/main/resources/conf/managed-schema at xwiki-platform-15.8 · xwiki/xwiki-platform · GitHub .
What options do you have? The simplest is probably to create another Solr field that indexes the same information (page tags) differently. You can do this from the Solr schema:
<copyField source="property.XWiki.TagClass.tags_string" dest="property.XWiki.TagClass.tags_sortString" />
This creates a new property.XWiki.TagClass.tags_sortString field that indexes the page tags as “sortString” which is defined here xwiki-platform/xwiki-platform-core/xwiki-platform-search/xwiki-platform-search-solr/xwiki-platform-search-solr-server/xwiki-platform-search-solr-server-core/src/main/resources/conf/managed-schema at xwiki-platform-15.8 · xwiki/xwiki-platform · GitHub and seems to do what you want xwiki-platform/xwiki-platform-core/xwiki-platform-search/xwiki-platform-search-solr/xwiki-platform-search-solr-server/xwiki-platform-search-solr-server-core/src/main/resources/conf/managed-schema at xwiki-platform-15.8 · xwiki/xwiki-platform · GitHub
lowercases the entire field value, keeping it as a single token.
Of course, you have to re-index you wiki after customizing the Solr schema.
Hope this helps,
Marius
@mflorea: I think we have a product issue here, as other tags-related APIs are case-insensitive, to my knowledge.
So the concept of tag in XWiki is case-insensitive, thus search should probably also be case insensitive by default for tags.
However, I don’t think all of the tag APIs are case-insensitive, so there is a bit of a mess here…
Anca
I have tried this approach to get case-insensitive tag-bases search suggestions and it didn’t work.
Error from the log:
"Multiple values encountered for non multiValued copy field
property.XWiki.TagClass.tags_sortString"
I have then tried jmiba’s approach from this thread and found that to work.
He proposes adding a new field “tag”, declaring it “multivalue”=“true” and then copying to that.
One can then add search suggestions based on this new field and voila, it works!
Hi,
I understand the issue that case sensitivity is dependent on the field, but:
Why can’t I at least store the same tag in different spellings? Currently (16.10.15) xWiki prevents me from adding the tag in an additional spelling (e.g., “S/mime” vs. “s/mime”).
That is, it can’t be that, on the one hand, case-insensitive searching isn’t performed, but on the other hand, it’s prevented from storing the tags in both uppercase and lowercase.
An internet search led me to this:
https://jira.xwiki.org/browse/XWIKI-23428
Apparently, it’s already implemented in the current LTS version. I’ll give it a try…
The tags are stored in a generic way, using objects attached to wiki pages. The search is then indexing these objects, using whatever metadata they provide to decide how to index. The search doesn’t care if it’s a tag object or a comment object. It only cares that there is an object with some properties that have a type and some metadata. The property type and metadata control how the indexing is performed. There is no case-insensitive string (list) property. The tag object uses a string (list) property, so the fact that tags are case-insensitive is only known by the tag UI. On the storage side they are saved as case-sensitive strings and indexed accordingly.
I’m not mentioning this as a justification, but to provide some context. The problem is real. The best fix is probably to introduce some metadata for the string (list) property to indicate that it’s case-sensitive or not and then use this to index the values differently. The quick fix is to add some special handling of tags to the Solr schema, which is not nice because the search shouldn’t be aware of the tags feature. Alternatively, we could imagine introducing a way for XWiki extensions to extend the Solr schema with domain specific indexing. This is probably already possible using the Solr API, but it’s not easy / straightforward.
Thanks,
Marius
I understand, so Loading... is irrelevant for tag searches because the Jira issue uses string fields, while the tags are stored as objects?
Too bad: I was hoping that a simple update would solve the problem. So now I’m back to square one: It’s not very helpful if I always enter the tags in lowercase, but users aren’t searching for them in lowercase.
But if I enter all the tags in lowercase… Isn’t there a way to convert the search to lowercase BEFORE the Solr search starts? It wouldn’t be ideal, but it would help for now…
I missed that issue, which is indeed relevant to the tag search. The only remaining problem is Loading... which is needed to expand the field property.XWiki.TagClass.tags to property.XWiki.TagClass.tags_string_lowercase so that you get the proper match. The quick fix is to edit <permanentDirectory>/store/solr/search_9/conf/solrconfig.xml:
- <str name="xwiki.dynamicFieldTypes">boolean, int, long, float, double, string, date</str>
+ <str name="xwiki.dynamicFieldTypes">boolean, int, long, float, double, string, date, string_lowercase, string_ns, lowercase</str>
Hope this helps,
Marius
Yes, that worked! After a reboot and a reindex, the case insensitive tags are now being found.
Thank you so much!