Hi everyone,
the wiki and space-level search REST API currently has a very detailed specification how it performs the search:
Returns the list of pages and objects that contain the {keywords} in the specified {scope}s. Multiple scopes can be specified. Search results are relative to the whole {wikiName} and are obtained via a HQL query. The specified keywords are converted to uppercase and used in a HQL LIKE clause (e.g if the scope is CONTENT then the document’s content is matched to the specified keywords).
The only use of this API that I’m aware of is the page picker. In order to fix the slowness of this page picker (XWIKI-22958), I see two options:
- Break this description and modify the search behavior. In particular, the search API would have the following new behavior:
- It uses Solr for searching, so no exact substring match is performed. Instead, the query is matched with the standard Solr query parser. Further, for the last token of the query, a wildcard is added at the end to support a partially typed word at the end of the query. This wildcard processing is disabled for the page content to improve performance.
- Sorting is by default by match score but still supported for some properties, in particular, fullName, name, title, language, date, creationDate, author, creator, space, version, hidden. For empty queries, as a special case, sorting is by date in descending order (to return the most recently modified pages).
- Searching in the name restores the previous behavior of matching in all spaces. The title has a higher score, though, so matches that occur in the title will be preferred over just matches in the space. Additionally, when searching in the name, the search now also matches the entered text against the full document reference to specifically support pasting a full document reference (this is an exact match only, no substring).
- Introduce a new API for the page picker that exposes the behavior described above, or use an existing Solr API that is flexible enough to support the use case (not sure, as it is very nice to be able to use the tokenizer for getting the last token for the wildcard search).
I’m in favor of breaking the existing REST API as it is a REST API that exposes slow database searches and I think we should remove such REST APIs. Further, I think it would be better to have a replacement that mimics the old REST API than to completely remove it.
Therefore, I’m opening this vote to perform the breaking change of option 1. If the vote should fail, I’ll proceed with option 2.
I’m opening this vote for a bit more than a week until June 2, 10:00.