When cleaning up notification filters and what filters exactly?

Hi everyone,

I’m opening this brainstorming after starting to investigate for fixing Loading.... A brief recap: right now for notifications we use notification filters defined by users that allows to specify which pages are watched or not. Those filters are then used to process the events and send notification to the concerned users.

The bug is that right now filters can be created automatically with the autowatch features, but they’re never deleted when a document is deleted.

My first intuition is that it would be easy to fix: we provide a listener to listen on DocumentDeletedEvent and we remove the filter when the event is triggered. The problem is that if we do that, then we’ll lose the delete notifications for users watching specifically those pages: the events for notifications are processed in a low priority thread, so it’s expected that when the thread process the events, the filters has already been removed by the listener, and then the document won’t be considered watched by the user anymore, and so the notification won’t be sent.

Now if we don’t do that with a listener, the only remaining options I can think of are either using a TaskConsumer or a Scheduler.
With the TaskConsumer we’d create a task with the listener to clean up the filters, which would also be processed in a low priority thread. Now to my knowledge we cannot really tell this task to be performed after notifications are processed, so we don’t know which willl be faster between notification or this task.

For the Scheduler, the idea is to not tie at all the cleaning of the filters to any event, but only to a date: we’d perform regularly the clean up, like every week. On the pro side we shouldn’t have a problem of synchronization between the moment the filter is removed and the moment the notification is handled, most of the time. On the cons side: I’m not yet 100% sure about the scalability of this solution: we’d need to clean up all filters of all users, since we wouldn’t know if or what pages have been deleted since last call. It probably could be done with an HQL query, but that has to be tested. Also there is still a very slight chance to have a synchronization bug if a notification related to a deleted document is processed at the same time as the scheduler runs.

So to sum up, on “when cleaning up” we have 3 possibilities:

  1. use a listener but then accept that we’ll get a regression on deletion events: pretty sure everyone will be -1 on that one
  2. use a TaskConsumer and accept that result will be random
  3. use a Scheduler and accept possible scalability issue
  4. do nothing and accept to pile up filters (current situation)

Now on the “what filters”: right now the notification filters might concern either the page alone, or the page and its children. By default, when the autowatch feature is used, it creates a filter for the page alone. But a user chosing to watch manually something might use the button to watch a space (page and children).

So if we do something for cleaning up automatically filters, I was thinking to only cleaning up the filters concerning the “page alone”, and not the filters about the page and its children. This means that, even if we provide a fix, it’s still possible that filters pile up if people watches spaces, as those would never be removed automatically. Now it will be clearly slower than with the autowatch feature.

IMO that’s acceptable, but would be great if I can have other opinions too. WDYT?

I’m not really qualified to give a very insightful opinion but here it goes:

Wouldn’t it be possible to give a lifetime to every filter, and check after this lifetime if the filter should still exist? E.g. the lifetime is 1 month, I start watrching a doc. On day 30, the filter is checked and the document still exists so it’s kept for a second month. On day 45, the doc is deleted, the filter sends the notification properly. On day 60, the filter is checked and removed since the doc is not here anymore.

From what I understand this would be a variation of 3. but spreading the task computation weight on multiple spans so that it doesn’t become such a problem when scaling up.
I suppose the cost for scheduling the checks is low enough, and that we can’t make cuts on computation time by batch processing the filters, but that might be wrong.

IMO in the case where the filter ‘parent’ page is deleted, the hierarchy for children is broken, and the filter was based on this hierarchy, so it’d make sense to delete it too.

Thanks for taking care of this issue on notifications!
Lucas C.

The idea is interesting if we do have a scalability issue. Now it raises some questions:

  1. this concept would be only technical to solve that issue? In general I’m not a big fan of adding such design concept if it’s only for a very specific technical problem
  2. how to handle migrations: all filters would get same value, so we’d still have the scalability issue possibly, no?

WDYM the hierarchy is broken? To my knowledge there’s no problem with the hierarchy when a page is removed: notifications are still properly working, and the navigation even still shows the space.

I meant that in a semantic way, the relation ‘child of X’ does not mean anything anymore when X is removed. Everything still works, but the initial choice of the user of ‘I want to watch what happens on Y because it’s the child of X’ is not really meaningful anymore IMO.

1 Like

ok thanks for clarifying, and for the feedback. I think what you say might make sense regarding the actual behaviour of the listener we have when moving page, which also refactor those filters, whatever if it’s a page only or a page with children filter.
Now I’m honestly not sure it would be always expected for users.

Actually, I think we can handle that reasonably cleanly (for the case of non-folded DocumentDeletedEvent).

There are two use cases:

  • the DocumentDeletedEvent is folded (it was deleted as part of some higher level action): strangely, this one is a bit more complex than the other. At first, it feels like we can immediately cleanup the filters since it’s not going to produce a user notification. Problem is that another event not yet fully dispatched might involve this document too.

  • the DocumentDeletedEvent is not folded (it was directly deleted): we could do the cleanup in UserEventDispatcher: if the stream event we just dispatched is a document delete then we cleanup the user for which the notification just been dispatched. If we want something a bit more generic we can move that logic to a listener listening to a UserEventDispatchedEvent produced by UserEventDispatcher for each stream event/user couple it just finished handling.

Sounds good indeed

We could maybe handle that with a dedicated DeletingFilterEvent cancelable: we’d trigger the event before performing the clean up of the filter and if something cancel it we don’t perform it. It’d allow extensions to prevent performing the clean up in specific known cases with a dedicated listener.

What I’m wondering is the following scenario: a document is deleted by accident and later restored. Now all notification filters are gone. Is that correct and expected? I wouldn’t expect it, to be honest. Could we maybe delay deleting the filter to the point where the document is removed from the recycle bin?

That’s indeed a good point.

I guess it’s an option if the recycle bin is enabled. It also avoids the complexity mentioned before then: we only perform the clean up when deleting from recycle bin.

For me, either we clean or we don’t (is the filter attached to the actual document or just the location ?), but I don’t see how removing the document from the recycle bin (which in most cases never really happen even if the document is never going to be restored) would be a criteria.

Hello all,

Since the recycle bin may or may not be cleaned in time, it would result in pages being leftover for too long in the filters of users and it would be just as incomprehensible as it is today.

It would be indeed be “surprising” that restoring a page from the recycle bin doesn’t restore its filters, but it is similar to many other deletions we have in XWiki that trigger additional cleanup: deleting a user deletes it from the groups and restoring never puts it back, now that deletes allow to re-link the incoming links I doubt that restoring a page from the recycle bin would put back the links, etc.
We may need to explain better what restore from the recycle bin does and what it doesn’t, it’s not a rollback in time it’s just a way to recover content.

As for what gets cleaned up, I think we should also clean up spaces, if spaces are deleted - or if the homepage of the space is deleted.

Thanks for handling this,
Anca

Hi, I’m coming back on this as I’m planning to perform the implementation.
So after reading again the thread I’ll perform what I wrote above on Thomas’ idea:

  1. Clean immediately the user filters in UserEventDispatcher in case of not folded DocumentDeletedEvent
  2. Handled folded DocumentDeletedEvent with a cancelable DeletingFilterEvent

I will also document as limitation in Notification that the filters are immediately clean up when a document is deleted and restoring the doc won’t provide them back. I’m hesitating to provide a configuration to keep supporting the old behaviour to never cleanup them if needed, do you think it’s needed?

Also I’m hesitating between adding a migration for cleaning up instances, or just providing a snippet?

I feel like a migration is in order (but it always feels more needed when you are not the one writing it).