Simplify the algorithm to group events for notifications

surli · October 22, 2021, 2:31pm

Hi everyone,

so I’m creating this new proposal related to Notifications, and again with the purpose of simplifying it. The topic of this proposal is the algorithm we use to group events: basically, when events need to be displayed, there’s an algorithm that checks if they should be grouped together in a Composite Event.

For example, in this screenshot, you can see 2 composite events, holding respectively 3 and 4 events:

Note that the way we group events have several impacts:

the obvious impact is that you only see the composite event, until you open the details (available when clicking on the 3 dots button)
you cannot right now mark a single event as read, you will always mark a composite event as read, so it might impacts several events at once
when you develop an extension, you can provide specific displayer for your events, but the displayer API actually takes a CompositeEvent in input since that’s what is displayed

Right now the way the events are grouped relies on a SimilarityCalculator, this component is internal and it computes the closeness of two events like this:

two events are very similar if they concern same document, they happened during same request, but they are not of same type,
two events are a bit less similar if they are of same type and they concern same document,
two events are again less similar if they are of same type but they do not concern documents

Then the algorithm consists in getting all the events, iterating over them, and starting to build a list of Composite Event: each new event is compared to the composite events and added if it’s similar to the events included in the composite events. If there’s no similarity a new Composite event is created with the event.
This global algorithm looks good. However, internally the way it computes similarity is quite complex, since it does not only check the score given by the SimilarityCalculator, but it also infers some transitivity relationship between events: if an event A is similar to B and B is similar to C, they will end up grouped together even if there’s no relationship between A and C.

To give you an example, here’s a usecase I have:

a user review a Change Request, which leads to changing the status of the Change request → three events are created, one for the review, one for the status change, and another one for the update of the change request page, since the status is saved in it. They are done in same request and concern same page but are not of same type.
then afterwards an update of the original page which modifies the status of the change request → 3 events are created again, one for the update of the original page, one for the status change, and another one for the update of the change request page

Right now the algorithm will group 5 events together: all of them, except the update of the original page since there’s no similarity at all with the other events. The 3 first events will be grouped because of the first rule of the Similarity Calculator, and the 2 others will be then added if they didn’t happen in the same request, but because they are attached to events of same type and same pages.

So my opinion here is that this algorithm is too complex: I definitely wasn’t expecting those events to be grouped together, and I spent quite some time to understand the situation. Hence I propose that we simplify it by keeping only two rules:

events that concern same page and same types are grouped together
events that are of same type and don’t concern any page are grouped together

and that’s all.
Taking back my example, doing that will mean that all the status change are grouped together, the update of pages are grouped in 2 different groups, for each page, and the review event is in its own group.

IMO doing that will simplify a lot the code, and the expected results from both developers and users.
WDYT?