Introduce a XWikiDocument#displayedAuthor

Hi everyone,

the way we handle Author of the document in XWiki is through two fields in XWikiDocument, author and contentAuthor. I won’t talk about contentAuthor since it’s been introduced only for security reason and AFAIU it should never be different from the author in the storage (it might be different at execution time, but it’s not the topic).

Now, the purpose of the author field in XWikiDocument is twofold: its first usage is to be able to display who is the author of the document in the header content of the page, its second usage is to be able to compute rights on the document.

The fact that it has two usage can be problematic in some conditions: right now, a script without PR rights cannot set the current user as the author of the document, so a document created in this condition will always have the script author as its own author, and not the actual user of the script.

For me my usecase is a bit different: I want a way in Change request, to be able to save a document in specifying that the user who performs the merge is the author to be checked for the rights, and the author who actually proposed the changes, the author to be displayed.

For those reasons, I propose that we add a new field and associated methods in XWikiDocument that I would call displayedAuthor: its purpose is solely to be displayed in the UI, and the author would remain the one to be checked for rights.

The change for introducing such field would be the following:

  • introduce the field and API in XWikiDocument (getter, setter)
  • introduce new save API in Document to allow setting a custom displayedAuthor when saving
  • introduce an hibernate migration that would create a new column in Document table, and fill it with the author for existing documents
  • change the implementation of the versioning store to use the displayedAuthor as the author saved in an RCSNodeInfo
  • change the serialization of a document to XML to introduce this new metadata

This is my main proposal, on top of this I propose that we rename getAuthor to getMetadataAuthor or something a bit more explicit to be able to make the distinction between the 3 authors we’d have.

WDYT?

And the parsing too (fromXML()).

General +1 except for one small comment: displayedAuthor feels a bit weird and I would expect displayAuthor.

This looks complex to understand with all the different authors. Maybe you could list them all, propose some names that make sense and propose a description so that we can see them next to each other and see how they would be documented.

So I’ll try to sum up, I hope I won’t make mistake since it’s even for me it’s still complicated :slight_smile: for details there was a discussion about them with @tmortagne available in matrix

  • author: this is the author of all metadata of a document, including the xclass, xobjects, etc. When using the save APIs we currently always use the current user in the context as this author.
  • content author: this is the author of the content of a document, this author information are used for checking the rights to execute scripts. In case of saving a document from a script, the author of the document will be the content author of the script document.
  • display author: as specified its goal is to not check rights at all on them, but only to display them in any UI, to know who triggered the changes of a document, whatever would be the mean used for saving it (script, change request etc)

Note that I’m still not comfortable between content author and author, since I don’t see in which condition there might be different.

As we discussed on the chat, the content author is the author of the content field and only that, while the author is about any kind of change (including the content). If you don’t modify the content field then only the author is updated and this happen often (just think about comments for an easy example).

AFAIU now the terminology of “display author” is too restrictive. It’s not just about display, it’s about knowing the author that triggered the save action (it could be to display it but it could be used for lots of other reasons too).

From Simon on https://github.com/xwiki/xwiki-platform/pull/1713#discussion_r746566050 :

So to clarify the idea here is that a document can be created or saved by a script that has the rights to do so when the user itself does not necessarily have the rights. So the final user who uses the script is the user “who impulses” the creation or save. Easy example is a the usage of change request: the user who performs the changes is not the one who merges, the first one will be this displayed user.

Some ideas of terminology:

    • Original Author
    • Effective Author

or

    • Author
    • Saving Author

or

    • Trigger Author
    • Saving Author

or …

Note that in your proposals you only include 2 possibilities while we already have two different authors: the content author and the metadata author.

So Effective Author would be Effective Metadata Author and Effective Content Author vs Original author for example?
Actually since those content / metadata authors should then be used only for Rights we could maybe put explicitely “Rights” in the terminology?
Something such as:

  • Metadata Rights Author
  • Content Rights Author
  • Original Author

I guess we can imagine a use case where you want to know who modified a document content last for other reason than testing rights, so I’m not sure including “rights” in the name really is accurate. I much prefer having a name that describe what it’s about.

Ok, so coming back on @vmassol idea on this, I propose those names:

  • effectiveMetadataAuthor (in replacement to actual metadataAuthor)
  • originalMetadataAuthor (in replacement to proposed displayedAuthor)
  • contentAuthor (keeping the existing name)

WDYT?

So if I understand correctly, the effectiveMetadataAuthor is used to check permissions (and so is contentAuthor), while originalMetadataAuthor is optional (contrary to the other ones, and when not set it returns effectiveMetadataAuthor) and is not used in XWikiDocument but it can be used by code (like the vm that displays the “last modified by” line in the UI) to access the author that triggered the document change.

This is fine to me but I find it a bit hard to understand what they’re used for just by looking at their names. This is why I had also proposed author and savingAuthor, where savingAuthor is the author used to save the changes (and author is the author who performed the change).

Problem is that “author” already exist in the API and in the database, and it’s what we plan to use as “saving author” since that’s its current job.

Note that I’m not a huge fan of the proposed names either, but I don’t have much better proposal yet.

Is it a problem to have a different name in the DB than the API? The DB is supposed to be “internal”. We could do a migration (changing a column name should be super fast).

It wouldn’t be a backward compat breakage since setting the author and not setting the “saving author” would behave as now(author would be used as the saving author if not set).

Not really, queries makes the database an API.

ah yes… indeed, not possible