Hello,
I have found quite a challange that originates from the DSGVO / GDPR combined with the version history containing usernames in general and also in combination of the LDAP Authenticator. This is probably gonna be an complicated topic. I appreciate any input and help on this topic!
When using normal XWiki user management an name that contains / is build out of his given name and surname is probably used most of the times.
Also, usally the LDAP Authenticator either uses the cn
or sAMAccountName
for the XWiki UID attribute which also contain his names in probably every organisation.
Example: UID_attr=sAMAccountName
Now here comes the problem: With the GDPR / DSGVO an organisation only has a permit to process users information, as long as there is either an valid reason or the user has given his permit (which he can withdraw at any time).
The users username attribute is the attribute that is written in the history on every change on a page and when an person leaves an organisation / company, his account can be deleted but his username and consequently his personal data will still be forever stored in that organisations XWiki even if he is long gone and that organisation has no longer a valid right to process that users personal information.
Im aware of the snippet/extension Change Document User (XWiki.org)
which can change the author, content author and creator attributes of any pages as well as the automated extension for LDAP:
LDAP user cleanup (XWiki.org)
Both of these do a very good job, but both of these don’t clear the history, and I think im fully aware why: That would probably end up in a ton of database changes / IOPS / performance costs.
Because of that issue I sought an way (especially in the context of LDAP Authentication) to mitigate the issue:
My idea was, to not use the sAMAccountName
but instead use the attributes objectSid
OR objectGUID
to map with the UID attribute for the xwiki users name. When an user leaves an organisation and their accounts will be deleted, such an unique identifier number does not contain any personal information and the history would simply show that meaningless UID.
But here comes the issue: Over LDAP (at least against Active Directory) both the objectSid
as well as the objectGUID
are both simple binary values and not text. When mapping for example (just for debugging) first_name=objectGUID
the result is funny looking symbols probably as the result from the LDAP Query being parsed with letter encoding. At least it proves that this attribute can be read.
Mapping description=objectGUID
also fails, the log tells it can’t be parsed into valid XML (probably an escapement for the syntax as description is an field that can be filled with xwiki syntax), updating the users information fails.
Now trying to map UID_attr=objectGUID
also fails and the user can’t be logged in / be created anymore - which makes sense, because objectGUID
is not encoded with valid text that is required for an xwiki username.
Using objectSID
instead of objectGUID
will always result in an error, probably because it is even longer than objectGUID
and it has the same issue of not being text anyway.
So what im here to ask is:
- How are others dealing with the possible GDPR / DSGVO issue, of former users usernames forever being saved in the history of their XWiki in an not anonymous way?
- Is it possible to clean the history / anonymize of every no longer existing user?
- Would it be an easy fix/change, to make it work to use the LDAP objectGUID as the unique user identifier, which will mitigate the whole problem? Maybe encoding the LDAP query results of the attirbutes
objectGUID
andobjectSid
into simple hexadecimal or normal decimal values could be an easy(?) fix?
I really, really appreciate any input and help on this topic!
Best regards,
Tom