The easiest way to do that, by far, is to move extensions to blob storage (i.e. and so to S3 in the case of a cluster), so I propose to do that as first version. I like the database solution because it makes more sense to store such an index in database that you can query, but it’s a lot more complex to implement. Going to blob storage now does not prevent from moving to something else later anyway (also in any case we will always need a storage solution for the extension binary files, and database is generally not great for that).
I believe we need something quickly, and blob storage has the big advantage that it does not change anything to the design (at least the local design, sharing the installed extensions index will change a bit the logic of the clustering support, whatever storage solution is chosen).
But it does come with a constraint: impossible to store extensions in S3 storage if S3 support is itself an installed extension. So I would like to also propose one of the following:
include S3 support (xwiki-commons-store-blob-s3) in the XWiki WAR
or introduce a new clustering oriented WAR (S3 included and the default, no embedded Solr, maybe some other stuff)
WDYT ?
Here is my +1 for moving extensions to blob storage.
I believe we will do 2. eventually, but I feel 1. make sense on its own already as it can be interesting even for single instance use case (not so much for extensions, but S3 storage is simply less expensive than disk storage, which can be interesting if you have a lot of attachments).
Disclaimer: I’m replying to a topic I haven’t followed so bear with me Just trying to understand.
I don’t understand this part. Imagine XWiki is installed, it has the blob store api + the filesystem implementation available. So that’s what is used to store extension metadata. Then, an admin installs the S3 implementation of the blob store api (thus in the filesystem) and that adds an option (in the Admin UI I assume) to switch to the S3 implementation for storing extension metadata, with a migration to copy from FS to S3.
What is wrong with that?
I don’t understand this too. For me S3 has to be more expensive since you have the network calls + the storage of the remote server, vs only the FS storage locally. Could you explain?
Sounds good to me, +1. I agree that the blob store is the best for the extension files themselves and moving the XED files there, too, is probably okay and as you’re writing, we can still move them later.
I think where we need to be careful is how to handle the upgrade across the cluster, like first add the new extension versions, then get all cluster members to load the JARs, and only then remove the old version - and maybe even consider how to roll back if anything fails.
Another point to consider I think are offline extension installations that should use S3. I haven’t tried this myself yet, but I understand from guides like the one from XWiki SAS, the recommendation is to extract files into the extension repository, which would now move to S3. This breaks when moving to S3 for everything but the initial installation (where an automated migration would be triggered if there aren’t any extensions on S3 yet). While we could instruct users to upload the files to S3 directly, this seems a lot more complicated. Maybe we could always check the local extension repository files for new extensions and copy/move them to S3 when found or offer some other easy mechanism for admins to add new extensions to the extension repository in S3?
An open question to me is if with this change, we will require the use of S3 for clustering or if we still support the old way, too. I think to avoid having to maintain two systems of which at least one will be poorly tested, it would be best to require S3 for clustering, but it’s of course not so nice in terms backwards compatibility.
+1 to include S3 support in the WAR, even though it’s quite a heavy dependency I think.
You need the configured blob store implementation component ready before you start scanning installed extensions. If the S3 blob store implementation is an installed extension, you cannot use it to discover installed extensions stored in S3 (you don’t even know it exist yet).
I’m talking about actual money here.
I would not worry too much about that: currently, the old version is not removed during an upgrade.
Yes, what I had in mind is to require S3. I also believe that it’s kind of the promise we made the minute we introduce the blob API: contrib extensions will expect that using the blob storage will give them a shared storage for files in the case of a cluster.
Yes, I hope the fact that it’s introduced at the beginning of XWiki 18.x (hopefully 18.1.0), makes this requirement OK. In practice, it put extensions clustering support in a similar situation than attachments clustering support: we recommend you to use S3, but the old, not super reliable, trick of using a shared folder should mostly work too.
I trust you on that one but would you be able to show us a simulation of the cost for X amount of data and Y amount of network usage on disk storage vs s3 storage?
+1 as well (assuming there are indeed no solutions that would allow seamless work with EM).
With the assumption that there is a good migration path toward s3. I tend to be +1 to breaking backward compatibility because the current solution never allowed a great clustering implementation, while the solution with s3 is expected to be the long-term solution for clustering support.
Regarding requiring S3 for clustering, have we polled the community, asking our users if they’d be ok with this or if someone would find it a big issue for them? I have the feeling it makes setting up an XWiki cluster harder but I’m probably wrong.
Do we have a tutorial to explain how to set up an XWiki cluster using an open source S3 server? I think that would help and would be good, even outside the scope of this proposal.
On the first start with S3 enabled, all data is moved automatically from local file system to S3 (that feature is part of the blob store API). This happens the moment the blob store is accessed, so for the extension repository the moment the extension repository is accessed for the first time.
From what I understand, at many cloud providers, internal network transfer is free. The way we use object storage in XWiki, all transfer goes through XWiki, so the network transfer cost should be the same for both types of storage. At OVH, even classic block storage is 0.04307 EUR/GB/month while standard object storage is 0.0070956 EUR/GB/month, so more than a factor four cheaper. Even high-performance object storage is just 0.01825 EUR/GB/month, so still more than a factor of two cheaper than classic block storage. High-speed block storage is even twice as expensive, so if you compare the two high-performance storage types, you get a factor of four again.
In a cluster setting, storage always needs to be provided over the network due to the NFS requirement.
ah ok I think most users who were using XWiki in a cluster so far were using NFS on their own server (thus no additional cost). I believe (and I hope) they can continue to do that by hosting some open source S3 server. It still feels more complex to set up than before and also less performant (but with the potential advantage of being less clunky/working better? I’m not even sure if we had problems with NFS).
About that, I know of https://garagehq.deuxfleurs.fr/, which seems popular these days.
It seems we are using MinIO for our tests, but from what I understand, the project is now in maintenance mode.
Do we have a plan for an alternative?
My plan was to try integrating Garage into our integration tests, but I haven’t found the time for it yet. There is no dedicated Testcontainers support for it, but I think it should work as a generic container. I’ve checked some other options like OpenStack Swift which should be the software that is used by OVH according to their documentation, but it seemed much more complicated.
That would be the goal when integrating it in our test setup.
Actually, clicking around a bit more, I found a documentation page that seems to imply that Swift is only used in their legacy storage system. Confirmed also in their migration documentation.
It’s a bit late to discuss the validity of proposing only S3 as recommendation for file sharing. We already have the problem for attachments. NFS never been a recommendation, only a workaround while waiting for a proper fix (which, right now, is S3).
Anyway, as I mentioned above, the same crappy hack used for the attachments will theoretically be usable too for extensions as soon as Extension Manager is refactored to rely on the fact that the local extension repository is shared (which is the only new requirement from EM point of view). Extension Manager itself does not really require a specific implementation, the minimum is that when a file is saved, it’s available to all nodes. I’m not too worried about file collision for a feature like Extension Manager (much less than in the case of attachments anyway), especially since each extension version is located in a different file. I just really don’t think it’s something we want to recommend.
If you compare to the current clustering support, it’s definitely more complex. Compared to NFS, I don’t know, there are quite a few S3 implementations around there, and I assume some are not that complex to configure.
It will probably have an impact at init. Now I guess it’s possible to have a setup in which the speed is not that different, and we are also not talking about writing/reading gigabytes of data in the case of Extension Manager, so it’s more about the ping.
Right now our doc says we can still use NFS when S3 is not available:
For older XWiki versions or if you cannot use S3 as blob store for another reason, you need to set up a shared file system (e.g., NFS) as attachments and deleted documents located on the file system (located inside the Permanent Directory) are not clustered. In order for all XWiki instance to access and display them properly in the UI, you need to share the store/file directory using the shared file system.
Are we going to keep support for NFS for users who don’t want/need S3? Do we have a thread where this was discussed (I searched quickly and didn’t find one)?
I don’t think we discussed that, but also I’m not sure what “keep support for NFS” means. As long as we have a filesystem implementation of the blob store (which will probably be always the case I assume), anyone can put NFS on it.
It’s a bit late to discuss the validity of proposing only S3 as recommendation for file sharing.
What I understand is that we’re not proposing to have only S3 as recommendation for file sharing. Users also have the option of using the file system + NFS if they can’t use S3 for some reason.
What is important to me is to continue to offer a solution for admins who don’t need/want to set up a S3 server. From what I see, this is the case, and we also have backward-compatibility since we can always tell them to use the FS impl of the blob store (the default) + NFS.
So all seem good for me, except that I would very much prefer to not add weight to XS, which should remain minimal in term of size and features. I really don’t like adding S3 support in it since that’s a very narrow use case that 99% of users won’t need. A separate distribution would work but it would also be nice to avoid it if we can (it’s heavy to propose more distributions and maintain them). I don’t fully understand your answer re the S3 impl that cannot be installed as an extension but it would be nice to find a solution that doesn’t require bundling it by default in XS nor having a separate distribution.
BTW, I would be ok to add some manual instructions steps to add S3 support to an XS installation (like having to stop XWiki and manually setting some configurations or running some scripts or migration), because the clustering use case is rare and the need for S3 (e.g. if you’re using a k8s setup of XWiki) is even more rare.
Again, I’d like to point out that this is not what is documented, see above
When we use the word “workaround”, it means that there’ll be some downsides and limitations but at https://www.xwiki.org/xwiki/bin/view/Documentation/AdminGuide/Clustering/ I don’t see any such limitations documented. Could we add them there and thus be able to explain why it’s not recommended?
I don’t understand this and don’t understand it conceptually. For example, our jars in WEB-INF/lib are core extensions and we could have a process to install some more, after stopping XWiki and manually installing a XIP for ex. Then restart XWiki. So what’s the difference with bundling it by default?
Also, I don’t understand what you’re saying. If S3 is not installed, why would you need to have S3 to scan existing extensions? And if you install S3 and enable it then a migration could move the extension metadata stored on the FS to S3.