Hello all,
yjs is a widely adopted network library for the real-time synchronization of collaboratively edited datastructures (CRDT).
It is commonly used to support the real-time features of most maintained editor libraries.
As we are moving away from CKEditor 4, the support for yjs is soon going to be an important feature for XWiki.
Note: we currently use netflux, but afaik we are the only project using this protocol. And, I believe it’s easier to add support for another protocol than it is to adapt an editor to a new real-time protocol.
The easiest option would be to run a third-party server. Many yjs server implementations exist, see Connection Provider | Yjs Docs (e.g., based on websocket or webrtc).
But, while this is the easiest option in terms of development. This makes operating an XWiki instance much more difficult, especially because the new yjs server needs to be accessible from the internet.
We could make it easier by automatically starting a server from a docker container. But, we know many XWiki users have policies that prevent using this option. Therefore, it cannot be the only provided option.
The other option is to use a Java implementation of a yjs connector compatible server. Sadly, to the best of my knowledge, no Java implementations are available.
Looking at the existing servers, the best alternative seems to be a re-implementation of y-websocket-server.
While the implementation seems relatively easy to port in surface, I’m afraid of the hidden complexity pulled from other javascript dependencies.
Thanks
I started using the Yjs WebSocket end-point you implemented, to have realtime synchronization in BlockNote editor, and I discovered it has an important limitation compared to GitHub - yjs/y-websocket-server: A basic backend for y-websocket · GitHub (which can be seen as the reference implementation): it doesn’t hold an authoritative (source or truth) Yjs document.
Yjs is network agnostic, but its wire protocol includes a synchronization phase where a client connecting for the first time or re-connecting is supposed to “merge” their local changes before pushing new changes. This is not strictly mandatory (a client can connect and push new changes immediately) but without this synchronization step clients can overwrite each other or get duplicate operations. See How Yjs Sync Actually Works: The y-protocols Wire Format | M. Palanikannan for a nice presentation. In a decentralized (peer-to-peer) network, this goes like this:
Alice: this is my vector (clock) state, what changes am I missing?
Bob: you’re missing these changes […]
Bob: this is my vector (clock) state, what changes am I missing?
Alice: you’re missing these changes […]
Basically, when a client (re)connects it has to merge their (offline) changes with all the other connected clients. In a centralized network the client (re)connecting has to sync only with the server.
For BlockNote integration we’re using the centralized approach, where clients (BlockNote editor instances) are using GitHub - yjs/y-websocket: Websocket Connector for Yjs · GitHub to connect to the Yjs WebSocket end-point. But without holding an authoritative (source or truth) Yjs document the WebSocket end-point can’t perform the initial synchronization step. I considered the following options:
- Forward the sync messages to the rest of the clients. This poses some problems:
- The connecting (
y-websocket) client is sending a single sync step 1 message, and it gets marked as synchronized as soon as it receives the first sync step 2 message, which can come from any of the already connected clients. There’s no guarantee that the first client to reply with sync step 2 is the most up to date.
- We can have an edge case where Alice connects and Bob is the only one present in the room, but as soon as Alice sends sync step 1 Bob disconnects, leaving Alice waiting for someone to respond with sync step 2.
- This means that even if you can connect to the Yjs WebSocket there is no guarantee that you will be able to perform the initial sync step, and even if you do the sync, there’s no guarantee that you will be up-to-date after that.
- Hold a
Y.Doc server-side using Yjs API. The problem is that there is no Java binding for the Yjs API. The suggestions I got from Copilot on this direction were:Option 1: y4j — Yrs (Rust) JVM bindings
y4j provides JVM bindings to Yrs, the Rust port of Yjs. This gives you a real Doc object with proper sync protocol support.
Pros: Full Y.Doc semantics, correct state vector diffing, awareness ownership
Cons: Requires native .so/.dll JARs per platform (linux-x86_64, linux-aarch64, etc.), adds a native dependency to the XWiki platform, which is unusual for an XWiki module
If you go this route, Doc, StateVector, Update, and Awareness are exposed. You'd hold one Doc per DefaultRoom and use it to:
* Answer sync step 1 with the missing diff
* Apply incoming sync step 2 updates
* Broadcast the merged update to other sessions
Option 2: Node.js sidecar via xwiki-platform-node
Since XWiki already has xwiki-platform-node infrastructure, you could run a minimal y-websocket-server-compatible Node.js process per JVM instance and have the Java WebSocket endpoint delegate Y.Doc state management to it via a local HTTP/REST or IPC call.
Architecture:
Browser ←WebSocket→ YjsEndpoint.java ←HTTP/localhost→ Node.js y-doc-service
(relay) (Y.Doc per room, sync protocol)
Pros: Reuses the mature y-websocket-server ecosystem, no native JARs
Cons: Operational complexity, two processes, latency on every sync operation
Option 3: GraalVM polyglot
Run Yjs JavaScript directly via GraalVM's polyglot API (Context.create("js")). Only viable if XWiki is deployed on GraalVM, which is not the standard deployment.
- Record all update messages going through the WebSocket end-point and replay them when sync step 1 message is received. This simulates the synchronization phase. The problems are:
- We’re always sending all the updates, not just the ones that the client is missing. This means that connecting to an existing session with lots of changes could be slow (due to network traffic to fetch all the updates and also for the time needed to apply them one by one on the local Y.Doc)
- The WebSocket end-point might need a lot of memory, in extreme cases, to hold all the updates for all the active sessions. The binary format used by Yjs is pretty compact though, so you’d have to have very long sessions with tons of changes to hit this problem.
For now, I implemented option 3. Let me know if you think that is not a good idea or if you think there are better alternatives.
Thanks,
Marius
I agree option 2 would be the best by far, if only a Java implementation was available.
I gave it a try a year ago and it does not seem trivial.
Until option 2 is available (but we could create an improvement issue for it and reference it in the code). I agree option 3 is the best.
Is there a point when we can consider the session to be complete and free the server from the updates?
At the moment the room (collaboration session) is cleared / destroyed when the last client disconnects. I also added a config to limit the amount of memory used by a room to keep the update history, to avoid misuse / abuse. When the limit is reached all clients are disconnected, and the room is destroyed. The clients will re-connect automatically to a new room. Depending on which client re-connects first, some changes may get overwritten or duplicated. It should be fine most of the time, if the connection is good and most of the clients are up-to-date before the limit is reached.
There is of course also the case when, due to connection issues, all clients disconnect temporarily, so the room gets destroyed. The first client that re-connects to the WebSocket end-point re-creates the room. Again, there could be some sync issues (like recent changes being overwritten) when the rest of the clients re-connect, but I think it’s acceptable.
Thanks,
Marius
1 Like