Developing XWiki using LLMs

Hi devs,

Several of us have been exploring using LLMs to help develop the XWiki code base (and extensions). This thread is about sharing the experiments done and centralizing all the knowledge gathered around this.

I’ve also created https://design.xwiki.org/xwiki/bin/view/Proposal/DevelopingXWikiusingLLMs that we should use to add non-temporary learnings, this forum post being a discussion, not a knowledge base.

Let’s try to answer these topics:

  • List all the running experiments happening.
  • Sharing best practices using LLMs to code for XWiki and its extensions.
  • List problems and limitations found.

Experiment 1

I’ve currently been experimenting with Claude Code (on Opus 4.6), for about 1 month. It’s working great, usually finding solutions to my questions with 1 to 4 prompts.

I’ve asked to:

  • Directly fix jira issues, providing a link from jira.xwiki.org. Worked very well so far (I’ve tried on about 5-6 issues).
    • It writes tests and they’re pretty good. Requires a bit of attention to fix some small mistakes but globally is very good.
    • What is not yet proven to work is when I asked to fix some flicker. I got some credible answer but I don’t know how to validate that it’s correct yet :wink:
  • Perform local code refactoring (fix checkstyle rules for ex). Also works very well.

I’m currently slowly tuning some skills files.

Here’s for example my tests skill (located in ~/.claude/skills/tests/SKILL.md):

---
name: standards-for-tests
description: Best practices, rules and XWiki-specific testing framework documentation for writing tests for the XWiki code base.
---

When writing a test:
* Follow the test strategy at https://dev.xwiki.org/xwiki/bin/view/Community/Testing/#HTestingStrategy
* When writing unit tests for Java code, follow https://dev.xwiki.org/xwiki/bin/view/Community/Testing/#HJavaUnitTesting
* When writing unit tests for code using XWiki Rendering (like rendering macros), follow https://rendering.xwiki.org/xwiki/bin/view/Main/Extending#HAddingTests
* When writing unit tests for XWiki templates (.vm files) or XWiki pages (.xml files representing a wiki page), follow https://dev.xwiki.org/xwiki/bin/view/Community/Testing/ViewUnitTesting/
* When writing a functional test, follow https://dev.xwiki.org/xwiki/bin/view/Community/Testing/DockerTesting/
* For functional tests, follow the best practices defined at https://dev.xwiki.org/xwiki/bin/view/Community/Testing/#HBestPractices but also at https://dev.xwiki.org/xwiki/bin/view/Community/Testing/DockerTesting/
* For other types of tests, see https://dev.xwiki.org/xwiki/bin/view/Community/Testing/ which has sections for other types
* After writing a test, use Maven to verify that any test written works fine. However, if the test is a functional test, ask before executing Maven since there could be an already running XWiki instance locally on the developer's machine, and the test is supposed to start one too.
* Apply XWiki's general code best practices and code style when writing tests.
* Don't use @OldcoreTest when @ComponentTest is enough.
* Verify if the jacoco coverage threshold cannot be increased after tests have been added, by running maven with `-Pquality -Dxwiki.jacoco.instructionRatio=1.00` which should fail but provide the current threshold value that can then be used to replace the current value.

Location of XWiki test frameworks:
* Simple and component-based test framework: ~/dev/xwiki/xwiki-commons/xwiki-commons-tools/xwiki-commons-tool-test
* Rendering test framework:~/dev/xwiki/xwiki-rendering/xwiki-rendering-test
* Oldcore test framework + docker test framework + page test framework and more: ~/dev/xwiki/xwiki-platform/xwiki-platform-core/xwiki-platform-test

WDYT? I’ve tried to keep it small and be an index instead of adding lots of tokens in it.

What I’d like to do next is find some way for claude to deploy changes to XWiki and test that they work. Right now I do the deployment myself and then use the Claude chrome extension to ask Claude to navigate to XWiki and verify it works fine. I still need to tune this and improve it.

Next step: I’d like to try opencode and/or openrouter using a less expensive model to see if I could use that for simple questions, and basically how well it fares vs CC.

Right now I’m on Claude Pro (20 euros/month), and I find that I use the daily limits every day (or almost), but it’s just about ok for the weekly limits. I think with my growing usage, I won’t have enough tokens soon. I need to also learn more to use less tokens.

FTR I’ve recently started using the plan mode more as I’ve learnt that it helps reduce token usage (maybe by 20%).

Ok that’s just a quick introduction to get the topic going!

What are you doing? :slight_smile:

Thx

Note: I’ve not yet tried to fine tune my claude.md file. I have generated it with /init. This is what my xwiki-platform one contains ATM:

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

XWiki Platform is a generic wiki platform offering runtime services for applications built on top of it. It is part of the XWiki.org software forge alongside XWiki Commons and XWiki Rendering, all sharing the same version (`18.3.0-SNAPSHOT`).

- **Issue Tracker:** https://jira.xwiki.org/browse/XWIKI (not GitHub Issues)
- **CI:** https://ci.xwiki.org with Develocity at https://ge.xwiki.org/scans
- **Dev Guide:** https://dev.xwiki.org/xwiki/bin/view/Community/

## Build Commands

```bash
# Standard build (no integration tests, fast)
mvn clean install -Plegacy,integration-tests,snapshot -Dxwiki.checkstyle.skip=true -Dxwiki.surefire.captureconsole.skip=true -Dxwiki.revapi.skip=true -DskipITs

# Build with integration tests
mvn clean install -Plegacy,integration-tests,snapshot -Dxwiki.checkstyle.skip=true -Dxwiki.surefire.captureconsole.skip=true -Dxwiki.revapi.skip=true

# Build a specific module
mvn clean install -pl xwiki-platform-core/xwiki-platform-<module> -Plegacy,snapshot

# Build distribution
mvn clean install -f xwiki-platform-distribution/pom.xml -Plegacy,integration-tests,snapshot

# Skip all tests
mvn clean install -DskipTests -Plegacy,snapshot
```

## Running Tests

```bash
# Run all unit tests for a module
mvn test -pl xwiki-platform-core/xwiki-platform-<module>

# Run a specific test class
mvn test -pl xwiki-platform-core/xwiki-platform-<module> -Dtest=MyTestClass

# Run a specific test method
mvn test -pl xwiki-platform-core/xwiki-platform-<module> -Dtest=MyTestClass#myMethod

# Run integration tests
mvn verify -pl xwiki-platform-core/xwiki-platform-<module> -Pintegration-tests
```

Functional (Docker-based) tests live in `xwiki-platform-distribution/xwiki-platform-distribution-flavor/xwiki-platform-distribution-flavor-test/`.

## Module Structure

```
xwiki-platform/
├── xwiki-platform-tools/        # Maven build plugins and tools
├── xwiki-platform-core/         # 100+ feature modules (main development area)
│   ├── xwiki-platform-oldcore   # Legacy core (com.xpn.xwiki.*) — central XWiki engine
│   └── xwiki-platform-<feature> # Feature modules (rendering, security, livedata, etc.)
└── xwiki-platform-distribution/ # WAR packaging, flavor, functional tests
```

Key core modules:/tok
- **xwiki-platform-oldcore** — The central XWiki engine (`com.xpn.xwiki`). Avoid adding new code here; prefer new feature modules.
- **xwiki-platform-rendering-\*** — Content rendering pipeline
- **xwiki-platform-security-\*** — Authentication, authorization, crypto
- **xwiki-platform-extension-\*** — Extension/plugin management system
- **xwiki-platform-flamingo** — Default skin/theme
- **xwiki-platform-livedata** — Dynamic data table UI component
- **xwiki-platform-ckeditor / xwiki-platform-blocknote** — WYSIWYG editors

## Architecture Concepts

**Component System:** XWiki uses its own IoC container (from xwiki-commons). Components are declared with `@Component` and injected with `@Inject`. Role interfaces use `@Role`; `@Singleton` and `@InstantiationStrategy` control lifecycle.

**OSGi Manifests:** Each JAR has an OSGi manifest generated by the Maven Bundle Plugin (see `xwiki-platform-core/pom.xml`). Modules declare their extension features via `xwiki.extension.features` properties.

**Legacy Modules:** The `legacy` profile activates backward-compatibility shim modules (suffix `-legacy`). These re-export deprecated APIs and should not receive new logic.

**XWiki Context (`XWikiContext`):** A request-scoped object threaded through much of oldcore. New code should avoid it in favor of component-based APIs.

**Event System:** Components communicate via `ObservationManager`. Events extend `AbstractEvent`; listeners implement `EventListener`.

## Frontend

The project uses an Nx monorepo (pnpm workspace) for JavaScript packages:

```bash
# Install Node dependencies (run from repo root)
pnpm install

# Build all JS packages
pnpm run build

# Build a specific package (e.g., blocknote)
pnpm --filter @xwiki/blocknote run build
```

JS sources live inside their respective core modules (e.g., `xwiki-platform-livedata`, `xwiki-platform-ckeditor`, `xwiki-platform-blocknote`).

## Code Quality

- **Checkstyle:** Enforced by default; skip with `-Dxwiki.checkstyle.skip=true`
- **RevAPI:** Checks backward API compatibility; skip with `-Dxwiki.revapi.skip=true`
- **Console capture:** Surefire checks tests don't write to stdout; skip with `-Dxwiki.surefire.captureconsole.skip=true`

## Key Maven Profiles

| Profile | Purpose |
|---------|---------|
| `legacy` | Includes backward-compatibility modules (almost always needed) |
| `integration-tests` | Activates IT execution via Failsafe |
| `snapshot` | Enables XWiki snapshot repositories |
| `distribution` | Includes distribution packaging module |
| `clover` | Code coverage analysis |

One question I have is how to share claude.md or the skills between the developers? Do we even want to share it?

Should we create a repo in the xwiki github org for that, and then somehow find ways to configure each LLMs to use the content of that repo?

1 Like

The setup I’m using is that I have an xwiki directory that contains the xwiki-commons, xwiki-rendering and xwiki-platform directories. In this directory, I’m keeping AGENTS.md but also various other Markdown files with analysis results. I have also added a pom.xml there that I can import in IntelliJ to have a nicer view of “everything” and I’ve added a Maven configuration to enable all profiles I normally use, so the agent also always uses them.

I have a file for tests both with some general stuff that is specific to XWiki and some rules that are more personal taste than something we’ve discussed as a general rule. I found, however, that the way I put the file, the agent doesn’t read it unless explicitly told and nevertheless very quickly picks up all necessary context including things I haven’t specified by reading related tests. So I think many rules are unnecessary and just bloat the context. I think we should avoid adding too much to such files and instead only add rules as we discover that the agents really needs them.

There are very few rules that I’ve added and tried tweaking in AGENTS.md where I’m confident they are good. One example is teaching the agent to use modern Java (to counter the fact that it sees a lot of old Java code that triggers the agent using old Java). Another one is how to rebuild modules after changes touching several modules as the agent keeps forgetting rebuilding the production code before running UI tests. I have the feeling more tooling would be good for that. I would really like to have a single command that rebuilds everything that has been changed and executes a sensible subset of UI tests, like not the whole flamingo skin test module when a single test in it has been changed. Similarly, we should have a single command that updates a demo instance with changes, if possible without restart and if not possible with an automated restart, so the agent can live-test changes. Of course, such tools would also be valuable for human developers.

yes I’ve read this too and this is what the lead dev of claude code also recommends (to regularly start from scratch, ie wipe out the claude.md file and only add stuff in it when necessary since the LLMs improve themselves).

One thing I’m currently trying (and so far so good) is to use the superpowers plugin for claude code.

See also a video about it at https://youtu.be/4XqVR6xI6Kw?si=0zq7tV4Tfept4swg

Having a repository of skills would be also helpful for XWiki users imho + administrators and developers of custom extensions. For example:

  • Issue solver skill:
    • Search existing solutions to an XWiki issue across xwiki.org, the forum (possibly using Dicourse MCP?), Jira, the code base, the Web etc.
    • Propose solution
    • Publish detailed bug report if the issue qualifies as a non existing bug.
  • Upgrader skill
  • AWM creator skills: turn list of fields into an AWM XAR
  • Spec creator: turn functional requirements into various architecture options and detailed technical specification
  • Custom extension code reviewer: inspect code practices, security, code quality, propose code simplifications
  • Extension creator
  • Documenter skill

For what it’s worth, I’m adding for the record some repositories that could be inspirational:

I came across some projects creating a chatbot for specific codebases: Google Codewiki.

I also have a skill to deploy extensions in my running XWiki instance that works well.

In my general flow, Claude code fixes issues, rebuilds with maven, deploy the changes in my running XWiki instance, ask me to restart XWiki and then use the chrome plugin to verify that the changes work fine by navigating.

FWIW here’s my ~/.claude/skills/extension/SKILL.md (which works when developing an extension or working on the core):

To deploy an XWiki extension (XAR, JAR), to a running XWiki install, do the following:
0) If the extension is a core extension (i.e. located in webapps/xwiki/WEB-INF/lib) then simply replace the jar with the new one and don't go over the next steps and ask the dev to restart XWiki.
1) Find the id of the extension to deploy: it's the maven groupId followed by ":", followed by the maven artifactId of the extension, which you can find in the pom.xml
2) Also find the version in the version property in the pom.xml
3) Generate an XML file named installjobrequest.xml in the target dir of the extension to deploy,  exactly like the following one (replace the string "id here" and "version here" by the extension id and the version value):

```
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<jobRequest xmlns="http://www.xwiki.org">
  <id>
    <element>extension</element>
    <element>provision</element>
    <element>796fb04f-b095-4db8-a3ec-fa03f22051f8</element>
  </id>
  <interactive>false</interactive>
  <remote>false</remote>
  <verbose>true</verbose>
  <property>
    <key>extensions</key>
    <value>
      <list xmlns="" xmlns:ns2="http://www.xwiki.org">
        <org.xwiki.extension.ExtensionId>
          <id>id here</id>
          <version class="org.xwiki.extension.version.internal.DefaultVersion" serialization="custom">
            <org.xwiki.extension.version.internal.DefaultVersion>
              <string>version here</string>
            </org.xwiki.extension.version.internal.DefaultVersion>
          </version>
        </org.xwiki.extension.ExtensionId>
      </list>
    </value>
  </property>
  <property>
    <key>extensions.excluded</key>
    <value>
      <set xmlns="" xmlns:ns2="http://www.xwiki.org"/>
    </value>
  </property>
  <property>
    <key>interactive</key>
    <value>
      <boolean xmlns="" xmlns:ns2="http://www.xwiki.org">false</boolean>
    </value>
  </property>
  <property>
    <key>namespaces</key>
    <value>
      <list xmlns="" xmlns:ns2="http://www.xwiki.org">
        <string>wiki:xwiki</string>
      </list>
    </value>
  </property>
</jobRequest>
```
4) If the install fails because the extension is already installed, first uninstall it by generating an XML file named uninstalljobrequest.xml in the target dir, exactly like installjobrequest.xml but without the `extensions.excluded` and `interactive` properties, then run:
`curl -i --user "Admin:admin" -X PUT -H "Content-Type: text/xml" "http://localhost:8080/xwiki/rest/jobs?jobType=uninstall&async=false" --upload-file uninstalljobrequest.xml`
Then re-run the install curl command.
5) Run `curl -i --user "Admin:admin" -X PUT -H "Content-Type: text/xml" "http://localhost:8080/xwiki/rest/jobs?jobType=install&async=false" --upload-file installjobrequest.xml`

I would recommend telling the agent to use -q and pipe the output through tail to avoid that the agent gets the whole (largely) useless build output into the context. The LLMs know where to find and how to read test reports in case the output is truncated - and then they can use tools like grep to read only the relevant parts.

Helloo!!! I’ve been researching quite a bit of AI related stuff lately. I don’t contribute code to XWiki, so take what I recommend with a grain of salt, but here are some of the best practices that I’ve currently seen:

  • Try to minimize token usage as much as possible. I’ve seen some benchmarks that show that after ~50% context usage the models tend to become really dumb.
  • You can use https://skills.sh/ to browse agent skills. You can install them locally with the Node.js CLI, and they’re versioned so you can also update them easily
  • There are multiple Agent Harnesses you can use (e.g.: Claude Code, OpenAI Codex, OpenCode, Pi Agent, and many others). Some are better than others. If you can, you should try multiple and see which is better for your usecase.
  • Try to keep CLAUDE.md & AGENTS.md files minimal, ideally handwritten rather than AI generated. Update them when agents misbehave.
  • Be careful with .env files!! Agents can and may read them. Once they do, those secrets are on somebody’s server. There are tools that do secret management, but I couldn’t identify clear consensus around one particular such tool.
  • Some agents are better at some tasks than others (e.g.: from what I’ve seen online the latest Anthropic model is excellent at problem-solving, but quite bad at UI). You should try out multiple models to find their strengths & weaknesses. For slightly less complex tasks, some open-source models are surprisingly good (OpenCode has a $10 subscription that offers access to multiple open-source models, may be a good way of trying them out).
  • Some tools offer LSP integration (OpenCode), this can pollute context and should be disabled. Instead, you should mention in the AGENTS file that the agent should run a type checking command at the end of its task.

Repos with skills seem to be the standard. Skills can be installed and updated through npx skills add <owner/repo> & npx skills update. For AGENT.md files, you could offer a starting point (with common commands to run, directory structure, other stuff that the agent may find useful) but it should be minimal and devs should update them with whatever edge cases they run into during their daily activities. For some devs it may make sense to have the agent check certain things during development, while for other devs those checks may be redundant.

I think these are all my recommendations so far. If I run into anything else, I’ll make sure to mention it here.

Note: The industry is moving at a rapid pace. The resources & recommendations mentioned here may be outdated in a couple of months. Keeping AI development practices useful requires constant updates.