We have a production instance of XWiki v7.0.1 with MariaDB 10.1. Unfortunately, the instance has been running for many years, so the situation is “too-big-to-fail”.
I am trying to upgrade XWiki to the recent version of v13.10.9, but the database encoding stays as “latin1”, while the newer versions use “utf8”. In addition, the Release Notes indicates testing results with HyperSQL, MySQL, Oracle, and PostgreSQL. So, MariaDB is a non-standard configuration, although it has been running for years.
Before more testing, we would better move the existing data from MariaDB to one of the supported databases so that the issue will not be a false positive due to our non-standard database.
Do we have a way to do it? Please point out if I missed something on the official document website.
Thank you for pointing out the minor typo, “utf8” should be “utf8mb4”.
However, the issue still exists for a testing scenario using MariaDB v10.9.2 starting from XWiki v7.0.1.
First, if we the database in utf8,b4 by the SQL command create database xwiki default character set utf8mb4 collate utf8mb4_bin;, later on, XWiki v7.0.1 fails on 12% initialization, with an error message Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535..
Second, if we create the database in latin1, XWiki v7.0.1 works OK. And I guess this is why the previous engineer configured the system this way. However, upgrading to XWiki v13.10.9 does not change the column definitions, and if we try to change the encoding to utf8mb4, the error triggers the same way as above.
Partial exception trace of the error at 12% initialization:
...
Caused by: org.hibernate.HibernateException: Failed updating schema while executing query [create table
activitystream_events (ase_eventid varchar(48) not null, ase_requestid varchar(48), ase_stream
varchar(255), ase_date datetime, ase_priority integer, ase_type varchar(255), ase_application
varchar(255), ase_user varchar(255), ase_wiki varchar(255), ase_space varchar(255), ase_page
varchar(255), ase_hidden bit, ase_url varchar(2000), ase_title varchar(2000), ase_body varchar(2000),
ase_version varchar(30), ase_param1 varchar(2000), ase_param2 varchar(2000), ase_param3
varchar(2000), ase_param4 varchar(2000), ase_param5 varchar(2000), primary key (ase_eventid))
ENGINE=InnoDB]
...
... 20 common frames omitted
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Row size too large. The
maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead,
check the manual. You have to change some columns to TEXT or BLOBs
...
That’s because utf8mb4 takes more room than utf8 (utf8mb3) and the schema back then was optimized with utf8 in mind (so converting to utf8 should work fine). You could upgrade with latin1 and change the encoding after (when the schema have been migrated to something compatible with utf8mb4).
As long as you don’t use any character which is not supported by latin1 (like any non-latin characters). The reason why a database is latin1 is generally because it used to be MySQL/MariaDB default and whoever created the database did not read the XWiki recommendation properly.
I made the first successful upgrade test on a smaller goal to v7.1.4. I am in the process of more testing toward v13.10.9.
Charset latin1 is non-standard for sure, so before the upgrade, I converted it to utf8, not utf8mb4, for the reasons given above. Steps:
Dump production data from v7.0.1 in text format, get the insert commands only; no DDLs
On a virtual machine, install a blank new instance of the same version v7.0.1 on a database of utf8. Then, dump the database to get the DDLs only; no insert commands
On the virtual machine, restore the database with the DDLs dumped before, so we get a purged empty database with structures in the utf8 character set
Import data with the insert commands dumped from the production, so now we have the production data in utf8
Start up tomcat and verify the utf8 instance works OK, then start the upgrade with the WAR file
With the above preprocessing, the upgrade was smooth, and I got the differences below by comparing the DDLs before and after.
In the previous testing of upgrading to v13.10.9, we did not see any difference in the DDLs. So, I guess the Java source code upgrade includes some automatic data conversions, and maybe they failed due to our non-standard configuration. Right?
My Question:
How do we verify these database conversions are correctly applied? And, do we call them migrations, like the Python context?
By the way, the earlier versions officially don’t support MariaDB, e.g. v7.0.1 and v7.1.4. And the file hibernate.cfg.xml doesn’t contain any entry about MariaDB. However, it works by adding a MariaDB URI into the MySQL section. So, we will keep this setting until reaching a later version with official support.
I would highly recommend using the MariaDB connector and not the MySQL one, especially before applying the upgrade to 13.10.9. One reason is that Hibernate identify MariaDB as the very old MySQL 5.5 when using the MySQL connector which could have impact on the way queries are generated and also impact some migrations (for example XWiki won’t increase the max size of short string fields because the new size is not supported by MySQL 5.5).
Do we remember from which version XWiki started officially supporting MariaDB? That means using the specific MariaDB connector, not borrowing MySQL’s.
The good news is that I have reached v7.4.6 on the testing virtual machine. And, it feels too slow when running the small steps, so I hope to make bigger steps to get a faster upgrade path.
Also, thank you for your reply to my other posts. So, I will append later on if got more details in the testing.
Good news: with the database in charset utf8, I have successfully reached the upgrade target for the first time.
Comparing the database DDLs of the start and target versions, we can see data type varchar(2000) becoming longtext and other changes. As attached below, a diff shows about 300 lines of differences, which means the automatic database change is functioning.
The above path is only the first happy pass, and it contains extra intermediate hops for proof of concept purposes. And we are running more tests to finalize a faster and more reliable change plan.
By the way, when at v7.4.6, we attempted to upgrade to v8.4.6 and v8.4, but both failed. If interesting to you, see this post.
We will convert the database again from charset utf8 to utf8mb4, and this part should be easier.
Future question:
We aim to use the Docker image so that I will submit another question shortly afterward. Thank you again for your help and patience.