REST API join/merge/paste new text onto a page without deleting the old content of the page?

shafigh · August 3, 2023, 3:56am

Hello, I have a little question:

I have written a little python script which writes the content of a .txt file to a wiki page:

import requests

url = "http://localhost:8080/xwiki/rest/wikis/xwiki/spaces/Sandbox/pages/TestPage1"
file_path = r"example.txt"

username = "user"
password = "pass"

headers = {
    "Content-Type": "text/plain",
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36'

}

with open(file_path, "rb") as file:

    response = requests.put(url, headers=headers, data=file, auth=(username, password), allow_redirects=True)

if response.status_code == 201:
    print("The Page was successfully created.")
elif response.status_code == 202:
    print("The page was successfully updated.")
elif response.status_code == 304:
    print("The Page was not modified.")
elif response.status_code == 401:
    print("The user is not authorized.")
else:
    print("Error uploading the data:", response.status_code)

and the code works fine, it does indeed write the text of the specified file to the page. However, I have come to realize that it doesn’t just write to the page, it fully replaces whatever was on the page before and just writes the new text from the file. Is there any way to make it so that it doesn’t replace the old page content but instead just adds/merges on?

I thought I could solve this by first doing a GET request and getting the old content of the page, write it to a file and then basically just add on the new text to that file and write that to the page, which sounds quite plausible in theory in my opinion. So I modified my code a little bit:

import requests

url = "http://localhost:8080/xwiki/rest/wikis/xwiki/spaces/Sandbox/pages/TestPage1"
new_file_path = r"new_content.txt"
old_file_path = r"old_content.txt"

username = "Admin"
password = "admin"

header = {
    "Content-Type": "text/plain",
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36'

}


# GET-Request for old content
response = requests.get(url, headers=header, auth=(username, password))

if response.status_code == 200:
    # save old content to file
    with open(old_file_path, "w") as old_file:
        old_file.write(response.text)
else:
    print("Error:", response.text)

# add the new text to the old text
with open(new_file_path, "r") as file:
    content = file.read()
    with open(old_file_path, "a") as old_file:
        old_file.write(content)

# PUT-request to write it all to the page
with open(old_file_path, "rb") as file:
    response = requests.put(url, headers=header, data=file, auth=(username, password), allow_redirects=True)

if response.status_code == 201:
    print("Page created successfully.")
elif response.status_code == 202:
    print("Page updated successfully.")
elif response.status_code == 304:
    print("Page not modified.")
elif response.status_code == 401:
    print("User not authorized.")
else:
    print("Error:", response.text)

This code runs and kinda does what it is supposed to, the problem is that the GET request returns a super duper long xml file with everything, but I just want the written content of the page as plain text and not a xml file with all different kind of xml elements.
Does anyone know how to resolve this or instead maybe know how I can add new text to the page without it deleting the old text?

Thank you!

nikpetrenko · August 7, 2023, 4:57pm

Yes, you can do it. The page’s content is what you put inside <content>...</content> tag. You can parse the XML response in Python using any library (for instance BeautifulSoup, which needs installation using pip), and access the required response element.
I noticed that you execute this everytime because, in most cases - your response is status 200 after get method.

if response.status_code == 200:
    # save old content to file
    with open(old_file_path, "w") as old_file:
        old_file.write(response.text)
else:
    print("Error:", response.text)

Better to wrap it into function. Also, there’s no need to use with open(new_file_path, "r") as file: of parameter r , by default, if no parameter is specified - it means “read mode”.

shafigh · August 9, 2023, 6:47am

Hi,
I solved this a few days ago and forgot to close the question but thank you anyway!
I just used GET to get the XML of the page and used “xml.etree.ElementTree” to analyze the XML file, get the text inside the <content> element and put that into a text file before putting the new text on the same text file and then used PUT to “upload” that file onto the wiki page.

Also, I knew that the “r” stood for read but I didn’t know that read would be used by default, so I’ll keep that in mind from now on onwards!

Again, thank you for your efforts