Accusoft will be performing a system upgrade from February 2-6, 2023. As a result, certain external services will be temporarily unavailable during that time.
For more detailed information, Click Here.
I want to load an HTML document in PrizmDoc with UTF-8 encoding. Can this be done automatically in the product?
Currently, no. We have a parameter for .txt files which does that (detailed here), but this “textFileEncoding” intentionally only works for .txt, not .html files. There is a feature request for this:
In the meantime, this can be fixed manually by adding charset = “utf-8” to the meta tag of the HTML document. One POC way this might be done programmatically is below in Python 3.7 (need obvious polishing like checking for the tag already existing, multiple “meta” tags, etc):
with open(filename, "r") as file:
content = file.read()
index = content.find("meta") + len("meta")
new_content = content[:index] + " charset=\"utf-8\" " + content[index:]
with open(filename, "w") as file: