Soupault 2.5.0 release
Soupault 2.5.0, is available for download from my own server
and from GitHub releases.
New features in this release include an option to preserve the original whitespace in HTML pages (i.e. disable pretty-printing)
and two new built-in widgets for rewriting internal links:
There are also some bug fixes and quality of life improvements.
HTML output whitespace control
There’s now a new option
[settings]. When set to
false, it will prevent soupault from inserting any whitespace into HTML for aesthetic reasons.
This option does not enable any kind of HTML minification: whitespace from templates and content files will be preserved. Soupault will only refrain from inserting any more whitespace into the page on its own.
In this release, it’s
true by default for compatibility with older releases.
Deprecation warning: In some cases, trying to ‘prettify’ HTML by inserting more whitespace actually breaks the intended layout (thanks to Thomas Letan for pointing it out). For this reason, this option will be set to
false by default in the next release.
If you want to keep the current behaviour in the future, make sure to adjust your configuration:
[settings] pretty_print_html = true
Internal link rewriting widgets
The mainstream assumption is that every website has its own domain and is located at its virtual host root.1 This is indeed the simplest setup, since you can link to a shared resource at the site root with just
<img src="/images/header.png"> or similar.
However, that assumption isn’t always true. First, with the resurgence of public access UNIX hosts (sometimes called “tildes”) the “site in a subdirectory” scheme (like
example.com/~user/) also made a comeback. Second, there are pages that are naturally placed in a subdirectory, like autogenerated documentation.
There are now two new built-in widgets that will help users deal with this issue.
relative-links widget adjusts internal links to account for their depth in the directory tree to allow hosting the website in any location.
Suppose you have this in your
<img src="/header.png">. Then in
about/index.html that element will be rewritten as
<img src="../header.png">; in
books/magnetic-fields/index.html it will be
<img src="../../header.png"> and so on.
[widgets.relativize] widget = "relative_links" check_file = false exclude_target_regex = '^((([a-zA-Z0-9]+):)|#|\.|//)'
The default regex is meant to exclude links that are either:
- External links with a URI schema.
- Links to anchors within the same page.
- Hand-made relative links.
- Protocol-relative URLs.
If you want to narrow the scope down, you can use the
only_target_regex option instead. For example, with
only_target_regex = '^/[a-zA-Z0-9]', it will only rewrite links like
check_file option is helpful is you have pages with unmarked relative links, e.g. there’s
<img src="selfie.jpg"> in it, and also
about/selfie.jpg file. Arguably, it would be a good idea to use
<img src="./selfie.jpg"> to make it explicit where the file is, but it may be impractical to modify all old pages just to be able to use this widget.
In that case you can set
check_file = true and this widget will rewrite such links only if there is no such file in the directory with the page.
This widget is prepends a prefix to every internal link. A polar opposite of the
[widget.absolutize] widget = absolute_links" prefix = "https://example.com/~jrandomhacker"
A prefix can be simply a directory, a URI schema or a host address is not required.
This widget supports all options of the
On Windows, errors of external programs executed by
preprocess_element widgets (as well as plugins) could crash soupault. It was because Windows doesn’t have a concept of signals, so the standard library communicates conditions like
SIGPIPE (“broken pipe”) by raising exceptions. Soupault handles these exceptions correctly now, so the behaviour is consistent between all OSes.
ignore_extensions option now checks all extensions rather than just the last. I.e.
ignore_extensions = ["tar"] will match both
file.tar.gz now (report by Anton Bachin).
Quality of life
Exception tracing now is automatically enabled by
settings.debug = trueand by
--debugcommand line option, no need to set
OCAMLRUNPARAM=bby hand anymore.
Better error messages for attempts to run soupault outside of a project directory, and for missing templates and
Proper alignment of options in the output of
soupault --help(patch by Anton Bachin).
New plugin functions