Soupault 2.1.0 release

Estimated reading time: 5 minutes.

Date:

Soupault 2.1.0 is available for download. It's a feature expansion release that adds some new options and makes existing features more flexible. Among them are ability to preserve original page doctype, support for multiple selectors in inclusion widgets, new ToC anchor slugification options, and more. This release also introduces 64-bit Windows support, and may be the last release to support 32-bit Windows.

Configurable content insertion actions

You may already know that HTML insertion widgets like include and exec support an action option that allows you to choose where the output is inserted. For example, with action = "prepend_child" you can force soupault to put the new content before the first existing child rather than after the last.

Now there’s a similar option for page templates.

[settings]
  default_template_file = "templates/main-template.html"
  default_content_action = "append_child"
  default_content_selector = "main"

[templates.another-template]
  file = "templates/another-template.html"
  content_action = "prepend_child"
  content_selector = "div#content"

A funny consequence of this change is that with content_action = "replace_content" you can instruct soupault to use a non-empty page as a template and replace its existing content with something else.

You can find a complete list of valid actions in the reference manual.

Table of contents improvements

New slugification options

The use_heading_slug provides a compromise between hand-written heading anchors (<h2 id="first-section">) that are informative and permanent but tedious to add, and autogenerated numeric anchors that are neither permanent nor informative. It autogenerates id attributes from the heading text. However, its original “slugification” algorithm is overly conservative. Now it’s got more flexible.

The HTML5 standard puts only one restriction on the id attributes: they must not contain any whitespace. There are no other restrictions. Soupault originally used much more conservative transformation and removed everything that wasn’t an ASCII letter or digit.

Now you can make soupault only replace whitespace characters with hyphens using a new soft_slug option.

[widgets.my-toc]
  use_heading_slug = true

  # Only replace whitespace
  soft_slug = true

  ...

If you want more control over the slugification process, there are more options now. Using slug_regex and slug_replacement_string you can specify what to replace, and what to replace it with.

Additionally, with slug_force_lowercase you can chose whether force ASCII letters in id attributes to lowercase (the original use_heading_slug option always did it).

These are the default settings for use_heading_slug:

[widgets.my-toc]
  use_heading_slug = true
  slug_regex = '[^a-zA-Z0-9\-]'
  slug_replacement_string = '-'
  slug_force_lowercase = true
  ...

Section links for headings below the max_level

The ToC widget does two closely related but distinct things. First, it generates tables of contents. Second, it puts “link to this section” elements next to headings, if you set heading_links = true.

In older versions, you could only create section links for headings that are also included in the ToC. It always bothered me that I couldn’t easily copy links to individual plugin functions in the reference manual like /reference-manual/#Sys.run_program. Not without cluttering the ToC with very small level headings.

Now this is possible. There’s a new max_heading_link_level option that can be greater than max_level. You can already see its effect in the reference manual.

If you set it to a value less than min_level, it will be automatically forced to min_level.

Doctype preservation

Before this release, soupault would always strip the original doctype declaration and replace it with the doctype set in the config, even in HTML processor mode. It was a limitation of the HTML parser, and now Anton the maintainer of Markup.ml and lambdasoup fixed it.

Now there’s a new keep_doctype option that makes soupault preserve the original doctype if a page has one. If a page doesn’t have a doctype, it will be forced to the doctype from [settings] as usual.

Multiple selectors for inclusion widgets

Inclusion widgets now support lists of selectors. For example, this way you can make soupault try inserting a footer in <div id="footer">, then <footer>, and finally just append it to the page body if there’s no dedicated place for a footer in the page.

[widgets.insert-footer]
  widget = "include"
  file = "templates/footer.html"
  selector = ["div#footer", "footer", "body"]
  action = "append_child"

Bug fixes

Reintroduction of Microsoft GitHub releases

Since I made a dedicated location for soupault downloads (files.baturin.org/software/soupault), I neglected publishing releases on Microsoft GitHub. Then I realized that some people use GitHub as a release notification mechanism. Some people who use soupault from CI scripts may also prefer CDNed links.

So, from now on I’ll keep mirroring releases to github.com/dmbaturin/soupault/releases.

64-bit Windows support

From the start I made a point to provide official binaries for the three most popular operating systems. I went for 32-bit Windows build and Windows Vista/7 ABI because that works for every Windows user.

However, by now all 32-bit Win7 ABI systems reached end of support, and many projects are (rightfully) phasing out support for 32-bit x86 architecture as such.

Unless there’s serious demand for Win32 executables, this release will be the last to support legacy 32-bit Windows versions. Otherwise I’ll only provide Win64 builds from now on.