Chapter 10
For Developers
This chapter deals with TeX4ht
development. It starts with a basic tutorial for a
new package support, shows commands useful in the process, different types of
TeX4ht
configuration files, and the syntax and structure of literate source
files.
10.1 Tutorial: Basic Support For a New Package
In this tutorial, we will try to show how to provide TeX4ht
support for a simple
LaTeX package.
TeX4ht
tries to load a special .4ht
file for each package loaded by LaTeX. This
special file can contain modifications to commands provided by the package, like
redefinitions of macros that cause clashes between the package and TeX4ht
, and most
importantly they insert special macros, called hooks, that are then used to include
the output format tags.
Let’s say that you have a custom package, called mynote.sty
\newcommand\notetitle{Note:~} \newcommand\note[1]{\textbf{\notetitle}#1} \newcommand\highlight[1]{\textbf{#1}} \endinput
It defines two user commands, \note
and \highlight
. They can be used in the
following way:
\documentclass{article} \usepackage{mynote} \begin{document} \note{This is a note} Try to highlight \highlight{something}. \end{document}
TeX4ht
produces usable output for both of these commands out of the box,
thanks to the support for TeX fonts. But you may want to use custom HTML tags
instead. To achieve that, you need to insert special commands, called hooks in
TeX4ht
, to package commands. These hooks can be then configured to insert tags in
the output format.
To introduce hooks, you need to create a hook seeding configuration file for the
package, called <name>.4ht
. For example, to seed hooks for the mynote.sty
package,
create file mynote.4ht
:
\NewConfigure{note}{3} % Use \HLet when you want to completely redefine a command \def\:tempa#1{\a:note\notetitle\b:note~#1\c:note} \HLet\note\:tempa \NewConfigure{highlight}{2} \pend:defI\highlight{\a:highlight} \append:defI\highlight{\b:highlight} \Hinput{mynote} \endinput
There is several things to note. First is that the :
character can be included as a
part of a command name in .4ht
files. It is similar to use of the @
character in LaTeX
packages. It allows us to create command names that don’t clash with other
command names.
The hooks are created using the \NewConfigure
command. They can be later
filled with the \Configure
command. To have an effect, hooks must be
inserted to the existing commands. There are two ways how to do that. For
simpler commands, where we want to insert tags only before and after the
contents produced by the patched command, we can use the \pend:def<X>
and
\append:def<X>
commands, where the <X>
is a roman number of parameters that
the patched command expects. In this example, it expects one parameter, so we
can use the \pend:defI
command. For commands without parameters, use
\pend:def
.
Of course, you can also insert hooks using other mechanisms, for example using LaTeX’s hook system:
\AddToHook{cmd/highlight/before}{\a:highlight} \AddToHook{cmd/highlight/after}{\b:highlight}
The second way for hook insertion, useful for commands where we want to insert
tags also inside it’s contents, is to use the \HLet
command. It is a variant of
the \let
command. In contrast to \let
, it saves the original command as
\o:<command name>:
. Commands redefined by \HLet
also support the \Picture
command, where the original version of the command will be used. This way,
pictures will produce the same result as they would produce in the PDF
mode.
In our example, we redefined the \note
command to use a hook between
note title and note text. This enables us to style both the title and the text
differently.
The configuration file for our hooks could look like this:
\Preamble{xhtml} \Configure{note} {\ifvmode\IgnorePar\fi\EndP\HCode{<div class="note"><span class="notetitle">}} {\HCode{</span><span class="notebody">}} {\HCode{</span></div>}} \Css{.notetitle{font-weight: bold;}} \Configure{highlight}{\HCode{<span class="highlight">}\NoFonts}{\EndNoFonts\HCode{</span>}} \Css{.highlight{font-weight:bold;}} \begin{document} \EndPreamble
As the \note
command should be used on it’s own paragraph, we need to fix
paragraph closing. See the Paragraph Handling section for more information about
this issue. More details about configuration files and configurations are in section
Private Configuration Files.
The HTML code produced by our configuration looks like this:
<div class='note'><span class='notetitle'>Note: </span><span class='notebody'> This is a note</span></div> <!-- l. 6 --><p class='indent'> Try to highlight <span class='highlight'>something</span>. </p>
10.2 Tutorial: How to Redefine Package Commands Used in the Document Preamble
The usual .4ht
files are loaded only after the \begin{document}
command. This
means that they cannot influence macros that are initialized in package options or in
other code executed in the document preamble. For these cases, TeX4ht provides a
special configuration file usepackage.4ht
, which is read at the moment
when packages are being loaded. This allows you to insert hooks that block,
replace, or extend package definitions much earlier than would otherwise be
possible.
10.2.1 Blocking a Package from Loading
Sometimes it is necessary to completely prevent a package from being loaded,
because its behavior is incompatible with TeX4ht. The following example shows how
to block the unicode-math
package:
% block unicode-math package \:dontusepackage{unicode-math} % provide dummy definition for \setmathfont command \DeclareDocumentCommand \setmathfont { O{} m O{} }{}
This directive ensures that the package is skipped during the loading process.
10.2.2 Executing Code After a Package is Loaded
In other cases we want the package to load, but we need to restore or adjust
definitions it has changed. For instance, the titlesec
package redefines all sectioning
commands. If we want to restore the original LaTeX definitions after titlesec
is
fully loaded, we can save the old definitions beforehand and then restore them using
\:AtEndOfPackage
:
\let\ttl:@makechapterhead\@makechapterhead \let\ttl:@makeschapterhead\@makeschapterhead \let\ttl:chapter\chapter \let\ttl:section\section \let\ttl:subsection\subsection \let\ttl:subsubsection\subsubsection \let\ttl:paragraph\paragraph \let\ttl:subparagraph\subparagraph \:AtEndOfPackage{ \let\chapter\ttl:chapter \let\section\ttl:section \let\subsection\ttl:subsection \let\subsubsection\ttl:subsubsection \let\paragraph\ttl:paragraph \let\subparagraph\ttl:subparagraph \let\@makechapterhead\ttl:@makechapterhead \let\@makeschapterhead\ttl:@makeschapterhead }
10.2.3 Available Commands
Two special commands are provided for these early hooks:
-
\:dontusepackage
{package name} – prevents the named package from loading. -
\:AtEndOfPackage
{code} – executes the given code after the package has been fully loaded.
These hooks are inserted at the correct place during package processing, so they can safely modify or restore definitions without requiring manual patching in the document.
10.2.4 Using LaTeX Hooks Directly
It is also possible to use LaTeX’s native hook management system from within
usepackage.4ht
. For example, to disable footnote superscripts only in the doc
package documentation, one can use:
\AddToHook{package/doc/before}{\SUPOff} \AddToHook{package/doc/after}{\SUPOn}
This approach is especially useful for packages that provide their own well-defined hooks.
10.2.5 Local Modifications with usepackage-user.4ht
Another possibility for loading local definitions before a package is processed is to use
the file usepackage-user.4ht
. This file works in the same way as usepackage.4ht
,
but it is intended specifically for user-level customizations. By placing your
own hooks or redefinitions there, you can adapt the behavior of packages
locally, without modifying the official configuration files distributed with
TeX4ht
.
10.3 Commands Usable in the .4ht
files
\NewConfigure
{name}{number of defined hooks}
This command defines macros with an alphabetic prefix in the form of \a:name
…\i:name
, depending on the number of defined hooks. The maximum number is
9.
\NewConfigure{try}{2} \def\try#1{\a:try#1\b:try} \Configure{try}{* }{} \try{ho} % produces "* ho"
\NewConfigure
{name}[number or parameters]{code}
Variant of \NewConfigure
that doesn’t define hooks with alphabetic
prefixes, but it passes arguments of \Configure
as TeX arguments. See this
exampe:
\NewConfigure{try}[2]{\def\hookI{#1}\def\hookII{#2}} \def\try#1{\hookI#1\hookII} \Configure{try}{* }{} \try{ho} % produces "* ho"
When you use \Configure{try}
, it defines \hookI
and \hookII
commands.
They can be then used in the redefined \try
command.
\HLet
{Redefined command name}{new command}
Variant of \let
that saves the original command under \ø:<name>:
name. It can detect use of the redefined command inside picture. In such
case, it will use the original command to produce correct visual result in the
picture.
\NewConfigure{note}{3} \def\:tempa#1{\a:note note:\b:note~#1\c:note} \HLet\note\:tempa \Configure{note}{*}{*}{*} \note{hello} % produces: "* note:* hello*
\HRestore
{command name}
Restore command redefined using \HLet
to it’s original content.
\pend:def<X>
{redefined command}{code to be inserted at the begin}
\append:def<X>
{redefined command}{code to be inserted at the end}
These two commands inserts code before and after a redefined command. There
are several versions of these commands, depending on the number of parameters that
the redefined command expects. Number of parameters as roman number replaces
the <X>
placeholder.
Up to three parameters are supported.
\newcommand\bar{xxx} \pend:def\bar{*} \append:def\bar{*} \bar % produces: "*xxx*" \newcommand\foo[2]{#1, #2} \pend:defII\foo{*} \append:defII\foo{*} \foo{a}{b} % produces "*a, b*"
\:CheckOption
{option name}
\if:Option
Support for custom options. The \:CheckOption
checks if the given option is
active, and \if:Option
conditional then run true or false branch.
\:CheckOption{info}\if:Option ... \else ... \fi
10.4 Two types of .4ht files
The compilation starts by opening tex4ht.sty and loading a fraction of its code. The
main purpose of this phase is to request the loading of the system at a later time (for
instance, upon reaching \begin{document}
). The motivation for the late loading is
to allow TeX4ht to collect as much information as possible about the environment
requested by the source file, and help the system reshape that environment with
minimal interference from elsewhere.
The system uses two kinds of (4ht) configuration files. The files of the first kind
mainly seed hooks into the macros loaded by the source file (for instance,
latex.4ht
, fontmath.4ht
, and article.4ht
). The files of the second kind mainly
attach meaning to the hooks (for instance, html4.4ht
, unicode.4ht
, and
mathml.4ht
).
Different source files may request the loading of different style files and in
different orders. The hook seeding files are loaded in response to the loading of the
style files, and in a compatible order. Since the different style files may redefine the
syntax and semantics of macros, TeX4ht
follows a similar route of defining and
redefining the hooks and their meanings.
10.4.1 Custom output formats
The meaning attaching files are normally requested through option names introduced
in the tex4ht.4ht
system file. It defines options for all output formats supported
by TeX4ht
. For instance, html5, ooffice for the ODT output, tei, and so
on.
These options are passed to TeX4ht
by make4ht
according to the --format
command line parameter, but you can pass them also yourself.
The user may add option names, and redefine old ones, within a new file named
tex4ht.usr
.
A new tex4ht.usr file should group references to *.4ht
configuration files under
arbitrarily chosen option names. For that purpose, \Configure
commands similar to
those provided in tex4ht.4ht
should be employed. These are particularly useful if
you use custom packages that are not included in TeX distributions and thus aren’t
supported by TeX4ht
.
You can place your custom .4ht
files or tex4ht.usr
in your local TEXMFHOME
tree, for instance in ~/texmf/tex/latex/my4htfiles
.
Location of the TEXMFHOME directory can be found using the following command:
$ kpsewhich -var-value TEXMFHOME
Example
Let’s say that you have a custom package mypackage.sty
:
\newcommand\mycommand[1]{Hello #1} \endinput
This can be configured using the following configuration file, mypackage.4ht
:
\NewConfigure{mycommand}{2} \pend:defI\mycommand{\a:mycommand} \append:defI\mycommand{\b:mycommand} \Hinput{mypackage} \endinput
Important command in this listing is \Hinput{mypackage}
. The \Hinput
expects
package name as it’s argument. It registers it for the latter processing in the output
format files.
Here is a custom output format file sample.4ht
:
\exit:ifnot{mypackage} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \ConfigureHinput{mypackage} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \Configure{mycommand}{\HCode{<span class="mycommand">}}{\HCode{</span>}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \endinput\empty\empty\empty\empty\empty\empty %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \endinput
The \exit:ifnot
command takes comma separated list of packages supported by
the output format file. This stops it’s loading if the currently processed package
doesn’t have configurations in the file.
The configuration for the package is placed between \ConfigureHinput
and
\endinput\empty\empty\empty\empty\empty\empty
.
To request the custom output format file, we need to add it to tex4ht.usr
. Here
is an example that adds a new option myhtml5. It is based on the code for the html5
option from tex4ht.4ht
:
\Configure{myhtml5}{% \:CheckOption{info}\if:Option \Hinclude[*]{infoht4.4ht}\fi \:CheckOption{info}\if:Option \Hinclude[*]{infomml.4ht}\fi \Hinclude[*]{html4.4ht}% \Hinclude[*]{unicode.4ht}% \:CheckOption{mathml}\if:Option% \else\:CheckOption{mathml-}\fi% \if:Option% \Hinclude[*]{mathml.4ht}% \Hinclude[*]{html-mml.4ht}% \else \Hinclude[*]{html4-math.4ht}% \fi \:CheckOption{svg}% \if:Option \else\:CheckOption{svg-}\fi \if:Option \else\:CheckOption{svg-obj}\fi \if:Option \else\:CheckOption{svg-inline}\fi \if:Option \Hinclude[*]{svg-option.4ht}% \:CheckOption{info}\if:Option \Hinclude[*]{infosvg.4ht}\fi \fi \Hinclude[*]{html5.4ht}% \Hinclude[*]{sample.4ht} }
It uses the \:CheckOption
commands to detect additional options, which results
in conditional loading of various output format files using the \Hinclude
command.
Our custom output file sample.4ht
is placed at the end.
You can then require the custom output format using this command:
$ make4ht filename.tex "myhtml5"
10.5 TeX4ht
literate sources
To add a proper support for a new package, it is necessary to edit the TeX4ht
literate sources. All distributed TeX4ht
files, including tex4ht.sty
and all
.4ht
files, are generated from these literate programming files. It is also the
reason why the generated files don’t contain much comments, these are in the
sources. If you want to understand how TeX4ht
works, it is necessary to read
them.
The source files are available in the TeX4ht
source repository. You can retrieve
them using a SVN client.
$ svn checkout https://svn.gnu.org.ua/sources/tex4ht/ $ cd tex4ht/trunk/lit/
The configurable hooks for all packages are contained by the tex4ht-4ht.tex
file.
Configurations of these hooks is placed in the output format configuration files. The
most common output format is HTML
, which can be configured in tex4ht-html4.tex
,
or tex4ht-html5.tex
if HTML5
features are used. You can also update sources for
other output formats, for example tex4th-ooffice.tex
for the ODT format, or
tex4ht-tei.tex
for TEI. The sources of the tex4ht.sty
package are available in
tex4ht-sty.tex
.
To compile all literate sources, run the make
command. You will need basic UNIX
utilities for this to succeed, as well as m4
and javac
. You can also compile
particular source files. Most of them can be compiled using LaTeX, but
some of them, for example tex4ht-4ht.tex
, needs to be compiled using
etex
.
10.5.1 How to add support for a package to the TeX4ht
literate sources
Given following package sample.sty
:
\ProvidesPackage{sample} \newcommand\hello{hello} \endinput
This simple package defines command \hello
, which simply prints the word
“hello” when used in a document.
Let’s say that we want to insert some HTML
tags before and after the text content
printed by the command.
Basic template for tex4ht-4ht.tex
:
\<sample.4ht\><<< % sample.4ht (|version), generated from |jobname.tex % Copyright 2017 TeX Users Group |<TeX4ht license text|> \NewConfigure{hello}{2} \pend:def\hello{\a:hello} \append:def\hello{\b:hello} \Hinput{sample} \endinput >>> \AddFile{9}{sample}
Configuration for each package must follow this basic template. The ProTeX
system is used as system for literate programming.
The \<name\><<<code>>>
block defines new macro which can be then called
using |<name|>
. The license text is included in this way in the example. The
instruction to generate the .4ht
file is given in the command \AddFile{9}{sample}
after the block definition. The first argument to \AddFile
is an arbitrary
number.
Each package configuration must include \Hinput{packagename}
, in order to
load the configurations for the package.
The command \NewConfigure{hello}{2}
declares new configuration hello
,
with two configurable hooks. These hooks are named \a:hello
and \b:hello
. The
hooks must be inserted into the \hello
, which can be easily done using the
\pend:def
and \append:def
commands. These commands can insert code at the
beginning, respective at the end of the redefined command.
The package name must be also included in the mktex4ht-cnf.tex
file. This file
is used in the generation of the
\AddFile{9}{sample}
You can place configuration for HTML
to the tex4ht-html4.tex
file:
\<configure html4 sample\><<< \Configure{hello}{\HCode{<span class="hello">}}{\HCode{</span>}} \Css{.hello{color:red;}} >>>
The \<configure html4 packagename\>
block will produce code that
detects use of the package packagename
. It then loads configurations for the
package.
The .4ht
files can be generated simply using the make
command.
The following sample TeX file:
\documentclass{article} \usepackage{sample} \begin{document} \hello\ world. \end{document}
Produces a following HTML
code:
<!--l. 4--><p class="noindent" > <span class="hello">hello</span> world. </p>
10.6 ProTeX
The literate programming system used in the previous section is called ProTeX. We should discuss some main ideas behind this system.
Literate programming is a discipline that promotes the writing of programs the way one explains them to human beings. ProTeX is a literate programming system fully implemented in terms of TeX, and it is compatible with LaTeX and other TeX-base systems. TeX4ht, and ProTeX itself, are examples of applications written in ProTeX.
\input ProTex.sty \AlProTex{extension,<<<>>>,list,title,escape-character} \<title\><<< code fragment >>> |<title|> \OutputCode\<...\>
Some explanation:
\input ProTex.sty \AlProTex{extension,<<<>>>,list,title,escape-character}
The escape-character stands for ‘, @, |, or ?. If omitted, it stands for
|
.
\<title\><<< code fragment >>>
This structure provides names to code fragments (the fragments should not be too large in size).
|<title|>
This command acts as a place holder for the code segment associated to the title
(|
stands for the escape character).
\OutputCode\<...\>
This command creates a file for the code whose root node is specified.