[Proposal] Extensible Modular Markup

vporton · 2019-12-16

Hi,

HTML is an outdated mess from the early Web. HTML does not support a Turing-complete macros language (moreover no macroses at all). We need to make a specification of a HTML-like thing with macroses and other advanced features (with a software to convert it to regular HTML). New tags should be easily addable. (No, I don’t insist browsers to support such things as Turing-complete macroses. We can convert from my XML format to another formats before browsing.)

https://en.wikiversity.org/wiki/Extensible_modular_markup (the draft specification)

The idea is make an “extension” of XHTML that is:

based on namespaces
feature rich
extensible by anyone (only need to know programming)
will support macroses and be Turing-complete
replaces both HTML and LaTeX as legacy (LaTeX is especially bad, need to replace it with a nice and responsible markup language).

As examples, I made namespaces for Table of Contents and for syntax coloring sources.

The software already exists. It follows Automatic transformation of XML namespaces - a sophisticated, but complex to use specification. However if you want to use only basic features, no need to know that specification (we are writing anoter Extensibe Modular Markup specification and the automatic transformations is here used just to implement a software, not as a base of EMM).

So, this software can be used to convert Extensible Modular Markup into regular HTML. Note that search engines also need more tags, so adding new tags is useful also for search engines and other automatic agents. However, the usage is also for Web designers and document writers to add such things as above mentioned table of contents, colorized pre tags, etc. Possibilities for both search engines and browsers are unlimited, because new tags can be added by users.

As first examples, I made namespaces for Table of Contents and for syntax coloring sources.

This should be standardized by W3C because my software is slow and we need alternative compatible implementations, to.

See https://vporton.github.io/extensible-markup/ - how to add it to the WICG list of projects?

Please participate in development of both the specs and software. We need a really good meta-language for Web development. Plain HTML is a bad choice for writing Web page, as it even does not support macroses, or for example syntax coloring.

dauwhe · 2019-12-19

It would be very helpful if you could describe some concrete use cases which can’t be done with existing web technologies, and then describe how you would solve them with your proposal. Remember that HTML is not the only part of the open web platform—many of the things you mention can be easily done with JavaScript, or simple transformations of XML to HTML on the server using XSLT.

How would this be different than any other server-side technology?

vporton · 2019-12-19

[dauwhe] dauwhe http://discourse.wicg.io/u/dauwhe December 19

It would be very helpful if you could describe some concrete use cases which can’t be done with existing web technologies, and then describe how you would solve them with your proposal. Remember that HTML is not the only part of the open web platform—many of the things you mention can be easily done with JavaScript, or simple transformations of XML to HTML on the server using XSLT.

One example is the documentation of XML Boiler https://mathematics21.org/xml-boiler-software-automatic-transformation-of-xml-namespaces/ itself: It contains colorized source code listings, table of contents, things not easily doable with other technologies.

As more modules added, more possibilities arrive. Consider, for example, that we could provide full support for math formulas in HTML, with different file formats to convert to chosen automatically.

With XSLT you need to find suitable XSLT scripts, manually determine the order in which they are run, etc. My software runs XSLT scripts and other (Python, for example) scripts in the order chosen automatically based on document namespaces. This adds a lot of convenience and simplicity.

A future version of my software should also:

download these XSLT or other scripts from the Web automatically, based /automatically/ on the document namespaces, which namespace elements are inside other elements, etc.
check validity of the resulting document (that there are no errors).

JavaScript is client-side, I am creating a complementing technology, scripts on the designer’s side. Both are useful.

It would be error-prone to write and choose the order to run for XSLT scripts manually. Or consider how difficult would be implement syntax colorization (a feature of my existing software) if you need to stick to XSLT.

Also my project to standardize XML modules for different things. Standardized modules are a quite different thing that home-made custom XSLT scripts.

In the future some of these modules are to be supported directly by browsers and search engines, while the others to be convertible to such portable formats automatically.

“describe some concrete use cases which can’t be done with existing web technologies” - everything can be done with plain HTML or with a custom Python program, but my project is another level of convenience, and I suspect this level of convenience may probably result in an exponential simplification of complex document processing in the future.

Yes, these things can be done with plain XSLT. But why then nobody yet did them? Where are HTML documents with formulas, graphs, complex responsive graphics? We need to move into the future.

vporton:

We can convert from my XML format to another formats before browsing.)
How would this be different than any other server-side technology?

My Extensible Modular Markup https://en.wikiversity.org/wiki/Extensible_modular_markup is not planned to be a replacement of server-side technologies such as PHP.

Probably it will be used mainly by client-side Web masters, book writers (including scientists who now use LaTeX), as well as automated file converters. It to compete more not with PHP but for example with LaTeX and DocBook to create HTML (or other format) documents. One use example is to write HTML files for github.io or any other software/hardware/… documentation. Now they use LaTeX and PDF or something other not suited for Web purposes (often non-responsive, not viewable in browsers, etc.)

However, it could be used server-side, too, probably by setting a reverse proxy which would do file format conversion and remember the results.

dauwhe · 2019-12-19

Syntax highlighting is quite common on the web, and there are widely-used JS libraries for that purpose such as https://highlightjs.org/. It’s possible to generate tables of contents for paged media using CSS. JavaScript can easily find nodes in a document and build a navigation structure from them.

All browsers will soon support MathML natively, thanks to Igalia. Full support is already available with MathJax.

All over the web

When you start by saying “HTML is an outdated mess,” people are less likely to take your proposal seriously. When your concrete use cases appear to be satisfied by existing web features, people are likely to think that you’re more interested in a new architecture than actual features. Convincing people of that after the astounding success, evolution, and growth of the web is a tall order indeed.

vporton · 2019-12-19

It isn’t full support of math formulas. What about an analog of LaTeX multline environment? We need all features of LaTeX.

isiahmeadows · 2020-01-28

I’ll point out that HTML 4 was based on SGML and XHTML supported namespaces and the like. But browsers never came anywhere close to fully implementing SGML.

vporton · 2020-01-28

I entirely don’t understand how my proposal is related with SGML.

Also note that most of my proposal can be implemented without any changes in browsers. Yes, I do propose to make changes in browsers in the future, but my proposal facilitates such changes, not makes them directly.