A lighter HTML-like standard

Dec 6 2021

HTML is really a bit over the top. For example, every non-empty tag has both a start and end tag which both require the tag name in them. The pages are then sent to the user in that form, meaning these extra end tags are requiring more bandwidth whilst contributing no more content. More bandwidth usage causes more load times and more energy usage. Therefore, in this page I suggest ways we could make a new, lighter standard of HTML.

My ideas

I have a few ideas for this. My goal is to make a major decrease in the weight of pages whilst also keeping it close to HTML, to keep it easy to convert to for both developers and browser engines.

Reduce the footprint of endtags

First of all, end tags aren't really necessary, especially since they can't contain attributes[1]. Therefore end tags could be replaced.

The code:

<body>
	<h1>Home<h1>
	<p>Welcome to my webiste!<p>
<body>

Could be replaced with:

<body:
	<h1:Home>
	<p:Welcome to my webiste!>
>

The colon could be replaced with a different character if it makes it easier to parse (e.g. <p>Hello> may be better).

In this small example the number of characters was reduced by 10.

Without creating a standard HTML to this standard convertor it is difficult to estimate how much this would affect page sizes. However one thing is for sure: it would decrease them, making the internet lighter.

Using smaller encodings

My second idea has a much smaller use case, probably just for people who would like to greatly decrease the size of their website/page (perhaps it could be used CAREFULLY for LBPs). This also doesn't affect HTML syntax so much as it promotes the use of and updrade of a lesser used feature.

The default encoding on HTML pages is UTF-8. Technically page developers are supposed to declare this[2], however most browsers assume it anyways if not declared. However, many sites, especially small ones are likely to only use a small number of the availiable characters. Therefore if someone made a tool which could detect the most concise encoding level a website requires then people could encode their websites in that instead and shrink the size of their page.

This could be even placed on webservers: someone could upload their UTF-8 encoded webpage and the webserver could automatically detect the best encoding method and re-encode the webpage. This way it wouldn't affect the development process of webpages.

Also, building upon this, large webpages which use less than 256 characters could have custom encodings which are placed early in the page a bit like custom fonts (Alternatively they could be placed in the main directory of the page a bit like .ico files).

This way even if their character usage doesn't work with any 1 byte encoding standards they can make their own (ironically they'd likely have to use UTF-8 for declaring the custom encoding) in order to decrease their page's weight.

I think this idea (it was kind of two ideas), especially the second part, is much less realistic, it could make web development more complicated in some manners and convincing web browsers that it's worth supporting would likely prove difficult. Also declaring the wrong encoding can break sites so it would have risks to it too.

References

(All external links)

  1. Attributes for an element are expressed inside the element's start tag. HTML5 - W3C Accessed 6 December 2021
  2. You should always specify the encoding used for an HTML or XML page Declaring character encodings in HTML - W3C Accessed 6 December 2021