Explain the Emergence of Markup Languages.

Authors Avatar


The report will explain the Emergence of Markup Languages. The aim of the report is to describe and evaluate different types of markup languages that exist at the present time. In addition, it will particularly focus on the following areas:

  • Background to Mark Languages
  • SGML
  • HTML
  • XML
  • WML
  • SMIL
  • Other Markup languages (i.e. MathML, CML, etc.).


In order to produce this report, secondary research technique was used. This enabled us to gather information from the Internet, and this was the best medium to conduct our research effectively. This is because resources and up-to date information was available.


Historically, the term markup was primarily used to describe annotations or other marks within a text intended to instruct a compositor or typist how a particular passage should be printed or laid out ( Lou Burnard, 1995).

During the 1970s, computer scientist at IBM conceived of breaking away from display and printing towards structural markup (George Dillion, 1999). This is because it was possible to identify the structural units of a document in terms of a (large) set of general structural units (paragraph, abstract, example, etc.). (George Dillon, 1999).

After the production of text was automated. The term markup was extended to include different types of special “markup codes” inserted into electronic texts to govern formatting, printing, or other processing (Lou Burnard, 1995).

The remaining section of this report will highlight different markup languages that exist in the cyberspace.


SGML is a Standard Generalized Markup Language defined in ISO Standard 8879:1986. It is the international standard way for creating descriptive markup languages. According to Martin Bryan of The SGML Centre, “SGML takes the concept of descriptive markup beyond the level of other markup languages”. This is true to some extent. He goes on to say, “By defining the role of each piece of text in a formal model, users of programs based on the SGML can check that each element of text is used in the correct place”. This means SGML allows computer to check, for instance, that users do not mistakenly enter a third-level heading without first having entered a second-level heading.

According to Martin Bryan, the SGML language allows users to do the followings:

  • Link files together to form composite documents.
  • Identify where illustrations are to be incorporated into text files.
  • Create different versions of a document in a single file.
  • Add editorial comments to a file.
  • Provide information to supporting programs.

Critics may argue that SGML is a predefined set of tags that can be used to markup documents, or a standardized template for producing particular types of document. However, this argument can be refuted because the SGML language was not designed to be a standardized way of coding text. This is because it is impossible to devise a single coding scheme that will work will all languages and applications. In addition, the SGML language is a formal language that can be used to pass information about the component parts of a document to another computer system. It can be said this is one of the advantages of this language because “SGML is flexible enough to be able to describe any logical text structure, whether it be a form, memo, letter, report, book, encyclopaedia, dictionary or database.” [Martin Bryan of The SGML Centre].

The Components of SGML

Syd Bauman, 1997 in his article outlines the major features of SGML as follows:

  • Text divided into elements, which can nest.
  • Element boundaries marked by tags. 
  • Elements carry generic type and other attributes. 
  • Entity references allow string substitution for character set problems, standard boilerplate text, and document management.
  • Consistent use of delimiters, few special characters.

Many would argue SGML is same as other markup languages. But this is not true according to Martin Bryan of The SGML centre, which we agree. He says, “SGML differs from other markup languages in that it does not simply indicate where a change of appearance occurs, or where a new element starts. SGML sets out to clearly identify the boundaries of every part of a document, whether it be a new chapter, a piece of boilerplate text, or a reference to another publication. But SGML does not presume that it will be told where everything starts and ends. Instead it provides rules that allow the computer to recognize where the various elements of a text entity start and end. By careful use of these rules the amount of coding that needs to be entered by a human operator can be reduced to a bare minimum.

To allow the computer to do as much of the work as possible, SGML requires users to provide a model of the document being produced. This model, called a Document Type Definition (DTD), describes each element of the document in a form that the computer can understand. The DTD shows how the various elements that make up a document relate to one another.

To allow the computer to correctly identify where each part of a document starts and ends SGML requires that the user declares, in an SGML Declaration, how the computer is to identify markup, and what codes have been used to identify and delimit markup sequences.”

Advantages and Disadvantages of using SGML

So far we know what is SGML and its major features. Now we will highlight the main advantages and disadvantages of SGML briefly.

The main advantages includes:

  • It is a descriptive markup, which allows separation of data from processing specification and re-use of text by multiple processes, etc.
  • Posse’s hierarchical structure that allows a person to manipulate and manage a variety of different sizes text (e.g., not only chapter titles, but entire chapters).
  • Flexibility and does not dictate what the parts of your document are, just how you express what the parts are.
  • Well-defined structure and interface for non-text notations (graphics, sound, etc.)
  • It is in plain-text representation, which means it is human-readable, system independent, always printable and finally always editable.

However, one of the main disadvantages of SGML is that it is possible to abuse the language in non-descriptive ways including using rainbow DTD, PIs (processing instructions) and abuse most any other non-descriptive use (Syd Bauman, 1997).

In summary, SGML is a markup language that uses conventions for encoding text. Its potential is significant and finally because of it other markup languages were introduced such as XML, etc. However, what we have said about SGML is just the half of it. We believe that is all you need to understand the foundation of this language. For more information on SGML markup codes refer to the references section.


HTML (HyperText Markup Language) is the best known mark-up language. HTML is the language of the Web. Web pages are written using HTML and Web browsers understand HTML. It is a simple language that is "well suited for hypertext, multimedia, and the display of small and reasonably simple documents." (Bosak, 1997, Paragraph 3). HTML includes a finite set of tags that comprise the elements allowed in an HTML document. These include tags for titles, paragraphs, tables, linking, etc.

HTML gives authors the means to:

  • Publish online documents with headings, text, tables, lists, photos, etc.
  • Retrieve online information via hypertext links, at the click of a button.
  • Design forms for conducting transactions with remote services, for use in searching for information, making reservations, ordering products, etc.
  • Include spreadsheets, video clips, sound clips, and other applications directly in their documents.

Tim Berners-Lee while at CERN invented HTML.  During the 1990s, HTML succeeded with the explosive growth of the Web. During this period, HTML has been extended in a number of ways. The Web depends on Web page authors and vendors sharing the same conventions for HTML. This has motivated joint work on specifications for HTML.

Most people agree that HTML documents should work well across different browsers and platforms. Achieving interoperability lowers costs to content providers since they must develop only one version of a document. If the effort is not made, there is much greater risk that the Web will devolve into a proprietary world of incompatible formats, ultimately reducing the Web's commercial potential for all participants.

Each version of HTML has attempted to reflect greater consensus among industry players so that the investment made by content providers will not be wasted and that their documents will not become unreadable in a short period of time.

Join now!

As the Web community grows and its members diversify in their abilities and skills, it is crucial that the underlying technologies be appropriate to their specific needs. HTML has been designed to make Web pages more accessible to those with physical limitations. HTML 4.0 developments inspired by concerns for accessibility include:

  • Better tables, including captions, column groups, and mechanisms to facilitate non-visual rendering.
  • A new client-side image map mechanism (the MAP element) that allows authors to integrate image and text links.
  • Long descriptions of tables, images, frames, etc.
  • Better distinction between document structure and ...

This is a preview of the whole essay