Logo IMG
HOME > PAST ISSUE > Article Detail


Speaking of Mathematics

Brian Hayes

Writing as Programming

AsTeR is based on what the computing professions would recognize as a "client-server" model of the publication process. The author or publisher, taking the role of server, encodes the content of a document; the reader, as client, decides how the content will be presented. For this model to work well, the server must choose a style of encoding that captures the underlying meaning of a document, not just its graphic layout.

A markup language like TeX can be used in either way. It has low-level primitives that simply define the appearance of objects. The superscript operator is one of these: It indicates that a character is to be raised above the baseline, without giving a clue to whether the character is an exponent, a limit, or something else. But other commands specify structure more than layout; they include \title and \footnote as well as \frac and \sqrt.

Unfortunately, authors do not always distinguish between content and layout. A particularly troublesome practice is the overloading of operators to give them multiple meanings. On the printed page, notations for a fraction, a Legendre symbol and a logical inference might all look alike, but if they are all built with the \frac control sequence, AsTeR and other rendering programs will be unable to distinguish them. The overloading is needless, because TeX makes it easy to define a new control sequence for each concept.

Authors and publishers would be well advised to avoid overloading and ambiguity—and not just for the convenience of certain readers with special needs. Authors cannot know in advance how their works will eventually be put to use, and there may be occasion later to give thanks for extra care taken now. Witness the current scramble to convert documents in dozens of haphazard formats to HTML, the markup language for the World Wide Web. The job would be easier if authors had not written with the thought that the printed page is the final product.

Knuth has argued that the writing of computer programs is first of all a kind of writing, and ought to be judged by literary standards. Today the opposite assertion also holds: Writing is a kind of computer programming, in which you prepare not just a printed document but the source code that will generate many renderings of that document. And thus the writer may need to learn some lessons from the software engineer.

Even if AsTeR could fluently read all LaTeX documents, most of the world's literature would still remain out of reach. A few documents are encoded in formats that lend themselves even better than TeX does to flexible renderings, such as SGML, the Standard Generalized Markup Language. But most documents exist only on paper or in electronic formats that preserve only layout information. Among these formats the most important are Postscript and PDF (Portable Document Format), both of which were invented by Adobe Systems. Adobe has recently announced a plan to address this issue. They have someone on the staff who is certainly well qualified to do so. © Brian Hayes

comments powered by Disqus


Of Possible Interest

Computing Science: Computer Vision and Computer Hallucinations

Feature Article: The Statistical Crisis in Science

Computing Science: Clarity in Climate Modeling

Subscribe to American Scientist