Monthly Archives: April 2012

On Wishing Death to Word

So I keep seeing links to this denunciation of the dysfunction of Microsoft Word in Slate: Tom Scocca, “Death to Word”.

One heartily endorses the sentiment. Scocca’s example of what pasted Microsoft Word XML looks like is comedy gold (and all too familiar from student blog posts). The heart sinks, however, when it turns out that the only alternative Scocca knows to mention is TextEdit, even though his explicit concern is with the crippling defects of Word when it comes to moving documents between print and HTML. In other words, the entire universe of text editing software (as opposed to word processors) is invisible to the writer of the article. No doubt he can’t imagine any way to break Word’s near-monopoly, let alone that there are both open-source and commercial systems of long standing that are much more versatile.

As I keep learning when I try to explain my use of LaTeX to humanists, the first obstacle is that the very concept of text processing is alien to most word-processing, WYSIWYG-expecting users. The response to a screen full of my TeX source is, “How do you print that”? Such users have long accepted the endless frustrations of Word in exchange for the relative simplicity with which it allows them to produce printable documents and share them. Or they have accepted the frustrations because the alternatives are unknown, maybe inconceivable without a different kind of conceptual framework.

But it is baffling, in a way, that though people who write are willing to spend many many hours learning to persuade Word to do its job and fighting with its problems, the same people are unlikely to spend the hours (probably fewer, in the end) needed to become adept at text-processing. Somehow the digital facts of life about text–markup, text encoding, processing—are quarantined in Code Land, the forbidden zone where only the Techies dare to venture. And everyone knows it’s okay for humanists and literary people not to be Techies. In spite of that they become, by default, technicians of Word but not technicians of text.

The latter would be better. Why is this not part of everyone’s basic digital literacy curriculum? Oh, wait, we don’t have a widespread basic digital literacy curriculum. But we should, as part of the goal of distributing the cultural capital of genuinely useful literacy as widely as possible. And it should include some lessons on two distinct tasks: composing text in a digital medium and processing digitally-composed texts into other formats (including print). Everyone should have a chance to learn what it’s like to write text in a text editor and then do something with that text in a processor.

Extra thoughts

I really wish the popular blogging platforms made the ability to swap between HTML and WYSIWIG editing more prominent, and encouraged everyone to do it. It seems to me that if more of the people who are writing on the web could be encouraged to play with this ready-made demonstration of how they are really first composing marked-up text and then rendering it in a browser, many more people could become technicians of text. And the day when we dance on Word’s grave would come a little closer.

I also think very highly of markdown. tumblr, bless its heart, allows you to compose posts in markdown. Markdown is easy to write, and its relation to HTML is easy to understand. Thus you can actually see how your composed markdown text leads to HTML, and then you can render it yourself in-browser.

E-mail clients too. Who thought RTF would be a good “rich” e-mail format?

To come, maybe

A guide for the perplexed on how to gain “reading knowledge” of LaTeX, if you are ever working with a TeXhead like me who shares their source and gets crabby if you ask for it “in Word.”

4 Comments

Filed under Conversion, General Reflections, Word

Percent-Ampersand Is Not Shebang

%& is not a shebang.

Somewhere on the internets I picked up the idea that the first line of a tex file should specify what flavor of TeX it’s written in, where flavor was one of TeX, LaTeX, XeLaTeX, XeTeX, etc. I think was imagining that it was some kind of TeX-internal version of the “shebang” #! that tells a Unix shell what program to use to execute a script. Indeed my standard template file had such a line (but not any more).

Wrong.

After a lot of googling and digging and frustration, I have learned the following. (This sort of thing happens a lot, when you want to know about anything that has to do with to the running of TeX instead of just typesetting it.) The %& comment, which I propose to call the “peramp,” is a mechanism for feeding a “format” to the tex engine. A format, as one learns from Victor Eijkhout’s book TeX By Topic, is a kind of precompiled bundle of macros that the basic TeX engine can use. (Even “plain” TeX, it turns out, is the TeX engine using a “plain” format). It is possible to make your own formats and feed them to the TeX engine at the command line (cf. the tex manpage) with a command like tex &myformat.fmt. As one might guess LaTeX is implemented as a format. The pdftex engine, furthermore, supports switching among plain TeX, LaTeX (both dvi output), pdfTeX, and pdfLaTeX using formats. So—this is kind of cool—if your source file has a peramp line of the form %&engine, you can process it with any of latex, pdftex, and pdflatex and get consistent results equivalent to processing it with engine at the command line. In fact it seems that for a while all of these engines have secretly been the single pdftex program invoked with the appropriate format.

Apparently some other TeX varieties I don’t use can also be handled this way. Unfortunately, XeTeX, which I do use, is not secretly a TeX format. As the author of XeTeX testily explains in this mailing list post I found, xetex is an independent engine and must be invoked as such on the command line. pdftex will not switch over to xelatex if it finds %&xelatex at the start of the file, and xelatex will not switch over to pdftex in the converse situation. The engine xetex does support TeX formats compiled specifically for it—that is what XeLaTeX is. So if you run xetex on a file that begins %&xelatex it will indeed be processed as XeLaTeX and not as plain XeTeX.

The water is further muddied because I seem not to be the only person to have picked up this idea about peramp-as-shebang, and there are programs on the internet that this method or a variant. In particular the GUI front-end TeXShop does support this kind of engine detection, with its own distinctive first-line syntax: %! TS-program = XeLaTeX (note the caps). But this is specific to TeXShop, not a feature of TeX.

Working with TeX one often feels one has stepped into a kind of Bizarro *nix Land: a lot of things look very similar to, but not quite the same as, things in a Unix-style programming environment. This is, I think, mostly evidence of TeX’s age (older than the GNU project) and devotion to backward compatibility. It’s also evidence of the fact that TeX users have not, by and large, been programmers but scientists, who (in my small experience) seem to specialize in Bizarro Programming.

[Edit, same day: If you use tex at the command line, it will not process a peramp-line for one of the other formats.]

1 Comment

Filed under General Reflections, running tex