Creating Professional Quality Documents with Free Software
Over the last ten years I’ve done a lot of technical writing for my college courses. Every week I write quizzes, lab assignments, and documentation, and now and then I write blog posts like this one. Taken all together it amounts to thousands of pages.
Since I’m a lazy, I’m always looking for tools that will save time on the document-generation (and give me more time to watch Ninja Warrior). I also want to create good-looking documents that can be converted to HTML, PDFs, and the occasional ePub when I need it. So far I haven’t found one magic piece of software that can do everything I need, but, there is a very sweet cocktail of cool free software that will do the job. My toolkit keeps evolving and I’m always on the lookout for something better. In the meantime, here’s my current list of document-generation tools.
1 LyX
LyX is my number one writing tool. It’s a LaTeX-based document processor that helps your write documents in a structured way; it bills itself as the editor that does what you mean to do. As you’re writing, you mark your text according to what it means, and LyX will format it appropriately. This is very different from a word processor, that requires you to format your text. If you write scientific documents, LyX can format any mathematical forumlas that can be printed by LaTex.
LyX is a little different that Word or Pages, so expect to go through some rough patches as you go along.
I usually write my documents on LyX, then export them as HTML and PDFs. I’m writing this post on LyX, and I’ll be exporting it as a LyX document. Some other export formats are:
- Postscript
- DVI
- LaTeX
- RTF
- Plain Text
- EPS
LyX runs on Windows, Linux, and OS X, and it’s free. Fortunately, you don’t need to know anything about LaTeX. LaTeX is one of the most complex pieces of software in existence, but LyX shields you from the pain. All you have to do is write.
You can get LyX at http://lyx.org.
2 elyxer.py
1 lyx2html5(){
2 html5_head=~/Dropbox/includes/html5_head.html
3 html5_footer=~/Dropbox/includes/html5_footer.html
4 if [ $# -lt 2 ]
5 then
6 echo ’Usage: lyx2html input.lyx "title"’
7 else
8 echo Converting $1 to ${1%.lyx}.html
9 echo ’—’
10 /usr/bin/elyxer.py --iso885915 --nofooter --notoclabels \
--raw --title "$2" $ 1 ${1%.lyx}.txt
11 cat $html5_head ${1%.lyx}.txt $html5_footer > ${1%.lyx}.html
12 sed "s/THE_TITLE/$2/" < ${1%.lyx}.html > tmpfile
13 mv tmpfile ${1%.lyx}.html
14 rm ${1%.lyx}.txt
15 fi
16 }
-
Lines 2 and 3: The files that contain the HTML 5 templates that will enclose the Elyxer-generated HTMLLine 10: elyxer.py called with several options, the most important being “–raw”, which tells elyxer.py to omit the XHTML/HTML 4 doctype. The HTML 5 template includes some text “THE TITLE” that is replaced with $2, the second argument to the function lyx2html5().
lyx2html somefile.lyx “This is the Title”
3 Markdown
Markdown:
A First Level Header
====================
A Second Level Header
———————
Now is the time for all good men to come to
the aid of their country. This is just a
regular paragraph.
The quick brown fox jumped over the lazy
dog’s back.
### Header 3
> This is a blockquote.
>
> This is the second paragraph in the blockquote.
>
> ## This is an H2 in a blockquote
Output:
<h1>A First Level Header</h1>
<h2>A Second Level Header</h2>
<p>Now is the time for all good men to come to
the aid of their country. This is just a
regular paragraph.</p>
<p>The quick brown fox jumped over the lazy
dog’s back.</p>
<h3>Header 3</h3>
<blockquote>
<p>This is a blockquote.</p>
<p>This is the second paragraph in the blockquote.</p>
<h2>This is an H2 in a blockquote</h2>
</blockquote>
4 exitwp.py
-
Clone the exitwp Git archive:
git clone https://github.com/thomasf/exitwp.git
- Export your Wordpress blog use the Wordpress exporter.
- Put your Wordpress XML file into the wordpress-xml directory inside the exitwp folder.
- Run xmllint to find errors; correct them.
- Inside the exitwp folder, run the converter: python exitwp.py
- Find your converted Markdown pages in the build directory. You can move them to your Octopress blog source directory.
5 Pandoc
About Pandoc
If you need to convert files from one markup format into another, pandoc is your swiss-army knife. Pandoc can convert documents in markdown, reStructuredText, textile, HTML, DocBook, or LaTeX to
- HTML formats: XHTML, HTML5, and HTML slide shows using Slidy, Slideous, S5, or DZSlides.
- Word processor formats: Microsoft Word docx, OpenOffice/LibreOffice ODT, OpenDocument XML
- Ebooks: EPUB
- Documentation formats: DocBook, GNU TexInfo, Groff man pages
- TeX formats: LaTeX, ConTeXt, LaTeX Beamer slides
- PDF via LaTeX
- Lightweight markup formats: Markdown, reStructuredText, AsciiDoc, MediaWiki markup, Emacs Org-Mode, Textile
6 Octopress
Octopress is my choice for blogging platform — it’s a a static site generator that uses SASS and Markdown, as well as Ruby and RVM. It’s main advantage is that there’s no database involved, and no executable language that can be hacked. It has all the robustness and speed of static HTML pages. You write your posts in Markdown, and Octopress will convert them to static HTML files.
Octopress is for hackers only. It’s not for beginners. If you feel comfortable with PHP, and you’re fine with Wordpress having 1000’s of plugins, and you’ve gotten accustomed to dealing with the constant stream of security upgrades, you should stick with Wordpress. Don’t even look at Octopress.
But, if you like to hack some code, and you like Ruby, and you want to learn about modern tools like Sass, and if you enjoy tinkering with the guts of your software, you owe it to yourself to take a look at Octopress. It’s a breath of fresh air.
Visit octopress.org to learn more.