Workflow for documenting open source software
Abstract: A method to offer both, a valid documentation for the last release as well as one for the current trunk version.
© Copyright Daniel Krajzewicz, 13.11.2016 00:08, cc by-nc-nd
A good documentation is a key asset to any software. But in contrary to commercial applications, open source software has two types of users. The first one works with the last release only (at least as soon as it is stable enough). The second one but wants to use the current (trunk) version, mostly because it contains some additional not yet released features or bug fixes. Both need an up-to-date documentation.
In the following, a short motivation for this article is given, first. Then, the initial procedure of generating the documentation for SUMO is given. Afterwards, the currently used workflow that takes into account the need of both user types is presented. The article ends with some kind of a summary.
A Motivation Outline
Now, what is a software documentation? Besides being a help to use the software, one should as well treat it as a kind of advertisement. It sells the product by explaining its use and by showing its capabilities.
While software develops over time, the documentation has to be accordingly updated. Mostly, "accordingly" means that every version of the software has an own documentation that describes what this version of the software does.
In principle, this would allow to close the work on the documentation at the same time as on the software the release consists of. More realistic is to end the work on the software first, and then complete and/or revalidate the documentation once again.
Besides these changes and their timings, documentation, as every human artefact, may contain errors, ambiguities, or mistakes. In the following, it is assumed that such documentation mistakes are not tried to be corrected before the next release. The reason is the wish to preserve a congruency for a release.
But what if the software is accessible between two subsequent releases? This is often the case when dealing with open source software which sources can be obtained from freely accessible revision control systems. Between two releases, major things, such as command line options or parameters may have changed and without a proper documentation, a user would have to look into the code.
First Attempt: DocBook
The first approach for documenting SUMO was to use DocBook. DocBook is "a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation" (http://docbook.org/whatis). The documents under work were located in the revision control and were converted into HTML-documents at every new release.
The documentation was fixed within the SVN together with the software source code of the according release. The .pdf and .html documents built from the DocBook sources were included into the released software package. One should note that the results were satisfying enough, even satisfying enough for being plagiarised: SiMTraM user documentation (local copy).
Now, few people want to start reading DocBook source documents and the processing tool chain for obtaining .html or .pdfs out of DocBook is not a one-click thing. As well, the documentation was not always kept up-to-date with the implementations during the daily work - rather it was prepared for the next release. This but made the life of some users complicated as no easily readable documentation that would describe changes performed on the software between two releases was given.
Second Attempt: wiki-parsing
Due to the aforementioned issues, I tried something new; The documentation was written in a wiki (a copy is still available at http://sumo.dlr.de/wiki). At each release, a script was run that converts them to HTML-pages.
Now, we could change the documentation within the wiki during the development. It should be pointed out that using a wiki is much more comfortable than writing DocBook, even though some great tools, such as XMLmind XML Editor, are available. Consequently, the wiki contained all corrections and changes that resembled the work done between the releases. A user who wanted to know the current state of the software could simply look into the wiki - and the wiki's history even supports following relevant changes on the software! When releasing a new version, the according (up-to-date, literally) .html documentation was built by converting the wiki-pages into plain HTML with some surrounding navigation snippets.
You may be interested in the source code for the script that converts the wiki pages into HTML. You may find it in the SUMO SVN at buildHTMLDocs.py. It does some things - e.g., patching the links to mirrored images, pages, etc. Surely, one could attempt to build a more flexible tool for this purpose. I did not - yet. BTW, the tool works with a MediaWiki instance as used for SUMO documentation. It will probably not work with other wikis as it directly extracts parts of the HTML-pages generated by the wiki systems by scanning for certain tags.
Ok, what about defining some requirements post mortem? We have two kinds of open source users; ones who use the release and its stable documentation (now .html, earlier .html and .pdf) and ones who want to use the SVN-version and accordingly need an up-to-date documentation. Both benefit from using a wiki-based documentation, used to generate fixed HTML pages at release time.
Was that the major reason for moving to the wiki? No, in fact, the ease in editing things is a major benefit as well. DocBook is still a terra incognita to many people. But finally, one should state that the major issue is still not solved: the documentation has to be written.