LaTeX for documentations in large projects

A

alb 11 years ago

Hi everyone,

today I spent nearly 8 hours chasing the very same content in more than 30 documents in order to bring up to date the entire documentation set. There must be a more efficient way to structure documents and their content in order not to duplicate information so wildly.

Single source of truth is a very basic principle that IMO is necessary when dealing with hundreds of documents throughout the lifecycle of a relatively large project.

Our systems are not big, though a project can last several years from the breadboard to the flight unit and it goes through a long list of reviews with a large set of documents, each with its own individual evolution.

Needless to say many may contribute to the same document and often, due to scheduling and priorities, we are obliged to minimize the effort to update a portion of the docs for a specific milestone, with the chance that part of the information is not correctly propagated throughout the whole set and in less than one can imagine the whole documentation set is an interwined list of broken links!

On top of this, our nice friend 'Word' is making our life even more miserable, forcing us to spend more time than acceptable on stupid formatting issues.

In my previous life I started to build an online documentation system for our software users, using texinfo. It was pretty neat and simple, with hierarchical nodes and leaves, which was accessible throughout our computer network in several format (info, ps, pdf, html). A simple makefile took care of bringing up to date the last changes and everything was under version control. That was pretty ok for a limited number (some tens) of people more confortable with a terminal rather than some flashy GUI.

Using texinfo in this case is certainly not viable given the formatting prerequisites of our documents, but maybe LaTeX could prove itself a viable solution. Main issue is to train 98% of the department to use a markup language instead of MS Word (ouch).

After all this bla-bla finally I come to a more specific question: is there anyone out there who has considered or is using latex in a collaborative environment for their product documentation?

Any ideas/suggestions/remarks? I do realize that even if I get the rational basis to put in place a documentation system based on LaTeX, or anything similar, I may face irrational obstacles from people who would rather keep things as they are instead of 'losing' time with a markup language for nerds only.

Thanks in advance,

Al

A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?

Vote

D

Don Y 11 years ago

------------------------------------^^^^^^^^^^^^^^

---------------------^^^^^^^^

-----------------------------^^^^^^^^

I can sympathize with your dilemma. A few (disorganized) comments.

MSWord isn't good for anything -- even a one page memorandum! Wait until you discover that you can't access documents created with version N-2 or whatever!

You will probably discover that you need a "Documentation Czar" to "enforce" policy/consistency in your documents. In reality, this individual will become the chief grunt -- responsible for FIXING everyone else's screwups!

GUI's are dangerously seductive. They allow users to focus on the

*appearance* of a document instead of its semantic content. "OK, that's in italics like it is supposed to be!" (No, italics are used for several different types of tags in our organization... concentrate on getting the TAG right, not the appearance of the text rendered for that tag!)

Long ago, I used Ventura Publisher for my technical documents. But, abandoned it when Corel bought the product and switched to a "compiled"/proprietary format for the documents (prior to that, you could always open the "original document" in a text processor and apply fancy transformations that the GUI tool would never have envisioned).

After that, I started using FrameMaker. In some ways, more capable. In others, more limited. I have recently been contemplating moving to a full SGML implementation (newer versions of FM support SGML to some degree so my "conversions" shouldn't be too painful). But, that's still a fair bit of effort, on my part, and, a fair bit of

*risk* so I have been procrastinating on that decision (and, meanwhile, accumulating more documents that will eventually have to undergo any such conversion!)

Beyond the tool(s), by far, the biggest problem will be training folks for the proper mindset. To treat words, terms, phrases, etc. as more than just collections of letters/glyphs. IME, this is a lot tougher nut to crack! :<

Good Luck,

--don

Vote

T

Tim Wescott 11 years ago

Read what Don has to say.

If you do adopt LaTeX, I suggest you use LyX as the authoring tool. It's not WYSIWYG, which is good from a technical standpoint, but it does let the author type in an environment that's sort of what-you-see-is-what-you- meant, and it makes it easy to spin off PDFs. It's what I use for authoring most of my documents here, unless a customer specifically insists on word-compatible documents.

But LaTeX isn't going to automatically solve your duplicated-info problem, any more than Word enforces it: what you need is a process, that everyone signs up to, that makes ONE source for information golden, and insists that other documents point to that information source.

It means that someone reading a document needs to have the whole pile and constantly cross-reference, but that's just kind of life.

Tim Wescott Wescott Design Services http://www.wescottdesign.com

Vote

J

John Larkin 11 years ago

We use Word for most of our documentation, at least the stuff that's not plain accii text. It's not bad. You can include logos, pictures, tables, and colors; during doc development, I use blue to highlight things that are new this edit. I use CutePDF to turn Word docs into PDFs for things like user manuals.

Going crazy with styles and tricks and macros can make Word unusable. Change tracking can be a mess, too.

I use Word 2003 and people with later versions don't seem to have problems with my docs.

Markup languages seem barbaric these days. I say that having written a few of them myself.

John Larkin Highland Technology, Inc jlarkin att highlandtechnology dott com http://www.highlandtechnology.com

Vote

D

Don Y 11 years ago

As I said, a "Documentation Czar". Getting folks to adopt common writing conventions for *prose* makes adopting consistent CODING STANDARDS seem like a piece of cake!

In my case, the documentation is tightly integrated with the code. This ensures that the documentation and code remain in "sync" (because the same "source" is used in each place).

Using semantic tags also facilitates "live" links between the code and the documentation -- so, interactive help (on demand or as a result of a thrown error) can directly access the actual documentation instead of hinting at some "error code", etc.

Depending on the markup language in use, the cross references can be somewhat seemlessly provided -- i.e., "point at" what you want to reference (vs. having to synthesize a unique name for a reference, insert that at the target, then refer to it as needed).

With something BEYOND a simple "text interface" (to your documents), you can provide features that would be virtually impossible to (completely) implement, maintain and verify.

E.g., my gesture interface uses a single representation for a specific "gesture" in the documentation (which can be "print", interactive or "otherwise") as well as in the code. So what the user is *told* about a specific gesture is what the *code* actually considers that gesture to be! (think about how you would document a "check mark" gesture in print, graphically, to a blind user, etc. -- and, ensure that the documentation agrees with what the *code* actually checks! How do you advise the user who is having problems getting the system to correctly recognize a '0' vs. a 'O', etc.)

Vote

C

Clifford Heath 11 years ago

A very large percentage of the open source world is using Markdown for this kind of thing. It's much more approachable than LaTeX for beginners, and is intended to be easily readable as plain text.

There is a number of dialects. Check the Github variant in particular. I'd also recommend using a central git repository (I deployed GitlabHQ, which is like a private Github, and trivial to deploy).

Clifford Heath.

Vote

P

Phil Hobbs 11 years ago

I use LaTeX for all my nontrivial word processing nowadays. I used to be a devotee of WordPerfect 5.1+ for DOS, which I'd still gladly use except that publishers don't accept WP anymore.

There are various documentation packages for special purposes, e.g. doc++.

Cheers

Phil Hobbs

Cheers

Phil Hobbs

Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC Optics, Electro-optics, Photonics, Analog Electronics 160 North State Road #203 Briarcliff Manor NY 10510 hobbs at electrooptical dot net http://electrooptical.net

Vote

B

Bruce Varley 11 years ago

IME, many (most?) large corporations and consortiums go for market-majority stuff, typically M$, and there may not be much that can be done to change the situation. In that case, the challenge becomes to get the doc control people to minimise the imposition of complex templates, which can cause endless hours of grief for documenters who aren't WP gurus.

While on the topic, another observation... where did the tendency come from to abandon descriptive titles in favour of totally cryptic document names? I've had enough of trolling through hundreds of drawings where you have to open up every one to see what it's about.

Vote

G

gyansorova 11 years ago

Open Office of course.

Vote

J

John Larkin 11 years ago

One of my customers assigns a 12 digit number to any entity: a drawing, a piece of code, a physical entity, a document, an assembly, an employee. The schematic, PCB layout, BOM, bare board, assembled board, are all unrelated numbers.

They tend to lose things.

I hate web sites that make you open a zillion PDFs to see what parts they have. Then they make the file names meaningless.

John Larkin Highland Technology Inc www.highlandtechnology.com jlarkin at highlandtechnology dot com Precision electronic instrumentation

Vote

D

Don Y 11 years ago

Because most firms have other systems for tracking the names of documents (whereas *you* are just looking for an individual doc).

What conventions should you adopt for naming documents?

What if the name contains upper and lowercase alphabetics -- and

*your* filesystem doesn't distinguish between cases? (i.e., ReadMe and README can't coexist in the same namespace).

What about the fondness for colons in names (Google: The Missing Manual)?

Or, forward/back slashes (User I/O Capabilities)?

Or, questionmarks, etc.?

Or, special characters?

Or, embedded whitespace?

Or, filename length limitations? (try copying "An incredibly long name for the title of a particularly interesting document, 3rd Edition -- by John Doe, 1988.djvu" to "\An equally\long\folder\designation \downloads\August\Electronics\..." *on* your "Desktop" -- which is itself a rather lengthy absolute path)

Computers (and computer systems) are very good at remembering things (like the name, content, keywords, author, publisher, date, etc.) and associating them with, e.g., files. It's dealing with the

*informal* systems that individuals have/impose that leads to these sorts of problems. [Do you tag the file name with a detailed description of it's content? Or, do you rely on a search tool to index your documents? Or, do you just read through documents hoping to stumble on what you need?]

Vote

R

Robert Baer 11 years ago

Dumb idea. Have a number of R/O "block" phrases / references / pictorials, and the author splices (hyperlink?) them into their document. This pool of R/O blocks can grow as needed. Hell, i have seen a few (unreadable by one that does not know the "code") documents using up to 30% "shortcut" multi-letter contractions. Just think of a document written with "shortcuts" similar to that but the "shortcuts" in this case are the "block" phrases (that is to increase clarity).

Vote

R

Robert Baer 11 years ago

Dot all of the "eyes" and cross all of the "tees"....

Vote

D

Don Y 11 years ago

This doesn't work in the real world.

How do you reference them? How/where are they stored? Who creates/enforces the "system" by which they are used? How do you track down which "documents" reference which objects?

And, you STILL "need to have the whole pile and constantly cross-reference" (else the writer can never render the document he is authoring)

As annoying as GUI/WYSIWYG i/f's can be, there is real value in being able to call up some *other* document and POINT to the thing you want to reference. And, let the *tool* sort out how to encode this reference for you.

For example, I may want to refer to a "section" as: (See "Troubleshooting tips" on page X-x) note that the *title* of the section will be filled in by the DTP tool -- so, if I change it to "When all else fails..." the text above will be replaced with: (See "When all else fails..." on page X-x) everywhere in *my* document that references this -- along with every other document that similarly references it!

*And*, I may even want to reference my *reference*. I.e., take all of the above and paste it into documents as an entity. So, if I change this to reference a *different* section -- perhaps the section on ADVANCED Diagnostics -- then my *reference* to that section gets pasted into those places where I reference the reference!

(I find myself using this often with footnotes, etc. So, the one "Gold Standard" is automatically repeated instead of me having to remember to consistently express it in the same way each time I need to use/insert it.)

What's really needed is a programming language for structured documents... that isn't so intimidating that it precludes large numbers of people from adopting it comfortably!

Vote

D

David Brown 11 years ago

I don't use much LaTeX at work - I haven't yet converted others here. But I use it at home.

Someone suggested LyX as a WYSIWYG front-end to LaTeX. I'm not sure that's the best plan - it gives you better quality typesetting, but it is hard to use the more advanced features of LaTeX. Personally, I usually use TeXclipse with Eclipse, which suits me as I use Eclipse for programming. Another front-end I have heard nice things about is TeXstudio. Getting a good front-end that users are happy with will make all the difference here.

LaTeX has several advantages over a word processor, but also some disadvantages. If you are going to convert to it, you will need someone (possibly you) who is the "expert" to help with more advanced uses.

LaTeX generates much nicer and clearer output. In particular, documents are consistent - you will never again get inconsistent spacing or fonts.
LaTeX documents are far more amenable to version control. And LaTeX will never corrupt your source files, unlike certain word processors.
Your output is in pdf. Anyone can read the output, on any device, without worrying about formatting and layout differences on different computers. The output is more advanced pdf (indexes, cross-links, etc.) than you can get from MS Word and expensive Adobe products, and is generated automatically and simply.
You can automate it, and use macros. The possibilities here are endless.

You can easily include different files and document sections. This could be an answer to having a "single source of truth" while still having duplicate information available in different documents. Have parts of the documentation in separate files, and include those parts when generating the full set of output documents.
If you are working with source code, you can include formatted source code in your documents, knowing that a "make" will get the latest code.

Your customers will have to come back to you when they want a change to the documentation, because they can't fix it themselves.

If you do stick to a word processor, I recommend LibreOffice over MS Office. I think it does a better job of encouraging the use of styles rather than ad-hoc formatting, and it can generate nice pdf documents (not as nice as with LaTeX, but better than anything you can get with Word and external pdf converters). And it's cross-platform.

Vote

J

Jasen Betts 11 years ago

Can you separate the authoritative sources of information and link them into the main document. It used to be that word would let you link an external source that would be referenced when printing the document.

OTOH if formattign is actually important then latex is a win, but you need a way to reference the autoritative source rather than duplicating it, myabe lyx can help there, I'm not sure.

lyx/latex being text based are more amenable to merging multiple simultaneous edits automatically use some sore of revision control

umop apisdn --- news://freenews.netfront.net/ - complaints: news@netfront.net ---

Vote

A

alb 11 years ago

Hi Don,

Don Y wrote: []

This is something I do not understand. As a company we are trying to improve our processes and workflow in order to get a better product in less time and we pin point how important it is to check, verify, validate each technical step through the project's lifetime. Aside to that we have to deliver documentation, indeed a document is a component no less important than any other component in our system, is a part of it without which you cannot:

A. prove you are doing it right B. prove you are doing it in time

Then why there's such a large gap in process development applied to documentation? How can the stakeholders be at the mercy of tools like MS Word?

I personally found that in our customer set of applicable documents there was some degree of inconsistency (24 requirements out of 2019) which may have been potentially costly if found at a later stage in the development.

We have 'Czars' [1] around, we call them PAEs (product assurance engineers) and they tell us what we have screwed up with respect to norms, standards, versions, ... But it's not enough to guarantee links are not broken.

Relying on multiple eyes is not bad per se, but the quirk of it is that we tend to silently accept that if a mistake passed a review than is not a mistake anymore, until two different tables report defferent numbers and the developer does not know which to pick (neither his/her manager).

On top of that, tools like Reqtify, used to trace requirements, may analyze text based on some formatting styles which *may* look the same even if they

*are* different, so the resulting document may have a set of requirements which are not 'captured' by the tool and go silently untraced until God knows when (typically at CDR, where you are ready to launch your flight production!). []

I totally agree with you. There's a tendency to deny the current problems we are facing (daily) and an innate inertia to refuse change, too often considered destabilizing. Some three years ago a revision control system was introduced (svn) and currently we are still facing 'acceptance' issues.

Another issue I currently see in front of my long revolutionary journey is the transition phase. Even imagining a ready to deploy documentation system (which do not have yet), we can only deploy it on newly coming projects, since old ones are already infected by the MS virus. Engineers who are then working on several projects will need to continue in two different ways...kinda confusing if not unsustainable.

The system I have in mind is a hierarchy of units, where information is

*clearly* defined in one single place. Tables (spreadsheets) and diagrams can live in a common directory, while ad-hoc ones can be in the specific document folder:

main # handles the data set with Makefiles ??? common # dedicated to all common parts

??? lists # list of documents, components, units ??? notes # technical notes (potentially linked to change requests) ??? plans # development plans, ??? specs # subsystem specs ??? ABC-DEF-0120-R # with textual sources as well as tables and diagrams # unique to this subsystem ...

I forgot to add an 'output' folder which will contain the whole set of documents produced by the latest build. In an ideal world LaTeX2HTML [2] could be used to export the whole set of docs in html and allow to browse document content with a browser instead, where hyperlinks are a much faster way to reach the information.

Each directory shall have a makefile in order to handle word processing and bring up to date the current directory. Documents shall have a template (or class), or potentially a set of templates, in order to provide a uniform and coherent typeset throughout the project and also between projects.

The structure (just a draft) shall not evolve too much otherwise it becomes unclear where does a piece of information belong to, but shall take into account all needs (I just sketched some of the top of my head).

Scripts and macros may facilitate the generation of chapters, especially tables starting from spreadsheets which are then included into a document.A Makefile would then be a perfect fit for handling dependencies.

The main idea here is to remove two issues from the current situation:

information duplication
formatting

While 1. can be addressed with a hierarchical structure, it is certain that people have to know where information is and what is considered to be sharable and what is unique to a document. As in programming, is not always clear at the beginning which subset of your main program will end up in a library function (especially if you start coding without knowing where you need to end).

Secondly 2. shall not be a problem of the engineer who's inputing technical information. He should provide correct data, while data rapresentation shall be done by the typesetter. Luckily there are a couple of guys in the room who have the skills to take care about 2.

Vote

A

alb 11 years ago

Hi Tim,

Tim Wescott wrote: []

LaTeX -> Open Document Format -> doc

I've successfully converted some files, but some others missed completely the images in the original one... I'm not sure why.

W.r.t LyX it may be a good idea to provide a GUI to people who are accustomed to use one (Word) and it may potentially hide the nuances of the set of Makefiles behind.

Agreed, that's the most difficult part, as also noticed by Don. Is not only a problem of acceptance, is also a problem of costume. While software guys are kind of keen to switch to something like LaTeX because the sole mentioning of Word gives them the creeps, other engineering disciplines are less open to it and stuck with their Windows/Word environment.

There must be a list of common 'functions', a library definition which everyone needs to look up first. The issue is not for who's reading, while who's writing. Reading unique information is guaranteed by using the same 'library', pretty much as function reuse in software/firmware practice.

Vote

A

alb 11 years ago

[1] is Czars plural for Czar??? [2] I've head a great deal of troubles with LaTeX2HTML and switched to texinfo for that kind of need.

Vote

A

alb 11 years ago

At a non negligeable license cost, as well as installation efforts, compatibility issues and many others...

The deadliest part of Word indeed is not related to its - lack of - logic, but rather to the very small benefit it provides to casual writers, instilling a rather vicious belief that formatting a document is a simple task.

As soon as you start to have your document falling apart you realize how difficult would be to manage formatting with a tool like Word.

An appropriate documents db will allow you to set the title, as well as its document number (usually automatically selected for a specific type of document), according to agreement with the customer.

We separate the document number from the file(s) itself, since we reserve the number well ahead of writing the file. The 'object' document is indeed rather more complex than a simple 'document title' and can be classified as a report, note, spec...

If you get only the file, without the rest of the information, you'll likely find yourself lost in the docs. Anyway, whenever we store the file we save it like this: crypticname_issue#_descriptivename.extension. In this way we have the best from both world.

BTW, cryptic names usually *do* have a logic, since they are automatically assigned and classified according to some *rules*. Once you understand them you will likely find more information than with a more 'human readable' title.

Vote

LaTeX for documentations in large projects

Join the Discussion

Didn't find your answer?