LaTeX for documentations in large projects

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Aug 12, 2014 2:31 PM

Because folks treat documentation as a "checkoff" item: Have documentation? YES NO

Many people/organizations don't produce formal specifications, test plans, test results, etc. They just have some "random" collection of paper that they can point at when asked about "documentation"... and, no one has much interest in checking to see if any of it is correct or USEFUL!

I have long adopted the approach of writing a User's Manual before beginning a design. It forms an informal specification -- in terms that a user can understand (instead of "stuffy tech-speak"). And, it helps me think about a *unified* approach to a design -- rather than letting the design "evolve" (which seems to be the design methodology du jour!) in unconstrained directions.

When that is done, a customer/client *knows* what the end product will be like and can critique each design decision that the document presents "as fact". It lets hypothetical users ask, "How do I..." and "What if..." questions -- all of which SHOULD be covered in the document.

At that point, the actual *implementation* is a piece of cake! All of the decisions have already been made -- no fear of coming up with two DIFFERENT approaches to two *similar* problems in the user interface because "you got a better idea when working on the second".

Yup. First, they need to be able to stop "delivery" of the product (in order for their efforts to have any "weight"). Second, they need to be "nit-pickers" to ensure they have the requisite skills to *catch* ALL discrepancies.

I'm "too small" to take on many of the projects that I do undertake. And, too "lazy" (unwilling to put in extra effort that shouldn't be necessary). So, I try to design mechanisms that amplify my efforts; do more by doing less.

Tying the documents to the actual implementation is one such example. The documents and the implementation are always in sync (if you let the makefiles do their thing!). It's also a carrot for future developers (maintainers) as the documents provide a friendlier way of viewing and entering changes to the codebase.

For example, a recent document explains and tabulates the rules by which I convert letters to sounds (part of a TTS). The document organizes the rules in a nice, easy to read format. A piece of code that I wrote extracts the rules from the document, rearranges them to satisfy the optimizations that the run-time implements, then encodes them for inclusion in the actual run-time.

A developer *could* insert himself into the middle of that process if he chose to. I.e., take the encoded output from version X of the ruleset and manually introduce changes to advance it to version Y -- without updating the documentation. But, it's almost certain that he will introduce a bug/typo in the process. And, will have to manually revise the regression suite to cover his changes (another opportunity for errors).

Expecting developers to be "lazy", the *intended* way of modifying the rules -- by altering the documentation -- is so much easier (and robust) that it is unlikely anyone will *try* to circumvent it!

Don't discount the fact that many people are not invested in the process. And, others may not be familiar enough with the technology to be *competent* to recognize a subtle mistake! Or, leary of expressing their uncertainty ("Surely Bob would have commented on this *if* it was a genuine mistake...").

One legacy letter-to-sound algorithm often implemented "wildcards" to represent letter patterns of interest. E.g., "any number of voiced consonants". But, the original implementation language was SNOBOL. Folks recoding the algorithm (into C, most often) would carelessly interpret the implementation as, literally, "any number of voiced consonants". And, naively implement a greedy matching algorithm (common in C):

while (char in {L, J, V, D, ...}) pointer++

This *looks* correct. Until it is applied in particular contexts: %D (where % is the aforementioned wildcard). Obviously, "%D" should match "coLD", "aDDed", etc. I.e., the % matches the first (and ONLY the first) of these voiced consonants and the explicit 'D' matches the immediately following 'D' -- even though it, too, is a voiced consonant.

But, the above implementation will fail -- due to its greed!

This is a common latent bug in implementations of this particular algorithm. Because the folks re-implementing it (in C) failed to understand how the original SNOBOL implementation operated.

Tools like MSWord appeal to folks who think "pretty printing" is a goal. Or, who are tickled with the prospect of embedding a picture or a scope trace in a document.

Lately, my documents have been interactive. E.g., the document I am working on presently allows the user (i.e., reader) to explore how various glottal waveform parameters affect the *sound* of the spoken voice -- by adjusting them and *listening* to the resulting pronunciation of (canned) words.

[You could spend paragraphs trying to explain these sound qualities and never be certain the reader understands; but, give him an actual sound sample to evaluate -- and contrast -- and your confidence in his understanding goes up markedly!]

In my case, I opted for Perforce -- much to the chagrin of all who advised me on the subject! A big part of that, IMO, was a desire to operate in their own little isolated fiefdoms, detached from The Organization. And, failing to perceive the needs of others in that organization!

Sorry, change is always painful. :( That's often why folks cling desperately to old ways of doing things and "wetware systems" (in which the "system" has been designed to fit in someone's braincase early on -- and never revised when the constraints of that braincase were exceeded!).

An exec at a Fortune 500 company ($10B/sales) once quizzed me on the design of part numbering systems. I gave the typical reply: "Numbers beginning with 1 for vegetables; 2 for fruits; 3 for meats;

4 for cereals; etc. The next digit could refine this further: 11 for leafy vegetables; 12 for legumes; etc."

He took a tomato out of his desk drawer: "Vegetable! 1XXXX" "No, it's a *fruit*!" (how many folks trying to rely on this "wetware" system would make a similar mistake?)

"Hmmm... What about berries? Strawberries, blueberries...?" "... tomatoes, avocados, grapes..." "Huh? Aren't those fruit?"

"And, where do we put *candy*? And vitamins? And..." "All those oddball things can go in the 9's!"

I.e., systems that appear simplistic usually are... too simplistic! The conversation ended with him arranging the items on his desk in a haphazard order and "identifying" them in exaggerated fashion: "1, 2, 3, 4, 5, 6... get the picture?"

"Then, how do you know what a 62347 is?" "I type the part number into this computer and it tells me everything I want to know about it! What it is, what it costs in materials, labor, how many we have on hand, how many we have active orders for, how many we sold last year, what time of year has the greatest demand, where (geographically) that demand is located, etc. To *someone* in this organization, each of those items are THE MOST significant aspect of this product. If *that* person was designing a part numbering system, he would choose to encode *that* data in the part number and care little about the criteria *you* chose!!"

[I can't resist quoting Earl Sinclair: "As you can see, I have separated all known dinosaur wisdom into three categories: animal, vegetable, rocks."

"Water is the opposite of fire, which we have previously established as a vegetable. What's the opposite of a vegetable? Fruit. So, water is a fruit! Fruit is not a vegetable, so it has to be either an animal or a rock. We know it's not an animal. Therefore, fruit is a rock."]

Why the need for specific folders/directories for each item type? Sooner or later, you will end up with huge directories and shrinking namespaces. Why not let things live where they "should" live -- just ensure they are accessible from everywhere that they should be accessed?

E.g., if I embed a particular object in a particular document... then, at a later date, decide that the object can also be used in some other document, I don't refactor the original document to extract the object and move it to some "shared" location. I just reference it where it was.

[I am becoming a huge fan of relational databases! Letting objects reside in the DBMS instead of as files in a filesystem. It makes it easier to see dependencies]

Your goals are far more ambitious than mine -- I just want to keep my docs synchronized with the objects they describe. I count on the VCS to handle much of that "make" overhead, currently.

You can also opt to not be concerned with "presentation" (depends on where your docs will be consumed). E.g., web pages tend to be content driven; PDFs are layout driven.

Also, consider what you will want *in* those documents. I think (as evidenced from my current efforts) that documents will become much more "active" than "dead tree products". So, you may find that "text's" role decreases over time to other media forms.

Good luck!

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Aug 12, 2014 3:23 PM

You're at the mercy of all sorts of tools, from oscilloscopes to paper shredders. Word is reasonably efficient and reliable if you use it sensibly.

The requirements documents that we get from customers range from simple emails ("can you build us an LVDT simulator?") to voluminous and mainly wrong. I have one that says, literally, "blah blah" and "not sure - ask Olav" where Olav quit and went to work for Qualcomm. TBD is the most common TLA in lots of these.

We just design what we figure they actually need. Sometimes I lie to my customers so that I can include things that I know they will eventually need.

I recently received a requirements doc that had 16 reviewers, and is full of obvious mistakes, TBDs, and blank sections. One requirement, supported by drawings, shows grounded BNC connector inputs; not far away it is forbidden to use single-ended coaxial inputs. What's a boy to do?

Content, correct and complete content, matters a lot more than form or editing tools. We document some things with photos of whiteboard sketches.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Aug 12, 2014 3:27 PM

Hi David,

David Brown wrote: []

sigh...

As I said in an earlier post, people used to GUI will likely expect another GUI to do the same job. On top of that whenever people see my 'green on black' terminals the first they think about is Matrix (just to explain the kind of environment I work in)!

A friendly editor for latex sources may be a viable option, I believe there's a plethora of opensource packages doing that and tipically they are highly configurable expecially when it comes to 'building'.

We have a couple of guys who have quite some experience with it (one has amused himself with describing all his master thesis state machines with latex) so there shouldn't be a problem in having the expert available. OTOH a documenting system should be less permissive in terms of formatting and reduce the amount of 'clutter' a user may introduce.

inconsistent spacing or font is not the main issue, while having the document name not properly propagated on the 56th page header is rather annoying.

I used to co-author scientific papers with latex and svn and even right before submission merging was extremely simple.

So far I envy only one thing to Adobe products: pdf annotations. There's no free equivalent and the lack of standard make it very difficult to be portable through programs. I find pdf annotations quite useful when reviewing docs.

We have defined a class to format the document as a standard company's document, it worked like a charm and has a bunch of nice commands already built in. The difficult part is to get it widely and coherently used. Maybe we should come up with a plan and an example on a tutorial project.

including file is a piece of cake, a lesser simple problem is to define the structure that holds the common parts and what kind of parts shall be common.

some time we are asked also to include reports coming out from CAD tools (see FPGA timing reports, ...). Luckily those reports are often spit out as text file and are easy to embed in a latex document.

This is not common practice in space industry. A configured item (document) cannot be edited and only pdf files are contractually binding. I recently had an issue with one of our client's speicification requirement where there was no clear label for each req. (marked with a bold *R*). In order to be able to trace the requirement I asked to change the document, they did not do it and I found myself in the unconfortable situation where I had to modify my client's document in order to be sure to cover all requirements.

For licensing issues, we are slowly moving towards libreoffice. I find it much better than Word, but I have to understand how I can put in place my hierarchical structure.

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Aug 12, 2014 3:28 PM

We do that, too. It doubles as our requirements document, and we've got to write a manual eventually anyhow. Writing the manual first uncovers all sorts of issues that the usual requirements doc wouldn't. We sometimes put stuff into the preliminary manual that is really design related, and edit it out for the user versions.

Changes happen during design, so we keep the draft manual current as we go.

--

John Larkin                  Highland Technology Inc 
www.highlandtechnology.com   jlarkin at highlandtechnology dot com    

Precision electronic instrumentation

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Aug 12, 2014 7:45 PM

Both TeXclipse and TeXstudio are LaTeX-friendly text editors, while LyX is a sort of half-WYSIWIG editor. TeXclipse is fine if you are used to Eclipse, but would probably be very strange to people not used to that sort of program development tool. TeXstudio is a text editor that is dedicated to LaTeX, and thus has lots of toolbars full of symbols, etc., while not being as big and intimidating as Eclipse + TeXclipse. LyX tries to be more WYSIWG, and makes some effort to make the document look like the final output even while editing.

But as you say, there are lots of such packages and editors - you'll have to try out a few to see what will suit best.

It can be fun doing diagrams in "pure" LaTeX, but I would probably do state machines in graphviz (dot) these days. Long ago I did some using MetaPost, but while it is a fascinating tool, it is not easy to learn.

As for getting consistent formatting, that's just a matter of making sure everyone uses the same document class, which one of your experts should set up including a set of standard packages. It is vastly easier than trying to enforce consistency in word processor files.

I would find it hard to get such inconsistency using LibreOffice, but I'm sure people with Word can manage it! Yes, LaTeX will help get that sort of thing right.

Foxit reader seems to do annotations fine (I use it all the time as a pdf reader on Windows, but haven't made much use of annotations). Evince and Ocular do annotations in Linux, but with some limitations (again, I haven't used annotations with these programs).

Oh, yes - LaTeX is a useful tool, but it won't solve /all/ your problems!

- T
- Tim Williams
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Aug 12, 2014 8:00 PM

TikZ (and related) seems to be the popular tool these days. Unless graphviz is newer; I haven't heard of it.

There's also a schematic drawing package out there, which I haven't tried. Hmm, y'know, it should be possible to parse simple formats like LTSpice .ASC into TeX'able code. Maybe there's a Package For That already, too?

Tim

--
Seven Transistor Labs 
Electrical Engineering Consultation 
Website: http://seventransistorlabs.com

- J
- Joe Gwinn
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Aug 13, 2014 1:47 PM

[big snip]

The IEEE Standards Association, the source of things like the Ethernet standards (IEEE 802.3 and .11 (WiFi)), standardized on FrameMaker with SGML at least ten years ago. The reason for the SGML requirement was to ensure to proof against the disappearance of data held only in a proprietary format. And FrameMaker is suited to the generation of huge complex documents, documents that will choke MS Word.

Joe Gwinn

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Aug 13, 2014 6:21 PM

I've tried a couple of these things, and always got stymied by conflicts between the macro sets they require to get the GUI part, and the ones I need to get the document done. (Publishers' macro sets, for instance.)

So I just went back to writing markup with a programmer's editor that does syntax highlighting and bracket matching.

I do use makefiles for LaTeX documents, but then I used to use one to automate Wordperfect for DOS as well. ;)

Cheers from the Galerie de la Reine in Brussels. Ypres tomorrow.

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Aug 13, 2014 6:25 PM

MSWord chokes on *any* document of substance! I was COMFORTABLY editing 500 page documents with VP3 more than 20 years ago... on a 25MHz machine! Those same documents would bring Word to its knees, today, on a 3GHz machine!

FM's .mif format, AFAICT, is still supported in current releases

*and* well documented. But, the SGML approach would *guarantee* that the content could be extracted even if some proprietary format was used internally.

In my case, the appeal of SGML is that it lets me tag "objects" in the document and access them with other tools. Currently, I am doing that by adopting conventions for appropriate tags and their hierarchy within a (regular) FM document -- in much the same way that a formally structured SGML document might.

I've just got too many (thousands of) pages to be eager to try to convert/restructure them under SGML though the nonSGML "mode" of FM still handles them -- many versions later than the FM version of their original composition! (can't say that of anything with MS in the title!)

Wile ther4e are many things in VP that I miss (mostly, clever hacks), the move to FM has been a pleasant and rewarding one! (I think I moved at about FM4 or so)

- L
- Lasse Langwadt Christensen
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Aug 13, 2014 6:38 PM

When I was at uni we all used AUCTeX for emacs

-Lasse

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Aug 14, 2014 6:28 AM

Hi Lasse,

Lasse Langwadt Christensen wrote: []

unfortunately most of the courses I've followed (some 15 years ago) were MS oriented and no introduction of whatsoever was dedicated to *nix systems and everything that comes with it.

IMHO *nix systems encourage the user to explore and learn, he/she can go as deep as he/she wants without any limitation. This approach changes your way to see a machine and suddenly you realize how much freedom you have.

LaTeX is just another piece of a large infrastructure that has put a lot of efforts in separating mechanisms from policies (R.Raymond), but when you grew up believing that there's little alternative to 'MS Word' you start to adapt to it, developing habits and beliefs that are hard to crack.

Al

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 1:04 AM

When I was at IBM, all of our documents were written in Frame. I liked using Frame, a lot (I once had a Win Version but I don't know what I did with the disks). OTOH, trying to force MS Word to do what I want it to do is a PITA.

Before Frame, going back at least 40 years, and probably ten before that, they used an internal markup language ("Script"). At one time IBM was the largest publisher in the world. That's a lot of bits.

- J
- John G
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 4:02 AM

And if I remember correctly they mainly used a printer in Poughkeepsie who also printed a lot of comic books. (Hard to tell the difference) lol

- R
- Reinhardt Behm
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 4:42 AM

VS-Script - Old memories...

We got it just after I had done my masters thesis using runnoff. We even had a laser printer driven by it - price range 70k$. Some year later I wrote my own version of Script running on a Z80 CP/M system.

--
Reinhardt

- T
- Tim Williams
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 5:00 AM

80s laser printer:

formatting link

Tim

--
Seven Transistor Labs 
Electrical Engineering Consultation 
Website: http://seventransistorlabs.com

- J
- josephkk
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 6:11 AM

MS

and

as

way to

of

grew

adapt

Excellently stated.

?-)

- J
- Joe Gwinn
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 1:53 PM

On a project some years ago, we were having big problems with MS Word flailing with a 300-page requirements document that contained many equations and tables and figures. On symptom was that the live links in the text to those equations and figures and tables became hopelessly scrambled. Another symptom was that the numbering of the figures et al became random. I worked for a very large company, so the IT department queried Microsoft, only to be told that MS Word was intended only for documents of up to thirty pages. Gee, it didn't say *that* on the box.

I did suggest Frame, but nobody knew how to use it and didn't want to learn in the middle of the project, and the customer had specified Word anyway.

Frame was also a factor more expensive than MS Word. That didn't come up, but it would have. The obvious counter argument is that Notepad is cheaper than Word, but neither of them can do the job, so the comparison is flawed.

So we just struggled on, and became expert in the coddling of MS Word. For one thing, all those live links were eliminated.

Yes, and the mere existence of this escape route helps maintain discipline and competition in the vendor community.

I don't think that IEEE does this, precisely to protect their escape route. And because IEEE Editors are technical writers, not programmers.

VP?

Joe Gwinn

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 3:30 PM

"Script" came before "Script/VS" ("VS" == Virtual Systems). ;-) It was called something else, too, but It's been too long.

Yeah, there was a version for the PC, too, but WYSIWYG took over shortly after and we moved to Frame.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 5:14 PM

[attrs elided]

And, it's virtually impossible to MANUALLY verify that all of these cross references, index entries, table/page/figure numbering, etc. are correct! It's a matter of blind faith -- you really want to trust your tools.

Yup. But, you *learn*, quickly! ;-)

Moving to it from VP, I found it relatively easy to learn. Equation editor is a bit tricky -- especially if you want to exploit its abilities to factor and reduce! I missed the ease with which I could quickly pull up the "raw" text files and manipulate them by scripting a "text editor" (emacs, Brief, etc.). I don't think any of these tools have the versatility that you might need for odd/unusual replacements/transformations (though if you were to tag aggressively under SGML, you could probably make any of the changes that "sound reasonable" just by changing the presentation of those tags).

What I missed, most, was the ability to change a "format"/tag so that it displayed differently on recto/verso pages! In VP you could do this easily and come up with some interesting layouts that "magically" adapted the content to its *actual* placement in the document.

For example, one format that I used had wide "outside" margins (with annotations that I could place in this margin as "side heads" to draw attention to descriptions of key subjects and where those occurred in the body text). Within the SINGLE "body column" (which hugged the inner margin), I would place figures towards the outside margin with short descriptions ALONGSIDE on the inner margin. Note "inner" and "outer" not "left" and "right".

So, on a recto page:

read this! XXXXXXXXXXXX this is a description XXXXXXXXXXXX of the picture that XXXXXXXXXXXX it appears alongside! XXXXXXXXXXXX XXXXXXXXXXXX

And, if this ended up falling on a verso page:

this is a description XXXXXXXXXXXX read this! of the picture that XXXXXXXXXXXX it appears alongside! XXXXXXXXXXXX XXXXXXXXXXXX XXXXXXXXXXXX

I.e., so the layouts are mirror images. In FM, the only distinction for recto/verso lies in the page (left/right) templates -- I would have to treat this as multiple columns and specifically place each bit in the corresponding column!

And pen and ink are even cheaper yet! :<

Yes. You end up AVOIDING features -- because they are unreliable (or inefficient).

"Tell me again: why are we using this tool?"

In my case, it has allowed me to gain access to the document in a form that I can massage algorithmically. The old VP (pre Corel) was like this -- plain text (with markup commands embedded).

Adobe products are like this, for the most part. E.g., I can create drawings in Illustrator and then excise the pertinent parts (eliding all the preamble, etc.) to move them into "objects" of my own choosing. E.g., this was how I originally "designed" all of the gestures in my gesture recognizer... draw them in Illustrator, extract the terse representation of each spline/bezier that forms the gesture's path, then import it into my codebase.

I would think they would, at least, do things like tag any document references *within* the document with IEEE_Standard or similar. Then, present the objects tagged thusly in "Bold", etc. In that way, they could mechanically determine which documents are referenced in each document, etc.

I think a publishing tool should probably disallow any commands that alter the appearance (in any way!) of text. FORCE the user to apply a specific tag in order to alter the presentation. In that way, be able to assign meaning to the alteration.

E.g., using "italics" for words in foreign languages (etc.), emphasis, titles of articles, etc. allows each to *appear* as it should (or, as we were taught in school) -- but loses the distinction between their intents! Force the user to tag "etc." with Foreign_Words, "My Scholarly Publication" with Article_Title, and stressed words with Emphasis.

Ventura Publisher. I used it in the early 90's. I am not sure if it still survives. Last time I looked at it, Corel had largely emasculated it! :<

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Aug 15, 2014 5:31 PM

And, of course, I got that exactly BACKWARDS!