What's in a name?

- D
- Don Y
  
  Contact options for registered users
posted
1 year ago

Thu, Aug 25, 2022 11:34 AM

I'm scanning paper documentation (to rid myself of the dead tree collection) -- 50,000pp so far.

I notice lots of manuals with inconsistent titles:

- User Manual

- User's Manual

- Users' Manual

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Thu, Aug 25, 2022 1:11 PM

On a sunny day (Thu, 25 Aug 2022 04:34:49 -0700) it happened Don Y snipped-for-privacy@foo.invalid wrote in <te7mp1$3ksa6$ snipped-for-privacy@dont-email.me:

Some 'manuals' are microscopic pieces of paper, sometimes multiple pages,.. I take pictures of those, resize the jpgs with xv to a normal format and combine into pdfs with ffmpeg. 'Useless manual' (the originals) comes to mind.

- R
- rbowman
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Thu, Aug 25, 2022 1:51 PM

Nobody reads the damn things so does it really matter what they're called? At least the pressure is taken of the forests when the manual is at

formatting link

which is referenced in the 'quick start guide', a one page document in 14 languages.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Thu, Aug 25, 2022 8:14 PM

I think manuals for *applications* tend to be too much of a "tutorial"/HowTo nature; you're walked through a procedure to achieve a given goal (which may not be the goal you have in mind!).

This is likely the reason most "computer users" just know "click this to do X" instead of thinking about features and mechanisms in a more abstract sense.

E.g., when I first started using Ventura some decades ago (e.g., under DOS/GEM!), everything was presented in that HowTo manner. But, they didn't have a "HowTo" for "creating list of tables/figures": "Can't be done!"

Of *course* it can be done -- but, only by exploiting implementation quirks, "See?"

I write "Reference Manuals" for the devices/appliances that I design in lieu of Specifications. They're just enumerations of controls and constraints. It's up to the user to figure out how he wants to use them to solve the problem at hand.

[If you need to be *taught* how to use the various triggering mechanisms on a DSO, then go elsewhere and find *a* tutorial for those needs; no need to build that into every DSO's manual!]

Every place in my code that can throw an error generates a unique code -- that is used to access a document that explains the condition as well as likely causes and remedies. This lets me create different messaging for "value too low" vs. "value too high":

if (value < MINVAL) { error(VAL_MIN_ERROR) else if (value > MAXVAL) { error(VAL_MAX_ERROR) }

instead of just a domain error:

if ( (value < MINVAL) || (value > MAXVAL) ) { error(VAL_DOMAIN_ERROR) }

So, the code isn't littered with terse descriptions of errors (that tell the user very little -- in one language (that only APPROXIMATES english!)

I really dislike "online" documentation. If you want to deliver it in HTML, then make available a download that I can unpack in a local (shared?) folder and a means of hooking it.

Yeah, I realize my version will only be current as of the day that I download it. <shrug> If I really think there is a new and improved (fixed!) version out there, I'll download THAT in place of the previous version.

FWIW, I think Adobe has a means of installing local copies of their documentation and "wiring" them into context-sensitive "F1". But, I've never bothered to sort out the details (easier to just open the PDF and search for whatever I need).

- W
- whit3rd
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Fri, Aug 26, 2022 2:07 AM

So do you have a database with other-than-title metadata support for those scan files? Like, 'user manual' tag, also you could organize by manufacturer (HP equipment and Compaq, Palm, Agilent and Keysight can be separate or together...) and by publication date and by model numbers or functional names... and create crosslinks as appropriate when an article gives application info. I'd certainly want some metadata that indicates where to find the scans now, and maybe even where the sources were. That many pages, it could be inconvenient to do search otherwise.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Fri, Aug 26, 2022 3:53 AM

Absolutely not. I have exactly the accessibility that I had when they were dead trees lined up on shelves -- except now they take up less space and can be duplicated with very little effort.

Sources are gone -- that was the whole point of the exercise.

How do YOU search your dead tree collection?

They're 600/1200 dpi TIFFs in PDF containers. Someone with gobs of free time could run OCR on them, sort out the illustrations from the text, insert hyperlinks to all cross references, index entries, etc. But, that won't be *me*! :>

I'll just be happy knowing I don't have all that paper taking up space while still preserving the *content* without wasting lots of time on the effort.

- W
- whit3rd
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Fri, Aug 26, 2022 9:52 AM

Good question: a few books are in my reference shelves and get used often, but I can't always find what I'm looking for. Vinyl records, CDs, videotape, etc. likewise.

For modern e-books, I've installed Calibre (which allows good extensible metadata); my personal notes in e-text are under my name as author, mostly I keep dates of publication (it matters sometimes when things change) and I've added tags, and identified some series (MIT Radiation Lab, for instance) so they can be handled as an ensemble.

Another database (older) has just author-title-size info, for SF and such, of the dead-trees, and I know approximately which shelves have the paperbacks with author 'B...'. For this, I keep a twenty-year-old computer active. Text dump of a late version of the data lives in my cellphone.

Yet older, another organization (clippings of articles) is in three-ring binders with alphabetical-by-topic organization.

The spooky thing, is that a terabyte size transflash chip can fit the whole of my e-books into my cellphone; some text, some epub, some PDF... that probably wouldn't work with the organizing software and metadata. That's the problem with organizing using software: it doesn't migrate through the decades well.

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Fri, Aug 26, 2022 11:05 AM

On a sunny day (Fri, 26 Aug 2022 02:52:26 -0700 (PDT)) it happened whit3rd snipped-for-privacy@gmail.com wrote in snipped-for-privacy@googlegroups.com:

I stopped using databeasts ? bases many years ago. Much is on optical media, and much on TB size harddiscs, all SDcards are backed up too. For the optical media I have a text file that says date, how it was burned (scripts), format, disc type and content for each of the more than 1000 disk I have, going back to when the first writable CDs came out. As all other things are stored with real names for things All I need is Linux 'locate' to find things in seconds on the laptop and now raspberries with 3 TB disks Linux / Unix is cool

~ # locate -i TDA7440D | grep -i pdf /root/download/html/TDA7440D_6438.pdf

~ # locate -i radar | grep -i doppler /usr/local/httpd/htdocs/pub/44kHz_Doppler_radar_mixer_test_board_IMG_4098.JPG/usr/local/httpd/htdocs/pub/44kHz_Doppler_radar_schematic_IMG_4096.GIF/usr/local/httpd/htdocs/pub/44kHz_Doppler_radar_Rx_transducer_and_coil_IMG_4097.JPG/root/download/html/doppler_RADAR_sensor_0242.pdf/root/download/html/doppler_RADAR_sensor_34685MPData.pdf/root/download/html/doppler_RADAR_sensor_34685MPSche.pdf when and size info: ~ # l /usr/local/httpd/htdocs/pub/44kHz_Doppler_radar_mixer_test_board_IMG_4098.JPG-rw-r--r-- 1 7480975 100450 141594 Nov 27 2013 /usr/local/httpd/htdocs/pub/44kHz_Doppler_radar_mixer_test_board_IMG_4098.JPGSo that is on the local website, so also on my real website.

'l' is short for 'ls -rtl' in my zsh config

No need for databeasts

My Usenet Newsreader NewsFleX has its own databeast search function that goes back to 1998 Email system goes back to 1998 too, all emails, searching with 'grep' for a word in all those emails takes seconds

From days before that I have some boxes with floppies...., important parts are on harddisks.

Password always asked for, has always been 'supercalibris ? well forgot the spelling entering wrong password causes alligators to be released that eat little bits, so very secure system.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Fri, Aug 26, 2022 11:42 AM

I've about 50 ft of shelf space -- the "overflow" in cardboard boxes.

The items that I am likely to reference are on the shelves, sorted by subject matter. E.g., if I'm looking for Organick's book, it will be in with OSs; Knuth's with Algorithms; Wolfram with Math; Mick & Brick in Digital; Gries in Compiler Design; Cookbooks; Ice Cream; etc.

Manuals for bits of kit are in the lateral files in the garage. MULTICS manuals were in another file cabinet out there (they were among the first pages scanned as they were looseleaf,

*many* and easy to be rid of!

Paperbacks, save a few dozen, are now epubs. Technical papers are in PDF collection and searchable, *if* true PDFs.

I've only ~200 vinyl albums left -- anything mainstream was replaced when CDs came out. Remainder are boots and will take a fair bit of effort to digitize. But, I've a shitload of other material that minimizes the need to access those.

I think I have two video tapes left -- but the material on them is also available on DVD.

CDs (originals) are hiding in boxes under my bed; the music long ago ripped to MP3s and a modified FLAC format (for my music server). Master copy is searchable (fully tagged). Other copies (phone, tablets, etc.) aren't as their contents change periodically (so updating a database would be lots of "maintenance") How do I find the kickass rendering of SongX? (*Which* kickass rendering??)

My solution is to acknowledge that:

- I don't have any "bad" music (why would I keep it?)

- looking for "The Best" means much will be ignored

- "The Best" reduces your range of choice so, just listen to whatever comes up next. And, if not in the mood for that, at that time, hit "NEXT" and hope that when it next percolates to the top of the playlist that you're in a mood to hear it!

If I need to search for something, it is because I want to bring it to someone else's attention. E.g., "Watusi Wedding" mentioned here, recently. I could rapidly find the title (and artist) because I knew it was a "singleton" (no other titles by that artist) *and* that I had a copy of it. Armed with the title/artist, I could find an on-line copy.

I only use Calibre to read epubs on a PC. I prefer Nooks as the medium to view/read them as I'd rather read AWAY from a computer, in the dentist's waiting room, while stopped at a traffic signal, etc.

But, the stock firmware is not intended to handle large collections (so I have to impose my own file hierarchy under "My Files" and search through that)

My epubs are sorted by (primary) author. This works -- if I can remember the author and/or title. I can implement FTS but that won't typically buy me anything -- what do I search for to find the story with the alien invader? And, I've titles that are reworks of early works -- should I manually tag them as such? Or, just "rediscover" that the next time I reread the title?

[When I moved here, I had 80 "photocopy paper cartons" (10 ream ea) of "paperback novels". I've pruned that down to ~1 -- titles that I still want to have paper copies of which (sentimental reasons). Everything else is epub.]

I used to visit book sales (library discards, used book stores, etc.) but realized this was just making the problem LARGER! :-/ I read A LOT!

You need to tackle YOUR paper! :> If I could find a doc on-line already in electronic format, I DL-ed it and tossed the dead tree copy. If not, I scanned the paper (a lot of mine come through the public library's "interlibrary loan" system as paper copies) and discarded the original.

[Yesterday, I was searching for "The Influence of Glottal Waveform on the Naturalness of Speech from a Parallel Formant Synthesizer" -- so I could dispose of the paper copy. But, all seemed to hide behind paywalls. So, I'll scan it and put the "original" in the recycle bin]

I group these by subject matter -- creating a new subject whenever I have "enough" titles that seem related. E.g., I have a Letter-to-Sound folder with Elovitz, McIlroy and Hunnicutt's papers. Another that addresses digital signal processing quirks. etc.

Yup. The other problem is that it is a highly manual effort; your criteria may not be applicable to me; etc. so you have to tag each item yourself.

For a real challenge, try organizing digital photos! :>

OTOH, if you think of the original media, the organization problem was just as bad: "Where are the photos of that trip to Vancouver? And, where are the photos of the dogs 'discovering' snow??"

I rely on having the media accessible and a fast way of "rendering" it to a form that I can "evaluate". E.g., a 1000 page PDF materializes in a few seconds and I can look at thumbnails of a hundred pages at a time to see if something looks familiar. An MP3 starts playing in a fraction of a second. Everything is direct access (not sequential, regardless of the nature of the original medium) so I can look for a particular concert encore without listening to 3 hours of music, etc.

This is effectively the only way to review many types of content. E.g., I have thumbnails of all of my clipart "on-line" to browse (hundreds of CDs worth). Ditto typefaces ("fonts"). And, how do I know which string library I prefer *now* for my DAW without being able to listen to each? Special effects for video presentations? Electronic component symbols? Software libraries/components?

[I have ~6T on each workstation -- and each workstation addresses a single class of applications -- just so I can keep the "stuff" that a particular class of applications will need "on-line"]

All of my "originals" are in a "cold archive" that is indexed by a set of relational databases. One encodes the hierarchy of volumes/containers and maps each object to a unique ID. Others add appropriate metadata based on the type of file object referenced.

E.g., DISK107/Business/ClientX/ProjectY.vmdk/support/datasheets.iso/86C010.pdf tells me the 86C010 datasheet is located in an ISO container located in the "support" folder of the ProjectY VMDK image stored in the ClientX folder under the "Business" folder on volume DISK107. If I want to know how big the PDF is, when it was last accessed or when it was last checked for data errors I look in a "metadata" table. If I want to look for other copies of it, I extract the hash and size and search for other files having the same hash, size. If I want to know how many pages are in the document, then I use the unique ID and look in a different table for "number_of_pages", etc.

[Clearly, number_of_pages wouldn't apply to MP3s so none of the IDs associated with MP3s would be listed there]

But, NOT having to look through physical media -- and the space to store it! -- is the real win. (If I have to mount a cold volume to actually access something, so be it)

- W
- whit3rd
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Mon, Aug 29, 2022 5:36 AM

True; that particular data was from circa 20 years ago, when I didn't have a scanner; a few years later, I was giving away scanners to siblings for almost exactly that purpose, but never got around to my own pile... ... part of which I stumbled across recently.

For artwork-critical stuff, I'm not sure I like the scan options. For text, if OCR and epub was easy, I'd be OK on that. One antique machine tool catalog has beautiful etchings, and I've been meaning to work out how to effectively scan THAT. It's not the sort of thing one could get by interlibrary loan.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Mon, Aug 29, 2022 6:39 AM

I've had scanners for decades. Desktop, large format, EXTRA large format (40" wide), film, etc.

But, scanning is a dreadfully tedious process. I recall scanning a bunch of old 35mm slides and taking a whole weekend to do it! <frown>

Trying to scan a 1000pp book is a frigging *career*. Imagine hundreds of such titles...

It was only recently that I opted to buy one with a fast feeder. Now, 1000pp is just half an hour of your time!

So, all the loose-leaf manuals (MULTICS) were done in less than a few hours and ceremoniously dumped in the trash (recycle bin). I'll repeat the exercise with bank statements, etc. (after I finish getting rid of books).

I also own a "stack cutter" (a high-end "paper cutter" that can cut large stacks of paper *square*; a typical paper cutter cuts each stacked sheet in ever increasing sizes). So, I use it to cut the "perfect bindings" off leaving me with a stack of individual, double-sided sheets.

[The scanner does both sides simultaneously]

I will probably NOT sacrifice the hardcover titles but damn near everything else is going under the knife! :>

I have IT8 targets for the scanners (and other calibration masters for the larger ones). So, I can build a color profile for the scanner (let lamp warm up before scanning; repeat calibration each time you want to be "color correct").

[I also calibrate my monitors when I need to be able to view "camera ready" artwork -- and the printers (when I still had them)]

One axis of scanning is always dead-to-nuts (defined by the physical dimensions of the sensor array). The other can exhibit small variations from media slip -- but nothing I've ever noticed as significant.

Depending on the source (and scanner chosen), I can scan at 48b color and up to 9600 dpi optical (higher if interpolated).

I.e., I can scan an image and enlarge the hell out of it and you'd be hard pressed to see any digitization artifacts.

The problem with OCR is it's not flawless. You have to proof the results -- if you truly care about accuracy. I don't (for these efforts). I just want "digital microfiche" to conserve space (relying on my eyes and brain to sort out what the images contain)

The biggest problem will be getting the sheets (pages) flat to lay on the scanner glass -- without destroying the document in the process (I have no desire to preserve the original items that I've scanned)

I nominally scan manuals/papers at 600dpi, monochrome. This is enough to preserve most greys without incurring the added cost of scanning in grayscale.

If the document has lots of fine detail (like a D size schematic reproduced as a B-size foldout), I'll selectively scan those images at 1200 dpi. This has been sufficient for everything I've encountered thus far.

I scan covers in 24b color -- again at 600 dpi (overkill). Documents that use a highlight color inside I don't bother trying to preserve that color but let it get rendered as grey.

Documents that *are* color -- or contain a selection of color plates in the inner signatures -- I scan in B&W... until I get to the color plates. I just finished scanning a book illustrating various origami folds that was printed in "multicolor". The document is larger than it could be but the book was reasonably small so I felt I could bear the cost of an "oversized" PDF.

I have some 35mm negatives that I'll scan, soon, so I can pull more image out of the shadows (poorly lit scenes). There, the scanning activity will be small compared to the "image adjustments". I'll scan those at 12800dpi in 48b color -- just to give me more signal to play with.

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Mon, Aug 29, 2022 7:34 AM

On a sunny day (Sun, 28 Aug 2022 22:36:25 -0700 (PDT)) it happened whit3rd snipped-for-privacy@gmail.com wrote in snipped-for-privacy@googlegroups.com:

I have a very good scanner, with a win 3.1 driver!!! (Old PC), not used since year 2000, good camera and pictures works for me much better. Using a small Canon camera now. but my Xiaomi smartphone can do 48M pixels too..

- W
- whit3rd
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Wed, Aug 31, 2022 4:02 AM

That's a good start; Vernor Vinge has another scheme that speeds things up

formatting link

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Wed, Aug 31, 2022 6:34 AM

Folks are already working on "augmented memory" devices. It's an amusing concept -- simple, in theory (amazingly complex, in practice).

[Imagine a device that whispers in your ear -- or displays data on a wearable set of "glasses" -- information pertinent to your *current* "interaction"]

Many folks MANUALLY deal with this (e.g., having details about each of your contacts *in* your address book and having it open to their data while you are speaking to them: "How's little Timmy? And has Mary Beth had her braces removed, yet?")

Personally, I try to commit such "trivia" to memory. When you bump into someone and can recall these details face-to-face, they seem to be more appreciative of your (apparent) interest in them, their life, problems, etc. SWMBO is always amazed at how much I can say to someone I may not have seen in months/years. "Simple: if you are interested in them as people, then you make a point of remembering things that are likely important/significant to *them*!"

[The flip side of that is: "If folks can't remember details about YOU or your life/problems/undertakings/etc. then they likely aren't very interested in *you*; keep that in mind when you update your opinion of them!"]

OTOH, I can't commit birthdays and anniversaries for "average joes" to memory and rely on written notes for those details.

- R
- rbowman
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Wed, Aug 31, 2022 2:18 PM

I remember my ex's birthday but have to check with her for when we were married and divorced. Turns out we would have been married 50 years in May...

I'm lucky if I can hit the right decade for most things.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
1 year ago

Wed, Aug 31, 2022 2:32 PM

I don't acknowledge folks' birthdays. I don't like mine acknowledged and don't want to encourage "do-gooders" to do so! (my way of objecting to the folks who make a big deal out of their birthday is to take the exact oppposite approach)

Instead, I will "coincidentally" bake cookies, brownies, cheesecake, bread, etc. some time *around* the victim's birthday so I can't be accused of acknowledging it. (given that I am equally likely to do this at other random times during the year, it never looks suspicious!)