eBook formats

Hi,

I have a boatload of books, papers, etc. in a variety (too many!) of different formats (TXT, PDF, PS, CBR/CBZ, MOBI, EPUB, DOC, RTF, etc.). I'd like to pick *one* format and convert them *all* and forget about having to make sure I have the *right* reader on the right

*device*, etc.

Any comments (pointers to references) to help in deciding which would be the "least bad" choice? And, what I risk losing in the process?

Thx,

--don

Reply to
Don Y
Loading thread data ...

txt is portable, compact, and easily parsed - but you lose all the formatting, structure, images, etc., and it is not necessarily easy to convert other formats to txt.

pdf is the most portable (while still retaining structure and formatting) - there are readers for it on every device and every OS. You have the advantage and disadvantage that the page layout is fixed and the same on every screen or printed page. When done properly, you can copy-and-paste from it. While there are utilities for manipulating pdf files a bit, it is basically a non-editable format.

ps is a little like the pdf of two decades ago. You need a PC (with Ghostscript) to view it, but that is not a common program on non-*nix systems (Windows, pads, etc.). It also has less structure (indexes, table-of-contents, etc.) than pdf.

I don't know what CBR/CBZ and MOBI are - and therefore cannot recommend them. If they are sufficiently obscure that I don't recognize them, they are not well enough supported for your needs.

epub is a possibility if the majority of your reading will be on small-screen devices (small pads, telephones). epub reads flow the text to suit the screen. But that also means that the appearance of the document changes depending on the device used to view it. epub readers are available for most systems, but are not nearly as common or mature as pdf readers.

doc and rtf are word processor formats, with all the disadvantages that brings - being editable means the data can be changed, but that's a disadvantage for this sort of use. You need a PC to display them, and the appearance changes depending on the version of office program used and the fonts on the system. You view them using a word processor, which is not an optimal program for reading - you don't get convenient links, contents, cross-references, etc., and your screen is filled with editing controls (or a hideous "ribbon" if you use MS Word).

In my view, this is an easy decision - pdf is the only practical choice. It also has the advantage that the majority of the stuff you have will probably already be in pdf format. doc and rtf (and even txt) can be easily converted using LibreOffice (which can be automated from the command line if you have lots of files). ps2pdf will handle any ps files you have.

Reply to
David Brown

There are PDF readers (like qpdf on Android) that can reflow document text, which is immensely helpful on smartphone displays.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org
Reply to
Nils M Holm

I use Calibre for my books, papers,... library. It takes care of opening the files with the right tool on the pc, and convert to the appropriate formate when I move the file on another device (ebook reader,...).

Bye Jack

--
Yoda of Borg am I! Assimilated shall you be! Futile resistance is, hmm?
Reply to
Jack

I was through this some time ago. I picked MOBI - for a reason I don't remember and I am not sure I could have explained back then when I did, may be I had read a few books in it. Someone mentioned calibre, I have used it to convert.

On Android devices I use a reader written by a Russian guy, let me check.... CoolReader, the icon is a brownish/yellowish open book. Works really well, may be it has a preference for mobi so I went there (but I don't remember if this is the case).

Dimiter

Reply to
Dimiter_Popoff

Too often, formatting contains information, itself. E.g., most documents contain multiple flows; how do you distinguish between these in a pure text document format?

You seldom want to *modify* a document (that you haven't authored). OTOH, it is frequently desirable to be able to *annotate* a document. In this regard, Adobe smartly foresaw this need. I miss being able to annotate (Windows) "Help" files...

CBR is commonly used for comic books; it's the equivalent of storing TIFF's of scanned pages in a PDF (except the scans are JPEG's -- because comics can easily tolerate the losses that JPEG's introduce!)

CBZ is a compressed form.

MOBI is a format designed for small screen devices (e.g., Palm PDA's).

I've also forgot to mention DJVU -- which is similar to CBR/TIFF PDF's.

Yes. Put all of these in the same category as HTML documents.

This ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ is the biggest win. The biggest down-side is viewing documents on "smaller screens" (where "small" is defined as "less than the size of the original medium"). E.g., most "papers" would require at least a ~15" diagonal screen to avoid the "pan and scan" interface.

Other advantages are support for multimedia, interactive documents and as a "packaging" mechanism (e.g., I "include" attachments in the PDF's that I create instead of having to add other "files" to a repository and somehow tag them as "belonging" to a particular "document".

Reply to
Don Y

Hi Dimiter,

[We're now above the 100F mark... expect> >

I'd like to move my book/paper archives onto "dedicated devices". Presently, these things are stored on NAS devices just because of their sheer quantity.

So, to access one, I have to fire up a PC, fire up the appropriate NAS, then, invoke the right "reader" (software).

I'd like, instead, to just copy everything onto the internal drive of a "tablet PC". This makes access easier: fire up the tablet PC and browse for the file/document of interest.

The pen interface would also make it easy to make annotations to those documents (esp drawings).

And, by treating the documents as part of a *collection* (instead of as individual documents), I can better organize and cross reference them (instead of just lumping them into dubious "groups"/folders based on very limited descriptions: medical, mathematics, metal-working, etc. I.e., adding/maintaining metadata would be facilitated because it's all part of the same "collection".

Reply to
Don Y

Definitely .CHM LOL.

What device will you mostly be reading them on? And do you read a letter size page at a time or peek through a small window?

--
Best regards,  
Spehro Pefhany 
 Click to see the full signature
Reply to
Spehro Pefhany

I want to move them all onto a couple of tablet PC's (~12" dia screen) which will be the "normal" (portable) means of accessing them. Anything that I need to reference for longer periods of time (e.g., while writing code, designing hardware, etc.) I will network mount and access from my regular workstations. (I don't use/carry "mobile devices" so I'm not bound by their tiny screen sizes)

Some of the novels *might* be nice to read on a paper-back sized device but most that I've seen would be too tiring on my eyes.

I'd prefer something even larger (e.g., TRULY page-sized) but that's not essential; viewing half a page, magnified, with the tablet in landscape orientation will probably be sufficient.

Reply to
Don Y

epub is good if it's all text (if you would like to be able to read them on a phone or tablet). Epub is OK with footnotes, but if there are photos, drawings, diagrams, tables, or example code snippets, then epub isn't great, and pdf is probably best.

--
Grant Edwards               grant.b.edwards        Yow! I had a lease on an 
                                  at               OEDIPUS COMPLEX back in 
 Click to see the full signature
Reply to
Grant Edwards

Hello Don,

Do you have any current or potential future requirements to examine these documents in a character cell/non-GUI environment ?

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP 
Microsoft: Bringing you 1980s technology to a 21st century world
Reply to
Simon Clubley

Good plan, but it's easier to just have a decent format converter such as Calibre in case you run into problems. I carry a flash drive with a mix of documents. If I need to read something, I convert it to MOBI mostly because that's what my various Kindle readers seem to like and because I like to email documents directly to my Kindle readers which requires a MOBI file.

See the comments from the author of Calibre on the topic: Note that he considers PDF the worst choice. Reading between the lines, I think he prefers: LIT, MOBI, AZW, EPUB in that order.

I've found that converting web pages into various formats is possible. Calibre has templates for handling various news web pages:

--
Jeff Liebermann     jeffl@cruzio.com 
150 Felker St #D    http://www.LearnByDestroying.com 
 Click to see the full signature
Reply to
Jeff Liebermann

PDF seems to work quite well on iPads, in the iBooks app. Not so great on phones. In particular the navigation isn't as annoying as it could be.

--
Best regards,  
Spehro Pefhany 
 Click to see the full signature
Reply to
Spehro Pefhany

To do so would be very limiting, IMO. Many of the documents (esp those that are the most interesting) are assemblages of scanned TIFFs, or printed FICHE that was then scanned (often at a poor resolution) etc.

So, I've pretty much resigned myself to being able to access them as if they were dead trees -- just dead *silicon* trees! :-/

Reply to
Don Y

I think all of them suck; there's something about the user experience of a "real" book that is hard to replicate electronically. So, any electronic version thereof has to add capabilities that simply aren't possible with print (e.g., multimedia, interaction, etc.) to compensate for the munged interface.

Reply to
Don Y

I plan on doing the conversion *once* -- instead of each time I need to view a document (I'm only targeting *one* sort of device; ebook devices just don't cut it, for me).

Note that his bias against PDF seems to be rooted in the fact that PDF isn't intended to *be* converted -- except to another identically formatted *page*!

You can apply similar logic to any of the other formats that he describes -- none have "ideal" presentations so you're always letting the targeted device impose its constraints on the content.

Imagine viewing schematics ("oh, my! the screen is too small!"), sheet music, complex illustrations, etc.

I'm not interested in consuming and archiving "news" pages. The few HTML documents that I keep are just technical documents that their authors opted to create in that format.

As a result, they don't fit onto "physical" pages very well -- because they weren't conceived with page sizes or boundaries in mind. Converting them to other formats is too "involved" -- you have to re-layout the document in a way that (hopefully) is visually appealing, efficient and doesn't change the content appreciably ("Hmmm... this illustration won't fit, here -- I cram it on the next page. Oh, crap! That's a verso page so the text will now be separated from it!")

Reply to
Don Y

?? just because of the "serial" nature of the flow? E.g., I know that when I create documents, I go to great lengths to keep "non-text" objects and the associated descriptive text "nearby" -- deliberately anchoring the former to the latter and shepherding it's placement (instead of letting the tools automatically "fit them where they may")

As I target PDF for my container of choice, I am keenly aware of what the final presentation will be like -- whether viewed "online" or rendered to paper (e.g., I will avoid downsampling certain photos and illustrations if I think an online viewer might want to zoom for greater detail -- detail that would not be reproducible in a print for).

Several formats appear to just be tweeks to others with DRM additions (I don't purchase "best sellers" so there's no appeal to supporting any of those)

Reply to
Don Y

MHTML does the packing thing fairly well for HTML based documents, and support is fairly common. Depending on the system, even something as simple as a ZIP file can be used as a package the browser can access.

Personally I don't like HTML as a documentation format, but in some cases it's the least worst option. Usually I prefer PDF.

Reply to
Robert Wessel

With HTML, you're always worrying about browser compatibility, plugin versions, etc. There will always be things that you CAN do in a browser environment that you *can't* in a document format like a PDF. But, taken to extreme, why not build a VM for each "document" and just archive VM's?

With such, not only can your atlas contain 2D (and 3D!) maps, but it can also contain photos of wildlife in each locale, sound recordings of their mating calls, phonebooks of their inhabitants (with LIVE updates!), *up-to-date* exchange rates for their currencies, heck, even an automatic renaming and resurveying capability to reflect ongoing political changes, conflicts, etc.! :>

And, you'll spend all your time updating plugins/modules so you could access this information -- when all you really want to know is if it was landlocked or not in 1783!

Reply to
Don Y

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.