OT: html Clean-up ??

I have some documentation written in html notation, and it's full of such stuff as...

– ±   −  

etc.

It's a _huge_ document, so I'm wondering if there's some simple way to replace these html codes with the Word symbol equivalents?

Manually finding and replace is the sort of thing that drives me nuts :-(

Any cute ways to handle this? ...Jim Thompson

-- | James E.Thompson | mens | | Analog Innovations | et | | Analog/Mixed-Signal ASIC's and Discrete Systems | manus | | San Tan Valley, AZ 85142 Skype: skypeanalog | | | Voice:(480)460-2350 Fax: Available upon request | Brass Rat | | E-mail Icon at

formatting link
| 1962 | I love to cook with wine. Sometimes I even put it in the food.

Reply to
Jim Thompson
Loading thread data ...

Record a macro script in Word or your favourite editor.

--
Regards, 
Martin Brown
Reply to
Martin Brown

I think you can open it up in Word and copy. Then Paste as text into a new document. Word html files are just as messy.

Cheers

Reply to
Martin Riddle

Open it up in a web browser and print to a pdf file.

--
Mark
Reply to
qrkpublic

The MadCap Flare program I am presently evaluating (for the upcoming User's Manual project) might be an answer. Unfortunately, their fully-functionin g demo intentionally forces random character substitutions in the output. You'd need the legit version - it's $995. I suspect that blows the budget?

One thing I do like about MadCap: If you watch their online demo, they have a line about "we all know what ha ppens to big Word documents..." and then they cut to a house on fire, or so mething along those lines.

Most "easy" ways I can think to do what you want will probably wreck Word's formatting, depending on the version and how it was put together. I'm env isioning a large complex document, with headers/footers, text boxes, equati ons, images, tables, captions, legends, page numbers, section headings, ind exed table of contents, hyperlinks, etc...

For something like that, you're probably best to just hack at it natively, in Word. It will probably take less time in the long run, and then (as a b onus!) you'd still be joined at the hip with Word. I often think that was Microsoft's intention all along.

Reply to
mpm

"tidy" aka "htmltidy" is the canonical open-source tool for cleaning up HTML, including Microsoft Office crap (has a special option for that)

Reply to
Clifford Heath

Can you do a search and replace? It should be fairly easy in Word or any other word processor.

Bill

Reply to
Bill Gill

open it in a browser and then copy-paste into a word processor

--
umop apisdn
Reply to
Jasen Betts

Try changing the file extension to .htm or .html and open again in Word and 'save as' a .docx or .doc file??

?-)

Reply to
josephkk

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.