LXTerm to accept ANSI characters

Nope, that's CSS's job. HTML's job is to add semantic markup - OK they dropped the ball with , , , as well as some pre-css font and colour properties and ..., but mostly it's about semantic markup honest.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot
Loading thread data ...

... and are still there in HTML 5

--
--   
Martin    | martin at 
 Click to see the full signature
Reply to
Martin Gregorie

... plus all the (X)HTML Symbols and characters - á & &pound ... which AFAIK can't be rendered with CSS.

-- Martin | martin at Gregorie | gregorie dot org

Reply to
Martin Gregorie

IMHO HTML 5 was introduced 1) because things evolve and it had been 20+ years, and 2) the industry wanted a standard way to do more multi-media things. Things which historically required plugins.

--
Grant. . . . 
unix || die
Reply to
Grant Taylor

That's the /current/ interpretation. 20 years ago, there was a different interpretation.

--
Grant. . . . 
unix || die
Reply to
Grant Taylor

"TimS" wrote

| > UTF-8 allows ANSI character sets to still be used. But it also | > provides a way to fully support multi-byte characters only | > where necessary. It's the one solution to support all languages | > without changing the default of 1 character to 1 byte. | | It's only a default for ASCII, and the characters that ASCII supports. And | when you say it allows ANSI character sets to be used, I take it you mean the | characters that different ANSI pages supported, which under UTF-8 will most | likely be 2-byte chars, rather than 1-byte but 8-bit values. |

Most ANSI character sets are also 1 byte to 1 character. It's only the DBCS languages that can't fit that model. So first we had ASCII. Then we had ANSI with codepages, and most languages could be fully represented in HTML using META content type. **All of that is 1 byte to 1 character.** Only the DBCS languages were an exception. And they used a system similar to UTF-8.

So it didn't require any fundamental change in character encoding, editors, or file formats. So-called wide character encoding, with 2 or more bytes per character, existed, but was not really used. 1 byte/ 1 character was nearly universal.

So the only reason UTF-8 was needed was to fully accommodate DBCS languages, pile-of-shit emojis, etc. Most English pages are essentially ASCII, which is UTF-8 conforming. And charset can be specified for ANSI interpretation.

So all I was saying was that UTF-8 was far easier than any other approach, using "wide characters", when it came time to fully support all languages under one system. Even now I'm not sure how much it's really used. Browsers properly display curly quotes, but I actually only have one unicode font on my system, which is arial uncode MS, weighing in at

24 MB. Nothing else will render most UTF-8 characters. For example, the RichEdit window in Windows has supported UTF-8 for some time. And I can use the ability in my own software. But it will only render if I use that Arial unicode font. With any other font it renders as ANSI using the English codepage. Just as a browser will do if charset isn't specced to be UTF-8. (Though UTF-8 may be default these days. I don't know. I still use:

Not that it really matters. It's pretty much all ASCII.

Reply to
Mayayana

Anyone who has to support multiple languages tends to use unicode internally for the sake of sanity (I was for a while internationalisation specialist (among other hats) on the Yahoo! front page team). We had loads of fun with external feeds claiming to be ISO8859-1 and sending Win-1252 - they're almost but not quite the same.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot

Of course. A number of things are "deprecated" but nothing of any note has been removed, or will be. No browser maker is going to remove and friends. Why would they. They had their fingers burnt before with XHTML. Just use HTML5 and forget everything else.

--
Tim
Reply to
TimS

You're thinking of western anguages with the extra chars used in French, German, Scandinavian languages etc. You seem to be overlooking languages that use a different alphabet altogether. Try Russian, Arabic, and Asian languages, all of which are comfortably catered for in UTF-8, as are the extra Western chars.

Most web pages are UTF-8.

Well I know nothing of Windows. And the question of which font doesn't enter into character representation.

--
Tim
Reply to
TimS

I convert everything to UTF-8. Windows tends to lie about which code-page it's using, anyway.

--
Tim
Reply to
TimS

Originally it's job was to link research papers together.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot

Yes and no. HTML is markup and characters are content. So by definition characters are not HTML. But still HTML defined how to encode those not part of 7-bit ASCII.

--




/ \  Mail | -- No unannounced, large, binary attachments, please! --
Reply to
Axel Berger

Characters can't be part of CSS. CSS is look and rendering, showing the same thing in different ways. An a and an ä are to different letters, they are different content. Content as such is part of neither HTML nor CSS, but the rules of how to encode characters in 7-bit source code has to be defined somewhere and so is part of the HTML package.

--




/ \  Mail | -- No unannounced, large, binary attachments, please! --
Reply to
Axel Berger

I was a fan of ISO 8859-1 for a long time; it was the Amiga's native encoding right from its introduction in 1985. I've now switched to UTF-8.

Windows still has remnants of UTF-16 in various places.

--
/~\  Charlie Gibbs                  |  "Some of you may die, 
\ /        |  but it's a sacrifice 
 Click to see the full signature
Reply to
Charlie Gibbs

I know that. It was a pre UTF8 workaround back in the day

But my point still stands.

Once you have specified UTF8 in te HTML headers you don't need € you can use "?" directkly. And its yuse or not is nothing toi do witrh HTML . The server passes a text stream to te browser, the browser notes that its UTF8 and if it has a suitable fpnt available displays the characters correctly

Precisely. They are now essentially obsolescent. The only one you still

*need*, because the others still work is & :-) And possibly  

--
?The ultimate result of shielding men from the effects of folly is to  
fill the world with fools.? 
 Click to see the full signature
Reply to
The Natural Philosopher

CSS is part of HTML

is not CSS. What is used to modify DIV is.

--
The theory of Communism may be summed up in one sentence: Abolish all  
private property. 
 Click to see the full signature
Reply to
The Natural Philosopher

By saying 'content-type: UTF8' or whatever the exact magic spell is

--
The theory of Communism may be summed up in one sentence: Abolish all  
private property. 
 Click to see the full signature
Reply to
The Natural Philosopher

--
?There are two ways to be fooled. One is to believe what isn?t true; the  
other is to refuse to believe what is true.? 
 Click to see the full signature
Reply to
The Natural Philosopher

So does MAC OSX UTF8 makes it all come right

--
?There are two ways to be fooled. One is to believe what isn?t true; the  
other is to refuse to believe what is true.? 
 Click to see the full signature
Reply to
The Natural Philosopher

It does if the font in use has no representation of the glyph you are trying to display

You wont get far trying to display Gujarati in Arial Narrow...

--
?when things get difficult you just have to lie?
Reply to
The Natural Philosopher

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.