LXTerm to accept ANSI characters

Hello folks,

I've recently set up a Pi 2B and pretend to play around with some stuff on it.

I was trying to run Mystic, but it seems that the LXTerm is not very much friendly to ANSI character codes. Is there a way to tweak it?

Reply to
Flavio Bessa
Loading thread data ...

Dana Fri, 12 Feb 2021 17:44:28 +1300, Flavio Bessa napis'o:

uxterm?

Reply to
Nikolaj Lazic

"Flavio Bessa" wrote

| | I was trying to run Mystic, but it seems that the LXTerm is not very much | friendly to ANSI character codes. Is there a way to tweak it? |

Is there a reason to think the Pi has codepages? You'd need that, and you'd need to set the local codepage, in order to use ANSI. I thought that was only on Windows.

Reply to
Mayayana

On Fri, 12 Feb 2021 19:21:16 -0500, "Mayayana" declaimed the following:

When I see someone mention "ANSI codes", I presume they mean something from

M$ Windows did not support ANSI codes until sometime in Win10 (primarily to allow the "Windows Subsystem for Linux" to handle common terminal controls).

--
	Wulfraed                 Dennis Lee Bieber         AF6VN 
	wlfraed@ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/
Reply to
Dennis Lee Bieber

"Dennis Lee Bieber" wrote

| When I see someone mention "ANSI codes", I presume they mean something | from | | | M$ Windows did not support ANSI codes until sometime in Win10 | (primarily to allow the "Windows Subsystem for Linux" to handle common | terminal controls). |

I may have misunderstood, but he did say character codes. Windows has always been ANSI-based, with codepages to define bytes 128-255. It uses unicode-16 in theory but that's mostly under the surface. And UTF-8 is relatively recent.

The man page has this: "lxterm - locale-sensitive wrapper for xterm"

But I have no idea what kind of support there is for locale. It would have to be something like codepages, apparently in /usr/lib/locale. So I was guessing that Flavio Bessa wants support for a language that doesn't have a codepage installed.

Reply to
Mayayana

Um....

I used ANSI color codes a LONG time ago in MS-DOS and Windows 3.x, via ANSI.SYS being loaded in CONFIG.SYS. I naively assume that Windows 9x and subsequent had comparable functionality. Perhaps it wasn't enabled by default. But I would be shocked if it wasn't there.

--
Grant. . . . 
unix || die
Reply to
Grant Taylor

Hi,

I know for a fact that XTerm, which it seems is the root of LXTerm, has supported ANSI control codes for at least 20 years as I've been using them in it for at least that long.

Please provide more details about the problems that you're seeing.

Also, can you reproduce the problems in standard XTerm?

--
Grant. . . . 
unix || die
Reply to
Grant Taylor

The ancestor of ANSI codes is surely the DEC VT100 from way back before PCs even existed, so pre-Windows certainly and probably pre-DOS.

I agree though, just about every modern terminal emulator for Linux supports ANSI codes.

Support of 'extended characters', those in the 128 to 255 range, i.e. not ASCII isn't really to do with ANSI codes. The ANSI codes are mostly ways to change character colours, bold, underline, etc.

The 128 (well, strictly 144) to 255 'characters' are defined by what codepage/character encoding you are using. Codepages are beginning to disappear now but you can use them by setting your locale to things such as en_GB ISO-8859-1 (there's a whole series of ISO-8859 encodings from 1 to at least 15) ISO-8859-1 is the 'latin' set with standard West European langauages characters like accented e, a, c with a cedilla, etc. Other code pages have graphical characters etc. Read the manual pages about locale to configure these.

The 'modern' way to handle extended/extra characters is UTF, I have all my systems set to the en_GB.UTF-8 locale now and everything 'just works' to the extent of displaying arabic, chinese and all sorts of other characters sets in my terminal windows.

--
Chris Green
Reply to
Chris Green

It is, perhaps, worth adding that the reason that this is the modern way is that it is the *only* encoding capable of representing everything unambiguously.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot

Yes it is ambiguous when the codepage is not expicitly declared but the bigger advantage is using more than one codepage in a single text, like quoting Hebrew and Greek in German. Personally I stick to TeX syntax even for those. One character -- one byte has its advantages if you like the command line and editor makros.

The downside are the malicious possibilities it opens. In my eyes it was a big mistake to open domain names to more than ASCII. There are many (near) lookalikes and that fools even the careful user, who makes a point of checking the true destination before clicking.

--




/ \  Mail | -- No unannounced, large, binary attachments, please! --
Reply to
Axel Berger

With UTF there are no codepages, OK the million point address space is broken up into blocks for different purposes but there's none of this nonsense of one value having multiple interpretations as in ISO-8859 et al.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot

I still use unicode in Latex.

Reply to
Nikolaj Lazic

"Ahem A Rivet's Shot" wrote

| > The 'modern' way to handle extended/extra characters is UTF | | It is, perhaps, worth adding that the reason that this is the | modern way is that it is the *only* encoding capable of representing | everything unambiguously. |

More to the point, it's backward compatible with HTML, where the vast majority of webpages are still effectively ASCII, aside from the odd curly quote or space character inserted by editor software. Anything else would have required multi-byte characters for the ASCII range and thus would have broken editors and webpages.

This way we can espouse the value of multiculturalism without changing very much. :)

Reply to
Mayayana

Does any one else suspect that this post is utter bunk? UTF 8 is multibyte character sequences and its not necessarily compatible with HTML which uses straight, not curly, brackets.

UTF8 is a layer above HTML. Its down to the browser and it's access to fonts to render it correctly and the server to specify that its in use.

--
"In our post-modern world, climate science is not powerful because it is  
true: it is true because it is powerful." 
 Click to see the full signature
Reply to
The Natural Philosopher

More accurately UTF-8 (not UTF-16 or UTF-32) is backward compatible with anything based on ASCII provided you stay in the ASCII range, this was an important design feature of UTF-8.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot

"The Natural Philosopher" wrote

| Does any one else suspect that this post is utter bunk? UTF 8 is | multibyte character sequences and its not necessarily compatible with | HTML which uses straight, not curly, brackets. |

I didn't say curly brackets. I said curly quotes. If you look at a typical English-language webpage online you'll find it's usually pure ASCII. When it's not, the only non-ASCII is typically a few things like curly quotes or space characters rendered in UTF-8. I'm guessing some editors do that, since it's not an easy job to write an article with curly quotes when straight quotes work just as well. What I'm saying is that most of the Internet still doesn't need more than ASCII, and ANSI in Europe.

HTML is just text. It started out as ASCII and ended up with META tags to specify charset. So browsers could accommodate non-English languages. But most of it was English, and much of it still is. So it was 1 byte per character. That worked fine for most situations; everyhing but DBCS languages.

As the Internet expanded and computing became mainstream around the world, we needed to adapt. ANSI was working for most languages but not for Chinese, Japanese, etc. So, what to do? It could go to unicode-16, but that still wouldn't cover all characters and it would require a radical shift to 2-byte characters, breaking the Internet and breaking computing. Text files on Windows still default to ANSI. It could go to 4-byte characters. That would work, but it would still break everything. Editors and browsers would need to all be rewritten.

UTF-8 provided a smooth, easy, solution. It accommodates the millions of pages and files that are still essentially ASCII. Unlike with unicode 16 or 32, we don't have to add a null byte to every character in order to encode it. UTF-8 allows ANSI character sets to still be used. But it also provides a way to fully support multi-byte characters only where necessary. It's the one solution to support all languages without changing the default of 1 character to 1 byte.

| UTF8 is a layer above HTML. Its down to the browser and it's access to | fonts to render it correctly and the server to specify that its in use. | UTF-8 is not a layer. It's character encoding. The HTML is plain text. The META content type tag specifies how that text is encoded. However it's done, it's still plain text. Fonts is a whole other kettle of fish.

Reply to
Mayayana

It's only a default for ASCII, and the characters that ASCII supports. And when you say it allows ANSI character sets to be used, I take it you mean the characters that different ANSI pages supported, which under UTF-8 will most likely be 2-byte chars, rather than 1-byte but 8-bit values.

--
Tim
Reply to
TimS

The HTML /markup/ is basic ASCII.

The HTML /page/, in it's entirety, may contain UTF-* directly, or the ASCII HTML codes therefor.

Eh ... If you're talking about the HTML /file/ and not the HTML /markup/, then it's entirely possible to have raw UTF-* content in the text copy outside of the markup.

--
Grant. . . . 
unix || die
Reply to
Grant Taylor

That is what I meant

HTML is not there to specify odd characters - it can but its job is to format text. What that text *is* is not relevant to HTML, It neither knows nor cares

--
Truth welcomes investigation because truth knows investigation will lead  
to converts. It is deception that uses all the other techniques.
Reply to
The Natural Philosopher

All of the HTML codes for special characters tends to disagree with you.

  © € ™ ...

Do a web search for "html special characters" and you will find long lists.

I don't know what version of HTML these were introduced. But I do know that many of the basic ones have been there for at least 20 years (HTML 4?).

--
Grant. . . . 
unix || die
Reply to
Grant Taylor

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.