Decimal Point vs. Decimal Comma

I know that roughly half the world uses a period to separate the fractional part of a number from the integer part. Roughly half the world uses a comma for the same purpose. But what about in computer languages? A little research showed that Algol was specified to work with either. Other computer languages seem to work primarily or exclusively with a period (point).

Are there any languages that support both formats without special programming?

--

Rick
Reply to
rickman
Loading thread data ...

Not really. Algol 60 (not sure about 68) did not specify the concrete representation of the lexemes. So one compiler could accept 123.45 and a different compiler could accept 123,45 as concrete syntax of the same number. But the compilers would not necessarily accept the other syntax. A typical standardization cockup.

Some Forth implementations do. E.g., SwiftForth and bigForth:

ANS bigFORTH 386-Linux rev. 2.3.1

123.45 d. 12345 ok 123,45 d. 12345 ok

A very early version of Gforth also worked that way, but I removed this feature, because it's more important to be able to get a useful error message if you use "2,", but the Forth system does not define it. I have not missed this feature, and I come from the part of the world that uses decimal comma.

- anton

--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html 
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html 
     New standard: http://www.forth200x.org/forth200x.html 
   EuroForth 2015: http://www.rigwit.co.uk/EuroForth2015/
Reply to
Anton Ertl

In COBOL: DECIMAL-POINT IS COMMA

Reply to
upsidedown

But neither of these are "correct". I was talking about the radix point that separates the fraction from the integer portion of a number.

--

Rick
Reply to
rickman

ENVIRONMENT DIVISION. CONFIGURATION SECTION. DECIMAL-POINT IS COMMA.

COBOL :)

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org
Reply to
Nils M Holm

If I understand correctly that you're asking about numeric literals in program source, then I think the answer is COBOL and very possibly nothing else.

It's trivial to make a lexer accept either number format, but many languages use commas as separators for, e.g., function arguments, array and list elements, etc. ... so also using commas as decimal marks would cause problems.

Consider: x = f( 1,9 );

Is that one argument or two? Depends on number lexing. Given a particular lexing, one of the two possible parses is an error - but which depends on the declaration of f(). This could be extremely confusing to a programmer. The only way around it is to also change the argument separator, or to enforce that truly separate values be separated by whitespace in addition to the separator.

Pretty much only the whitespace separated sexpr syntax of Lisp (and Scheme) is immune to the parsing issue ... languages like ML and Haskel, etc. don't necessarily need commas for function calls, but they still do use them for other things.

I'm not aware of any language that specifically addresses this issue regarding its source code - all that I am familar with simply use the dot syntax for decimals. Number formats often are addressed as locale issues for I/O libraries, but not for program source.

Many compilers now allow Unicode in their source, but programmers still are actively discouraged from using non-ASCII characters. And non-English speaking programmers almost are forced to program in English for portability. English is the lingua franca of programming. There have been compilers designed specifically for certain locales, but sans government mandate of their use, none has ever been very successful.

Incidentally, I'm not sure what you saw re: Algol, but neither Algol

60 nor 68 permitted commas as decimal marks - both used the dot syntax. I don't have a reference for Algol 58, but all the Algols used commas as argument and array separators, so they all would have been susceptible to the parsing issue described above.

YMMV, George

Reply to
George Neuner

Maybe by modfying the read-table of Common Lisp. But this might already be "special programming" and I don't even know exactly if this is possible (but I very much suppose so).

formatting link

-->

formatting link

Reply to
Norbert_Paul

Am 02.04.2016 um 23:49 schrieb rickman:

In computer languages we have barely enough punctuation letters available as it is, to express all the necessary things without being overly verbose. Wasting one on a luxury item like that would IMHO be unjustifiable.

Programs need to be able to handle localized formats on input and output, but not in the source itself.

Reply to
Hans-Bernhard Bröker

Some computer languages like COBOL and FORTRAN could be written with 6 bit codes (like FIELDDATA). Unfortunately languages like C and Pascal used characters from ISO-686 character sets, including characters reserved for national variants.

What is the problem of using Latin-1 (ISO 8859-1) in a programming language ?

In fact it would be a great thing, if you could use UTF-8 in programming, e.g. Greek letters Alfa and Beta as program variables ? No need to do any translitteration from any textbok variables.

Reply to
upsidedown

Allowed in Ada.

You can do that in Ada. Whether it makes sense is another question.

I usually keep to alphanumerics, but sometimes I use accented letters when I write Ada in a Finnish context, with Finnish names for variables.

--
Niklas Holsti 
Tidorum Ltd 
niklas holsti tidorum fi 
       .      @       .
Reply to
Niklas Holsti

You can. In 8th you certainly can, and I don't see why a standard Forth would disallow a UTF-8 name.

Reply to
Ron Aaron

Cobol has the same issue, so if you use DECIMAL POINT IS COMMA, you need to be more careful with whitespace in some places. But subscripting is not one of them, since only integers are allowed there (I think this might have changed in modern Cobol so that you can have full expressions in subscripts, but I'm not sure).

Cobol is full of weird parsing rules.

Reply to
Robert Wessel

Am Sun, 03 Apr 2016 06:23:56 +0000 schrieb Anton Ertl:

VFX has dp-char and fp-char to store the currently used decimal separator for doubles and floats (actually, a zero-terminated string of a few characters), by default, dp-char is both '.' and ',', so you can enter both forms:

VFX Forth for Linux IA32

123,456 d. 123456 ok 123.456 d. 123456 ok

fp-char is by default just '.'.

Gforth's current development versions also have dp-char and fp-char, but only one possible character; default is ".". This allows to change the character in case you want to read localized numbers.

--
Bernd Paysan 
"If you want it done right, you have to do it yourself" 
net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ* 
http://bernd-paysan.de/
Reply to
Bernd Paysan

Am Sun, 03 Apr 2016 20:24:10 +0300 schrieb upsidedown:

Forth-2012 with the Xchar wordset allows all XChar printable characters as part of a Forth word (usually UTF-8). People use that; we recently had an article about a Chinese student team using Forth to win a competition, and one of the two programmers used Chinese for his own definitions.

As far as I can test here, all Forth systems I tested have no problems with this approach; just case insensitivity is limited to ASCII (which the standard recommends, we have case insensitivity to write words in lower case which have been standardized in upper case).

--
Bernd Paysan 
"If you want it done right, you have to do it yourself" 
net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ* 
http://bernd-paysan.de/
Reply to
Bernd Paysan

8th doesn't distinguish doubles and floats (it just has 'numbers'). But it does let you set the (output) values of the decimal and thousands separators using .# and ,# respectively. You cannot query the current values; they default to . and ,

One weakness is that for example, outputting numbers for Indian users is more difficult, since 8th has no internal concept of 'locale' and the built-in ,# will only cause 'thousands separators' every three characters, unlike the normal Indian formatting. That would have to be done manually using e.g. "s:strfmt" or other words.

Reply to
Ron Aaron

Here's one way to do it:

formatting link

Reply to
Ron Aaron

You've always been able to do that in Java, although it's UCS-2 (old versions) or UTF-16 (current), not UTF-8.

Reply to
Robert Wessel

Actually Forth-2012 falls slightly short of actually guaranteeing that this works, because it does not require that the Forth system supports UTF-8.

Fortunately, UTF-8 is designed to be compatible with existing code, so if the system is 8-bit clean and does not do funny things for recognizing white space, it will pretty much work with UTF-8. An example is old versions of Gforth (long before anybody thought about the Xchar wordset), where everything but command-line editing works with UTF-8.

An example of UTF-8 variables in Forth (and other languages on the same page) is shown on .

- anton

--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html 
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html 
     New standard: http://www.forth200x.org/forth200x.html 
   EuroForth 2015: http://www.rigwit.co.uk/EuroForth2015/
Reply to
Anton Ertl

What Algol68 does makes much more sense. It defines symbols that makes sense for the language. Like "less or equal" (a kind of underscored --

Groetjes Albert

--
Albert van der Horst, UTRECHT,THE NETHERLANDS 
Economic growth -- being exponential -- ultimately falters. 
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

You can have a certain leeway as implementor, because non-ascii characters are "implementation defined". This makes the use of the IBM box-printing characters a portable program with dependancies.

I've a different strategy in ciforth. A name is a string of bytes, not characters. That is why I detest case insensitivity. The interpretation mechanism just looks up a byte string. So a name can contain sequences that turn the current foreground color green then red, so as to have names that look like colorforth if printed on an appropriate device (a VT100 would not do).

Instead of manipulating >IN ciforth has lifted to an ever so slightly higher abstraction, PP@@. PP@@ returns an incremented parse pointer and the next character (that must fit in 64 bits). By revectoring PP@@ one can redefine how word names are interpreted into characters, or escape sequences.

A practical application is com-4e5 . Any function keys are just stored in the dictionary. Press a function key while in communication mode and it will just be executed. Press an as yet undefined key and you get the chance to define it.

If you use this, your program is no longer ascii and looses the epitethon "Forth with environmental dependancies".

Groetjes Albert

--
Albert van der Horst, UTRECHT,THE NETHERLANDS 
Economic growth -- being exponential -- ultimately falters. 
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply to
Albert van der Horst

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.