Decimal Point vs. Decimal Comma

R

rickman 10 years ago

I know that roughly half the world uses a period to separate the fractional part of a number from the integer part. Roughly half the world uses a comma for the same purpose. But what about in computer languages? A little research showed that Algol was specified to work with either. Other computer languages seem to work primarily or exclusively with a period (point).

Are there any languages that support both formats without special programming?

Rick

Vote

A

Anton Ertl 10 years ago

Not really. Algol 60 (not sure about 68) did not specify the concrete representation of the lexemes. So one compiler could accept 123.45 and a different compiler could accept 123,45 as concrete syntax of the same number. But the compilers would not necessarily accept the other syntax. A typical standardization cockup.

Some Forth implementations do. E.g., SwiftForth and bigForth:

ANS bigFORTH 386-Linux rev. 2.3.1

123.45 d. 12345 ok 123,45 d. 12345 ok

A very early version of Gforth also worked that way, but I removed this feature, because it's more important to be able to get a useful error message if you use "2,", but the Forth system does not define it. I have not missed this feature, and I come from the part of the world that uses decimal comma.

- anton

M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.forth200x.org/forth200x.html EuroForth 2015: http://www.rigwit.co.uk/EuroForth2015/

Vote

U

upsidedown 10 years ago

In COBOL: DECIMAL-POINT IS COMMA

Vote

R

rickman 10 years ago

But neither of these are "correct". I was talking about the radix point that separates the fraction from the integer portion of a number.

Rick

Vote

N

Nils M Holm 10 years ago

ENVIRONMENT DIVISION. CONFIGURATION SECTION. DECIMAL-POINT IS COMMA.

COBOL :)

Nils M Holm < n m h @ t 3 x . o r g > www.t3x.org

Vote

G

George Neuner 10 years ago

If I understand correctly that you're asking about numeric literals in program source, then I think the answer is COBOL and very possibly nothing else.

It's trivial to make a lexer accept either number format, but many languages use commas as separators for, e.g., function arguments, array and list elements, etc. ... so also using commas as decimal marks would cause problems.

Consider: x = f( 1,9 );

Is that one argument or two? Depends on number lexing. Given a particular lexing, one of the two possible parses is an error - but which depends on the declaration of f(). This could be extremely confusing to a programmer. The only way around it is to also change the argument separator, or to enforce that truly separate values be separated by whitespace in addition to the separator.

Pretty much only the whitespace separated sexpr syntax of Lisp (and Scheme) is immune to the parsing issue ... languages like ML and Haskel, etc. don't necessarily need commas for function calls, but they still do use them for other things.

I'm not aware of any language that specifically addresses this issue regarding its source code - all that I am familar with simply use the dot syntax for decimals. Number formats often are addressed as locale issues for I/O libraries, but not for program source.

Many compilers now allow Unicode in their source, but programmers still are actively discouraged from using non-ASCII characters. And non-English speaking programmers almost are forced to program in English for portability. English is the lingua franca of programming. There have been compilers designed specifically for certain locales, but sans government mandate of their use, none has ever been very successful.

Incidentally, I'm not sure what you saw re: Algol, but neither Algol

60 nor 68 permitted commas as decimal marks - both used the dot syntax. I don't have a reference for Algol 58, but all the Algols used commas as argument and array separators, so they all would have been susceptible to the parsing issue described above.

YMMV, George

Vote

N

Norbert_Paul 10 years ago

Maybe by modfying the read-table of Common Lisp. But this might already be "special programming" and I don't even know exactly if this is possible (but I very much suppose so).

formatting link

-->

formatting link

Vote

H

Hans-Bernhard BrÃ¶ker 10 years ago

Am 02.04.2016 um 23:49 schrieb rickman:

In computer languages we have barely enough punctuation letters available as it is, to express all the necessary things without being overly verbose. Wasting one on a luxury item like that would IMHO be unjustifiable.

Programs need to be able to handle localized formats on input and output, but not in the source itself.

Vote

U

upsidedown 10 years ago

Some computer languages like COBOL and FORTRAN could be written with 6 bit codes (like FIELDDATA). Unfortunately languages like C and Pascal used characters from ISO-686 character sets, including characters reserved for national variants.

What is the problem of using Latin-1 (ISO 8859-1) in a programming language ?

In fact it would be a great thing, if you could use UTF-8 in programming, e.g. Greek letters Alfa and Beta as program variables ? No need to do any translitteration from any textbok variables.

Vote

N

Niklas Holsti 10 years ago

Allowed in Ada.

You can do that in Ada. Whether it makes sense is another question.

I usually keep to alphanumerics, but sometimes I use accented letters when I write Ada in a Finnish context, with Finnish names for variables.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

R

Ron Aaron 10 years ago

You can. In 8th you certainly can, and I don't see why a standard Forth would disallow a UTF-8 name.

Vote

R

Robert Wessel 10 years ago

Cobol has the same issue, so if you use DECIMAL POINT IS COMMA, you need to be more careful with whitespace in some places. But subscripting is not one of them, since only integers are allowed there (I think this might have changed in modern Cobol so that you can have full expressions in subscripts, but I'm not sure).

Cobol is full of weird parsing rules.

Vote

B

Bernd Paysan 10 years ago

Am Sun, 03 Apr 2016 06:23:56 +0000 schrieb Anton Ertl:

VFX has dp-char and fp-char to store the currently used decimal separator for doubles and floats (actually, a zero-terminated string of a few characters), by default, dp-char is both '.' and ',', so you can enter both forms:

VFX Forth for Linux IA32

123,456 d. 123456 ok 123.456 d. 123456 ok

fp-char is by default just '.'.

Gforth's current development versions also have dp-char and fp-char, but only one possible character; default is ".". This allows to change the character in case you want to read localized numbers.

Bernd Paysan "If you want it done right, you have to do it yourself" net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ* http://bernd-paysan.de/

Vote

B

Bernd Paysan 10 years ago

Am Sun, 03 Apr 2016 20:24:10 +0300 schrieb upsidedown:

Forth-2012 with the Xchar wordset allows all XChar printable characters as part of a Forth word (usually UTF-8). People use that; we recently had an article about a Chinese student team using Forth to win a competition, and one of the two programmers used Chinese for his own definitions.

As far as I can test here, all Forth systems I tested have no problems with this approach; just case insensitivity is limited to ASCII (which the standard recommends, we have case insensitivity to write words in lower case which have been standardized in upper case).

Bernd Paysan "If you want it done right, you have to do it yourself" net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ* http://bernd-paysan.de/

Vote

R

Ron Aaron 10 years ago

8th doesn't distinguish doubles and floats (it just has 'numbers'). But it does let you set the (output) values of the decimal and thousands separators using .# and ,# respectively. You cannot query the current values; they default to . and ,

One weakness is that for example, outputting numbers for Indian users is more difficult, since 8th has no internal concept of 'locale' and the built-in ,# will only cause 'thousands separators' every three characters, unlike the normal Indian formatting. That would have to be done manually using e.g. "s:strfmt" or other words.

Vote

R

Ron Aaron 10 years ago

Here's one way to do it:

formatting link

Vote

R

Robert Wessel 10 years ago

You've always been able to do that in Java, although it's UCS-2 (old versions) or UTF-16 (current), not UTF-8.

Vote

A

Anton Ertl 10 years ago

Actually Forth-2012 falls slightly short of actually guaranteeing that this works, because it does not require that the Forth system supports UTF-8.

Fortunately, UTF-8 is designed to be compatible with existing code, so if the system is 8-bit clean and does not do funny things for recognizing white space, it will pretty much work with UTF-8. An example is old versions of Gforth (long before anybody thought about the Xchar wordset), where everything but command-line editing works with UTF-8.

An example of UTF-8 variables in Forth (and other languages on the same page) is shown on .

- anton

M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.forth200x.org/forth200x.html EuroForth 2015: http://www.rigwit.co.uk/EuroForth2015/

Vote

A

Albert van der Horst 10 years ago

What Algol68 does makes much more sense. It defines symbols that makes sense for the language. Like "less or equal" (a kind of underscored --

Groetjes Albert

Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Vote

A

Albert van der Horst 10 years ago

You can have a certain leeway as implementor, because non-ascii characters are "implementation defined". This makes the use of the IBM box-printing characters a portable program with dependancies.

I've a different strategy in ciforth. A name is a string of bytes, not characters. That is why I detest case insensitivity. The interpretation mechanism just looks up a byte string. So a name can contain sequences that turn the current foreground color green then red, so as to have names that look like colorforth if printed on an appropriate device (a VT100 would not do).

Instead of manipulating >IN ciforth has lifted to an ever so slightly higher abstraction, PP@@. PP@@ returns an incremented parse pointer and the next character (that must fit in 64 bits). By revectoring PP@@ one can redefine how word names are interpreted into characters, or escape sequences.

A practical application is com-4e5 . Any function keys are just stored in the dictionary. Press a function key while in communication mode and it will just be executed. Press an as yet undefined key and you get the chance to define it.

If you use this, your program is no longer ascii and looses the epitethon "Forth with environmental dependancies".

Groetjes Albert

Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Vote

Decimal Point vs. Decimal Comma

Join the Discussion

Didn't find your answer?