Language feature selection

Don Y · 2017-03-05T23:03:25+00:00

A quick/informal/UNSCIENTIFIC poll: What *single* (non-traditional) language feature do you find most valuable in developing code? (and, applicable language if unique to *a* language or class of languages)

D

Don Y 9 years ago

I use "new" only in the sense of something that was *new* (for you) when you were exposed to it. My interest is more on *value* than newness and/or novelty.

E.g., replacing the i4004's JMS/BBL with the i8008's CALL/RET was a yawn... stack depth was still severely constrained (though doubled!).

OTOH, the 8080's addition of PUSH/POP was *hugely* (YUGELY?) valuable as it made a pronounced difference in what was suddenly "realistic" in terms of managing application implementation complexity. "You mean I can push *any* register (pair)? And, as many as I want ALONG WITH subroutine invocations??"

But, does that really buy you much -- unless you are sorely resource (time/space) constrained? (Granted, this can be seen as a different case of the PUSH/POP issue, above -- e.g., building a synchronization primitive to wrap a set of *discrete* accesses, etc.)

[One of my gripes about Ada is that it's too "big" of a language]

One can always come up with new "features" that can have some merit in some cases. What I'm looking for are the "wow" moments when some feature made a radical difference in how you approach(ed) problems.

Like moving from the 8's JMS to the 11's JSR opening the door for recursive routines.

Or, having direct access to the "status word" allowing a multitasking executive to restore the *entire* machine state, etc.

This looks like you are trying to tackle atomic updates on a larger (wider) scale than just bitfields (?).

If not, then you seem to be creating a huge bias (in the mind of the developer) to opt for a "set of bitfields" representation of certain structures that might inadvertently lead to a sort of premature optimization ("Damn! If I let the individual fields grow 'just a tad', then I lose the value of this feature, here...")

[Disclaimer: I've not looked at Ada for more than 15 years... :< ]

Vote

L

Les Cargill 9 years ago

Hi Walter! No offense, but...

This is utterly 1) inappropriate and 2) unnecessary. It's a *terrible* extension to ( I presume C, where I have seen it ).

Even back on Borland, Microsoft, Tektronix and ... another VAX C compiler 30ish years ago, you'd use a linker/locater to put specific structures at specific locations.

The source code itself doesn't need to know about where variables go[1].

Its part of the responsibility of tools that are invoked later in the build cycle.

[1] think of a shared module that's a "driver" for an FPGA and is common to three projects, but all three targets have a different base addresses for said FPGA.

Now, all that being said - for some kinds of debugging, it's darn handy to have the same variables at the same locations. It's just better done in the linker or with a locater.

it very nearly destroys the portability of code that uses it...

Les Cargill

Vote

L

Les Cargill 9 years ago

That's kind of not what const is for in 'C', although it's a way.

So what you really want is for the toolchain to assign these values at address fixup time - in the linker or later.

You use #pragmas to assign a block in a seperate segment, then the linker fixes it up for you.

That way if he FPGA decodes at a different address for different projects, you're covered and don't have to maintain seperate copies of that source code.

- Les Cargill

Vote

W

Walter Banks 9 years ago

Maybe "terrible" but a lot cleaner than a pointer to constant that is expected to be optimized by a compiler to a simple load store.

and I assume that constant pointer doesn't destroy portability. C specifically rarely has portable applications.

Better that applications be completely contained and either have include files to define application specific information. The original role of linkers was to be able to compiler large programs on small computers.

The current state of especially open source compilers is an awful lot of duplication between the compiler and linkers in the tool sets. Better that a lot of the linker information be processed much earlier to develop compile strategies.

This is even more true in some of the more interesting ISA that are currently being developed. I am working on a project that can have hundreds or potentially thousands of heterogeneous processors being compiled to from a single application.

Address setting from application/processor specific header files allows compilers to be written with far more flexibility and that is far less computationally intensive.

w..

Vote

G

Grant Edwards 9 years ago

Huh? The common method for accessing memory-mapped registers in C that's been discussed doesn't involve a "pointer to a constant".

Grant

Vote

T

Tom Gardner 9 years ago

This is a philosophical difference.

If something is important to the correct operation of the program, then I like it to be visible in the source code. A useful benefit is that the information is easily found and analysed by the IDEs and/or other source code manipulation tools around.

In the same vein, in C I dislike having correct code operation being dependent on combinations of command line compiler arguments.

That seems unimportant to me. I cannot think of a reason why you would need to nail down addresses in portable code. Of course "portable" is not a black and white concept!

Any examples?

Vote

D

David Brown 9 years ago

First, no one is using a "pointer to a constant" - they are using a constant pointer to a volatile:

#define REG (*((volatile uint8_t*) 0x1234))

Secondly, the syntax here is not particularly bad, and it should be clear to anyone who understands C. And since in almost all cases, this sort of thing comes in pre-written headers, the developer is rarely going to have to write it themselves.

Yes, I expect a compiler to optimise the use of such a REG. This is not

1980 - even the simplest compilers do a reasonable job on such common constructs. And the compiler has /exactly/ the same information here as it would have with an @ definition - there is no reason why it should be able to optimise one better than the other.

(It is different for registers whose address is defined by linker scripts - there, the compiler has less information and cannot generate as optimal code.)

Some C code is massively portable, other C code is tied tightly to a particular compiler and particular target - C covers a wide span here.

The use of the cast constant pointer is portable to any compiler that supports the appropriate types, and has the same interpretation of casting an integer literal to a pointer, and has the same concept of "volatile access". That covers pretty much every C compiler ever written - certainly anything for embedded systems (baring possible limitations on type sizes).

The use of @ is limited to a few specific compilers, most of which are for outdated and limited cpu cores.

Linkers do a lot more than that - but that is a different argument.

I agree that it is nicest to have as much as possible defined as part of the C code and header files, rather than in the linker file (though I accept that some people like to put their fixed addresses in a linker file). The standard method of defining hardware registers using cast pointers puts the addresses in a header file (or occasionally a C file).

So again, the @ method gives absolutely no benefit over the standard method.

Can you give specifics here, or is it just your prejudice against open source tools?

If the compiler knows the addresses of the fixed registers at compile time, it can use that information to generate better code. If it knows that REG1 is at 0x1001, and REG2 is at 0x1002, then it can load a cpu pointer register X with 0x1001, and access these two hardware registers at X+0 and X+1 rather than working with their absolute addresses.

This is absolutely true, and particularly relevant on more modern cpus with longer addresses and more registers where this sort of optimisation makes a bigger difference.

That is why it makes sense to use the standard cast pointer syntax, the "kludge", rather than putting all your hardware addresses in a linker file. And that is one of the reasons why almost everyone /does/ define their hardware addresses that way.

There is absolutely /nothing/ here to give any reason to suspect a difference between open source compilers and closed source compilers.

And again, the @ method gives you absolutely /nothing/ here that you don't also get with standard C.

Again, the @ method gives you absolutely /nothing/ here that you don't also get with standard C.

The @ syntax is a relic from the days of brain-dead compilers with poor optimisation and limited support for normal standard C, usually targeting brain-dead cpus that are also a relic. It is a little neater than the standard C method - defining macros is always ugly, as are pointer casts. But there are /no/ benefits from using the @ syntax like this for a decent compiler. The @ syntax does not give the compiler any more information, and it does not allow any more optimisation. The way it is implemented is often less flexible, since it is often limited to variables of a few specific types, and may require specific fixed formats for the address.

Vote

W

Walter Banks 9 years ago

Absolutely true it is a constant pointer.

w..

Vote

S

Simon Clubley 9 years ago

Some things you get with being able to update multiple fields within one assignment statement without having to use bitmasks:

It allows you to update multiple fields in a device register as part of a single R-M-W sequence without having to use bitmasks or temporary variables.

It allows the compiler to do additional error checks on what you are trying to do, which it can't do if you are changing what appears to the compiler as an opaque integer.

It makes the code more readable and more robust.

Understood. Sometimes Don, the problem with these discussions is that you don't provide sufficient guidelines on what you are looking for so people have to guess and as such quite often don't give you the kinds of answers you are looking for.

Simon.

Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world

Vote

D

Don Y 9 years ago

What *I'm* looking for isn't important. I'm interested in hearing what *others* "looked for" (by way of "language features") and how they've rationalized their "value".

E.g., some folks look for "features" that make debugging easier; some that make the application "terser"; some that make it more robust; etc.

If I was interested in which language features... ... facilitate debugging... ... make for terser applications... ... enhance the durability of the application... etc., then I'd have asked those *specific* questions, right?

By asking in more abstract terms ("valuable"), I let others decide what is important (in terms of criteria) AND, then, what features they deem "best suiting" THAT particular goal.

My followup comments are merely to draw out more detail and "backup" for how they came to that conclusion, not as an assessment *of* their conclusion ("what's your favorite color, and why?")

Vote

D

Don Y 9 years ago

I guess it depends on what you consider "source code". How do you treat makefiles, linker scripts, etc.? Clearly they are all important to the *intended* operation of the program -- as are the actual tools, themselves.

How much of this cruft do you clutter the "sources" with in the attempt to ensure they accompany the sources? What about applications wherein multiple languages are combined; how do you nail down the "implementation defined" behavior of their interfaces? What order will your Java/C/Python/foo function build its stack frame? How will your FORTRAN/Pascal/ASM/bar module interface to it??

There's usually a difference between "correct" and "desired". It's unfortunate when "correct" relies on command line arguments to resolve some "implementation defined behavior" with which the compiler could, otherwise, take liberties. Likewise, if the order and locations at which objects can be bound can arbitrarily be altered and affect operation

[These should be eschewed, IMO]

(This is an issue on many processors, without concern for the actual I/O's)

Of course, there's no way for the tool to know/enforce these constraints other than a suitable note to the future developer!

Therein lies the rub. Code can be "portable" yet still tied to a particular processor (but a different implementation). E.g., reset_processor()...

Vote

D

Don Y 9 years ago

E.g., one reply that I received (elsewhere) resulted in a lengthy discussion as to deficiencies in "list" implementations; why lists were valuable to some types of use (application) but problematic for others (in the discussion) because of implementation shortcomings.

This sort of back-and-forth refines the "value" that is being placed (or not!) on the "feature" in the minds of the different folks discussing it.

Vote

J

Jacob Sparre Andersen 9 years ago

In my daily work I would say strong typing (as done in Ada and SPARK).

When I'm doing embedded systems, being able to declare representations for a type (especially for enumerations) is very valuable.

What really made me say "wow" to a programming language feature was when I first read about having tasking built into a language (and not as an add-on library) in a book about Ada 83.

Greetings,

Jacob

"We will be restoring normality as soon as we are sure what is normal anyway."

Vote

T

Tom Gardner 9 years ago

Keep it simple...

Source code => as defined in the language standard.

If something is tool-specific, then it is not part of the source code. Hence compiler arguments are not part of the source code.

Code inspection tools such as browsers, analysers, and compliance checkers work on the source code. That's important.

Not really.

If there is a distinction between "desired" and "correct" then I can instantly rewrite the program to be much faster and much smaller.

That is always the case with C/C++, unless the program uses no separately compiled libraries and turns off all optimisations.

Other languages are much better in that regard.

That is just one small consideration in this context.

And that's undesirable.

More than just a processor, consider different boards with the same processor.

Vote

H

Hans-Bernhard Bröker 9 years ago

Am 10.03.2017 um 09:17 schrieb Tom Gardner:

That's just it: it is hardly essential for the operation of such code _where_ exactly that register is; what really matters is that the variable describing it is a) properly structured and named, and b) ultimately correctly located.

So how is it better to have that address hard-coded into the driver source code, as opposed to getting to decide it at link-time? Among other things, this allows using the same, pre-compiled driver code to be used for different micros, even if they mapped the same peripheral IP module to different addresses.

Practically speaking, the @ extension has caused my employer a good deal more grief than equivalent methods in other toolchains we use. That happened primarily because it _is_ an extension that quite a number of our other tools that read C source don't fully recognize. The worst aspect of it is that no amount of preprocessor work gets the @ syntax reliably hidden from those tools' view.

An IDE that cannot find text in linker/locator scripts or make files isn't worth being in use.

Well, most code whose behaviour is changed by compiler flags is anything but correct --- its authors usually fail to see that, though.

OTOH, using language extensions like this '@' makes your code dependent not just on a compiler option that might toggle it on/off, but even on the compiler _having_ such support in the first place, for that option to enable.

For platforms like ARM that have multiple compiler toolchains available, it's highly preferrable to allow a single code base to work on all of them. That's one kind of portability the '@' extension breaks almost immediately, first by not being supported on all compilers, second by hard-coding addresses that aren't necessarily the same on all implementations of the CPU platform.

Vote

G

Grant Edwards 9 years ago

When you say "strong typing" you are also assuming "static typing"?

Grant Edwards grant.b.edwards Yow! I love ROCK 'N ROLL! at I memorized the all WORDS gmail.com to "WIPE-OUT" in 1965!!

Vote

W

Walter Banks 9 years ago

There are lots of tools issues that should be re-examined. There is many cases where the tool execution is backwards to what generates the best code. The compile link sequence where some key information about the target ISA or processor architecture is not known until link time sometimes forces the compiler to use a subset of the actual processors ISA rather than take advantage of a specific member feature. The recent discussion of the mill belt length is a good example of how important this is. This type of compiler approach can encapsulate the specific processor variations and make application wide optimizations with relative ease.

Switching this around so the compiler is focused on creation code for a well defined target rarely is anything more than including a device specific header file in the application. (as a side effect eliminating the link step)

What's wrong with a single set of sources that defines an application, no command line options or linker scripts just an application including the definition of the target, files and libraries it needs. Compilation is both faster by many factors and there is a simple self contained project that can be easily re-created after a decade or more.

(The oldest project we have helped customers re-create in the last year was archived by the customer in 1988, we have copies of every released tool set. Start to recreating an identical HEX file < 2 hours from receiving the customer support request email)

The brouhaha about "@" and C is really more about having supporting syntax to be able to explain what is desired without needing an indirect definition. That is my real argument not that it can't be done. Most languages have some way to access the underlying machine, fewer of these languages do so in a simple clean way.

Don I am not arguing to create a more complex world but in the area of language design why are many tool sets burdened with solutions to the computer limitations of 1980?

It is a real eyeopener to spend some time with some of the current crop of programmers who are using what many of us would consider a toy language to actually achieve some pretty remarkable results. It took me a long time to respect what they are doing.

w..

Vote

W

Wouter van Ooijen 9 years ago

Because it might enable the code generator to generate better (smaller and faster) code.

Wouter "Objects? No Thanks!" van Ooijen

Vote

P

Paul Rubin 9 years ago

It's probably clearest to just call functions to bang on those registers. The compiler can inline them so there shouldn't be overhead.

Do you mean C?

Vote

D

Don Y 9 years ago

Yes, but you also have to figure the "developer" in that calculus. People need to be able to wrap their brains around their solutions (esp if those solutions import "components" from other crania!)

A machine can have (near) infinite storage; humans tend to need to reduce things to simpler mental models, eliding many of the details (that a machine *could* examine and exploit).

Additionally, there is value in "hiding detail" when dealing with human agents; you can (undoubtedly) "trust" a machine with those details -- cuz, you'd know the constraints placed on what it could

*do* with them! (something that you wouldn't trust to a human!)

That would depend on the size and complexity of the project, right? I have 192 processors (each with multiple cores) in my current design. It would be *delightful* if could sort out how best to allocate resources at run-time instead of my crude metrics.

But, those tools don't exist and aren't likely to any time soon.

As a result, I have to develop solutions that meatware can manage... and, with existing tools as I don't relish becoming a tool-designer any more than absolutely necessary for *my* needs -- with little or no interest in "yours" (out of no malice to "you")

I can build any project almost instantly -- as long as I don't ALSO need to bring up target hardware and/or hardware debugging tools. (I've been diligent about preserving tools AND development environments)

Agreed. Software (language designers) seem to think in terms of some EXISTING environment, not in the CREATION of that environment. This may be a necessary practical concession: how much effort should go into supporting bare metal when that metal changes at a rate that approaches a typical development cycle length?

Walter, you're preaching to the choir! But, in my case, I am interested in the *applications* and suffer the inadequacies of the tools. Why are telephones (to this day!) so "uncomfortable" to hold? Hasn't anyone ever looked at the characteristics of a human hand and how it might wrap around a "handset"? How much time should folks ("users") spend thinking about this vs. just making their phone calls and tolerating the klunky physical implementation?

My current project involves 4 or 5 different languages. And, concepts that don't inherently map to traditional language primitives. How much time do I spend trying to bend my implementation to the languages? Or, the languages to the implementation?

Do I add keywords to the language to cause the automatic generation of client- and server-side stubs for procedures and functions that are intended to be RPC's? Or, do I invent an IDL to use alongside the targeted languages? And, "manually" create the necessary stubs?

I think there are a few different issues, involved (ignoring egos, NIH, etc.).

First, it seems difficult for those of us with "long histories" to wrap our heads around how much technology has changed over the course of our *individual* careers. Recall, for many of us, "software" didn't exist before we were born -- so, it's not like trying to understand the advances in PLUMBING since the Roman Era! :>

I have to make a very *conscious* effort not to pack 8 "flags" into a byte. Or, "reuse" a byte for different purposes by exploiting knowledge of which parts of the application are running at any given time. I can recall counting subroutine invocations in an early 8085 project so that I could map the 7 most common invocations to 7 of the "restart" vectors (thus allowing a 3 byte "CALL" to be replaced by a one byte "RST" -- just to save TWO BYTES several times) in order to shrink the size of the binary to avoid adding another $50 EPROM to the product's cost!

Or, carefully selecting the condition code for a JUMP/CALL (and the surrounding code's structure) to leverage my knowledge of how often that condition will be met -- to trim a few clock cycles off that execution path through the code.

Its also hard to comprehend how much *faster* (and cheaper) the hardware has become. When I started my current project, I was unduly biased by my past conceptions of costs and spent a fair bit of effort "skimping" on the system architecture to save a few dollars here and there. And, rely on COTS hardware (e.g., PC's) for the "heavy lifting". This has a profound influence over what you can do and where you can "do" it!

Over time, it became !painfully! obvious that this was me clinging to an obsolete idea of hardware costs. Why not "PC's (figuratively) everywhere"? Sure, no need for all those displays, keyboards, UI's, etc. But, the compute power is cheap enough (if you stay way back from the state of the art) that its silly to cripple the implementation by "prematurely optimizing (hardware) cost"!

Likewise, why burden an "average user" with having to understand the issues involved in "programming"? Why should they have to care about data types, overflow, roundoff error, etc.? With all that horsepower available, why not do some of the heavy lifting FOR the user?

length = 12 feet 8.5 inches width = 8 feet 6 inches height = 4/3 meters volume = length * width * height print "Volume is " (cubic yards) volume " cubic yards."

[I cringe each time I see something done in a runtime that "shouldn't have to be done", in some other approach to the problem -- e.g., GC]

These same issues have analogs on the development side. Build cycles for the aforementioned 8085 project took 4 hours (edit, compile, link, burn EPROMs). I'd get *two* passes at the code -- using a 'scope probe as my debugger -- in a normal work day. And, had to share THE development system with 2 other folks who were operating under similar constraints (with scarce target prototypes!).

Now, its almost easier to "make world" than it is to ensure you've got ALL the dependencies properly defined in a set of makefiles. And why worry about incremental backups when its far more convenient to just image the entire system, daily?

Yet, each of these were skills that were very important when building products that had to outperform their implementation hardware and development budgets. How do you rearrange your meatware to address the new "realities"?

Second, how do you square your KNOWLEDGE, gained from years of experience, of the sorts of errors that are encountered in developing code (regardless as to how well you've honed your skillset to avoid these) and the LIKELIHOOD of those errors (in folks with "less capable" skillsets?) with mechanisms in languages/tools that purport to minimize those? It's like complaining that you're being FORCED to wear a seatbelt while driving! (is there some reason you WANT to be injured in a crash?)

Third, how can you ignore the inevitable FUTURE evolution of the product, tools, *CODEBASE*, etc. given historical evidence? How many products are flash-in-the-pans and exist as little historical islands without contributing to their successors and peers?

In my current project, I have to balance many competing design criteria at different levels in the design. E.g., I wouldn't want to code the OS in the same sort of language that "users" use for scripting. Nor would I want future subsystem developers to have to deal with the intricacies of the OS at anything beyond a certain level of abstraction.

So, I trade-off complexity, capability, "safety", reliability, etc. as best fitting the capabilities of the (types of) folks who will

*probably* be involved at the different levels in the project's implementation (a "crack coder" would grimace at the constraints placed on him by the scripting language; and an "average user" would be glossy-eyed trying to understand how the OS works!)

I think the same sort of calculus is involved when developing or embracing any toolset (isn't a product just a different type of tool?).

Vote

Language feature selection

Join the Discussion

Didn't find your answer?