Languages, is popularity dominating engineering?

E

Ed Prochak 11 years ago

As I began my career in software and systems, choosing a programming language was at times serious. Over the years it seems that choosing a programming language has become: what is popular (or perceived popular by management).

So a couple questions/topics to spark the discussions.

Does this match your experience? Are you using a language because it is popular? Was it your choice or was it forced by management?

How many languages do you know? Which language would you choose for a large pattern matching project? Which would you choose for a hard real time project?

I'll sit back a bit before throwing in my thoughts.

ed

Vote

R

rickman 11 years ago

I'm not much of a crowd follower. I use Forth because of the simplicity of the development environment. C is fine, but the tools are large and clumsy. Lots of stuff to get right without much guidance.

Rick

Vote

P

Paul Rubin 11 years ago

Part of the choice is a good ecosystem and user community, which requires a level of popularity.
For ongoing projects, there is generally no choice to be made by management or anyone else, since the language was chosen when the project started, and can't be changed without scrapping the existing code base.

Sometimes management chooses, sometimes I get to choose, often I pick projects because the choices they have already made match up with my skills and preferences.

Quite a few, I'm something of a language geek.

Not sure what you mean by "pattern matching" or "large pattern matching project". Generally, successful large projects start out as small projects and then they get bigger. The language choice is generally made when the project is still small. So the question is something of a non-sequitur.

Anyway the choice would depend somewhat on available resources, number of programmers involved and their skill sets, project scope, budget, amount of acceptable risk, performance requirements, etc. I'm fluent with Python but it tends to run slowly, which is fine for some things but not others. I like Haskell but it has a very steep learning curve, so it's only suitable if the surrounding organization is willing to deal with that when getting other programmers involved. Ocaml (or even SML) might be a reasonable middle ground between Python and Haskell, but I haven't used it yet. For critical applications I'm interested in proof assistants like Coq, but at present I don't have enough knowledge about them to take responsibility for such a project.

For hard realtime I'd give some consideration to Ada. Again though this is an area where I've never done anything serious. I'm a pretty good C programmer by industry standards (I wrote a small part of GCC) but I've come to think of C as a dangerous language because of all the undefined behavior hazards, etc. I do think of C++ as an improvement, that's scary in its own way (search terms: C++ FQA). There are also Haskell EDSL's like ImProve and Atom

formatting link

) that could be of interest though they are a bit specialized (I've played with Atom some).

I sometimes daydream about a Haskell subset for embedded control, that I call Control-H (joke: the file extension is a backspace character). My current picture of this is basically Purescript (purescript.org) targetting a small Lispish virtual machine, or maybe the Lua VM (both of these are garbage collected), with Erlang-like multitasking. The HRT fragment could be called Point Blank (file extension is a space char) and might end up looking like the typed Forth in Peter Knaggs' PhD dissertation.

Vote

T

Tim Wescott 11 years ago

I use C++ almost exclusively because it is approved of by everyone who works here. Moreover, the top guy insists on it unless a customer insists on something else.

I do have a very few projects that get done on processors that are small enough that C++ just doesn't fit, or that simply don't have tool support for C++. In that case I use C. I have, in the past, done one project on such a teeny processor (PIC 12-something) that C didn't fit -- then I used assembly.

Assembly, C, C++, Java (a bit), Scilab, Perl (a bit). I also can function in Verilog and HTML, although real digital designers see my code and sort of moan and shake their heads. I have in the past been useful in Pascal and Modula-2, and I have debugged (but never written a line of code in) Ada.

Oh, and BASIC. Can't forget BASIC. My first professional programming job (in 8th grade) was programming in a mix of BASIC and assembly.

I don't know what requirements there are on pattern matching, so I can't answer that part.

As for the hard real time part -- any language that does not get in the way would be suitable. For me, that means assembly, C, or C++.

I'm not sure what's motivating this, but if you're going to hire people to work with you, there is value in choosing a popular language. I was on a team that was a very early adopter for C++ in embedded systems; it was a good move to my mind, but it meant that all new hires had to be brought up to speed either in C++ or in embedded (note: it's way easier to teach an embedded C guy C++ than it is to teach a C++ desktop guy embedded).

Had we worked on smaller applications that would have derived less advantage from C++'s ability to help organize large applications, the extra load of training people into C++ would have been a greater advantage than the pluses of C++.

Were I, personally, to change the language I use (to Fourth, say, as an extreme example), then not only would I have to pay to retrain myself, but I would have to sell customers on the notion of getting code in an unfamiliar language, and if I ever became a TWO person shop, I'd have to pay for all the training all over again.

Which is all a really long way of saying that you can't make language decisions based on technical merits alone.

Tim Wescott Wescott Design Services http://www.wescottdesign.com

Vote

T

Tim Wescott 11 years ago

You're just supposed to know. And the guidance is there, but, well -- you're just supposed to know.

Tim Wescott Wescott Design Services http://www.wescottdesign.com

Vote

D

Don Y 11 years ago

Or, what the diploma mills are churning out! If you can't find folks (at cheap enough rates -- a concern for many short-sighted managers who thinks "a body is a body") proficient in the language/toolset you select, who cares how good the tool is!

Presumably, as this is c.a.e., you would like comments offered in the context of embedded systems (and not desktops, etc.). And, that these aren't "design once" projects but, rather, require ongoing "support". And, that you intend this to focus on languages for *software* development.

"In Industry" (i.e., as an employee), I found that there usually was no "choice" as to language used. The "Shop" implicitly defined what the selection would be. Early in my career (70's), that selection was, in turn dictated by the processor you selected for the design. X ASM vs. Y ASM vs. Z ASM (see the pattern, here? :> )

Not "popular" in the sense of "en vogue" but, typically, "reasonably well known" (even if not EXPERTLY known by the audience I have to address). A *great* language that is obscure is typically of very little value (what happens when your "investment" -- programmer -- moves on to another employer?)

Often, there are other concerns that factor into a language choice besides "code efficiency", "intuitiveness", "expressiveness", "popularity", etc.

E.g., I've been surveying languages with an eye towards creating a "scripting language" INTENDED FOR USE BY NON-PROGRAMMERS using my devices (see below). This is a lot harder than you might think. All the syntactic sugar that we (as programmers) expect in languages represent stumbling blocks for neophytes (who may only write *one* script in the life of a product!).

This is a valid code fragment:

s := tokenize(s, "\t;, \n"); case hd str { "foo" => spawn do_foo(tl str); "baz" or "bar" => do_bar(str); "move" => x = int hd tl str; y = int hd tl tl str; rest = tl tl tl str; move(x,y); eval(rest); * => die(); }

Imagine trying to tell someone (short of cut-and-paste) what it says accurately enough that they could reproduce it! (i.e., the "read it over the phone" metric)

Talking to a programmer *experienced* in the language, it can be described colloquially with some high degree of certainty that the other person could reproduce it correctly. A neophyte? You'd be "spelling" everything for them!

You also have to consider which features of your *environment* need to be supported *in* the language (else you end up relying on "libraries" to augment the language's capabilities -- often in sub-optimal ways).

E.g., do you need to support concurrency? What sorts of communications (a *huge* aspect of a reliable system design) do you support? Does your communication mechanism impose type checking on the data that it passes? Or, is it an "untyped byte stream" and you rely on the targeted process/device/etc. to MANUALLY perform all of that testing on the incoming data?

For example, the scripting language (above) can't expect a neophyte to be diligent and check for zero denominators. Or, to order operators to preserve maximum precision in the result. As a result, the *language* has to do this (at run-time or compile-time). I.e.,

12334235234534635645674754675675675675678567/2354234029348293492384

should yield a *different* result from:

(1+12334235234534635645674754675675675675678567)/2354234029348293492384

because the neophyte would *expect* it to produce a different result! (imagine this is a subexpression in a larger expression) The neophyte should be tasked with indicating the level of "detail" (precision) he seeks in the output -- not the language (which would require educating the neophyte as to where calculations could "blow up", etc.)

As above, some choices were imposed -- even if *we* selected the devices that brought in their consequential development environments. Existing tools (hardware, software and CODEBASES) also add a lot of inertia to that decision.

When I started consulting, Management had the *second* to last word on the subject. I, of course, could "no bid" any job for which Management's decisions weren't consistent with what I would consider the best use of *my* time -- as I am concerned with that, paramount (I fix "bugs" for free so I don't want to have to revisit a project because of a poor choice of tools, specifications, etc.)

That's sort of a silly question. Define "know". I.e., one such definition might be "able to sit down and write EFFECTIVE/bug-free code NOW".

I think a better question acknowledges the fact that you are exposed to various languages over the course of a career, to differing degrees. So, how many would you factor *zero* (not "some small amount that I can probably find some slack time to ABSORB in the schedule") re-learning costs into an estimate.

Or, how many are you actively using "in recent experience".

For me, that's currently:

- C

- Limbo

- SQL

- (a few) ASM

- an unnamed, proprietary pseudo-scripting language in development

plus smatterings of perl, awk, sh, etc. to make it all work (don't count things like "make" as languages, per se)

That would depend on the project. As most of my projects are real-time (ignore the SRT/HRT distinction as most folks are misinformed, there) and severely resource constrained. "Extra/unused resources" represent excess cost. (You can't grow functionality to consume them after the fact as that alters the Specification -- potentially compromising the entire design. So, they can only be applied ex post factum to improve performance... ABOVE that which was Specified).

My speech synthesizer is little more than a pattern matching project. But, it's RT and resource constraints dictate a small, tight implementation ("Gee, wouldn't this be *so* much easier to code in LISP??")

Similarly, the UI for my automation system relies largely on context ("what makes the most sense, *now*?"). Ditto the inference engine.

C hands down (again, see "resource constraints"). It is important that the code I write can be (relatively) easily mapped onto "op codes". When a language does (hides/obfuscates) too much implementation detail, it becomes difficult to evaluate what is *really* happening. You spend a lot more time profiling your code than you should have to!

Mechanisms that the language provides to "help" you can force a more inefficient implementation (at source and binary levels). E.g., Limbo eschews pointers (they don't exist at all!). So, while you can functionally implement equivalent constructs, the actual implementation is more costly (in many cases) and less transparent.

Some things are *unnecessarily* hard (clumsy) to implement simply because of the language's definition. (e.g., design an operation that has a timeout that kicks in ONLY if the operation hasn't been completed in the alloted time; apply that to *low level* -- e.g., FREQUENT -- operations!)

IME, the language isn't as important as a clear definition of the problem. I've met lots of "experts" (in particular technologies, languages, etc.) whose lack of knowledge of the *application* domain rendered most of their knowledge *useless* (in-APPLICABLE).

Some languages force you to clearly define THE IMPLEMENTATION. But, if this still has a poor match to the actual *problem*, the the language's features/facilities/CONSTRAINTS don't do anything to improve the quality of the product ("Our code is 100% bug-free -- machine validated!" "Yeah, but it doesn't *do* what we set out to do!")

Vote

L

Les Cargill 11 years ago

Yes and no. I'd use *that* language for *that* purpose anyway. I also use a personally-chosen language for stuff that's not necessarily full deliverable - test jigs, what not.

Both, and neither.

Surprisingly. probably shell scripts, if it'll "grep". Tcl is good for pattern matching and string munging, too.

How hard? 1 microsecond? That's an FPGA - so VHDL. 10 msec? Just about anything will support 10 msec latency these days.

Les Cargill

Vote

P

Paul E Bennett 11 years ago

Like Rick, I am also someone who uses Forth by preference.However, I also programme in other languages like the IEC61131 set, S80, D3, Fortran, Basic, Mumps and probably a few others I have forgotten the names of (mostly all specialised to the application domain).

My favourite is Forth, especially for the embedded systems work I deal with. Please note that programming is actually about 10% of my day job efforts (I am more likely wading through getting requirements specifications tied down enough to be useful. That includes performing Task Analysis, HAZOP and Risk Assessments, Electronic System Specification and Design, as well as the Functional and Non-Functional Spec for a System and its software (if any).

Forth Philosophy (and the works by George Polya) have been a major guiding principal in the way I approach Systems Analysis. I quite prefer a Component Oriented Approach. I am also able to determe how each and every component of my systems shall be called upon to prove its integrity with sufficient hard evidence to support the eventual Safety Case submissions.

Note that my domain is usually in the embedded/PLC Process control worlds where it is important to know that the system will always do the right thing no matter what adverse disturbances it might be subject to.

******************************************************************** Paul E. Bennett IEng MIET..... Forth based HIDECS Consultancy............. Mob: +44 (0)7811-639972 Tel: +44 (0)1235-510979 Going Forth Safely ..... EBA. www.electric-boat-association.org.uk.. ********************************************************************

Vote

L

Les Cargill 11 years ago

The subset of 'C' you really need is rather small:

- Resource Acquisition Is Initialization. Holds for 'C', too. Use ternary operators or "constructors" to achieve this.

- Only use "for" loops for integer-index loops.

- Use while (...) { ... if (...) break; } for everything else.

- Don't use switch () too much.

- Don't use else too much.

- Early Return Is The Right Way; enumerate and prioritize constraint testing in this way. Happy path at the bottom...

- Separate constraint testing from processing.

- Threads make everything worse.

- The Big Loop is honorable.

- Serialization of internal state is the Path to True Enlightenment ( properly factored code cannot be understood statically ).

- Names matter.

- Callbacks rule when you need variant behavior. Serialization of callback state is part of the Path of True Enlightenment.

- With one timer and callbacks, you can have very low jitter. Don't be afraid of writing a scheduler. The Motorola 68xxx TPU was great beyond measure; you can do something like that in software. This can even be a thread and a usleep() or equivalent.

- Be explicit and tolerate no ambiguity.

- Everything begins life as static const.

- Tables are the One True Way.

- memset() all buffers before use. Always.

- Do not be afraid to declare static buffers for a single purpose.

- memmove(), not memcpy().

- Only allocate utility counters ) i, j , k ) on the stack. Use static for everything else you can.

- It is honorable to declare arrays of control block struct and multiplex using the array index.

Now you know.

Les Cargill

Vote

J

John Speth 11 years ago

Your list surely shows your hard earned battle scars. I've been on the losing side of all of them at one point in my career.

I just don't understand the big loop is honorable item. Are you saying the big loop is good or bad? I've never seen a good big loop. I've seen some scatter-brain type main loops that are hundreds of lines and having absolutely no structure (bad). They were written mostly from lack of experience and design. All the good loops I've seen or written are a dozen or so lines long with good structure and good system considerations designed in.

JJS

Vote

R

rickman 11 years ago

It's funny, but a lot of what you two posted reads like the guide to programming in Forth.

Rick

Vote

P

Paul Rubin 11 years ago

[interesting list, some comments]

I don't understand this: RAII is a C++ idiom that relies on C++'s exceptions calling object destructors in case of abnormal return from some lower level of the program. How do you do something comparable in C?

Hmm, ok a lot of the time, though idioms like for (p = list_head; p != NULL; p = p->next) { ... } seem perfectly fine.

OK, but what do you do with the return code in the error case?

This is interesting and I haven't seen it put like that before. I'll give it some thought. A currently trendy practice (functional programming) is to minimize internal state and localize it to the extent possible, segregating stateful from stateless functions using the type system in the case of languages like Haskell.

What do you mean by that? It sounds like the way OOP obscures control flow.

You mean instead of an old fashioned switch statement?

Not sure what you mean by that.

Why do you say this? Just to avoid having to analyze stack depth?

Vote

L

Les Cargill 11 years ago

neither.

I have. And I have seen it bad.

They weren't finished. People generally don't finish things because of ... reasons. We don't want to know those reasons.

Well, usually.

Sure.

Yep.

Okay, so what I mean by The Big Loop works out to "I don't have a Real O/S (tm) so I will repetitively call function after function in a loop that should evoke nausea in a code review because it's so long because I don't have a real O/S but the functions all figure out how to manage what would be threads with a real O/S".

You get the advantages ( and perils ) of run-to-completion this way.

Vote

P

Paul Rubin 11 years ago

Start at slide #12:

formatting link

Vote

D

Don Y 11 years ago

I read this a couple of times and still am not sure how it was intended! :>

- Don't use anything other than "for" loops for integer-index loops

- Don't use "for" loops for anything other than integer-index loops

:-/

In concert with the above...

I use for, while and do as hints to the reader as to what the following code is likely to do and the criteria that govern its execution.

E.g., "do" is best read as "do THIS until (something is no longer true)". It implicitly tells the reader "this WILL execute at least once. Read through it and you'll see (at the end) whether it will execute again."

By contrast, "while" signals that the user should consider the driving condition *before* the code is executed -- "this *may* execute one or more times OR NOT AT ALL!"

I use switch a lot! But, the trick is not to clutter up the N-way case's with lots of code -- which causes the switch to "disappear" in all the detail.

Else finds frequent use replacing small switch's: if ... elseif ... elseif...

Braces really help sort out nesting on if-else.

Early return regardless of success *or* err-out! The idea of forcing the exit from a subr/function to always be in one "return" statement is too arbitrary. And, often leads to extra nesting levels *just* to create this artificial structure (e.g., "do { ... break; ... } while (1)").

Add invariants everywhere they logically exist.

Ah, I beg to differ. Doing two different things simultaneously WITHOUT threads quickly becomes brittle. ("Oh, gee! I forgot to blink the LED while I was in this bit of code...")

But, good partitioning requires forethought. Just cutting a problem into multiple active entities isn't a panacea -- comms grows as the square (potentially)

But, artificial naming "standards" are ludicrous. Right maleAdultLes?

+42

Even though you "know stuff" (about the language, application, etc.) it can't hurt to put that in writing to make sure others also know it.

I've taken this to an extreme in my current designs! I move the tables

*out* of the executable and load them from a DBMS at run-time. This allows me to update the behavior of the code after deployment by tweeking DBMS contents. It also allows the code to modify aspects of itself in a more disciplined and structured manner. Finally, constraint checking on the table(s) -- in the DBMS -- makes the invariants that apply to those tables IN THE CODE more explicit (you can't update a DBMS table "wrongly")

I actually avoid this sort of thing. I want buffers (memory) to go away when not explicitly referenced. It also tends to complicate reentrancy and code sharing.

See above.

Eschew variables at file scope.

Create (equivalent) types to clarify the nature of the object (even if the compiler won't be able to enforce strong type checking).

Avoid "clever" code constructs as they don't inherently imply more efficient code.

Let the compiler do most of the optimization.

Browse the objects periodically to be sure WYSIWYG.

Check all inputs -- especially from "outside the system" (e.g., user).

Tie code to specification via appropriate commentary.

[There are probably countless others on my "short list" but I've got cookies to attend to... :> ]

Vote

S

Simon Clubley 11 years ago

I've done this myself in tightly constrained environments (8-bit MCUs) even though it goes against all my natural instincts for nicely modular coding with variables only defined in the scope(s) they are needed.

In my case, pulling them out into .bss means it's easy to look at the linker map and see, at compile time, exactly how much memory is required for the variables making it far easier and reliable to analyze memory usage.

Now having said that, I want to make it clear that in environments in which memory resources are not so tightly constrained (32-bit MCUs with several MB of memory available) my natural instincts are dominant and much more gets created (and passed) on the stack.

Simon.

Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world

Vote

S

Simon Clubley 11 years ago

In my personal coding standards, _everything_ (ie: single statements) gets placed between braces in brace orientated languages.

:-)

My former employer for my day job (embedded work is a hobby for me) has just written in a reference that I like to document things. He's right. :-)

By this, do you mean using objdump and friends to give the generated code a once-over just to make sure you haven't done something that's hopelessly inefficient or invalid ?

If yes, it's nice to see I'm not the only one who does this. :-)

Simon.

Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world

Vote

H

Hans-Bernhard Bröker 11 years ago

I'm afraid your reasoning is backwards in this case. It's precisely in memory-starved environments that you can not afford doing this. Making variables static when they don't need to be blows up memory consumption considerably.

The stack isn't your enemy. It's the cheapest memory usage conservation technology there is, so use it.

The problem is that it doesn't just make memory consumption easier to see ... it also makes it larger than it needs to be. So there's a good chance you'll run out of memory _because_ you wanted to figure out if/when you run out of memory.

Vote

S

Simon Clubley 11 years ago

OTOH, it's a lot better than having to deal with subtle memory trashing errors because your now larger stack has descended into the space allocated to .bss (or even .data) and you find out the hard way that your code is now too big for the resources on the MCU you are currently using.

I prefer to try and find out at compile time if the available resources are insufficient rather than have to find out the hard way at run-time. I accept what you say about the memory size increasing may be true in some cases, but if you are close enough to a resource boundary for this to make a difference, then maybe it's time for a larger MCU anyway.

After all, code doesn't always remain static and quite often has functionality added to it, so you may hit the resource limit even with your stack based approach anyway.

I suppose the major thing driving me here is to use development techniques which allow me a better chance to find out about these potential issues in a controlled deterministic manner as early on in the development process as possible.

OTOH, as a hobbyist, I'm not churning these devices out by the thousands so there may be a cost tradeoff for you (in terms of using a cheaper less resource rich MCU that's a few pence cheaper) that simply doesn't exist for me.

If that's true however, I would ask if the additional cost of your development time outweighs the savings from the cheaper MCU when you have to start debugging run-time resource availability issues.

BTW, in my makefiles (especially the 8-bit target ones) there's a size command executed as part of the makefile target so I can see how the resources needed are increasing as I add functionality. It's a nice way to keep an eye on resource use without any additional manual effort.

Simon.

Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world

Vote

G

Grant Edwards 11 years ago

Making everything static creates all sorts of restrictions: you have to write seperate versions of functions for forground and interrupt use, you can't use threads, you can't use coroutines, you can't use recursion, etc.

Grant

Vote

Languages, is popularity dominating engineering?

Join the Discussion

Didn't find your answer?