Code metrics

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 4:36 PM

A lot of old compilers weren't very "clever". I can recall projects in which I was called on to "recover lost sources" where I could actually generate a reasonable facsimile of the C sources from the binary executable (barring object names, of course) -- because the code generator was so "predictable".

Hmmm... I can think of at least three such denizens (though haven't seen posts from any of them in quite a while)

Would you prefer the rampant automatic type conversion of, e.g., Pascal? :>

Agreed. But, I suspect the compiler was just looking at *types* and not *values*.

On the one hand, we complain that the compilers are too smart -- offering us "unwanted" advice. On the other, not smart *enough*! :-/

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 5:10 PM

(snip)

The compilers I remember from the early 8086 days had an additional pass for errors. Each pass was on one floppy, and you didn't need the final one if there weren't any errors.

-- glen

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 5:14 PM

There'a a difference between warning about a missing cast vs., e.g, a SESE "violation". The former could well indicate that you've done something you didn't intend to (e.g., int from ptr) while the latter is a stylistic issue.

Some tools now feel obliged to "improve your style" (e.g., MISRA) to adhere to a particular set of "guidelines".

As I said upthread (wrt my *original* post), it's all about knowing how to *use* metrics, guidelines, warnings, etc.

How many pieces of code (esp UI's) *force* the user to do things in a fixed order -- UNNECESSARILY? E.g., enter name; then address; then phone number; etc. (poor example but meant to illustrate the total LACK of need for any such ordering).

Years ago, I wrote a little DB "address book" application. Fill in "City" and (valid) choices for "State" are automatically presented. Fill in State and City choices appear. Fill in Area Code and State choices are constrained. etc. Coding it wasn't significantly harder than REQUIRING the user to proceed through the form in a *fixed* order (OK, now that I know what city he's in, I can present him with state choices...).

[Developers, IME, have *a* vision of how a program will be used and often IMPOSE that on their users. Some of us no how to "get the hell out of the way" and *assist* instead of *enforce*]

So, a *second* initialization is "OK", in your playbook -- when you KNOW what the variable should ACTUALLY be initialized to?

This only works when a variable's scope can be constrained to exactly that block. E.g., when you later realize you need "sinX" on the line

*after* (or, three stanzas later) the closing curly brace.

I am consistently bothered by attempts to make local declarations that end up ADDING to my effort: limit: int ... limit = len foo; for (i:=0; i Gratuitously adding extra nested blocks just so you can declare/define

All blocks do is give you an way of inlining a *procedure* (not a *function* because you can't pass a value *out* of a block). I much prefer writing procedures/functions with appropriate descriptive names and (possibly later) opting to inline them. This lets me see more "on a page".

Do you really need to see the temporary variables used in a memcpy(3c) operation "inline" instead of a memcpy(3c) invocation?

And *hours* to chase down bugs like the overlapping scope issue, above.

C works well if you are pedantic in your discipline. I.e., no such thing as an int, char, etc. Everything explicitly qualified: signedness, const, static, etc. And, no "cleverness" exploiting ASSUMED endianness or specific encoding(s)!

Yet, you can still get screwed easily and frequently. E.g., try doing anything with struct packing, enums, bitfields, etc. in a way that is even marginally portable. sizeof() on these won't even give you consistent results! Or, exploiting the full range of values tolerated by particular data types (try writing a bignum implementation or decimal arithmetic package)

E.g., I have to *ensure* that the same compiler (or, one with EXACTLY the same "implementation defined behaviors") is used to compile both the client- and server-side stubs for my RPC's else what starts out as a "foo" on one side can look like something else, entirely, on the *other* side! And, that assumes a *homogeneous* environment (obvious problems when you move to a heterogeneous environment)

[solution: allow user to add cast operators that the IDL compiler automatically invokes to provide more freedom of choice wrt tools]

Shopping day.

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 5:24 PM

(snip, I wrote)

Yes, but eventually they run out of the true subtle ones and get to the no-so-subtle ones.

Or, some may be subtle in one context and not in another, but the compiler doesn't know the difference.

(snip, I wrote)

The compilers are getting better. I think most now figure out:

if(x==0) y=1; else y=2;

(not that you would do that, but something assigned in both parts of an if/then/else.) But they didn't always get that one. Maybe they won't yet figure out nested if/then/else, though.

It doesn't happen all that often, though.

(snip)

-- glen

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 5:35 PM

Replace the increment operator with an explict add.

Well, where do the obligations of the language/toolchain lie? E.g., I can't imagine anything in a language or tool that will compensate for a shitty boss! :-)

Of course! And, this works well *if* you have control over those people: their "training" AND "selection"!

The problem arises when you have to deal with the INEVITABILITY that some dweeb will come along after (or before) you and muck things up. Falling back on "well, it *works*, so..." as the justification for it being in the state that it is.

"Sure, there are LOTS of warnings! But, none of them (SEEM TO!) matter to the operation of the code!" And, by "operation" he implies "correctness"!

int multiply(int a, int b) { return a + b }

works fine for a=b=2! :-/

Various stake-holders are each trying to "improve" the quality of our "product" (i.e., software).

Employers would like to be able to hire from a larger pool of "adequately skilled" (as "adequately" approaches "infinity") -- and at ever lower *payrates*.

Users would like to be able to interact with "accurate" (correct?) products that behave in expected ways.

We would like to be "less hassled" by the *process* -- as well as enabled to undertake ever more complex challenges. Yet, without loosing much "artistic license".

But, we're all stuck with tools that "come up short" -- in one way or another.

E.g., the newer languages try to do much for you at the expense of efficiency (time/space). "Stricter" languages tend to force you to do a certain thing in a certain way (which can be confining as well as inefficient). "Informal" languages (C) give us well-enough rope to hang ourselves -- or, at least find our legs entangled in it's loose coils!

It's a matter of trying to squeeze a balloon on *both* ends, simultaneously.

[BTW, I had a chance to play with the A5 dev board for a few minutes. Seems like it will be a win for *my* application -- very "snappy". Still have to round up the "optional components" and fully populate it, though. Cripes, what did they SAVE by omitting a few inexpensive parts?? Also have to see if I can mate a display to it, just for yucks.]

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 5:51 PM

I see the goal as being the same: "help" the developer catch things before he has to embed that executable in something even more complex.

E.g., (effectively) complaining about forgetting a decimal point in a float/double seems "petty": "Why can't the compiler KNOW that there should have been a decimal point, there? It's obvious that the value *must* be a float so interpret that integer constant *as* a float, for crying-out-loud!"

I design my software (and hardware) to "strongly influence" my successors to continue in the same style that I've adopted. If only for the sake of consistency! I.e., I make it fairly easy for you to *extend* my code along the same lines that I've adopted instead of having to "build from scratch". E.g., add an entry to an existing const table instead of writing some ad hoc code, etc.

#define (FOO) (...) ... if (x == FOO) ...

What I need to explore is how clever tools are at optimizing:

const foo ...; ... if (x == foo) ...

esp if foo is extern! (I have been consistently moving away from manifest constants in my code in favor of const variables)

- N
- Niklas Holsti
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 5:56 PM

[mucho snips]

I think Ada has the right solution here: a literal never has a type itself; it takes on the type expected in the context. So the literal 1 means the mathematical, universal, integer "one", and is compatible with any other integer type (as long as the value 1 is included in the range of that type, of course).

--
Niklas Holsti 
Tidorum Ltd 
niklas holsti tidorum fi 
       .      @       .

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 6:11 PM

I'm not sure even *that* "do the right thing" approach is 100% safe.

Consider: x = sqrt(6)

Then, consider: if (x == 6)

(in both cases, x being a floating type!)

In the latter case, I would want the compiler to force me to look at what I wrote, AGAIN -- if only to add the DP (or a cast). Hopefully, it gets me thinking: An *exact* compare of a floating type?? Are you *really* sure you want to do that??

E.g., I can write: value := "1234" + 1 Do I intend that to yield a *string* having the value "12341"? Or, an *in* having the value '1235'? (Or, a string having the value "1235"??)

Forcing me to look at it again would lead me to make my intentions more clear: value := (int) "1234" + 1

It's also too easy for people to "see what they want to see" when writing code. I suspect that accounts for most of those "hours spent finding an OBVIOUS bug!"

What would you *expect* the values of each variable to be? x, y: int = 5; vs. int x, y = 5;

- H
- Hans-Bernhard Bröker
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 9:01 PM

Am 11.03.2015 um 11:20 schrieb Les Cargill:

It doesn't, because there are none.

So let's go look into the C Standard, and try to find any limitations on what the translator is allowed to emit a diagnostic message about. And guess what we find? None! There are only required messages, but _no_ forbidden ones. Instead there's even a (non-normative, but enlightening) footnote explicitly allowing any and all additional warnings (C99 5.1.1.3, footnote 8).

Nice plan, but again, as soon as you try to do that for more than one compiler with the same source code, this generally becomes impossible.

That's assuming all warnings are _about_ promotions and constraint violations. That assumption is generally far from correct.

Which is why any warning level so low that it doesn't include missing-prototype warnings is, IMO, unacceptable.

No, because that's not a question of probing. No amount of probing, by any tool, can _deduce_ whether I made those object 'static' by mistake or by conscious decision. The only way any tool can ever know that would be by me _telling_ it that I did it on purpose, right there in the source code, e.g. by adding a lint comment to that object's definition.

- H
- Hans-Bernhard Bröker
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 9:24 PM

Am 11.03.2015 um 11:46 schrieb Reinhardt Behm:

The compiler could deduce that, and most likely it does. But a fully blown C source analysis tool (and that includes a compiler in -W all mode) may worry about more than what _it_ can deduce. It may want to make sure that _you_ deduced correctly, too.

If you didn't spell out your wish down to the last detail, down to and including conversions which the compiler would have automatically performed anyway, the compiler can't know what you really wanted; it only sees what you wrote, not what you thought.

So in a really picky setting, the compiler has to get back to the coder to make sure that writing really does match intent. It does that by emitting a diagnostic for every aspect of the code that's not 100% perfectly self-evident.

- J
- Jacob Sparre Andersen
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 10:00 PM

There is a reason I try very hard not to use any compiler-specific features in my source code.

I should probably make it a habit to occasionally attempt to build my projects with alternative compilers, to ensure that I actually *have* avoided compiler-specific features.

A friend of mine routinely builds his software with three different compilers. It certainly avoids compiler-specific features, but it also explores the differences in the bugs in the different compilers. :-/

Greetings,

Jacob

--

                                           -- Mirabel Tanner

- N
- Niklas Holsti
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Mar 11, 2015 10:44 PM

Ok, I did not explain fully. Integers, floats and strings are separate "type categories" in Ada, so although the literal 6 is compatible with any integer type, it is not compatible with a floating point type or a fixed point type. An Ada compiler would complain about the (Ada versions of) the above.

And also about that (assuming that you have not written your own operator "+" with a suitable profile (string + integer, returning whatever type "value" has).

Character literals like 'c' and string literals like "1234" are also type-flexible, like integer literals, in that they take on an 8-bit,

16-bit, or 32-bit-per-character encoding depending on the context.

I'm not quire sure what you are asking, there. The former declaration resembles Ada, in which both x and y are initialised to the value 5. The latter is C, where only y is initialised to 5.

--
Niklas Holsti 
Tidorum Ltd 
niklas holsti tidorum fi 
       .      @       .

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Mar 12, 2015 12:33 AM

Sorry, I wasn't commenting on Ada, specifically. Merely trying to indicate how "do the right thing" can still be misleading.

Again, I'm trying to point out that people coming from different backgrounds could "see" such an operation in very different ways -- based on past experiences.

And, again... :>

Limbo is closely related to C (same pedigree). I.e., you could "get the gist of" a Limbo program without fully understanding its syntax. You might likely consider the two statements (above) to be roughly equivalent when, in fact, they are subtly different.

Our expectations belie our past experiences. Hence, you read what you *want* instead of what is actually *written*.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Mar 12, 2015 12:48 AM

IME, this is usually a win -- unless you live in a narrowly defined application domain (same tools "forever", etc.)

Somewhat "of necessity" (convenience?), I routinely (i.e., during the course of any given week) expose my code to 4 or 5 different compilers in different hosted environments. I keep my "live" code on a box that is up 24/7/365. I can access it there and compile/debug under gcc/gdb.

Or, mount that filesystem on a Windows host and use a Windows-hosted "native" compiler/debugger (e.g., Borland, MS, etc.).

Or, mount it on a Solaris (SPARC) host and generate objects for that enviroment.

Or, use a cross-compiler/IDE and compile it for the desired target.

As a result, I see how different compilers react (complain!) to my code; see different data encodings (endian-ness, floating point formats, etc.); etc.

By routinely rotating through these different toolchains (host and target environments), I see what I am possibly doing "non-portably" soon enough to correct my style (I am largely a creature of habit and use the same stanzas repeatedly in my code, etc.) before I've got too much invested to make corrections "painful" (which, if you believe people to be lazy, suggests I will opt for some OTHER way to avoid making the "necessary" corrections).