arm-gcc: pointer to constant string

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sat, Sep 15, 2018 2:25 PM

Fair enough - there is no need for the string to be in flash or read-only memory. But it is likely that it would, at least logically, be at a fixed address in the memory map somewhere.

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sat, Sep 15, 2018 4:28 PM

That's hairsplitting, which obscures rather than illuminates.

Nonsense.

I know of three languages where subsetting is a practical necessity: C, C++, and the SPARK subset of Ada. That's significant.

However SPARK's objective and *ability* is something that C/C++ cannot even dream of: provably correct implementation for use in high integrity systems. To get that you would have to subset /any/ language (except possible NewSpeak!).

No. Very, /very/ few - if any - developers understand all the undefined and implementation defined behaviour of any standard of C or C++, and how to reconcile different parts of the same standard (or differing implementations of the same standard)!

Statements from those that were involved in some of the standardisation efforts lead me to believe that's also the case for - commercial compiler writers, several of whom have had requested requested opinions about how standard clause X can be reconciled with standard clause Y - the committee itself, e.g. infamously with their refusal to believe C++ templates were Turing complete until somebody rubbed their faces in proof!

No, I can't quote chapter and verse. While watching the shenanigans from the sidelines it became apparent to me that C/C++ was becoming (early/mid 90s) part of the problem as well as part of the solution.

- S
- Stefan Reuther
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sat, Sep 15, 2018 4:32 PM

The library makes no mention of storage classes, meaning: it does not care about storage classes.

namely: construction of an array of static storage duration, with content that - lo and behold - fulfils the library's requirements of a string, by being a zero-terminated array of characters.

The question was: is a string literal's address and content valid after the function containing it exits? The answer is: yes.

See above.

Stefan

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 3:58 AM

You must be unfamiliar with the *formally proven* C compiler that was part of the sel4 (formally-proven microkernel) project. I've met some of the people, who work here in Sydney.

They proved the operating system guarantees *down to the silicon* for a specific ARM 9 chip (with MMU).

Clifford Heath.

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 7:42 AM

Reference please, and has the claim withstood knowledgeable scrutiny?

How did they formally prove the compiler? That in itself is "entertaining" for an incompletely defined language such as C.

Does "down to the silicon" include the silicon itself?

The A9s are complex devices with caches, interrupts etc. How was the effect of those taken into account in the proof?

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 7:47 AM

Just noticed you claim the /compiler/ was formally proven; too little coffee :(

What are the caveats/limits to that proof?

Was it a full standard C specification (which standard?) or just a subset of C.

Basically I regard the claim of a formally specified C compiler as "extraordinary", and hence it requires "extraordinary proof".

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 10:18 AM

It it too hard for you to type "sel4" into Google?

The first hit is , but there are many resources for this, including the backgrounder on L4 family microkernels on Wikipedia.

You can download all the code, all the specifications (which are executable) and the compilers and provers, and run them yourself.

This is the culmination of a massive multi-decade effort by some of the world's smartest people. It's not some maverick's hobby.

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 11:25 AM

Google orders results on what it thinks you are interested in :( It is easy for you to provide a definitive link.

The sel4 FAQ includes "The reduced proof assumptions mean that we do not need to trust the compiler or linker, but there may still be faults remaining in specific low-level parts of the kernel (TLB, cache handling, handwritten assembly, boot code). These parts are thoroughly manually reviewed." and "From C code to binary code, we employ a set of automated widely-used verification tools. These are: SONOLAR, Z3, Isabelle/HOL, and HOL4."

That looks like the *compiler* isn't formally verified. It also makes me ask the question as to whether C is "merely" an intermediate language autogenerated from a HOL spec.

Having said that, the commercial deployment is certainly interesting and significant.

Sorry, life's too short!

I've seen many such activities since the early 80s, some of them also based on LCF. They went nowhere.

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 12:22 PM

The C code is annotated with special comments that contain guidance for the automated provers. The provers analyse the C code, and the final binary, and (using knowledge of both C and the ARM instruction set) can formally prove the equivalence.

So no, the compiler is not proven - but all of its specific outputs are proven to be an accurate translation of the respective inputs. That's good enough.

It might not work for all possible C code, but it doesn't need to; it just needs to work for the actual code being compiled.

No. However the specification (the invariants that must be proved) are written in literate Haskell.

There has been no result like this anywhere before, period.

Besides that, sel4 and its precursors are already in significant deployments, but guess what? They're all secure systems and don't tend to talk about it.

Clifford Heath.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 1:00 PM

It is no different than programming in Forth, Ada, Python, Fortran, or any other language. What makes you think C is special?

You haven't read the Misra rules, have you? I didn't mention that the remaining 10% of the rules go against the C standards and compiler norms by inventing their own non-standard type system. I suppose that /does/ mean it counts as another type of C, but it is certainly not a standard one.

I think you'll find it applies to /all/ programming languages. For some languages, there is enough consensus in these "subsets" and enough popularity of them that they are known popularly - like SPARK and MISRA. For others, they will be smaller. If company guidelines say that when programming Python you must use capital letters for class names and never write "from module import *", then you have a subset of the language just as surely as if they say "use MISRA for C".

I know that is SPARK's objective. Most sets of guidelines or coding standards aim towards more reliable and less error-prone coding (though few aspire to claiming they can be provably correct).

I think I see what you are getting at now - you are saying there are things in C that are undefined (or implementation-defined) that many C programmers don't realise are a problem. Yes, I can agree with that - though I think the problem is often overrated. The great majority of undefined behaviour in C is obviously undefined because there is no sensible definition for it - and there is no way anyone could reasonably expect code to "work". That applies to things like buffer overruns, out of bounds pointers, etc. The most obvious category of undefined behaviours where people get things wrong are, I would say, accessing data through incompatible pointers and signed integer overflow. In the case of signed integer overflow, the code is almost certainly wrong regardless of any definition of the behaviour.

The rest is just a matter of not knowing the language as well as they should, and thus getting things wrong. That happens in all languages.

I am not in any way saying that C or C++ are perfect languages, or that their standards are ideal, or anything like that. I am merely saying they are not as bad as many people make out, and not different in principle from other languages.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 1:06 PM

Ah - 6.4.5p6, second sentence: "The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence."

Key here is that that the standards specifically say that this array has "static storage duration". So yes, the OP's code is fine according to the C standard.

Thank you for leading me to this point.

Agreed (now that I have seen the light).

Thank you.

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 1:33 PM

If the type system is a valid use of C, then it is no more non-standard than libraries that aren't defined in the standard.

I think most people would look askance at that being an informative definition of "subset" - and would instantly think to look at what's being concealed.

Agreed.

Overall I'm not a fan of formal proofs except in a few very niche applications. The proofs always have hard (in multiple senses) boundaries which exclude important real-world problems.

There are no perfect languages, and probably cannot be.

However, some languages are significantly more/less imperfect in practical use than others.

Overall C has expanded until its rough edges have become part of the problem rather than part of the solution. Once it really was the best language for many purposes - but we have learned Better Techniques in the past 40 years.

Cs position is firm for the same reason than COBOL's position is firm: history, familiarity and weight of existing code.

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 8:06 PM

Unfortunately, MISRA's idea of "essential types" is not /quite/ compatible with actual C. I have forgotten the details (and they may vary between MISRA versions), but there are points in which the MISRA rules say code should do something that is different from what the C rules say. I have seen at least one case where a (massively expensive) compiler generated code regarding integer promotions that was "MISRA correct" but not "C standards correct".

There are few cases where formal proofs are worth the time and costs involved. But I like - when possible - to write my code with a view that a formal proof /could/ be made if someone were to go to the effort. It is rare that you could prove a whole program - but not uncommon that you could (if you wanted to) prove many of the functions involved. Thinking in terms of pre-conditions, invariants, post-conditions, variants, specifications - that is all good stuff for writing solid code, even if you don't do so formally.

Certainly I do not think C is the best choice of language for many of the types of use it is put to. I think most people who program in C would be more productive and write more correct code in other languages

- and most programs that are written in C would be better written in other languages.

Those are undoubtedly reasons for choosing C in many cases. But C is still the best choice for many purposes - and sometimes the only "choice".

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sun, Sep 16, 2018 8:42 PM

Oh dear. I can't find /anything/ to disagree with in those statements :)

- L
- Les Cargill
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Tue, Sep 18, 2018 2:24 AM

If you stick to something akin to the Haskell Actor model and everything is an event, it's not too onerous to use combinators to generate a significant subset of all messages. With the proper level of introspection, you can "prove" the program works to a limit enumerated by the test vectors and that it continues to do so as changes are made to the code base.

The test vectors and captured output can be controlled and compared to. And if you want to go as far as model checking or partial model checking, you have the stuff in place to enforce that now.

Obviously, there are domains where you just can't get close enough to this ideal. So you gamble.

Possibly. But what C offers is the ability to build nonstandard, hybrid "furniture" to support the development effort that can not be forseen by language and library designers. This is especially pronounced in serialization and the ability to dynamically configure a program ( which are two sides of the same coin ).

C also supports event-driven-ness in a way that few other languages do.

--
Les Cargill

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Tue, Sep 18, 2018 6:49 AM

Testing can /never/ prove a program is correct - unless you can enumerate /all/ inputs and states of your code and go through them all. Testing can prove a program is wrong (and it is vital for that), but it cannot prove a program is correct.

A functional programming language like Haskell gives you a much better chance of actually proving the correctness of your code than an imperative language like C - that is one of the reasons for the existence of functional programming languages (it is the lack of state and side-effects that help). But if you want to prove the correctness of your Haskell program, you do it by an analysis of the code, or by deriving correct code in the first place - you don't do it by testing.

Indeed - and almost all practical situations fall into this category.

I don't follow that at all.

And I follow that even less.

(I am not saying I disagree with you, I just can't see what you are trying to say. Maybe I'll understand it better after some coffee.)

C certainly has its important features and strengths, and is a very useful language to have.

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Tue, Sep 18, 2018 7:32 AM

That's nothing to do with formal proofs, of course.

It is equivalent to "inspecting quality into a product".

Yes, it is better than nothing/eyeballs - both of which I've witnessed :(

Er, no.

Event-driven programming is used at all levels of the application protocol stack. Frequently events/messages are layered on top of events/messages at other levels without the implementers even realising it.

For examples, start by looking at telecoms and fintech applications - which are often written in Java :) (Including high frequency trading where they spend billions (literally) to shave ms off comms latency, and encode business rules in FPGA hardware!)

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Tue, Sep 18, 2018 8:02 AM

Windows gui apps are often written as event-driven code, and no one sane writes them in C - they use C++, Python, Delphi, Java, C#, VB, or many other languages. None of these are particularly specialised for events or messages, but they all work at least as well or better than C for it.

Go is perhaps a good choice for event-driven code (I base that on absolutely zero experience with Go...).

And if they want code that works reliably (and can prove it), they use Erlang rather than Java.

- L
- Les Cargill
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sat, Sep 22, 2018 3:15 AM

So what you can to is find the meat of what is possible and do that.

I must say - I am rather tired of this old warhorse; an appeal to Godel in the form of an economics question. All I would say is that you should test as well as you can. You should use a dragon to fight the dragon.

And so if your system is adept at rejecting inputs it cannot expect, then what?

It's all about the ability to enforce constraints. Sort your constraints by the ability to do damage.

That's because we'd like to pretend that "analysis" is some process that's not automatable. Or, at least, not somehow aided by automation.

I only invoked Haskell because it's the source of the actor pattern.

Yeah, I dunno. I think people give up too early.

So write a C++ thing to dump all the state in Any Random C++ object to a text file. You can get close.

A C struct? Gimme thirty minutes and the source code.

And that's fine. I accept the following premises:

- Model checking is the current state of the art in correctness.

- Models don't work without events.

- Models work better when they're introspected.

- Because C can "cheat" and use the physical representation of objects in memory, it has a small leg up in the game of introspection.

C has select()/poll()/epoll() and Real(tm) callbacks.

For what it is worth, If I knew what I was trying to way, I'd say it :)

So the tack I've taken is to cobble together things that make C do what I want it to ( with a scripting language adept at string processing ) rather than wait on someone else to provide things for me.

--
Les Cargill

- L
- Les Cargill
  
  Contact options for registered users
Vote on answer
posted
5 years ago

Sat, Sep 22, 2018 3:17 AM

Yes. Of course. It is, as all good engineering is, about the retirement of risk.

Oh heck no. Try "introspecting quality into a product".

--
Les Cargill