evaluation of boot execution time

- A
- alb
  
  Contact options for registered users
posted
9 years ago

Mon, Nov 10, 2014 8:58 AM

Hi everyone,

I'm trying to understand how much time it takes for my system to boot and I define 'booting' as the process from 'cpu reset' to begin of execution of my 'main' program (if a better definition for this phase exists I'd be happy if somebody can point it out to me).

I'm simulating my embedded processor on an RTL simulator and I report a note when the instruction memory port is loaded with the instruction 'brlid' to 'main address' (in my case brlid r15, 1784).

So now I have my 'booting' time but I sense I'm missing something. For instance, I'd believe that if the type of program is /different/ the amount of time it takes to boot would be /different/.

Is there any standard way to measure the booting time? Any idea on how to instrument my program in order to see what affects booting time?

Thanks a lot for any suggestions/ideas/comments.

Al

--
A: Because it messes up the order in which people normally read text. 
Q: Why is top-posting such a bad thing? 
A: Top-posting. 
Q: What is the most annoying thing on usenet and in e-mail?

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 9:35 AM

Op Mon, 10 Nov 2014 09:58:17 +0100 schreef alb :

Then read the boot code. :)

If the boot code is the same then the time could be the same. If the boot code distinguishes between cold and warm boots, that might cause differences.

If your simulator has a cycle counter, then I would recommend using that.

Instrument the boot code!

--
(Remove the obvious prefix to reply privately.) 
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 11:21 AM

Hi Boudewijn,

Boudewijn Dijkstra wrote: []

sure even though I do not believe it is the most efficient way to get this type of information.

Ok, maybe I should have given more information about the environment. The boot code, as defined, is essentially taken by the C run time environment to setup bss, heap, stack and whatever else is needed for the program to run, with necessary calls to .ctors as needed.

I do not see any cold or warm boots, only the length of data sections which might be different and take more time to configure. Do I miss anything else?

[]

Why should I instrument the boot code? I would like to see, given the current library, what elements of my program have an impact on the execution time of the booting phase. If I instrument the current library I will not obtain what I want.

Al

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 12:46 PM

It is a good way to learn about the internals of embedded systems.

So, you are specifically interested in the "cstartup" part of the boot process. Your linker should be able to tell you the sizes of bss, heap, stack, etc. Setting up .bss and .data should be linear with size. Setting up stack and heap could be negligible or linear with size depending on whether they are filled with a pattern.

Indeed, cstartup is pretty straightforward. However, it is not inconceivable that your toolchain inserts hardware initialisation routines depending on the needs of the program.

--
(Remove the obvious prefix to reply privately.) 
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 1:39 PM

Hi Boudewijn,

Boudewijn Dijkstra wrote: []

Ok I did read the assembler and the basic structure is the following:

the 'configure bss' should be something like this:

128: 20c04738 addi r6, r0, 18232 // 4738 12c: 20e04738 addi r7, r0, 18232 // 4738 130: 06463800 rsub r18, r6, r7 134: bc720014 blei r18, 20 // 148 138: f8060000 swi r0, r6, 0 13c: 20c60004 addi r6, r6, 4 140: 06463800 rsub r18, r6, r7 144: bc92fff4 bgti r18, -12 // 138 148: 20c04738 addi r6, r0, 18232 // 4738 14c: 20e04790 addi r7, r0, 18320 // 4790 150: 06463800 rsub r18, r6, r7 154: bc720014 blei r18, 20 // 168 158: f8060000 swi r0, r6, 0 15c: 20c60004 addi r6, r6, 4 160: 06463800 rsub r18, r6, r7 164: bc92fff4 bgti r18, -12 // 158 []

I didn't know it was called 'cstartup', I've always heard about crtinit instead.

AFAIK .data is initialized and is included in the binary file so it shouldn't really contribute to boot time. .bss on the contrary shall be initialized to '0' and does contribute to booting time.

[]

Uhm, I'm using a mb-gcc on an MB-lite architecture, I haven't seen, thus far, any particular nuances related to hardware initialization.

Al

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 2:31 PM

I guess it depends which toolchain pedigree you are most familiar with. ;)

That depends on whether your data is in volatile storage on reset. If the initial values are in Flash, then they need to be copied to RAM first.

Indeed.

--
(Remove the obvious prefix to reply privately.) 
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 2:52 PM

Hi Boudewijn,

Boudewijn Dijkstra wrote: []

always used GNU toolchain. Does it exists another one??? ;-)

[]

you have a point. This step is done by the FPGA fabric so when the processor is started everything is happily residing in RAM.

Al

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 10:17 PM

Then, have main() tickle a GPIO and measure the time from the release of RESET to the time you see that GPIO tickled.

Note that you haven't discussed your environment so it's unclear whether this time will be constant (given no changes to your executable) OR highly variable!

E.g., if your startup code contains initialization routines that have to "make the hardware safe" and the hardware's state can vary from one RESET to the next, then the time required to reinitialize it can vary.

This can also be true in other cases. E.g., imagine you had to count restarts and store the result in FLASH... what happens if *this* FLASH write needs to be retried while the FLASH write for the previous restart didn't? Or, if the write requires an ERASE cycle, etc.?

Define "different"...

Only you (thinking on behalf of your customer/user) can decide what a MEANINGFULL measurement would be. E.g., if the first line of main() gets executed within a dozen microseconds of RESET being released... BUT, main spends the first three minutes doing other housekeeping BEFORE the user can effectively *use* your device, then the 12 usec time is stupid!

E.g., that's the trick desktop apps use to throw up a splash screen QUICKLY so you think they are "already running" when, in fact, they have a bunch of initialization code that still has to execute before they can be USED!

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Nov 11, 2014 2:20 PM

I do a similar thing but in simulation and I spy on the instruction bus when the 'jump to main' instruction is executed.

That is the biggest issue I have. Unfortunately the requirements are quite misleading here since there's a total lack of 'perimeter' for the code and we need to provide the hardware.

So our nice customer has been encouraged to believe that they can make an algorithm in Matlab, export it as C and compile it for our architecture (MB-Lite) without breaking a sweat!

I believe this lack of coordination will lead to no joy at all! Sob.

Fortunately this is not the case. The hardware is configured by the FPGA in the 'safe mode' and the software doesn't need to perform any of such operations.

Sure, I must admit my OP wasn't at all clear w.r.t. boundary conditions.

Assume I have program A and program B, both of them accomplishing the same task but written differently. A uses lots of global variables while B doesn't use them.

All statically allocated objects, like global variables, need to be configured at run-time and will eat up my booting time. In what aspect can A and B differ more that would have an impact on booting time?

[]

I see your point, but we are far from emulating this sick trend. My measurement would be from RESET to operational, where the operational is really when the main starts.

Al

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 13, 2014 6:24 AM

You are essentially saying all hardware related issues are "off the table" and just looking at software issues?

You could write A to be incredibly inefficient at *whatever* it does. While this sounds pedantic, it may actually *not* be!

I know of a piece of production code that stored all of it's data as ASCII numeric strings. This made configuration relatively easy. And, you could visually inspect the data to verify the results (instead of having to CORRECTLY decode a multibyte binary value and verify *that*!)

Unfortunately, in real terms, this meant the processor spent gobs of time in sscanf(3c). And, used a COTS library for all this stuff despite the fact that their use was far from typical! When profiled, the results were so dramatic that they were convinced it was a "measurement error": "Shirley the processor can't be spending ALL it's time in sscanf!" ("Yes, it is! And stop calling me Shirley!")

Any other sorts of coding practice that don't nicely fit with the expectations of your environment (hardware, OS, etc.) can also dramatically increase execution time. E.g., poor locality of reference can dramatically impact the efficacy of any caches. VM systems can be crippled by inconsiderate choices of algorithms ("for all i do {for all j do { something with x[i][j] } }" behaves differently from "for all j do {for all i do { something with x[i][j] } }"

First, assess how competent folks are with the tools/environment in which they are tasked with working. If they are aware of these sorts of issues, then they will adapt before (or after) they encounter a big performance hit.

In any case, you may want to consider policies or mechanisms that remove much of the "bad choices" from your implementation -- *if* the results are viable! (e.g., don't allow VM if it's not likely to be used properly)

People tend to be notoriously ignorant of where time is spent in algorithms (e.g., the sscanf() example). And, too often "prematurely optimize" -- often NEVER having obtained empirical data AND considered how that data would likely change "in the wild".

Assumptions are the root of all errors and inefficiencies. People

*think* they know what is happening under the hood but, in reality, are just making GUESSES based on their own personal biases. A better approach is to "do the math" (i.e., a calculator instead of a Quija board!)

Set some criteria for your "boot time". Something that your client/customer can agree to. And, something that you can *measure* (think like a lawyer -- or a Test Technician!). Make sure folks know the RELATIVE significance of this up front so they can factor it into their design methodology.

(I say *relative* because you can end up with pathological implementations where folks do silly things to make ONE criteria look artificially "good" at the expense of other, potentially more significant, criteria!)

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 17, 2014 8:39 AM

Hi Don,

Don Y wrote: []

Correct, no hardware differences are involved and/or taken into account since are not present in the current scenario.

Unfortunately 'inefficiency' is a quality difficult to measure; we all understand well extreme cases (as in your sscanf example) but there's nothing that qualify as a requirement for A and B.

For instance, if I knew what programming structure, objects, flow would have a direct impact on the booting time I can specify A/B numbers for all those elements (ex. use less than 10 global variables).

[deleted sscanf example] Keep in mind that I'm interested in what happens at 'booting' time, not when the program executes its main section. But I do see your point here. []

Worst of the worst is that we do not have any idea of what the algorithm to be implemented looks like, we are only providing the hardware with a certain computational capability. Yet, choices on the algorithm and its implementation may have an impact on the functionality (like too long booting time).

[]

premature optimization is to be banned. Optimization is typically an effort that needs to look at the overall picture and concentrate in what makes sense optimizing. I've seen efforts in optimizing memory paths to reduce access time to external memory while the processor was spending

98% of the time on local cache operations.

Nice example the Ouija board, I didn't even know it was called like that! One of the root of all problems lies in the requirements specification itself. The too often seen problem is that customer specify a piece of incomprehensible list of requirements, we take it 'as is' in order to reassure them and 6 months after (or even later) we come up with all sorts of incoherences!

BTW we recently did use a software ('Lelie for requirements') to evaluate the 'soundness' of a requirement spec on which we already did a meticoulous review. It came out with 70% of our same remarks, spotting issues that are not so easy to spot for a stupid program. Unfortunately it costs too much for a small company like ours, but they do offer a 'service' at a convenient rate per document [1].

That is exactly what I wanted to propose, unfortunately we are close to the Critical Design Review and nobody gave a s**t about this item so far.

Al

[1] and no, I do not work for them or take any percentage on their selling.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 17, 2014 10:22 PM

I have a standard way of dealing with "hand-waving" clients: everything is on a time-and-materials basis. And, I am overly blunt in explaining it to them as, "That way, *you* pay for the flexibility to change your mind".

When they think about this for a while, (especially after the first time they "change their mind") they decide they don't really WANT this freedom and want to explore other contract options.

"Fixed cost? SURE! That sounds great!" "OK, we just have to decide what you *want* before I can give you a quote." "Sure, no problem"

Then, you start asking questions to draw out the specifics of the design (that they OBVIOUSLY haven't completely "thought through"). It won't be more than a few minutes before they reply to one of your interrogatives with "I don't care". Without batting an eye or lifting my gaze from the notepad where I've been recording their answers, I say, "When XYZ happens, the device can catch fire and burn to a cinder..."

Of course, this gets there attention: "What???! That's not acceptable!" Putting down my notepad, I look at them, straight faced, and say, "But you said that you 'didn't care'?" "Well, I sure as hell don't want it to catch FIRE!" "Ah, so you *do* care!" "So, what would you *like* it to do in that case?" "Well, I don't *know*..." "Ah, but you see I have to know what freedoms I will have in the design before I can give you a FIRM ESTIMATE of what it will cost and how long it will take. I obviously want to make as many things that *you* 'don't care about' take ZERO design/test time. Catching fire, turning itself off, making a loud noise or any other SIMPLE thing makes my life easier and the effort more predictable..."

People usually don't know what they want. They want YOU to come up with something -- and then they'll tell you what they *don't* want! (i.e., what you have *done*!)