How to choose a firmware partner

_A history of Univac computers and Operating Systems_ [

formatting link
] says "Just as the first UNIVAC 1108s were being delivered, Sperry Rand announced the 1108 II ... The memory units were for program storage and data storage, each holding up to 262,000

64-bit words."
--
Guy Macon, Electronics Engineer & Project Manager for hire. 
Remember Doc Brown from the _Back to the Future_ movies? Do you 
have an "impossible" engineering project that only someone like 
Doc Brown can solve?  My resume is at http://www.guymacon.com/
Reply to
Guy Macon
Loading thread data ...

"Latching permanently" - do you mean a) CMOS latchup (which would be down to poor integration); b) other excluded logic state (fixed via a reset); or c) loss of software control (also fixed by a reset)?

I don't understand how any of these affect the choice of CPU.

Ah, I see where you're coming from. The lesson here is not to avoid watchdogs; it's to make sure you read the datasheets. I'm afraid this was a hardware/software integration error. The watchdog was doing its best to save your ass, but you misprogrammed it.

I'm a bit puzzled by this statement:

temperature).

Reply to
Steve at fivetrees

wrote

It's worse than that, read on.

So what you seem to be saying is, "I don't read datasheets and when things go wrong for me it's the manufacturers fault". If you think it can only vary by 20% then you will be rudely educated again. For example on a 16F628 the acceptable limits (according to the datasheet) is 7 - 33mS with 18mS being "typical" (no prescaler assigned). I'm really not trying to be an ass, but you absolutely have to RTFM when working with these things.

Reply to
Anthony Fremont

I actually have a core memory board that looks very similar to the ones in that link. I can't remember how I acquired it, as I certainly have never seen the machine that it came from. This particular one, is made by Litton Memory Products, and is a G645E 8K x 19 bit 3W-3D 18mil Planar Memory (so says the board).

I find it interesting to hear from some of the people that worked with these older machines.

Thanks for the trip down memory lane (though, not my memories),

Mike Anton

Reply to
Michael Anton

I am curious, how many 1 bit or 2 bit errors has the system reported?

BTW, I have watched the pricing on older Intel chips and I don't see the high end parts drop much in price until they are incredibly obsolete. Even then they can start to go back up as they become very scarce. The more mainstream chips seem to keep dropping in price. I guess nobody will pay a lot for an older Celeron or Pentium, but if you have an old server and need to replace a bad Xeon CPU chip, then it is a lot cheaper to do that than to replace the whole unit even at high CPU prices.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

None, in two years of 24/7 use and two full weeks of running memory diagnostics. With older Compaq servers, if you can measure the error rate it is much too high.

--
Guy Macon, Electronics Engineer & Project Manager for hire. 
Remember Doc Brown from the _Back to the Future_ movies? Do you 
have an "impossible" engineering project that only someone like 
Doc Brown can solve?  My resume is at http://www.guymacon.com/
Reply to
Guy Macon

I assume that is running under Linux or your own software. AIUI there is no provision in Windoze for recording memory failures and/or corrections.

--
fix (vb.): 1. to paper over, obscure, hide from public view; 2.
to work around, in a way that produces unintended consequences
that are worse than the original problem.  Usage: "Windows ME
fixes many of the shortcomings of Windows 98 SE". - Hutchison
Reply to
CBFalconer

I quadruple boot to=FreeDOS, QNX, Slackware linux, and Windows 2000.

This is true of desktop systems, but Proliant servers are a another story. A desktop PC has a minimal BIOS that just gets the hardware into a state where the OS can boot. A fully equiped Proliant server has another CPU with embedded software that does such wonderful things as detecting that Windows has crashed and resetting the system, switching over to another server if there is a hardware failure, etc. In addition Compaq provides a ProLiant System Management Driver for Windows that adds the following capabilities to Windows:

Logging of real-time clock battery errors. Logging of processor errors. Fan outage and temperature detect alarms. Logging of power module errors. Logging of corrected memory errors.

--
Guy Macon, Electronics Engineer & Project Manager for hire. 
Remember Doc Brown from the _Back to the Future_ movies? Do you 
have an "impossible" engineering project that only someone like 
Doc Brown can solve?  My resume is at http://www.guymacon.com/
Reply to
Guy Macon

Boy, you guys are really making me feel old! I started this business working on Burroughs 205 vacume tube system that ran at 200khz clock and had drum memory. Program input was punched paper tape and punched cards. Console was a maze of neon lights. That's were I learned to read binary. After that I worked on the Burroughs 220 whic had core memory. My memory may be failing me but it seems like it had a 16K stack that was about 2'x2'x3' or so. Power was a big 400 hz generator set. Another memory was seeing a movie, some kind of space rocket to the moon where the crew controlled the rocket with a box that was actually a diode checker for a 205. I almost died laughing at that. All the logic diodes were clip-in and a lot of time was spent checking diodes.

It's fun going down memory lane, isn't it?

Regards,

Art

Reply to
Art K6KFH

Nice. Would you like to make a quick $10 profit on that box? :-) Am I correct that there is no real reason for this Windoze failing in general; i.e. it simply needs to enable the appropriate HW interrupt and service it accordingly?

--
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
     USE worldnet address!
Reply to
CBFalconer

No way to tell; windows doesn't release their source code. Maybe doing so would upset some fragile part of Windows. Maybe this is one of the many parts of Windows where Microsoft no longer has the knowledge to make any changes. Perhaps the memory manufacturers are paying Microsoft to not report errors. Unless software is Open Source, these kinds of questions can never be answered.

--
Guy Macon, Electronics Engineer & Project Manager for hire. 
Remember Doc Brown from the _Back to the Future_ movies? Do you 
have an "impossible" engineering project that only someone like 
Doc Brown can solve?  My resume is at http://www.guymacon.com/
Reply to
Guy Macon

No, I do read data sheets but I have a small attention span and a bad memory so I expect to make mistakes.

Therefore the simpler the system, the more likely it will work (for me).

The more sensible the job the more likely it will fascinate my brain.

The sillier the job the more likely my brain will wander, and make mistakes.

Therefore I expect to make more mistakes implementing e.g. types. It may be that I am unique but I doubt it. I think some others do likewise and I think the sheer dullness and difficulty (impossibility) of optimising the placement of the wdt_resets (ignoring future maintainability horror) will likely lead to wholesale sloppiness sooner to later and agressive management or timescale pressure will guarrantee it.

Cheers Robin

Reply to
robin.pain

Remind me not to hire you ;).

Errr... see my earlier post. There is no "dullness" or "horror" involved in kicking watchdogs - one just has to be methodical. And since being methodical is the name of this particular game (embedded design), I can't help but wonder whether you've chosen the right career.

Steve

formatting link
formatting link

Reply to
Steve at fivetrees
[...]

Misreading (or not reading, or forgetting) the datasheet was a silly mistake.

But the sillier one was trying to "optimize" your WDT updates. There is really no reason to do so.

For my superloop code, the WDT is updated in exactly two places: immediately after reset, and at the top of the superloop. Combined with special "come from" tests to ensure the program flow is as was expected, our systems have no trouble surviving some really nasty ESD testing required by our customers. (Without a WDT, I _guarantee_ your system will fail these tests.)

Multitasking systems also update the WDT in exactly two places: immediately after reset, and in a low-priority periodic task that checks to make sure the system is operating correctly.

Under normal operation the WDT is being updated several orders of magnitude more often than is necessary. Who cares? It's abnormal operation I care about, and what the WDT is supposed to remedy.

Regards,

-=Dave

--
Change is inevitable, progress is not.
Reply to
Dave Hansen

wrote

it

datasheet)

I hope you read what I wrote, and will try to remember that. ;-)

I often find the inverse to be true. The simpler the problem, the more likely I will underthink it and make a silly mistake. At one time I was a Cobol programmer (a long time ago, so don't laugh). I often made the stupidest mistakes when writing a new program. When I wrote assembly language, I often had great success at getting it right the first time due to the larger amount of thought required to reason the problem out in my tiny brain.

I think you will face many boring situations then.

Words like silly and sensible don't often correctly describe real world problems. Maybe the required solutions, but not the problems.

As others have said, kicking the watchdog just in the nick of time is unnecessary and foolhardy. That's why there is a prescaler. I will agree that relying on a watchdog to cover up deficient software design is a poor practice, but lousy software is not the only reason that embedded applications lock up. It's more a matter of the real world environment that necessitates the watchdogs. You can't control nature so don't waste your time trying. ;-)

If I had an application that occasionally hung up in a controlled environment, I would find the software problem causing it and not rely on the watchdog to cover it up.

Reply to
Anthony Fremont

Univac had a machine that strung the cores on all three axes in the mid-1960s. It was fascinating to watch it work.

My first introduction to digital computers was the Bendix G15 in the mid-1960s. It was a decimal, serial, drum memory machine using a lot of diodes in its logic. It's amazing how long it takes to find all the failed diodes after a lightning strike!

Reply to
Everett M. Greene

Yes it's a silly mistake that *anyone* can make.

Very well, _guarantee_ that this code will fail without the wdt enabled:- ... jmp test jmp test

test bitset PORT,test_bit bitclear PORT,test_bit jmp test jmp test ...

So a high energy particle changes your program counter and you have a random GOTO occur but your program does not reset because the chances of the GOTO landing near a wdt_reset is now "several orders of magnitude" higher? And you say "Who cares?"

Cheers Robin

Reply to
robin.pain

I'm not sure I understand your point, but in any case, if your test_bit is the watchdog output, then it's not a good idea to output both states in the same place. I and others have given you examples of watchdog-kicking strategies - have you read and understood them?

Eh? I'm sorry, this makes no sense at all to me. To reiterate: in normal operation, software will periodically kick the watchdog (high and low, in two wholly unrelated places that both need to happen to accurately represent "normal operation") to stop it timing out and hence resetting the CPU. If the program loses control, then the watchdog is not kicked in those two places, it times out, and the CPU gets reset. Whether a random GOTO lands near a watchdog kick (one state only) has nothing to do with anything.

Steve

formatting link
formatting link

Reply to
Steve at fivetrees

Why? Do you have a short attention span and a bad memory too? =:)=

Cheers Robin

Reply to
robin.pain

That will fail whether or not there's a WDT. You've made another silly mistake.

[...]

And another.

Remember, there are exactly _two_ places in the code where the WDT is reset, not "orders of magnitude." It's a loop. And if we _do_ jump to the top of the loop from somewhere in the middle, the "come from" test will fail, triggering a reset.

Regards,

-=Dave

--
Change is inevitable, progress is not.
Reply to
Dave Hansen

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.