How to develop a random number generation device

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 3:27 AM

Not

Tuesday

one

apps

We bought a *lot* of identical drives when we bought the batch of PCs. I don't want to worry about PCs for another 4 or 5 years maybe.

The hot-swap raid thing works great. Pull either of the C: drives, pop in another drive, blank or not, and it begins automatically cloning the live os to the "new" drive, online. It takes about an hour, after which they are identical. We've tested it in all sorts of situations, and it works.

I can also pull one of the c: drives from my work machine and take it home, and run it as d:, or boot and run the whole OS as c:

John

- S
- Spehro Pefhany
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 4:03 AM

Tuesday

Have you actually tested this? What happens if you don't have the same kind of drive available? I don't think it will work until you replace the drive.

Best regards, Spehro Pefhany

--
"it's the network..."                          "The Journey is the reward"
speff@interlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 9:12 AM

Agreed.

Some of it is dictated by the language: contrary to what used to be a commonly-held belief amongst DOS programmers, C does not have any concept of "near" and "far" pointers. If you want to use multiple data segments, *all* data pointers have to be segment:offset (48 bits on 32-bit CPUs). One data segment (data, bss, rodata, stack) and one code segment wouldn't be a problem, though.

Some of it is dictated by portability: x86 has segmented memory; most other CPUs don't. If you want a single code base to run on multiple architectures, you can't assume segmented memory. This doesn't have much impact upon user space, but the Linux kernel could get quite messy if it had to allow for disjoint code and data spaces.

OTOH, segmentation doesn't necessarily get you all that much that you don't get with page-level controls (on x86, the inability to map pages write-only is a problem). On newer CPUs, you can implement W^X (write or execute but not both) at page level. On older CPUs, you can put the code first and make the code segment end immediately after the code (all segments must have the same base address to get a single flat address space), but this can cause problems for dynamically-mapped code (dlopen() etc). A compromise is to make the code segment end before the bottom of the stack, which protects against stack-based injection but not the heap or data segment (an attacker would have to find some other vector to get the code called, as you can't trash the return address with a heap overrun).

You could prevent heap overruns if malloc() used a separate segment for every block, but there would be a significant performance hit (malloc() would require a context switch), and you're limited to 8192 (IIRC) local descriptors per process (16 bits for the selector minus 1 bit for global/local and 2 bits for the privilege level leaves 13 bits).

Theoretically you could use the same approach for local (stack) arrays, but the performance hit would be even worse.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 9:21 AM

Windows vs Linux doesn't come into it:

formatting link

C is C, whichever OS you run the program on.

Beyond that, the fact that the web is based around many "small" transactions means that there is a significant performance gain to be had from putting everything in one process (e.g. mod_php rather than spawning an interpreter for each request), thereby eliminating process boundaries which would otherwise provide some protection.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 9:48 AM

One problem with that is that you're limited to 8192 segments per process.

In theory, you could use segments only for "active" objects, and have something like the Local{Lock,Unlock} of 8086-mode Windows. But apart from producing really ugly code (and adding overhead), it only helps to the extent that the code chooses to make use of it.

Some code can use a lot of arrays, e.g. an array of structures, each of which contains an array of characters. Chances are that the programmer will use a segment for the larger array and leave the character arrays as just a range of bytes within the segment.

If you can accept mechanisms which impose significant constraints on coding, you may as well just forbid the use of arrays in favour of an opaque "vector" type whose accessor methods/functions perform bounds checking.

Both methods work just as well (i.e. they work if you use them, and don't work if you don't use them), but the OS-level option adds a lot more overhead.

The realistic approach to eliminating buffer overruns is not to write word processors and web browsers in a language which was designed for writing an OS kernel and device drivers. If arrays are a distinct type, having both a start and end (to allow bounds checking), and pointer arithmetic is impossible (or at least not actively encouraged), buffer overruns would be an obscure theoretical issue rather than an everyday occurrence.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 10:09 AM

I'm with you up to this point.

But this is a separate issue.

If you have W^X (write or execute but not both), code injection is impossible, but that isn't the only type of buffer overflow exploit (although it's probably the most powerful).

They aren't all that massive. Most programs don't need executable stack/heap, and don't care about exactly where particular memory regions are mapped.

Most of the code which does care tends to be in a handful of programs and libraries. IIRC, implementing W^X on Linux required some changes to the signal-handling code and hardly broke any binaries (except for emulators, and code written in Objective-C, which uses thunks quite extensively).

The problem here is that the OS doesn't know where one buffer ends and another begins. Intra-process buffer overruns are primarily an issue with the language.

In C, an array is represented by its start address; bounds checking is the responsibility of the programmer. That isn't necessarily a bad decision for a language which was meant to be one step above assembler, but it doesn't make sense for writing applications.

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 1:30 PM

On Sep 17, 7:55 pm, John Larkin wrote: [....]

In every other area, humans make mistakes and yet we seem surprised that programmers do too.

I think there really is a fundamental limitation that makes it such that the programming effort becomes infinite to make a bug free large system. We do seem to be able to make bug free small systems, however.

This suggests a rephrasing of your point as "it is better to use multiple simple systems" connected in some way rather than just calling it multiple cores or CPUs.

Very complex hardware is likely to have the same problems as very complex software. We need to link of ways to use many copies of a much simpler hardware.

This is exactly the path or perhaps even N CPUs per process, where N

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 2:01 PM

OK, rephrase it. Then start making the kinds of chips and OS's that I suggest.

That's what I proposed: arrays of simple RISC machines, a amattering of more powerful cpu's or floating-point units, all on a chip around a central cache.

But we don't need performance. We need simplicity and reliability.

John

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 2:05 PM

In most other areas of endeavour small tolerance errors do not so often lead to disaster. Boolean logic is less forgiving. And fence post errors which even the best of us are inclined to make are very hard to spot. You see what you intended to write and not what is actually there. Walkthroughs and static analysis tools can find these latent faults if budget permits.

Some practitioners, Donald Knuth for instance have managed to produce virtually bug free non-trivial systems (TeX). OTOH the current business paradigm is ship it and be damned. You can always sell upgrades later. Excel 2007 is a pretty good current example of a product shipped way before it was ready. Even Excel MVPs won't defend it.

Software programming hasn't really had the true transition to a hard engineering discipline yet. There hasn't been enough standardisation of reliable component software parts for sale off the shelf equivalent in complexity to electronic ICs that really do what they say on the tin and do it well.

By comparison thanks to Whitworth & Co mechanical engineering has standardised nuts and bolts out of a huge arbitrary parameter space. Nobody these days would make their own nuts and bolts from scratch with randomly chosen pitch, depth and diameter. Alas they still do in software:(

How about calling them modules. Re-usable components with a clearly defined and documented external interface that do a particular job extremely well. NAGLIB, the IJG JPEG library or FFTW are good examples although arguably in some cases the external user interface is more than a bit opaque.

And the very complex hardware is invariably designed using software tools. The problem is not that it is impossible to make reliable software. The problem is that no-one wants to pay for it.

In hardware chip design the cost of fabricating a batch of total junk is sufficiently high and painful that the suits will usually allow sufficient time for testing before putting a new design into full production. Not so for software where upgrade CDs and internet downloads are seen as dirt cheap.

For N

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 4:02 PM

It is certainly true that it matters little if a process is delayed because it is swapped out of the cpu, or because the cpu it is running on has slow access to memory. But unless your new architecture is an improvement in speed (since it is unlikely to be more power efficient, or cheaper, and is not inherently more reliable), then there is no point in making it.

There is no reason to suppose your massively multiple core will be faster. Your shared memory will be a huge bottleneck in the system - rather than processes being blocked by limited cpu resources, they will be blocked by memory access.

Your also seem to be under the impression that context switches are a major slowdown - in most cases, they are not significant. On server systems with two or four cores, processes typically get to run until they end or block for some other reason - context switches are a tiny fraction of the time spent. If you want a faster system serving more web pages or database queries, you have to boost the whole system - more I/O bandwidth, more memory bandwidth (this is why AMD's devices scale much better than Intel's), more memory, etc. Simply adding extra cpu cores will make little difference beyond the first few. For desktops, the key metric is the performance on a single thread - dual cores are only useful (at the moment) to make sure that processor-intensive threads are not interrupted by background tasks.

For almost every mainstream computing task, it is more efficient to use fewer processors running faster (although it is seldom worth getting the very fastest members of a particular cpu family) - you can get more work out of 2 cores at 2 GHz than 4 cores at 1 GHz. In a chip designed around many simple cores, each core is going to be a lot slower than a few optimised fast cores can be.

If you shift the complexity to hardware, you'd get hardware that is expensive and buggy.

Have you anything to back up this belief in cheap and reliable hardware? Certainly some hardware is cheap and reliable, being relatively simple

- but the same applies to software.

The Cell is a specialised device - only the one "master" cpu can run general tasks. The eight smaller cpu's are only useful for specialised dedicated tasks (such as the graphics processing in games). This is precisely as I have described - massively multi-core devices exist, but they are only suitable for specialised tasks.

Moore's Law is not like the law of gravity, you know. You can't quote it as "proof" that a simple solution to your shared cache problem will be developed!

It's just a matter of the costs of finding and fixing errors in hardware are so much higher than for software, so that more effort goes into getting it right in the first place. But the result is that a given feature is vastly more expensive to develop in hardware than software. A hardware version of the Vista kernel may well be more reliable than the software version - but it would take several centuries to design, simulate and test, and cost millions per device.

At the moment, I've got 51 processes and a total of about 476 threads (it's the number of threads that's important here) on my XP-64 desktop, excluding any that task manager is not showing. There is a svchost.exe service with 80 threads on its own, and firefox has 24 threads.

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 4:12 PM

Compare a software system to an FPGA. Both are complex, full of state machines (implicit or explicit!), both are usually programmed in a heirarichal language (C++ or VHDL) that has a library of available modules, but the FPGAs rarely have bugs that get to the field, whereas most software rarely is ever fully debugged.

So, computers should use more hardware and less software to manage resources. In fact, the "OS kernal" of my multiple-CPU chip could be entirely hardware. Should be, in fact.

Yes. The bug level is proportional to the ease of making revisions. That's why programmers type rapidly and debug, literally, forever.

Yes. So let's use that horespower to buy reliability.

John

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 4:15 PM

That would be an absurd setup. There is some justification for wanting multiple simple cores in server systems (hence the Sun Niagara chips), but not for a desktop system. The requirements for a disk controller, a browser, and Doom are totally different. With a few fast cores like today's machines, combined with dedicated hardware (on the graphics card), you get a pretty good system that can handle any of these. With your system, you'd get a chip with a couple of cores running flat out (but without a hope of competing with a ten year old PC, as they could not have comparable bandwidth, cache, or computing resources in each core), along with a few hundred cores doing practically nothing. In fact, most of the cores would *never* be used - they are only there in case someone wants to do a few extra things at the same time since you need a core per process.

Until you can come up with some sort of justification, however vague, as to why you think one cpu per process is more reliable than context switches, this whole discussion is useless.

Do you have any hints of a suggestion that anyone else thinks this is the case?

- R
- Rich Grise
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 4:42 PM

...

How do you know which is the bad one?

Thanks, Rich

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 5:58 PM

If a drive is bad it complains at bios/boot time. We're running XP, which unfortunately doesn't have the online RAID utilities, but the bios seems to do everything we really need.

John

- S
- Spehro Pefhany
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 6:10 PM

Not

Tuesday

one

apps

Sounds like you have things under control. Just out of curiosity-- do you have an "IT guy" or are things kept running by real engineers?

Best regards, Spehro Pefhany

--
"it's the network..."                          "The Journey is the reward"
speff@interlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com

- Y
- YD
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 7:13 PM

Late at night, by candle light, MooseFET penned this immortal opus:

Here's a good example:

formatting link

- YD.

--
Remove HAT if replying by mail.

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 8:53 PM

No, most bad programmers these days first learn C.

The first non-assembly language I used was Basic-Plus, and I still program in PowerBasic, and assembly!

There's nothing wrong with Basic, especially the modern versions. Given an adequate language that doesn't positively force bad habits, the programmer is what matters.

Interesting, but I don't really design a program first, other than some rough notions; I start coding it bottom-up, and design the structure along the way. What matters is the final product, which I get to by lots of reading and re-writing until it's perfect. Works for me.

I do write the manuals first.

Read "Dreaming in Code" by Scott Rosenberg.

John

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 11:46 PM

Not

Tuesday

one

apps

get

That may be true, but it is no reason to throw away a drive when it has the lowest probability of failing in favor of one with a higher likelihood.

If they're all identical, you better hope it's a good design/lot. I think I have what must be the last IBM 75GXP (a.k.a. "Death Star") on the planet that still works. I don't trust it though.

--
  Keith

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Sep 18, 2007 11:46 PM

What's the matter Dimmie? Did mommy need her underwear back?

--
  Keith

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Sep 19, 2007 2:38 AM

Static analysis tools can only find some bugs. Some code has to be stepped through to see if it ever gets stuck or goes into a loop. I'm thinking of things like:

while (X > 1) do if (X is even) X = X/2; else X = 3 * X + 1;

It is really hard to see whether for some values of X this sticks in a loop or not.

Given the right motivation, I suspect that quite a few programmers could do it. The problem is partly the one you point out below and partly that those guys already have jobs. There is a lot of quite good code being written. It doesn't get noticed because of the mountains of crap it is hiding in.

I have always worked in an environment where bugs are not allowed. I don't have a perfect record, but I'm sure that my rate of making bugs is way lower than that of the average programmer in an environment where bugs are allowed. Practice helps.

... and further: In a lot of ways, we need better languages. Back in the days of DOS I was helping someone fix a program that used a library for working with the serial port, and by time we got done we no longer used the library. I don't think there was really much wrong with the library it was just that we couldn't figure out how to make it do what was needed. It had a think book full of documentation of the dozens of function it contained. Going cover to cover several times we simply couldn't find the needed routines or how to call them.

We needed a "get the next character from COM1 or any change in the modem status in the order they happened please" function.

Perhaps not today but back in 1998, someone in China was making his own bolts etc. They were threaded with an odd pitch. They were roughly the typical US sizes rounded off to the nearest metric thing his lathe could do. He was threading into plastic, the standard metric threads wouldn't hold and he could find the right sort of insert.

I like the idea of modules. Maybe we could have a programming system that is more like designing analog hardware. You indicate where the data goes, putting down the modules you need and wiring them up.

Have you ever played with "artsbuilder". It is sort of what I'm thinking of.

I suspect that we are paying for it many times over. It is just that people can't see the money going out of their pocket.

Also: "software is magic. Hardware has all those transistor thingies in it."

[...]

Did you ever work on an Itell Series 4 development system. It was multiprocessor and multiuser. In both cases "multi" meant 1.5. It had an 8086 and an 8080 in it. If the user on the 8086 started a compile, the 8080 froze up.

You obviously have seen how fast I typed all this :)

I rarely need all of the power of my PC. Sometimes though a run on LTSpice will take all night. I rarely press the gas all the way to the floor on my car too.