The value of a CS education

- U
- upsidedown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 6:27 AM

The usual reason for locking threads/processes to a specific processor in a multiprocessor system is avoiding the task switching issues. Especially with large processor specific private caches, if the task could freely jump to any free processor, the caches and virtual memory address translation tables would have to be reloaded each time.

This can be avoided by locking the task to one CPU, even if it might to wait slightly for any higher priority task scheduled locked to the same CPU.

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 7:43 AM

On a sunny day (Sun, 02 Oct 2011 15:25:06 -0700) it happened John Larkin wrote in :

Yes, many. Worked and IIRC the quote was 'This is exactly what we wanted'. Som million dollar projects with spinoffs..

Why not write a DirectX-11 driver for Atoms in BASIC and sell it to Intel?

BTW did you ever design digital multimeters? I killed 2 within a month. Those had a 1000V DC range, that should mean you can put 1000 V DC on it no? But 800 V killed both (of different manufacturers). Ordered 2 new ones on ebay... ebay nr 190452347958 Cheaper than a good screwdriver...

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 8:13 AM

Agreed, but sacrificing another cache line for the pointer to pointers is still an unnecessary complication. And pointer alias issues often make advanced compiler optimisation nigh on impossible.

This is gratuitous C style micromanagement. There are almost no modern compilers that cannot do strength reduction on loops automatically and several CPUs can do multiply fast enough for it not to matter.

I very nearly wrote it as indexed form value[r*MAXCOL+c] = r*c instead of C hacker style tvalue++ = r*c

To make the point originally - most compilers will generate *exactly* the same binary code for both of these inner loop expressions. The advantage of avoiding pointers is that the optimising compiler no longer has to worry about aliasing of partially overlapped arrays.

Only the 5-20% of code where the programme spends all its time is worth optimising and even then profile directed compilers can do a better job of keeping the various pipelines from stalling than most humans.

It is pretty obvious when demand paging gets out of hand and thrashing ensues. Most people don't even think about it as the mathematics for a transpose is pretty simple. The practical implementation of a fast algorithm to do it on a given hardware platform is not.

It depends a lot on what is being done. The choice of correct algorithm O(NlogN) rather than O(N^2) for instance is computer science. And the situations where you chose the other deliberately need to be understood.

When things get bigger problems of scaling become a lot more apparent.

Blame marketing for adding zillions of unsused and sometimes unusable features to match some tick list of features in reviews.

Again depends what you are doing which approach is appropriate.

That isn't so hard to do at a university.

Computer science is arguably a distinct branch of applied mathematics and logical reasoning. Designing algorithms to do solve problems.

That sounds like overkill.

Two CPUs is trivial, 3 is still easy. Four CPUs gets suddenly a lot more difficult. I was severely burned by being on the receiving end of a prototype 4 cpu machine that the manufacturer had not fully debugged.

Software development is always looking for the silver bullet that will allow people to just code what they want without ever understanding what they are doing. There are plenty of slimy salesmen out there selling such products too.

The key point is that a couple of man months is a typical CS project size and it is also a size where anyone half way decent can hack something together and have it work. That approach does not scale at all in the large.

Not just unique, but they seem never to look at the prior art. It is even more annoying when USPTO grant patents on obvious "new" software inventions that have been known about for donkeys years.

Knuths books also have an astonishingly low defect rate. ISTR he offers

2x the reward for each error newly reported to him. I don't know what the score is up to these days but he did send us a nice postcard (but no money) when we found an already known one .

I know it is popular to bash Microsoft and indeed I do it myself, but they do have good people technical working for them and have advanced the state of the art in numerous ways. The annoying thing is that the tools which should be being introduced at computer science courses are entirely absent from their educational software offerings and only included in the vastly overpriced corporate versions.

Should be 1980's solutions. But to be fair they do manage to multitask reasonably well these days. Their most annoying application by far is Word2007 which manages to combine arbitrary complexity and ureliability with unusability to produce something where what you see is never what you get. One of my printers used from Word2007 will only print on A3 paper the second time around if I select 6x4 postcard size (go figure)!

Regards, Martin Brown

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 8:27 AM

Pointers are *always* potentially dangerous and make life a lot more difficult for optimising compilers. They should be discouraged in modern computing. Pointer based buffer overruns are responsible for most of the key vulnerabilities of Windows.

That must be some incredibly prehistoric archaic BASIC you have.

There are BASIC dialects with richer syntax than the ancient ancestor.

You don't get 80 bit floats in Microsoft C/C++ any more :( Damn nuisance that is they assume all users are morons.

PowerBASIC is kind of cute. But it isn't the solution to all the worlds problems as John seems to think it is.

This actually proves you either have poor C programmers or a misconfigured C compiler. There is no reason why any of the modern compilers will not generate roughly equivalent speed code for either indexed or pointer based code. The catch is that C programmers tend to try and micromanage things with pointers and that fouls up the final stages of the global code optimiser. It has to avoid taking risks that pointers are acting on overlapping parts of arrays.

The commercial world does not invest in reliable code. They want quick code with lots of drones available to code in it and that gets you C. These days the compilers are for C/C++ but a feeble subset of it is being used. At least now most compilers do not need a Lint run against the code to eliminate all the typos and gross programmer errors.

I was once unlucky enough to have to develop on an early C compiler that would crash disastrously if asked to compile code with certain types of invalid syntax in it taking the editted sourcecode with it.

Regards, Martin Brown

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 11:51 AM

FORTRAN and COBOL weren't government projects. F. came out of a team at IBM led by John Backus. The COBOL spec was written by a committee of SHARE, and IBM users' group, and also first implemented by IBM. (Source: Bashe et al, "IBM's Early Computers", MIT Press, 1986, which coincidentally I'm in the middle of re-reading. Great book.)

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC
Optics, Electro-optics, Photonics, Analog Electronics

160 North State Road #203
Briarcliff Manor NY 10510
845-480-2058

hobbs at electrooptical dot net
http://electrooptical.ne

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 12:26 PM

I have never heard AT&T called communist before. And it was the business world that really took up C as their tool of choice. Market forces do not often give the optimal technical solution (or we would all be using Sony's betamax video format and IBMs OS/2).

Various niche markets exist for reliable and safety critical compilers and tools but the extensive training needed to use them correctly pretty much ruled them out for most commercial software.

IBMs compiler offerings at the time were risable. A FORTRAN G compiler that was so uncertain of its own capabilities that it would report at the end of a successful compilation:

NO DIAGNOSTICS GENERATED?

The "?" was a trailing NULL and the bug reporting change procedure so arduous that it was never fixed.

Combine that with the "ship it and be damned" attitude of senior management and you have a pretty good recipe for where we are today.

Regards, Martin Brown

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 1:13 PM

Depending on the implementation, there are libraries for that. OORexx can open queues natively, iirc, and there are also libraries like RxSock.

Both of the above, plus connecting to stdin and stdout directly from the script, and driving the external programs that way. That sort of job is one of the things REXX was designed for. It's also designed as an embedded scripting language to use inside other apps. My favourite text editor, X2, uses REXX scripting. Dead useful.

It's a _script_. If you want that sort of stuff, there's always Python--everything is a string in REXX. There are two classes of string comparison operators--ones that ignore whitespace and ones that don't. That's pretty useful, and is one consequence of the "principle of minimal surprise".

Rexx doesn't. Just quotation marks for strings, and commas for function arguments and line continuation. You can put semicolons at the line ends if you like, but they aren't required.

If doing symbolic math with Mathematica is just reformatting, then so's mine. Otherwise not. The script is more like a geometry compiler and optimizing supervisor, with a math program thrown in.

Could be. The manual is written in WP for DOS, and there are some peculiarities when running it under DOSBox. One of these days I'll re-do it in LaTeX, but not today. (I had to make that conversion with my book manuscript for the second edition, and it was *not* fun.)

'Easy' is a relative term. For instance, I have a serial port class that I've used in instrument control code for donkey's years. It has a high priority thread encapsulated in it, blocking on a read of the port, so that the buffer never overruns and the data are always fresh. That makes the app simpler to write and a great deal simpler to debug.

The FDTD simulator is in C++ because it allows me, for instance, to have completely different updating equations for different materials without the rest of the program needing to know or care. It computes a strategy, by going through every sugar cube in the crate and seeing if the material is the same as the previous one in the X direction. If so, it adds it to the list and presses on. If not, it starts a new object for the next sequence. Each of those objects has pointers to its nearest neighbours and a set of updating equations, so the inner loop just goes through a list of, say, a few hundred thousand rows of 100 cells, telling each row to update itself.

That winds up being about three times faster than the usual approach, which is to put a big switch statement inside a triply-nested loop that iterates over every sugar cube. It's also easier to understand and to debug. It also made it much easier to take a program designed for a single SMP (symmetric multiprocessor) box and make it run on a Linux cluster--I had to redo the synchronization scheme a bit, but the communications between processors were already object-oriented, so I just had to put network sockets inside. (That's where the Windows version scales better than the Linux one--the stupid scheduler doesn't let me tell it that the thread embedded in the Surface class needed higher priority than the compute threads.)

The OO could have been a bit cleaner in the beginning--for efficiency, I had each processor peeking into it's neighbour's memory to get the nearest-neighbour field values, which isn't practical to do over TCP/IP due to latency, so there was a bit more copying involved.

There were one or two places where I got a bit too clever, e.g. the way the updating equations inherit their properties. I'd do that part differently today, but this was part of the point of writing it myself. (The main point, of course, was that it has capabilities I couldn't buy at the time (2004-06) and some that I still can't, e.g. the ability to optimize on everything.

It depends on how much time you spend in a given domain. Writing code for phones, I could easily imagine that not paying back effort spent on OO. Simulations and instrument control are wonderfully natural places for it.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC
Optics, Electro-optics, Photonics, Analog Electronics

160 North State Road #203
Briarcliff Manor NY 10510
845-480-2058

hobbs at electrooptical dot net
http://electrooptical.net

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 3:10 PM

in

I have that! Yes, great book. It took IBM a long time to figure out what a computer is.

I wonder some times, if one of us time-traveled back to, say, 1940, knowing what we know, what sort of computer would we design with tubes and such? It would look a lot different from the things they did in those days. Something like an MC6800 maybe?

John

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 3:20 PM

Wiki

in

We'd say, "Holy crap, that's a lot of parts!". I bet you could make a one-tube flipflop using secondary emission in a tetrode, which would be sort of fun (if you could get tetrode receiving tubes). All the level shifting would be a chore.

Using diodes for logic and tubes for amplifiers was a pretty significant advance over pure tube logic. Just splitting open a diode and putting a third wire on it would make a bit of a splash in 1940. ;) IBM made transistors exactly that way during the lab investigation phase, later on of course. Fun.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC
Optics, Electro-optics, Photonics, Analog Electronics

160 North State Road #203
Briarcliff Manor NY 10510
845-480-2058

hobbs at electrooptical dot net
http://electrooptical.net

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 3:22 PM

I buy Flukes. They seem pretty tough. The cheap ones aren't.

John

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 3:25 PM

I used Fortran G and Fortran H as an undergraduate, in 1980-81. (I had an undergraduate research assistantship with Prof. Bill Shuter, who was a millimetre-wave radio astronomer--I learned FORTRAN by debugging somebody else's radiative transfer code that I didn't understand.)

It seemed to work fine for me, but of course I was far from expert.

Apart from its typographical peculiarities, what didn't you like about it?

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC
Optics, Electro-optics, Photonics, Analog Electronics

160 North State Road #203
Briarcliff Manor NY 10510
845-480-2058

hobbs at electrooptical dot net
http://electrooptical.net

- S
- Spehro Pefhany
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 3:30 PM

Maybe.. they had a lot of constraints that have disappeared. If every transistor cost $50 and had a 3% failure rate per year (or whatever it was), and the computer was intended only for computing ballistics, we might end up doing exactly the same thing except for the pain in "unlearning" what we know now.

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 3:30 PM

Wiki

is in

There's a circuit called a phantastron, which can make flipflops and one-shots from a single pentode. It's not secondary emission, just some funny electron trick with the grids, I think.

One could make a microcode store by weaving wires through magnetic cores. I did that once, actually.

Somewhere in the RadLab books is the casual statement that "a semiconductor triode should be possible."

John

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 6:10 PM

On a sunny day (Mon, 03 Oct 2011 08:22:57 -0700) it happened John Larkin wrote in :

Yes I have used Fluke. But buying 2 every month seems a bit wasteful. They probably are better. The first meter arced somehow, and then the chip failed, started drawing lots of current from the battery, I blamed myself, 'must have touched something wrong'. The second one I soldered into the circuit, and it was on the whole time, way below 1kV on the 1kV range, and all of the sudden it showed over-range. I tested the circuit and the voltage was normal. It also now shows over-range on any other range, and 0 Ohm on any Ohms range. I had the chip (old ICL76??) on a socket, so I changed the chip (still had one), and same thing. Looks like a resistor went high... OK thing is more than 10 years old. Hope the new ones from ebay are better, at least those have AC current range. Somebody here recommended those ebay ones.

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 6:27 PM

Hi Martin,

[*much* elided!]

It's obvious on, e.g., machines with rotating media -- you can *hear* the disk beating itself to pieces. OTOH, when the paging is "silent" and slowly creeps up on the application (i.e., while you are developing it and not noticing that it's paging more and more as your code tries to *do* more and more), it can be insidious.

Only when things get truly outrageous do you go looking for a cause.

E.g., I once looked at a piece of code that was performing incredibly badly given the triviality of its function. Long story short: it was sizing files by reading the file and *counting* the bytes! As files inevitably grew larger, more and more time was wasted pulling bytes off platters, counting them -- and DISCARDING them!

Apparently, the bytes were actually *examined* at one point in a past version and the "counting" was just a "free" way to size the file instead of making yet another call on the filesystem/dirent. When that "examination" process was later eliminated, the counting code was left -- since the rest of the application still needed to know the file's size. No one ever rethought *how* that size was determined!

Technically, this wasn't a "bug" -- the code was producing the results that it *should*. It was just insanely ludicrous in its inefficiency!

Exactly. It's the education that teaches you to go looking at these metrics. Too many "programmers" just look to see if the "answer" is correct -- without considering the path by which they arrived at it.

(still others are only concerned with whether or not the make(1) finishes! :< )

I think software (and hardware, as appropriate) developers are also complicit. The problem is that it is hard to accurately gauge the complexity/cost of a software project (unless it is *incredibly* similar to one you just finished... same features, same staff, etc.). If you compound that estimation process with demands to estimate the costs of individual *features* (as if they were a la carte entrees), there is no practical way that an intelligent decision regarding the merits of each particular feature can be made.

With hardware, it's a lot easier. You have a pretty good feel for how much you have to beef up a power supply to support an extra load; you know how many extra connectors you will have to add; etc. And, these have relatively simple costs associated with them (unless you are truly trying to do something revolutionary)

I was charged with specifying a new product at a company to replace a product that had been in production for many years. Of course, I talked to the marketing and sales folks to see what they were actually selling (taking real orders for!) and what customers were asking for (things that we offered as well as things that we didn't).

I was met with a deer-in-the-headlights look.

So, I dug through all the past purchase orders for the product line (*paper* records stored in "banker's boxes") and tabulated the various features purchased, dates, pricing history, etc.

When I made my proposal, those same marketing/sales staff complained bitterly about features that I had intentionally removed from the product specification (many of which had significant design consequences for the product). "We *must* have that feature!"

They weren't happy when I could tell them, on the spot (in a meeting of senior staff, sales, engineering, etc.) exactly how many of these features they had sold: "None" in most cases, "one" for a particular feature.

[In that last case, the CEO stared at me down the length of the table and said, "I bet I know exactly which sale that was..." Apparently, no one had ever actually *thought* about what they should be selling but had catered to the whining of the sales staff who wanted *everything* just to increase their chances of a sale -- despite the fact that those features inflated the cost of the product, *probably* decreased its reliability and definitely didn't bring in any sales (except that *one*?) by their presence/availability]

On the flip side, you find places where the engineer (or Engineering] designs the product and adds lots of useless bells and whistles "because they can" -- as if a testament to their own cleverness. And, you end up with a product that only an engineer can use -- but even he *wouldn't* because its so impractical!

Yes. But you have to understand that there are different approaches with different costs/performances/efforts -- and, be able to see how those aspects fit into the problem you are trying to solve.

Yes. However, picking it up in OJT is a lot harder and more costly (to the employer/client)

70's. Everyone was playing with their own ideas as to where "computers" would go (in the future). There was *a* "central mainframe" on campus. You could run cards or use it interactively (assuming you had an account -- all of which were metered!). But, there were just so many machines "lying around" that you typically had your choice of *where* you wanted to do your work ("OK, I need access to an Algol compiler... which machines will offer me that?")

Again, it depends on whether you try to treat them as a "processor pool" or as a set of co-operating processors. E.g., I'll often use a single chip MCU as a "smart I/O port"... a glorified set of octal latches! Often, this makes board layout a lot easier: just run I2C/SPI across the board to the MCU -- instead of having to run a real bus to several latches, etc.

The signal processing app mentioned above has me torn between putting in a DSP as the second processor *or* a second instance (or core) of the *first* processor. The former gives me far more performance boost for the signal processing parts of the applications; the latter gives room to migrate *other* (nonDSP) parts of the application into that processor as the need arises.

But this is true of many fields. The goal is always to dumb-down the human requirements of a task. Even if it is the "dumb human" who is making that decision! ("Gee, I don't know ANYTHING about medicine. But, this product claims it will let *me* solve my medical problem without requiring a fancy/expensive medical professional!")

Exactly.

I like the definition of complexity as: something that fits (or *won't* fit) in a single brain. Those "CS projects" aren't complex. You can carry the entire conceptual design in your head easily. It's just a matter of time to get it down onto paper.

But, when you can just barely carry the *outline* of a project in your head and need other "heads" to carry the details of each of the subsystems alongside yours, now you have a problem that requires considerably more planning and design.

If you pick up a pencil and start writing code on Day 1, you either have a trivial project *or* are headed for big-time pain!

IIRC, TeX carries a version number that reflects each new revision ("fix") by appending the next digit to the representation of pi being used by the current revision :-/

Clever. OTOH, it allows for an *infinite* number of revisions! Perhaps he hasn't as much faith in his product as he leads us to believe? (a more arrogant/confident approach would have been to use a single digit for the revision level!)

[MS]

ROTFLMFAO! A bit long for a quote but amusingly accurate!

I use *no* MS software (other than XP itself) since they seem to be pretty inept at making *useful* tools. Constant changes to the user interface, help when you don't want it, sluggish performance, etc.

My last MS product was the end of the C line (version 7, 8 or 9?). I got tired of the, "Yup! That sure is a bug! But, we are no longer supporting that version of the product. We'll give you a free upgrade to the next set of bugs in its replacement product! Can I get your mailing address...?"

Could you imagine how long it would take to build a house if the carpenter(s) had to upgrade their *hammers* as often as MS would have them do it? :-(

I use FrameMaker for all my "word processing/DTP" needs. Reasonably speedy. Doesn't create huge "empty" documents (like Word). Intuitive interface. *Documented* file formats. Stable. etc.

I can't say enough *good* about it!

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 6:39 PM

Many processors don't have "huge caches" to worry about flushing (I do embedded work, almost exclusively). As such, IPC is the bigger cost to worry about.

Unfortunately, it's impossible to tell a "tool" how much data will be flowing between two "tasks". Even if you *write* that tool (e.g., the scheduler/dispatcher).

If you *know* that a certain set of tasks have strong communication ties, then you can win (in a NUMA/NoRMA architecture) if you can force that partitioning at design time by arranging for the tasks to always be "near" their data sources/sinks. This keeps communications paths short (and WIDE!) as well as keeping all this cruft out of the way of *other* tasks -- that have no need for it (nor the cost of its presence!).

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 6:44 PM

Bwahahaha! Which (cartoon) character was known for that? "Muttley" comes to mind but that can't be right. I recall his breathy chuckle, instead...

(sigh) A-Googling I go...?

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Oct 3, 2011 9:37 PM

We formally release everything to a company server. Manufacturing gets their docs and BOMs only from released files. We have brutal levels of backup strewn all around California.

Programmers can use a VCS during development, but only clean files are released, starting with rev "A".

John

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Tue, Oct 4, 2011 1:51 AM

Wiki

complicated

Algol.

is in

One could make microcode store by weaving lands on flex circuits, too. IBM did that too.

Sould be!

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Tue, Oct 4, 2011 1:55 AM

I certainly have, though AT&T isn't the subject.

You bet! It pretty much ruled out software that works, too.

Sure. IBM printed a *lot* of "THIS PAGE LEFT INTENTIONALLY BLANK", too.

Look up "IEFBR14 + bug" some time.

House of cards.