OT: Software bloat (Larkin was right)

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, May 20, 2010 6:03 PM

Actually, it's *Bob*... ;-)

I worked at one (small-ish) firm *when* MRP was deployed. The most important aspect of the roll-out was getting everyone onboard -- everyone in the company "went to school" to understand the issues involved.

The problem I see with *all* OTS "business solutions" is they want to impose their business logic on *your* business. They want you to process and use data the way *they* envisioned.

In addition, they impose arbitrary constraints on the data, etc. Is there some reason a "description" has to fit in N characters? Or that a P/N has to fit a certain template? etc.

I think this is a result of misplaced efficiency tradeoffs. I.e., they want to be able to index/search data "quickly"... saving a few micro/milliseconds on each operation. Yet, they are oblivious to the seconds, minutes or even *hours* of "real time" delays that these "efficiencies" cost -- when a user can't locate a particular item (because the item's ABBREVIATED DESCRIPTION was abbreviated inconsistently or used one keyword instead of another, etc.).

This, I think, is an outgrowth of the same sort of ridiculous mindset that people initially bring to

*organizing* data. E.g., how many part numbering systems have data embedded *in* the part number that tries to describe the item? (isn't that the role of the *description* tied to the P/N??) People impose structure on things unnecessarily instead of letting the machine do that on their behalf.

E.g., when I started preparing documents, standards, etc. here, I used a much more commonsense approach: I started with *1* :> (instead of something artificial like

1985-SPEC-SFW-001.1 -- the ".1" being a revision level, etc.) Then, moved on to "2".

Data should largely be free-form -- except where it *can't* :>

This applies to part numbers, object (file) names, etc. Once you start imposing artificial structure, you start forcing things to be "done your way" -- which, typically, exposes some *flaw* in "your way", later (once you are *very* pregnant!)

Put smarts in the system to be able to *understand* the data.

- J
- Joel Koltner
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, May 20, 2010 6:35 PM

Yes, so it would seem.

20 and perhaps even 10 years ago, I think there were decent reasons for this with limited memory, hard drive space, etc... but unfortunately the software vendors for this kind of program seem to be very slow to change.

Yep, agreed 100%.

And I think Google has shown that you can effectively index orders of magnitude more data than any one company will ever produce and still get searches down well below one second.

I've tried to discourage embedded lots of information into a part number, although I do see value in trying to get the most basic information (e.g., resistance for resistors) in there. I agree that a lot of people seem to want to embed 3 or more fields into a part number and it rapidly gets to the point where you need a secret decoder ring, though.

We'll occasionally have internal discussions on how to number software revisions, and I've generally advocated that we just stick the build date and time in the code and be done with it (that's what I do with my own software). The tools can do that automatically, so why even worry about whether this is versoin 1.2.4.3 or if it should be bumped to 1.2.5.0? Many times people forget to bump the revision numbers anyway, and this has been a major headache when, e.g., some guy in testing tries to tell a programmer which revision of the code contains a given bug.

---Joel

- M
- Michael A. Terrell
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, May 20, 2010 7:28 PM

The idiot engineering manager we had for a few months at Microdyne wanted to change all stock numbers. His plan was to use the resistance of a resistor as the part number. He threw a hissy fit when I pointed out that we had over a dozen different types of 10K resistors in inventory. It didn't take them long to fire him when they found out how useless he was. I think he was hired by Scott Adams to be Dilbert's boss after that. :(

--
Anyone wanting to run for any political office in the US should have to
have a DD214, and a honorable discharge.

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, May 20, 2010 8:19 PM

This extends far beyond just MRP systems. I am always amused when I hear some gung-ho accountant excited (as he heads off to "school") about the new computerized accounting system his company is buying.

I politely and gently tell them they're going to have problems making it FIT their business, etc.

I always get the "Oh, no! We looked at a whole bunch of them. We sat down with the sales rep and reviewed our accounting model, verified it would work in their framework, etc. (Don, you just don't understand *accounting*! Trust me, we KNOW it will work)"

Then, a few months *after* the 6 month start-up phase, they inevitably come out grumbling about how it "doesn't do things right". I then (gently -- remember, *I* don't understand ACCOUNTING! :> ) innocently ask them, "You mean, it

*can't* do ________________? "

"Oh, no. It'll *do* it. It just doesn't do it the way *we* do!"

(At which point, I wish I could play back an audio recording of the initial conversation I had re; this exact issue!)

But, my point is, those limits were created by the wrong-placed ideas of the implementors. They looked at query speed, how long it took to create reports, etc. and said "My, that's just too slow! We'll have to cut some corners to speed it up!". Instead, they should have looked at the role of the system in the organization. Not just from the perspective of the guy running the reports, etc.

It's like when bean counters decide how data should be organized... they think about what *they* need from the data and don't consider the data model as an abstraction that others relate to as well (e.g., how the data gets *into* the system, etc.)

Exactly. The place where I watched it "come in" had to deal with stupid, *naive* descriptions. So, sometimes it was a:

1K ohm resistor fixed 1K R 1000 Res. 1K resistor

etc. So, the MRP system itself was "stupid" -- you *couldn't* find things in it unless you forced all descriptions to be consistent, etc. (this is what they eventually did -- had

*one* person responsible for entering new data)

Exactly. This is an outgrowth of people wanting to be able to manage the data "in their heads". Or, clinging to "that's how we've ALWAYS done it".

I worked for a large (hand) tool manufacturer when I was a kid. I had an interview with one of the big muck-a-mucks (I think this was one of those "create a job for this kid" deal -- the gentleman was from my alma mater). Very informal. I think he just wanted to "chat with the youngster" :>

He asked me how to *design* a part numbering system (keep in mind, the company probably had 100,000 or more part numbers at the time -- SWAG). I came up with the (uninspired) solution "first digit(s) indicate the type of tool..." etc. I.e., encode information about the part in the P/N. Conversely, let the part determine its P/N!

He smiled ("silly boy"...).

"So, screwdrivers would be '1xxxx', hammers '2xxxx', pliers '3xxxx', etc.?"

"Yes."

"OK, so where do we put *electric* screwdrivers?"

"Well, you could lump those things in '9xxxx'!"

Without the smile leaving his face, he opened his desk drawer and pulled out a tool that was a screwdriver, hammer, pliers and hacksaw all-in-one (I don't recall exactly what it was; but, it was *really* weird -- remember, this is where they

*invent* the tools) and said, "How about *this*?"

Now the smile is a broad grin.

"Don, what you will quickly realize is EVERYTHING ends up in the '9xxxx' category! The only rational way of assigning numbers to parts is *sequentially*!"

The important lesson there was not that "part numbers should be insignificant" but, rather, that you need to think through what you are doing -- and why. If you want the number to be significant (why? so you can fabricate it in your head without having to look it up??), then *will* it ALWAYS be significant? Or, will there be exceptions that effectively render your "system" useless -- because you'll never be able to keep track of what the "exceptions" are... so, you WILL have to "look things up".

You can *make* these approaches work. But, if you are honest with yourself, you have to admit that your initial goal is not being satisfied. (and, what do you do if someone else creates P/N's using some different system? i.e., you are better served to have a good *lookup* system than to waste effort trying to enforce structure on "numbers")

I do that with "documents" -- though usually just the date is sufficient (since documents don't get released that often WITHIN A DAY).

I don't do it with software, though. I've been bitten by clock errors from one machine to the next, etc. (before I had NTP server running). So, I would look at some code weeks later and see "version Y" stamped with a date previous to "version X".

Now, I just keep a "VERSION" file in the repository and include that in each build (makefile does this). That way, it doesn't matter which machine I am using to build the image (since the VERSION file is in the same place as the sources).

If I ever get a second seat for my DTP tools, I will probably have to use a similar approach (currently, there is no issue with clock skew as there is only one DTP seat!)

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, May 20, 2010 8:58 PM

We tested some commercial packages and didn't like them. They clearly didn't understand the electronics business, were slow (usually sat on top of a general-purpose database manager, a hazard in itself) and often had silly per-seat-per-year license rules.

I wrote the skeleton of this myself and we hired a contract guy to do the detail coding. The biggest part wasn't the code, it was inventing and documenting a new part numbering system, re-describing all the parts in stock (close to 5000 of them) and moving/relabeling all the bins. It was worth it, and now we own the source code.

The Brat bird-dogged the project to completion, including browbeating the programmer into doing good work and forcing the engineers to go through those 5000 part records one at a time. I was impressed.

John

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Thu, May 20, 2010 9:10 PM

We spent a goodly amount of time defining how part numbers would be generated. A resistor part number *does* include its type and value, and there are rules for what to do if one 10K 0.1% 0603 resistor is green and another is blue. All the resistors are numerically and physically in order by value.

Everything is also available in ASCII files, so you can write your own Basic or Perl programs to do weird stuff. I recently wrote one to search for resistor ratios using in-stock parts.

John

- M
- Michael A. Terrell
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 1:40 AM

Did she use a whip and a chair, or a pistol? ;-)

--
Anyone wanting to run for any political office in the US should have to
have a DD214, and a honorable discharge.

- M
- Michael A. Terrell
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 1:44 AM

want

point

and

software).

headache

They used R for the first character, then a three digit code for the type of resistor, followed by more digits for the value. The biggest problem was the 'E' category which was a catchall category. I even found a Chevy Blazer that was part of the Satellite installer group's equipment listed in the 'E' inventory while trying to find an IC they had used in a test fixture. It took half a day to brow beat engineering into admitting that they had built the fixture out of sample parts from several vendors, and there were no spares.

--
Anyone wanting to run for any political office in the US should have to
have a DD214, and a honorable discharge.

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 2:58 AM

Understandable.

This ^^^^^^^^^^ is true of damn near all OTS software solutions, IMO. It *might* work for the folks who wrote it. But, probably won't for anyone else (since no two companies do things the same way)

I don't believe in "part numbering systems" beyond: "How many digits in the P/N?" (see other thread for my discussion).

E.g., how would I even *begin* to assign part numbers to each of the software modules I create?

The biggest mistake *I* made was not giving *everything* a part number (I didn't think certain things would ever need to be "controlled").

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 3:06 AM

I'm talking about electronic parts, not software modules. We use a "telephone number", like 123-4567. The first three digits are the class (103- is the class for 0603 ceramic caps for example) and the last 4 digits are the specific part. 103-1300 is the 10 pF one.

We have rules for *everything*.

The best way to control software modules is to not have them: use one monolithic source file for a given project. We do assign a part number and a revision letter to a piece of software, just like any other engineering drawing.

John

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 4:38 AM

I reuse modules often. It allows me to make "inexpensive" solutions (instead of having to reinvent the wheel each time).

It would be quite hard for me to edit 5-50MB source files! :>

And, incredibly inefficient having to compile all of that in one piece. (besides the drawback of polluting the namespace therein)

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 5:05 AM

I have an editor with advanced "cut" and "paste" features.

YIKES! Nothing, not even an OS, should be that big!

:>

My PowerBasic programs rarely take a full second to compile.

What's a namespace?

John

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 7:26 AM

Sure. But why cut and paste stuff when it already works "as is"? It's documented, there is a test suite for it, etc.

Products require an OS + application(s). I work on *big* projects! :>

But, if you think about it, it is not difficult to have

10MB for a small-ish assembly language program. Assume a line of code generates 1.5 bytes (most generate at least *one* byte of code :> ). Most ASM opcodes are 3 or 4 chars. Put a tab on each side of it gives us 5 or 6. Put some arguments after it (register list, symbolic address of an operand, etc.) probably adds at least 10 chars. One or more tabs out to a "comment column" followed by 30 or 40 characters of commentary (?) I.e., it's easy to have 50 characters on that line for that 1.5 bytes of generated code.

So, 100KB of code is ~3.5MB (note that we haven't included any other overhead -- like EXTERNs, etc. -- nore any paragraphs of pure commentary).

I just looked at an old Z180 project that I did. Almost 9MB of sources. Looks like it compiled to about 200K. So, that would be 4.5MB/100KB code (roughly in line with my shirtsleeve estimate above).

My TCP/IP stack is over 100 files and a couple of megabytes of source (C with a tiny mix of ASM).

A quick peek at the NetBSD sources show ~20,000 files and > 100MB for just the kernel (i.e., the equivalent of "Linux" -- without any of the other OS utilities or applications).

My multimedia OS is about a third of that -- but, it doesn't have to run on 30 different architectures, support umpteen gazillion different I/O boards, etc.

No, it's just not practical to deal with any decent size product as "one file"...

Building my entire OS from scratch ("make clean all") takes several minutes on a 3G machine. Some of this would be sped up if everything was in one giant file (less thrashing around in the filesystem).

But, since I rarely work on more than two files at a time, I can do a typical incremental build in just a few seconds: compile the two files that have changes, link and done.

If you glue all of your modules together into one giant file, then you have to ensure that nothing from any different modules has the same name as something in *another* file

E.g., if I have a (private/static) scrap routine called "copy()" that is used in "window.c" to perhaps make a copy of a "window". Now assume I want similar functionality for *menus*. So, in "menu.c" I create a (private) routine called "copy()".

Nothing outside of menu.c ever references the "copy()" that is inside menu.c. Likewise, nothing outside of window.c ever references the copy that is inside window.c. I.e., each "copy" is private to the module in which it is defined.

Now, if I glue all of window.c and all of menu.c together into one big file, there will be *two* "copy()" routines. Two identical names in the single file-scope namespace. (i.e., each name within the file has to be unique).

(you can actually work around this -- but, why work around something when it just gives you a less desirable result? Namely, an oversized source file?)

Sort of like having two variables each called "X" and expecting them to have different values and be accessed uniquely.

- T
- Tim Williams
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 2:27 PM

Hmm, I've got a 70k .asm file (x86, for MASM) which compiles to I think 9k code (plus a few large arrays). Only a ratio of 7? Gee, now I look lazy at commenting...

; ; Multiplies a fixed-point DWORD (16.16 format) by a fraction (0.16). ; Input: on stack (DWORD, WORD) ; Output: DX:AX = DWORD product. ; fixedByFrac PROC near USES bx cx si di fixed:dword, frac:word mov di,wpr fixed+2 mov ax,wpr fixed mov bx,frac xor si,si test bx,bx ; get fraction sign jns @F

Ehh, looks easy enough... maybe I just read asm better? :-p

Tim

--
Deep Friar: a very philosophical monk.
Website: http://webpages.charter.net/dawill/tmoranwms

- D
- Didi
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 3:13 PM

Hi Don, my average is about 34 characters per line over the years,

formatting link

:-). I write comments on every line, headers etc., but there are null lines to compensate for that, I guess (a sort of typical largish source file is

formatting link

. Part of a much larger picture, of course (1 of 46 files which make the vpa assembler(compiler?)).

Into how much object code does that compile? Seems comparable to my tcp/ip stack for dps,

formatting link

. What is listed there is just for tcp, ip, udp, ppp - dns and above stuff are elsewhere. And some helper objects which are already moved elsewhere (memory_pool etc., turned out handy and I moved them for more common usage). The object code this list produces is about 230k (for power, but it is written only for power, no backwards 68k compatibility, at least not without some work). Would be an interesting comparison between toolchains and languages, I guess.

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 3:54 PM

Curses is well named--it _almost_ does what you want, but never quite.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal
ElectroOptical Innovations
55 Orchard Rd
Briarcliff Manor NY 10510
845-480-2058
hobbs at electrooptical dot net
http://electrooptical.net

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 3:58 PM

It's sort of nice to be able to look at a part number and see whether it's a capacitor or a BNC connector, though. That doesn't have to have descriptions embedded in the part number, but it does need a bit of thought, e.g. numbers starting with '0' are subassemblies, '1', resistors, '2', capacitors, and so forth. Takes an extra couple of digits but makes life a lot easier.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal
ElectroOptical Innovations
55 Orchard Rd
Briarcliff Manor NY 10510
845-480-2058
hobbs at electrooptical dot net
http://electrooptical.net

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 4:08 PM

Revision control software is a big help for this, and you can archive each release (and every change leading up to it). I use git, which I've become very fond of.

My clusterized EM simulator is nearly 5 MB of .cpp and .h files, not to mention a 500 kB front end script!

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal
ElectroOptical Innovations
55 Orchard Rd
Briarcliff Manor NY 10510
845-480-2058
hobbs at electrooptical dot net
http://electrooptical.net

- D
- D Yuniskis
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 4:40 PM

I don't think it works, in the long run. And, I think the effort spent trying to figure out *how* to do this (and codifying it and ensuring everyone uses the same rules) is better used getting better descriptions, better search capabilities, etc.

I don't understand the false security that comes from putting information like that in a "name" for an object. E.g., I likewise don't understand "hungarian notation" for naming variables.

I'm sure glad my parents didn't name me: boy_son_eldest_Don!

I see this sort of behavior in many places. People seem afraid to "let go" of data and trust something else to keep it handy (and safe!) for them!

I see people building "databases" in spreadsheets simply because they fear "not seeing ALL the data". (and I mean

*huge* spreadsheets -- hundreds of columns, thousands of rows) When shown how to get the same results with a database, they cringe because "it's magic" -- the numbers "just appear" instead of being displayed in front of them (static) all the time.

I ask them, "How does seeing all of these numbers help you understand them? Do you know, for a fact, that *this* number HERE is correct? Are you sure the formula in this cell hasn't been changed in JUST this cell? Do you regularly examine the formulae?

(glassy-eyed stare)

Computers are good for two things: doing lots of things fast and remembering stuff. They've got me beat on both counts and I'd be foolish to try to compete with them :>

- J
- Joel Koltner
  
  Contact options for registered users
Vote on answer
posted
13 years ago

Fri, May 21, 2010 4:45 PM

I think there was a certain time period when software was getting pretty big and complex but the tools hadn't caught up yet when hungarian notation saved enough time in having to look up, "ok, now what type of object is this again?" could -- for some people -- outweigh the minor hassle of using it in the first place... although I never personally felt the benefit was enough to bother.

These days development tools have so much "intellisense" built-in that I don't think there's much point anymore and (happily) you do see less and less of it. And to a certain extent, if you choose your variable and method names descriptively and write the interfaces well, people shouldn't have to worry about the exact type of your object anyway -- if you have a variable that's of type, e.g., "EmployeeName," it should be safe to assume it can act like a string.

I try to convine people that if you have a schematic sitting in, e.g., c:\Schematics\Top Secret Projects\NSA Spy Satellite\RF Transmitter, it's kinda silly ot make the file name "Top Secret NSA Spy Satellite RF Transmitter Schematic.SCH," but not everyone buys into this. (Granted, I'm exaggerating, but you get the idea -- I really do see a lot of "Project Name Functional Name Schematic.SCH" files...)

---Joel