exposing resource usage

D

Don Y 9 years ago

Is there any potential downside to intentionally exposing (to a task/process/job/etc.) its current resource commitments? I.e., "You are currently holding X memory, Y CPU, Z ..."

*If* the job is constrained to a particular set of quotas, then knowing *what* it is using SHOULDN'T give it any "exploitable" information, should it?
[Open system; possibility for hostile actors]

Vote

R

Robert Wessel 9 years ago

It depends on how paranoid you want to be. If some of the resources the process has access to may be influenced by other actors, that could expose a covert channel.

For example, if one of the bits of information was the number of real pages held by a task, then another task could make storage demands that could get some of the first processes' memory paged out (or discarded). Similarly, the first process's CPU usage will be affected by other processes in the system. Wiggle those back and forth, and you can do the equivalent of sending Morse code from one process to another.

OTOH, some of those can be exploited with help from the OS: just time how fast you're executing a loop - if another process also start running a CPU bound loop, you're rate should roughly halve.

OTTH, I've always thought the whole covert channel thing was just a bit *too* paranoid.

Unrelated to the above, I could see some of that knowledge being used to implement a DoS attack - tune your usage right up to the limit to maximize the possible impact on the system.

Vote

U

upsidedown 9 years ago

IIRC, some high security systems require that process should _not_ be able to determine, if it is running on a single CPU machine, multiple CPU systems or in a virtual machine and in all cases what the CPU speed is.

Regarding quotas in more normal systems, quotas are used to limit rogue processes to overclaiming the resources. In practice the sum of specific quotas for all processes can be much greater than the total resources available. Thus, a process my have to handle a denied requests even if the own quota would have allowed it.

Only in the case in which the sum of all (specific) quotas in a system is less than the total available resources, in that case you should be able to claim resources without checks as long as your claim is less than the quota allocated to that process.

But how would a process know, if the sum of quotas is less or more than the resources available ? Thus, the only safe way is to do the checks for failed resource allocations in any case.

Vote

D

Don Y 9 years ago

In practice, that's almost impossible to guarantee. Especially if you can access other agencies that aren't thusly constrained. E.g., issue a request to a time service, count a large number of iterations of a loop, access the time service again...

But, that can be to limit the "damage" done by malicious processes as well as processes that have undetected faults. It can also be used to impose limits on tasks that are otherwise unconstrainable (e.g., how would you otherwise limit the resources devoted to solving a particular open-ended problem?)

Yes, but how the "failure" is handled can vary tremendously -- see below.

That assumes the application is bug-free.

How a resource request that can't be *currently* satisfied is handled need not be an outright "failure". The "appropriate" semantics are entirely at the discretion of the developer.

When a process goes to push a character out a serial port while the output queue/buffer is currently full (i.e., "resource unavailable), it's common for the process to block until the call can progress as expected.

When a process goes to reference a memory location that has been swapped out of physical memory, the request *still* completes -- despite the fact that the reference may take thousands of times longer than "normal" (who knows *when* the page will be restored?!)

When a process goes to fetch the next opcode (in a fully preemptible environment), there are no guarantees that it will retain ownership of the processor for the next epsilon of time.

When a process wants to take a mutex, it can end up blocking in that operation, "indefinitely".

Yet, developers have no problem adapting to these semantics.

Why can't a memory allocation request *block* until it can be satisfied? Or, any other request for a resource that is in scarce supply/overcommitted, currently?

This is especially true in cases where resources can be overcommitted as you may not be able to 'schedule' the use of those resources to ensure that the "in use" amount is always less than the "total available".

Vote

U

upsidedown 9 years ago

Exactly for that reason, the process is not allowed to ask for the time-of-day.

There can be many reasons why the Tx queue is full. For instance in a TCP/IP or CANbus connection, the TX-queue can be filled, if the physical connection is broken. In such cases, buffering outgoing messages for seconds, minutes or hours can be lethal, when the physical connection is restored and all buffered messages are set at once. In such cases, it is important to kill the buffered Tx-queue as soon as the line fault is detected.

This is not acceptable in a hard real time system or at least the worst case delay can be firmly established. For this reason, in hard RT systems, virtual memory systems are seldom used or at least lock the pages used by the high priority tasks into the process working set.

There is a guarantee for the highest priority process only, but not for other processes. Still hardware interrupts (such as the page fault interrupt) may change the order even for the highest priority process. For that reason, you should try to avoid page fault interrupts, e.g.by locking critical pages into the working set.

For this reason, I try to avoid mutexes as much as possible by concentrating on the overall architecture.

As it is done in early architectural design. Trying to add last ditch cludges during the testing phase is an invitation to disaster.

Not OK for any HRT system, unless there are a maximum acceptable value for the delay.

Overcommitment is a no no for HRT as well as high reliability systems.

These days the hardware is so cheap that for a RT / high reliability system, I recommend 40-60 % usage of CPU channels and communications links. Going much higher than that, is going to cause problems sooner or later.

A 90-100 % utilization might be OK for a time sharing system or mobile phone apps or for viewing cat videos :-)

Vote

U

upsidedown 9 years ago

I just realized how old I am (still one of the youngest in CAE and especially SED newsgroups). During my career in various forms of computing, the prace/performance has been improved by a ratio one to a million, depending on how you interpret the Moore's law (is the price/performance ratio doubling every 18 or 24 months). With such huge rations, it is cost effective to do things in one way and other

2-4 years in a completely different way.

Things that required dedicated designs and optimization in the past does not make sense these days, unless you are making several million copies and want to save a single cent from the production cost.

For low volume products, it doesn't make sense to use too much optimization these days. Thus a person with long experience really needs to think, how much "clever" features are used.

Vote

D

Don Y 9 years ago

I started work on "embedded" products with the i4004 -- with clock rates in the hundreds of kilohertz and instruction execution times measured in tens of microseconds -- for *4* bit quantities! *Simple* operations (e.g., ADD) on "long ints" were on the order of a MILLIsecond. Memory was measured in kiloBITS, etc.

Now, I can run an 8 core 2GHz box with 32GB of RAM and 4T of secondary store for "lunch money" :<

IME, the *hidden* cost of saving pennies (typically, reduced reliability) far outweighs the recurring cost of all but trivial designs. Let the machine (system) carry most of the load so the developer (and user) aren't burdened/inconvenienced by it.

In the (development) time it takes me to *save* a few pennies, the product costs have FALLEN by those same few pennies.

[There are, of course, special circumstances that defy these generalizations. But, most folks waste effort nickel-and-diming designs needlessly.]

Vote

T

Tim Wescott 9 years ago

It's one less barrier to some outside actor getting that information, and therefor using it in an attack (if I know to do something that might make a task overflow its stack, for instance, I'll have something concrete to try to help me break in).

How hard are you going to work to keep that information out of the hands of outside actors?

Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!

Vote

D

Dimiter_Popoff 9 years ago

The good thing about aging is that we don't notice it a lot ourselves as long as we are healthy. The outside world takes care of keeping us up to date of course...

Hardware has always been ahead of software and as hardware becomes faster for the same tasks done 30+ years ago the gap is allowed to widen - to scary dimensions I would say. But this is how evolution works I guess, eventually some balance will be reached. Not that we have that moment in sight as far as I can see.

Dimiter

Vote

D

Don Y 9 years ago

Of course. The challenge in the design of an OPEN system is coming to a balance between what you do *for* the developer (to allow him to more efficiently design more robust applications) vs. the "levers" that you can unintentionally expose to a developer.

In *closed* systems, the system design can tend to assume the developers are not malicious; that every "lever" thus provided is exploited to improve cost, performance, etc. Any flaws in the resulting system are consequences of developer "shortcomings".

In an open system, you have all the same possibilities -- PLUS the possibility of a malicious developer (or user!) exploiting one of those levers in a counterproductive manner.

The only way to completely prevent exploits is to completely deny access. But, that's contrary to the goal of an open system.

Vote

U

upsidedown 9 years ago

You need to consider the inpput/output speeds. Essentially the 4004 was a calculator chip with steroids. The input speed for a manual calculator is about 100 ms/decimal digit and one expects that the result is displayed in a second, so you could do quite complicated computations even with a 1 ms (long) decimal add time.

Just calculated that the 4004 would have been sufficient to handle summation of data from a slow card reader (300 CPM, cards per minute) so with ten 8 digit decimal number on each card, you would have to handle 50 long decimal numbers each second. Using a medium speed (1000 CPS characters per second) paper tape, this would be 125 long decimal integers/s, which would quite hard for the 4004 to handle.

Simple decimal computers in the 1960's often used a 4 bit BCD ALU and handled decimal digits serially. This still required a lot of DTL or TTL chips and the CPU cost was still significant.

With the introduction of LSI chips, the cost dropped significantly in a few years.

Any programmable calculator today will outperform any 1960's decimal computer by a great margin at a very small fractional cost.

If things were done in one way in the past with different constraints, implementing it today the same way might not make sense.

The 4004 had a nice 4 KiB program space. Small applications even in the 1980's didn't need more and reprogramming a 4 KiB EPROM took just

5 minutes :-)

Vote

D

Don Y 9 years ago

We used it to plot current position based on real-time receipt of LORAN-C coordinates:

Each "coordinate axis" (i.e., X & Y, latitude & longitude, etc.) in LORAN consists of a family of hyperbolic "lines of constant time difference": between a master transmitter and one of its slaves (A&B in the diagram). With families from *two* such slaves INTERSECTING, you can "uniquely" [1] determine your location on the globe (knowing the latitude and longitude of the master and associated slaves, the shape of the earth, propagation time of radio waves and "conic sections").

[1] This is a lie as a single hyperbolic curve from one family (time-difference coordinate #1) can intersect another hyperbolic curve from another family (TD coordinate #2) at *two* points, unlike a (latitude,longitude) tuple that is unique. To confirm this, print two copies of the above sample and skew them so AB is not parallel to AC (assume C is the renamed B on the second instance)

Coordinates are processed at a rate of 10GRI (10 sets of transmissions -- GRI is the time between transmissions from the master; ). Each is typically about 50-100ms so 10GRI being 500-1000ms.

It's a fair bit of work to resolve two hyperbolae on an oblate sphere mapped to a scaled Mercator projection and drive two stepper motors to the corresponding point before the next "fix" arrives.

This is the second generation (8085-based) version (bottom, center): By then, the code space had soared to a whopping 12KB (at one time, close to $300 of EPROM!) -- with all of 512 bytes of RAM!!

The Z80 was still a 4b ALU (multiple clocks to process 8b data)

Of course! I suspect I could reproduce the software for the plotters in a long weekend, now. No need to write a floating point library, multiplex PGD displays, scan keypads, drive motor coils, count

*bits* of storage, etc. Just use and a graphics library to plot line segments on a display "instantaneously". Load a set of maps from FLASH, etc.

You were using 1702's in the mid 70's -- 2Kb (not KB!) parts.

Vote

exposing resource usage

Join the Discussion

Didn't find your answer?