"short" pointers

Hi,

[Probably should have posted this to c.l.c but those folks get too pedantic...]

I'm looking for a *convenient* way of "SHORTening" pointers (to objects scattered through memory). All of the schemes I've come up with aren't very generic -- or, are "tedious" (making them prone to misuse/mistake).

[And, "yes, I *do* need to conserve resources to this extent"]

Thx,

--don

Reply to
Don Y
Loading thread data ...

So, you're trying to wind up with 16-bit pointers rather than 32-bit? Is your addressable memory space for this application actually that short?

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com 
Email address domain is currently out of order.  See above to fix.
Reply to
Rob Gaddi

C? C++? FORTRAN? COBAL? Perl?

If C, could you make a set of macros?

i.e.:

typedef short int shortPointer; // maybe make this a struct, // for type safety?

shortPointer spBlock1, spBlock2;

spBlock1 = SHORT_POINTER(pBlock1); spBlock2 = SHORT_POINTER(pBlock2);

int bob = EXTRACT_POINTER(spBlock1)[3];

...

If your C compiler can't be coerced into making the type conversions strict enough, this would almost be worthwhile to use C++ just for this, while avoiding all other C++ features.

--
Tim Wescott 
Wescott Design Services 
 Click to see the full signature
Reply to
Tim Wescott

Yes. Though a more correct way of saying it would be "16 bit

*encodings* of pointers (where "encoding" can be some arbitrary function that "makes sense" for the application)

I don't think I will be able to squeeze down into a 16b *total* address space (TEXT+DATA) -- though I would *love* to be able to get things that small!

I am, currently, pursuing options to independently shrink the TEXT and DATA segments (of course, there are repercussions of changes to *one* on the *other*) and trying to quantify what each "optimization" is costing/gaining. E.g., trying things like Q14.9 to shave a byte here and there, etc.

I'm also looking at other functionalities that I might want to select for in processor choice to, perhaps, economize on these resources.

(typical "squeezing a balloon" exercise where there's no clear "right answer" :< )

Reply to
Don Y

Most preferably C -- ASM if I am forced to that extreme. :<

Yeah, each approach I've come up with, so far, requires a fair bit of "decoration" in the source. And, they all seem to side-step many of the checks/protections that the compiler could, otherwise, apply to address operations.

E.g., pBlock1 - pBlock2 has some meaning. What does spBlock1 - spBlock2 mean? Or, spBlock2a[3] - spBlock2b[6]?

I'm always afraid that too much added syntactic complexity distracts from what the code is trying to *do*... you spend more time trying to figure out how to *say* what you want and lose track of what you actually *want*!

It would be *really* nice to exploit operator overloading in C++ for this. The syntax could then "look normal".

But, I am leary of doing *anything* in C++ as I am always surprised at how it manages to "do stuff" (generate code) that I hadn't expected from the source I had written.

[At least C makes it relatively easy to *guess* at the code that is likely to be generated for a given set of source statements]

(sigh) I'll have to see if there is some other place that can give me comparable savings to "trimming pointers" -- in a way that doesn't obfuscate the sources as much. I knew this

*looked* too easy to be true! :-/
Reply to
Don Y

In C you're just going to have syntactic difficulty. I think you can do it so that it won't compile if you misuse your "short pointers", but I think that if you stick to C (or assembly) that you'll just have to live with macros.

I think hiding the "short pointer" inside a struct can be done so that even the C type checking will barf if you use it wrong; that will enforce using macros which will hopefully be clear, if verbose.

I know that this may fire up the whole C vs. C++ debate, but if all you do is take an existing well-written C code base and compile it in C++, you shouldn't get any extra run-time code: just better type checking.

Knowing what C++ features to avoid using in an embedded system is a huge part of making C++ play nice. The "embedded C++" initiative attempts to take care of that for you, but I've never used it so I don't know how solid it is, or if it's still popular.

--
Tim Wescott 
Wescott Design Services 
 Click to see the full signature
Reply to
Tim Wescott

C++ would give me a "safe" way around this. The sort of thing that someone could (choose to) ignore when reviewing the code (by contrast, all the macro invocations are *impossible* to ignore! "Why didn't he use the macros in *this* expression?")

Yes, but I was looking at C++ to give me the benefit of the cleaner syntax. E.g., I can implement a Q14.9 "class" and write code that

*looks* "normal" (infix) -- without having to litter it with lots of function invocations. I can go to some lengths in C to maximize the "help" the compiler will provide. But, it means imposing these same (implementation style) burdens on folks maintaining the code

("Why the heck is he doing things this way? It would be SO much easier/cleaner if he did *this*, instead..." "Yeah, but then *these* things will bite you in the ass..." "Oh.")

Similarly, given a "short pointer" to "abcdefg", consider what the syntax for accessing the fourth char therein would look like contrasted with the equivalent syntax for a "normal" pointer.

[Yes, it can be done. It's just how expensive you want to make the develop/maintain effort]

I take a simpler approach: avoid altogether. :> I'm far more focused on retirement than learning how to exploit yet another "language du jour"! :>

Reply to
Don Y

You're being way too terse with the problem description here. And that's not something I ever thought I would be telling you, of all people ;-) So: what kind of architecture, what kind of address range, and which pointers in what kind of program would be involved? Are you trying to compress all pointers, or just a subset? If the latter, what defines the subset?

Generally, your implied premise that there's excess width in pointers which can safely be omitted is, by itself, _far_ from being generically true, so it's no surprise that there's no generic approach that does what you want: a generic solution would be flat-out impossible.

The classic solution to the non-generic cases that would really profit from this kind of acrobatics, is, of course, a compiler-specific language extension, i.e. some variant of __near vs. __far memory space qualifiers. And such platform almost invariably have such an extension in place already.

E.g. at my place of work we use a 16-bit controller that has a 24-bit address space, and can map up to 32 KiB of Flash into the same 64 KiB page that also hosts RAM and peripheral registers, while the main code/const flash is elsewhere. The C tool chain builds a memory model for this where data pointers are 16-bit unless qualified __far, whereas code pointers are 24-bit unless qualified __near. That makes for rather efficient code, and near-optimal pointer size.

One could implement a roughly similar scheme without a language extension proper, but not without implementation-specific features of the compiler and linker. Those are needed a) to configure which objects go into the __near region of memory space, and which don't, and b) because the behavior when casting pointers to or from non-pointer data types is always implementation specific. But without the compiler's support, such a scheme can never be really reliable. What the compiler doesn't see, it can't warn you about if you get it wrong.

Reply to
Hans-Bernhard Bröker

My simple thought is to make your "pointers" actually indexes into an array which has the pointers to the actual objects. This will save memory for any object that has multiple pointers to it.

Information theory says that it will be basically impossible to generically compress arbitrary 32 bit pointers to 16 bits (as implied by scattered through memory).

I would expect to use these object handles as a way to save space in the interconnections of objects in the data cloud, but not necessarily for short term pointers to things inside of these objects.

Reply to
Richard Damon

If you have less than 256 "objects" (subroutine addresses, data items) why not assign a "handle" (a single byte unsigned integer) to each of them. Then you need to build an up to 256 void pointer table to access the actual object in a 16, 32 or 64 bit address space.

Of course for each object reference, you need an extra memory access to get the actual address.

In C, you may have to do some typecasting, so that the void pointers are usable.

Reply to
upsidedown

You may need to pay attention to compiler flags to get this, such as to disable exceptions and RTTI. Exception handling can take code space, and limit optimisations, even if you don't use it. But any decent embedded compiler will let you disable it entirely and thus avoid the bad effects.

There are a few points in which valid C code has a different meaning when compiled with C++ - but for well-written C, the few that might occur will lead to clear error messages (for example, C++ requires slightly more typecasting in some cases). People who write code that depends on sizeof('a') get what they deserve. One possible issue to watch for is if you depend on the underlying type of enum data - that can change between C and C++.

The "Embedded C++" people got a lot badly wrong, so that you lose too much of C++. It's a good idea to ban exceptions and RTTI, and perhaps a good idea to ban multiple inheritance - but although multiple inheritance is costly in use, it is free if you don't use it (unlike exceptions and some of RTTI), and in cases when you /do/ need it, it is arguably cheaper than any alternative solution. But banning "mutable" is silly, banning templates spoils much of the point of C++ (good template use leads to smaller object code, as well as smaller source code), and banning namespaces shows that the EC++ committee are not qualified for the job.

Fortunately, EC++ is usually considered a dead language.

Reply to
David Brown

There is not nearly enough information here to be able to give a sensible answer. I'm guessing you need something that cannot be handled by an obvious solution, such as putting the pointed-to objects in an array and using a small indexing type. But I could come up with a dozen different ideas - none of which would help, because they would suit a dozen different specific circumstances and requirements (or preferences).

The best idea, however, is to first try to figure out why you need so many pointers in the first place, and look there for improvements.

Reply to
David Brown
[...]

The embedded compiler vendors found writing a C++ compiler was too hard.

So they did the easy bits, and sold that.

--

John Devereux
Reply to
John Devereux

Yes, squeezing the balloon (as you said in a sub-post) is an adept analogy.

My take on this is that if you have to start going to these extremes at this low level, there is a (much?) better way to handle it at a higher level. It's like trying to optimze the heck out of an algorithm at the implementation level instead having another look at the algorithm itself.

At least I would be more motivated to think about a potential solution if I could see why this is really necessary.

--
Randy Yates 
Digital Signal Labs 
 Click to see the full signature
Reply to
Randy Yates

Obviously this is subject to your processor's type and environment, but maybe some way to make a NEAR_PTR where the item is within +/- 127 bytes, a PAGE_PTR where the pointer is within the boundary of the current code segment (dependant on processor type) and then a full blown FAR_PTR that has all the overhead of being long enough to hold any physical address and/or includes any page swapping code.

Then cast 8,16,and 32 bit values to the pointer type? Obviously, I'm just spit ballin' here and haven't tried, or needed to try any of this, yet. I have been in the position of ripping out verbose strings and putting in cryptic numeric error codes because of ROM constraints.

Jim

Reply to
WangoTango

Hi Don,

Given contiguous heaps, you could elide the common high bits of addresses and add them back when dereferencing. Unfortunately, there's no way to completely hide this while still retaining compiler type checking for pointer use.

Your best bet is to use C++ and create a templated pointer type. That will allow most/all uses of the pointers to be transparent, but it still will show through in pointer declarations.

In C I think you're screwed: I don't see how any kind of portable macro solution can be achieved without either using per-type macros (bad) or requiring that the macros have explicit type parameters (worse, and ugly too!). Even resorting to nonportable language extensions such as GCC's "typeof" operator, I'm not sure it's really possible for macros to conceal the type manipulation ugliness.

George

Reply to
George Neuner

Hey George,

Have fun? :) H> Given contiguous heaps, you could elide the common high bits of

Yeah, that's the *biggest* problem. Plus the syntax obfuscation that it would draw into the mix -- esp if not universally applied. E.g., why is the syntax for invoking this function *through* a pointer unchanged/changed while this other pointer usage is changed/unchanged?

I.e., pointers act as identifiers and delimiters, of a sort. Esp for variable size objects, variable width fields, etc. So, how (in)consistent do you want your "naming" of objects to be?

And, of course, this depends a lot on how you want to *access* those objects. And, how much you want to pay for that access! (e.g., you can develop elaborate encodings that have very small space costs -- for the encodings themselves -- at the expense of more complex algorithms to ENCODE and DECODE those references)

E.g., my backup speech synthesizer accesses objects seqentially. So, it can be efficient with just a simple delimiter between successive objects (and arranging those objects in contiguous sequential memory so the algorithm can move from one to the next "for free" -- instead of having to chase down a new pointer to find the next object!)

But that offsets savings from "pointer abbreviation" with additional code. (I'm not competent to look at that and *guess* as to what the actual costs are likely to be -- by contrast, anything C-ish is pretty "apparent")

I'm currently exploring a kludgey compromise that, I *think*, gives me significant savings (without adding lots of code to deref these "references"). But, I'm not sure if what I am doing will be "strictly portable" (while I can imagine it isn't the sort of thing compiler vendors *anticipate* seeing, I can't really see why they -- in 2013 -- wouldn't be able to accommodate this stuff!).

I'll try to hack together a short example for you. I'm *sure* you'll be able to tell me if I've broken any rules :> I *think* I can actually address type safety, to some extent, with it as well! (though I'm still screwed wrt syntax) Watch your mail...

Reply to
Don Y

So you are saying that EC++ was designed to be easy for the compiler writers, rather than useful for users? That's perhaps more realistic than the usual assumption (which is that the EC++ committee were idiots). On the other hand, I have never heard of a compiler that supported EC++ that did not also support at least most of "real" C++, and EC++ was certainly marketed as a "subset of C++ suitable for embedded systems".

Reply to
David Brown

Not really.

Each different "short pointer" type needs to have different casts made at the point of use ... so there will be as many instances of the code as there are defined short pointer types. But the same would be true of equivalent C code.

AFAICS, you'll only need a few one/two line functions ... a quick (non-template) hack suggests just 3 or 4 functions are sufficient, depending on how you store/reference the common high bits.

WRT code size becoming a problem, the only issue I see is in making multiple use of the same short pointer(s) within a single function. The compiler *might* be able to recognize and optimize this, but if you're going to be reusing a short pointer, you're better off expanding it into a full size local pointer and then using that.

Of course, the same is true with C ... but in C++ there will be no obvious visual clues - i.e. no macro/function calls on the pointers - that will indicate (relatively) heavy handed manipulation is going on.

George

Reply to
George Neuner

Additional code compared to what? C? Templates done right shouldn't be any worse than macros, and may even be better if you can give the compiler enough optimization clues.

--
Tim Wescott 
Wescott Design Services 
 Click to see the full signature
Reply to
Tim Wescott

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.