What is this data structure?

I need to store what is essentially an array of N unsigned ints and examine and set them individually as you expect of a simple array. However I also need what I can only describe as a cumulative index on that same data, i.e. "What is the total of all elements from the beginning to element I?" or "At what element does the cumulative total reach threshold X?" The size of the array is determined at compile time and I have a fair amount of discretion over what its actual value is.

I've been musing this for the last couple of days now and I've devised in back-of-an-envelope terms a solution in log N time, although there's still plenty of fine detail to sort through before I have a working solution. However, I have this strong gut feeling that I'm missing something and the same requirements looked at slightly differently suggest a well-known standard structure that I'm simply not seeing.

Does this ring any bells with anyone?

--
Andrew Smallshaw 
andrews@sdf.lonestar.org
Reply to
Andrew Smallshaw
Loading thread data ...

In article , Andrew Smallshaw wrote: }I need to store what is essentially an array of N unsigned ints }and examine and set them individually as you expect of a simple }array. However I also need what I can only describe as a cumulative }index on that same data, i.e. "What is the total of all elements }from the beginning to element I?" or "At what element does the }cumulative total reach threshold X?" The size of the array is }determined at compile time and I have a fair amount of discretion }over what its actual value is. } }I've been musing this for the last couple of days now and I've }devised in back-of-an-envelope terms a solution in log N time, }although there's still plenty of fine detail to sort through before }I have a working solution. However, I have this strong gut feeling }that I'm missing something and the same requirements looked at }slightly differently suggest a well-known standard structure that }I'm simply not seeing. } }Does this ring any bells with anyone?

I don't think this is sufficiently general a problem to have a named data structure, but it occurs to me that some form of balanced binary tree might be suitable.

When you say you have a log N solution, is that for all operations? You mention three: * replacement of an element * determining sum of 0..I * determining index where a (fixed?) threshold is reached

Are all operations equally common, or are some common and some rare?

Reply to
Charles Bryant

It feels like you're looking for the one-dimensional reduction of the MIP map, a classic data structure in computer graphics and computational geometry. In those fields this structure is used to hold pre-computed copies of an original image at various sizes reduced by powers-of-two, which allow for more efficient queries involving averages or coverage checks over given areas.

In your case this would involve taking the original array and extending to a power-of-two size. Let's assume 1-based indexing, i.e. indices [1] .. [2^n]. Then you replace every element [2i] with the sum of elements [2i] and [2i-1], every element [4i] with the sum of new elements [4i] and [4i-2], every element [8i] with the sum of current values [8i] and [8i-4], and so on. Continue this until you've updated the content of element [2^n] to be the sum of current element [2^(n-1)] and itself. The plan is that now every element [i*2^j] holds the sum of original elements [(i-1)*2^j + 1] ... [i*2^j]. Querying the sum from [1] to [k] now involves summing and subtracting up at most 2*n of those partial sums, which are easily located based on the binary representation of 'k'.

If you can afford the extra storage, you can keep all original elements and all the intermediate sums and reduce the number of terms to be summed up to 'n', at the cost of a factor of 2 in storage space.

Reply to
Hans-Bernhard Bröker

I don't know if it has a name but yes, there's a standard approach that involves putting a binary tree over the array, so the leaves are labelled by array elements, and each higher node is labelled by the sum of the labels of its two children. You can then do lookups and updates in a straightforward manner, propagating lookups from the root downward and updates from the leaves upward. Hans-Bernard Broeker described an efficient implementation strategy.

Reply to
Paul Rubin

Thanks for this, and to Charles and Paul too. The solution you outlined was broadly similar to my own, albeit with a slightly different array-as-tree representation. I got it sorted out this morning when I came back to it fresh. I'm still sure I've seen that kind of semantic somewhere before, though, can't remember for the life of me where now.

--
Andrew Smallshaw 
andrews@sdf.lonestar.org
Reply to
Andrew Smallshaw

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.