Re: Intel details future Larrabee graphics chip

- K
- Kim Enkovaara
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 6:20 AM

And all that changes even with the synthesizable subset of SystemVerilog, it adds enums, structs etc. And synthesis tools are starting to support those features already, because they are quite easy to map to VHDL features, that the tools had to support in the past already.

I have also seen major errors in Verilog, because it's so easy to connect two different width vectors together etc. All languages have their pros and cons.

And the synthesis result for the integer and bitvector are the same. The difference is that the other one traps in the simulation and the designer has to think about the error. In HW there is no bounds checking.

We also have to differentiate what is meant with an error. Is is something that traps the simulation and it might be a bug, or is it something that exists in the chip. I like code that traps as early as possible and near the real problem, for that reason assertions and bound checking are a real timesaver in verification.

At least Verilog blocking vs. nonblocking and general scheduling semantics are not very simple. VHDL scheduling is much harder to misuse.

Sometimes people like to code differently, choices help that. In SystemVerilog there are at least so many ways to do things that most people should be happy :)

Coding style definition is a good way to start endless religious wars :)

--Kim

- T
- Terje Mathisen
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 6:24 AM

Them are fightin' words...

This is precisely the wrong specification for a portable specification!

Which byte order should be used for the length field?

If we had something like uint32l_t/uint32b_t with explicit little-/big-endian byte ordering, then it would be portable.

The only way to make the struct above portable would be to make all

16/32-bit variables arrays of 8-bit bytes instead, and then explicitly specify how they are to be merged.

Using a shortcut specification as above would only be allowable as a platform-specific optimization, guarded by #ifdef's, for machine/compiler combinations which match the actual specification.

The alternative is to hide the memory ordering behind access functions that take care of any byte swapping that might be needed.

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"

- T
- Terje Mathisen
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 6:45 AM

Indeed.

How many ways can you define such a function?

The only serious alternatives would be in the handling of negative-or-zero inputs or when rounding the actual fp result to integer:

Do you want the Floor(), i.e. truncate, Ceil() or Round_to_nearest_or_even()?

Using the latest alternative could make it harder to come up with a perfect implementation, but otherwise it should be trivial.

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"

- B
- Bernd Paysan
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 8:53 AM

That's a big no-no. Synthesis is just another implementation of simulation, so the semantics must be the same. If there is no proof that the trap can't happen, I would say you aren't allowed to synthesize the construct.

Yes, but assertions are an obvious verification tool, and not mixed with the actual operation semantics. If I write in Verilog

if(addr > 10) begin $display("Address out of bound: %d", addr); $stop; end

then this is perfectly synthesizable code, and I know that a failure in simulation is actually a bug in some producer of the address.

Fortunately, the synthesis tools are usually very strict on the rules of blocking vs. non-blocking, so if you misuse them, you get error messages.

SystemVerilog is a lot of VHDL with Verilog syntax (as you described above).

If you design as a team, you don't have time for many different coding styles. But certainly, you are right that people are religious about their coding style - in the current chip we just examine, we found a bug which came from a particularly risky way of coding something, and we already had a discussion between two people when this was coded - the implementer ignored the common practice, and the code he wrote was actually wrong.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

- N
- Nick Maclaren
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 8:57 AM

In article , Terje Mathisen writes: |> Jan Panteltje wrote: |>

|> > No, int32_t and friends became NECESSARY when the 32 to 64 wave hit, |> > a simple example, and audio wave header spec: |> > #ifndef _WAVE_HEADER_H_ |> > #define _WAVE_HEADER_H_ |> > |> > typedef struct |> > { /* header for WAV-Files */ |> > uint8_t main_chunk[4]; /* 'RIFF' */ |> > uint32_t length; /* length of file */ |> |> This is precisely the wrong specification for a portable specification! |> |> Which byte order should be used for the length field?

Yes. Plus the fact that many interfaces use fields that have subtly different value ranges or semantics than the ones specified by C. But that wasn't my primary point.

The reason that the fixed-length fanatics are so wrong is that they take a decision that is appropriate for external interfaces and extend it to internal ones, and even the majority of workspace variables. Let's take that mistake as an example.

The length is passed around as uint32_t, but so are rather a lot of other fields. In a year or so, the program is upgraded to support another interface, which allows 48- or 64-bit file lengths. Not merely does the program now have to be hacked, sometimes extensively, there is a high chance of missing some changes or changing something that shouldn't have been. Then the program starts to corrupt data, but typically only when handling very large files!

That is PRECISELY what has happened, not just in the IBM MVT/MVS days, but more than once in the Unix era, yet nobody seems to learn.

10 years ago, most Unix utilities were solid with such bugs, for exactly that reason - even when running in 64-bit mode, they often started corrupting data or crashing after 2/4 GB.

Yet writing word size independent code is no harder than writing that sort of mess - though you do need to regard most of C99 as anathema. Such code typically doesn't even check whether 'words' are 32-bit or

64-bit, and would usually work with 36-, 48-, 60- or 128-bit ones.

|> The alternative is to hide the memory ordering behind access functions |> that take care of any byte swapping that might be needed.

That is the only approach for genuine portability, of course, such as when some new interface demands that you do bit-swapping, add padding, or do other such unexpected munging.

Plus, of course, it is the way that any competent software engineer does it, because it provides a place to put tracing hooks and hacks for broken interface usages, as well as making it almost trivial to add support for new interfaces.

Regards, Nick Maclaren.

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 9:29 AM

with an Ada compiler. Pascal or Modula2

encourages.

I have seen far too many horrors in C code inspections. I am frankly amazed that some coders get away with so many mistakes.

make a fairly strong case for only having

failures

Lack of proper sequential N-D arrays is one major C weakness. FORTRAN had that exactly right apart from the old 1 based indexing.

Being able to declare procedure parameters call by reference for speed but as a const would also make things a lot less likely to get trashed. That way they could only be read but not modified.

randomly soldered to points on your

something vital.

I don't dislike them quite as much as that suggests. It was more for effect. However, it describes what happens quite often :(

The worst pointer related faults I have ever had to find was as an outsider diagnosing faults in a customers large software base. The crucial mode of failure was a local copy of a pointer to an object that was subsequently deallocated but stayed around unmolested for long enough for the program to mostly still work except when it didn't.

Regards, Martin Brown

** Posted from

formatting link

**

- K
- Kim Enkovaara
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 9:57 AM

The synthesis works the same way as VHDL if range checking is disabled in the simulator via command line. Range checking is just an extra precaution, just like assertions which are not synthesizable in any general tools, just in some very specialized tools.

Synthesis vs. simulation semantics are different, many structures that happily simulate might not be really synthesizable, for example many wait statements.

If the user is clueless, any language can't save from the disaster.

But while synthesizing for real target the tool first says that $display and $stop are not supported constructs, and after that it removes the empty if, so nothing was actually generated. What is the difference to VHDL and defining the address to be integer range 0..10?

With assertions I was more pointing to direction of PSL/SV assertions, not the traditional ASSERT in VHDL or code based assertions.

And simulators can simulate same verilog code in many different ways. Many commercial IP models behave differently depending on optimization flags and simulators or simulator versions even. It seems that especially in behavioral models verilog is usually very badly misused. I hate debugging problems where first you have to figure out with the standard at hand if the simulator or code is wrong, and after that try to convince the original coder that he is wrong in his assumptions about the language semantics.

--Kim

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:00 AM

Libraries do exist in software too and some of them are very good. A lot of code reuse projects do fail because the gatekeeper is pressurised to put things in that are not of the required standard for reuse. I recall one that was called the suppository by the engineers ordered to use it.

Using off the shelf reliable software components and making a living selling them has not taken off the way it should have. Generally you have to buy a whole library and licence.

And remember that your hardware design is done on a software tool...

Can't argue with that at all. Take a look at Modula2 I am pretty sure you will like it. The M2 Definition module is very specific about exactly what the outside world is allowed to see. Though dated now it is a clean minimalist language designed to appeal to hardware engineers. It is a bit verbose and I don't agree with the final ISO std syntax.

I largely agree. Unit testing and test vector approaches are better in hardware developement than in most software shops.

Although fence post errors on booleans do tend to do exactly the opposite of what you intended whereas getting a bias voltage wrong by a few percent doesn't usually do much damage.

I am a software CEng although I don't believe software is really mature enough to count as a proper engineering discipline as it is praciticed in industry at present. A bit like mediaeval cathedral builders we either overengineer or we say afterwards oh that was a good one it is still standing after 5 years!

Indeed. I would argue that before software can truly become an engineering discipline we have to be able to better manage the process and design things in a way that customers can see and feel what it will be like to use well before the application is coded.

Most people know roughly what a house should look like and would question obvious faults like having the front door on the second storey opening out into free space. The same is not true of software. There is also a secondary problem in that the guy selling bespoke software to the customer is only concerned about his bonus and not about whether the thing can actually be built. The guy with control of the money is almost never the one who knows what is needed so the scope for being sold a pig in a poke is massive. And turnover is fast enough that the chances are the salesman will be long gone before the shit hits the fan.

I have been on the recieving end of specs that require the laws of physics repealled!

Regards, Martin Brown

** Posted from

formatting link

**

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:07 AM

On a sunny day (Thu, 14 Aug 2008 08:24:19 +0200) it happened Terje Mathisen wrote in :

That is all true, but one thing 'uint32_t length' makes clear here is that the wave file can be at most 4GB long. When moved to a larger size 'int' this is not obvious. When going from 32 to 64 bit one could easily assume otherwise.

There is also an alignment problem with structures like this, one could be tempted to just read a file header into the structure address, however if the compiler aligns at 4 byte intervals, then you uint8_t may not be where you expect it. So there is always more to it, if you dig deeper.

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:11 AM

On a sunny day (14 Aug 2008 08:57:49 GMT) it happened snipped-for-privacy@cus.cam.ac.uk (Nick Maclaren) wrote in :

That is babble. A new file specification will have an other header, or will be in an extention field of the current header. The whole program would be a different one, as all length calculations and checks would change. I have been there, done that.

Using clearly specified variable width is GOOD, unless you want to go back to BASIC. :-)

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:21 AM

OK. Then it is the proportion of static testing errors detected by the lint style tools in the Verilog environment as compared to the defects that are found later on and the same for VHDL.

It would be unreasonable to compare Verilog without the normal workflow tools being used. Although the number of defects they pick up per KLOC would be an interesting number for both environments.

Regards, Martin Brown

** Posted from

formatting link

**

- W
- Wilco Dijkstra
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:24 AM

integer

inputs or when rounding the actual fp

implementation, but otherwise it should be

It was a trivial routine, just floor(log2(x)), so just finding the top bit that is set. The mistakes were things like not handling zero, using signed rather than unsigned variables, looping forever for some inputs, returning the floor result

1.

Rather than just shifting the value right until it becomes zero, it created a mask and shifted it left until it was *larger* than the input (which is not going to work if you use a signed variable for it or if the input has bit 31 set etc).

My version was something like:

int log2_floor(unsigned x) { int n = -1; for ( ; x != 0; x >>= 1) n++; return n; }

Wilco

- N
- Nick Maclaren
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:25 AM

In article , Jan Panteltje writes: |> >

|> >The length is passed around as uint32_t, but so are rather a lot of |> >other fields. In a year or so, the program is upgraded to support |> >another interface, which allows 48- or 64-bit file lengths. |> |> That is babble. |> A new file specification will have an other header, or will be in |> an extention field of the current header. |> The whole program would be a different one, as all length calculations |> and checks would change. |> I have been there, done that.

Precisely. And, if you would learn from the experience of the past, all you would have to change is the interface code - the rest of the program would not even need inspecting.

Been there - done that. Many times, in many contexts.

Regards, Nick Maclaren.

- W
- Wilco Dijkstra
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:49 AM

I'd certainly be interested in the document. My email is above, just make the obvious edit.

More than that. I've implemented it. Have you?

It's only when you implement the standard you realise many of the issues are irrelevant in practice. Take sequence points for example. They are not even modelled by most compilers, so whatever ambiguities there are, they simply cannot become an issue. Similarly various standard pendantics are moaning about shifts not being portable, but they can never mention a compiler that fails to implement them as expected...

Btw Do you happen to know the reasoning behind signed left shifts being undefined while right shifts are implementation defined?

It will work as long as the compiler supports a 32-bit type - which it will of course. But in the infinitesimal chance it doesn't, why couldn't one emulate a 32-bit type, just like 32-bit systems emulate 64-bit types?

Actually various other languages support sized types and most software used them long before C99. In many cases it is essential for correctness (imagine writing 32 bits to a peripheral when it expects 16 bits etc). So you really have to come up with some extraordinary evidence to explain why you think sized types are fundamentally wrong.

Wilco

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 10:52 AM

On a sunny day (14 Aug 2008 10:25:59 GMT) it happened snipped-for-privacy@cus.cam.ac.uk (Nick Maclaren) wrote in :

I know we can go on, but probably mean the same thing in the end, but 'the program would not need inspecting (or changing)' sounds a bit, eh, daring, if not wrong. Take the example of a program that concatenates some wave files to one larger one. It will first read all headers, add the sizes, and then, if it finds the output exceeds 4GB say: 'myprogram: output exceeds 4GB, aborting.' So, the size check, and the reporting, would need to change, in any case. However there is much more, depending how really pedantic one was in reading the file headers, perhaps as length = byte + (byte * 256) + (byte * 65536) + etc. I do not claim to be the perfect coder, and you compiler writers know more about the specs then I do, but - I do try to learn from these discussions, and other similar ones in relevant newsgroups -. It has caused me on several occasions to rewrite my code. Will it ever be perfect? No, but usable, with no errors, yes. When I get an email (this happened when AMD 64 came out) saying: Your program always worked fine, but now I have an AMD 128, and it reports the wrong output file size', what do I do now?' it will be back to the source code for me, or somebody else.

- N
- Nick Maclaren
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 11:09 AM

This is getting ridiculously off-topic, and this will be my last posting on this sub-thread.

In article , Jan Panteltje writes: |> |> I know we can go on, but probably mean the same thing in the end, |> but 'the program would not need inspecting (or changing)' sounds a |> bit, eh, daring, if not wrong. |> Take the example of a program that concatenates some wave files to |> one larger one. |> It will first read all headers, add the sizes, and then, if it finds the |> output exceeds 4GB say: 'myprogram: output exceeds 4GB, aborting.' |> So, the size check, and the reporting, would need to change, in any case.

That is not how to approach such a problem. Inter alia, it prevents the program from concatenating files in a format with a 4 GB limit and writing them to one with a larger limit. You should write it like this:

Each header is read and decodes, and the length is put in an internal format integer.

The concatenation code adds the lengths, checking that they don't overflow, and giving a diagnostic if they do.

It then writes the result to the output, checking that the file will fit, and diagnosing if it won't.

Regards, Nick Maclaren.

- W
- Wilco Dijkstra
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 11:13 AM

diagnosing faults in a customers large

object that was subsequently

mostly still work except when it didn't.

Those are very nasty indeed. However they aren't strictly pointer related - a language without pointers suffers from the same issue (even garbage collection doesn't solve this kind of problem). ValGrind is good at finding issues like this. Automatic checking tools have improved significantly over the last 10 years.

The worst problem I've seen is a union of a pointer and an integer which was used as a set of booleans, which were confused by the code. So the last few bits of the pointer were sometimes being changed by setting or clearing the booleans. Similarly the value of the booleans were different on different systems or if you changed command-line options, compiled for debug etc.

Wilco

- J
- Jan Panteltje
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 11:27 AM

On a sunny day (14 Aug 2008 11:09:51 GMT) it happened snipped-for-privacy@cus.cam.ac.uk (Nick Maclaren) wrote in :

OK.

Yes this is what I do, so?

Yes, and 'overflow' is set by the format it reads, in this case the wave format, and that is fixed at 4GB.

No, it writes only output if it fits, and if it does not, then it switches to raw mode actually (actually it proposes a new command line). As raw mode has no file limit (other then OS and filesystem limitations).

See, I *wrote* this program for real, you are only dreaming about one.

- T
- Terje Mathisen
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 12:46 PM

inputs or when rounding the actual fp

implementation, but otherwise it should be

that is set.

result + 1.

mask

to work

That is _identical_ to the code I originally wrote as part of my post, but then deleted as it didn't really add to my argument. :-)

There are of course many possible alternative methods, including inline asm to use a hardware bitscan opcode.

Here's a possibly faster version:

int log2_floor(unsigned x) { int n = -1; while (x >= 0x10000) { n += 16; x >>= 16; } if (x >= 0x100) { n += 8; x >>= 8; } if (x >= 0x10) { n += 4; x >>= 4; } /* At this point x has been reduced to the 0-15 range, use a * register-internal lookup table: */ uint32_t lookup_table = 0xffffaa50; int lookup = (int) (lookup_table >> (x+x)) & 3;

return n + lookup; }

or to make it branchless:

int log2_floor(unsigned x) { int n = -1; int gt = (x >= 0x10000) >= gt;

gt = (x >= 0x100) >= gt;

gt = (x >= 0x10) >= gt;

uint32_t lookup_table = 0xffffaa50; // 0011222233333333 int lookup = (int) (lookup_table >> (x+x)) & 3;

return n + lookup; }

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"

- N
- Nick Maclaren
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Thu, Aug 14, 2008 1:10 PM

In article , "Wilco Dijkstra" writes: |>

|> I'd certainly be interested in the document. My email is above, just make |> the obvious edit.

Sent.

|> > |> I bet that most code will compile and run without too much trouble. |> > |> C doesn't allow that much variation in targets. And the variation it |> > |> does allow (eg. one-complement) is not something sane CPU |> > |> designers would consider nowadays. |> >

|> > The mind boggles. Have you READ the C standard? |> |> More than that. I've implemented it. Have you?

Some of it, in an extremely hostile environment. However, that is a lot LESS than having written programs that get ported to radically different systems - especially ones that you haven't heard of when you wrote the code. And my code has been so ported, often without any changes needed.

|> It's only when you implement the standard you realise many of the issues are |> irrelevant in practice. Take sequence points for example. They are not even |> modelled by most compilers, so whatever ambiguities there are, they simply |> cannot become an issue.

They are relied on, heavily, by ALL compilers that do any serious optimisation. That is why I have seen many problems caused by them, and one reason why HPC people still prefer Fortran.

|> Similarly various standard pendantics are moaning |> about shifts not being portable, but they can never mention a compiler that |> fails to implement them as expected...

Shifts are portable if you code them according to the rules, and don't rely on unspecified behaviour. I have used compilers that treated signed right shifts as unsigned, as well as ones that used only the bottom 5/6/8 bits of the shift value, and ones that raised a 'signal' on left shift overflow. There are good reasons for all of the constraints.

No, I can't remember which, offhand, but they included the ones for the System/370 and Hitachi S-3600. But there were also some microprocessor ones - PA-RISC? Alpha?

|> Btw Do you happen to know the reasoning behind signed left shifts being |> undefined while right shifts are implementation defined.

Signed left shifts are undefined only if they overflow; that is undefined because anything can happen (including the CPU stopping). Signed right shifts are only implementation defined for negative values; that is because they might be implemented as unsigned shifts.

|> It will work as long as the compiler supports a 32-bit type - which it will of |> course. But in the infinitesimal chance it doesn't, why couldn't one |> emulate a 32-bit type, just like 32-bit systems emulate 64-bit types?

Because then you can't handle the 64-bit objects returned from the library or read in from files! Portable programs will handle whatever size of object the system supports, without change - 32-bit, 64-bit,

48-bit, 128-bit or whatever.

|> Actually various other languages support sized types and most software |> used them long before C99. In many cases it is essential for correctness |> (imagine writing 32 bits to a peripheral when it expects 16 bits etc). So |> you really have to come up with some extraordinary evidence to explain |> why you think sized types are fundamentally wrong.

Not at all. That applies ONLY to the actual external interface, and Terje and I have explained why C fixed-size types don't help.

Regards, Nick Maclaren.