CAN bus reply problems - Page 2

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Re: CAN bus reply problems
Quoted text here. Click to load it

Having read a number of articles and threads recently on this subject,
it seems to me that despite CAN's excellent hardware based
acknowledgement & retry system, the above statement is probably true
once you have three or more processors on the bus and certain types of
message being sent.

Consider a system with processors P1, P2 and P3.

P1 wants to send a message to P2. The message is not one of the often
quoted "nice" CAN bus examples whereby P1 is constantly spewing out
repeated readings of a sensor so that P2 or anyone else may "consume"
them; the loss of a message in this scenario isn't so important as the
next reading will usually suffice. Instead, the message is an
instruction for P2 to perform something, such as turn an I/O line on, or
write some data to an LCD, and it is therefore 100% essential that P2
receives this message or the product fails.

So, P1 sends the message, and gets the hardware ACK. But the ACK came
from P3, who isn't interested in consuming the message.

From what I understand, although P2 "should" generate an error to
destroy the ACK if it detects an error, there are a number of
circumstances where it may not and P2 may "lose" a message.

1. A software bug in P2.
2. A receive overflow in P2.
3. Errata in the P2 CAN controller.
4. P2 has gone error-passive or bus-off.
5. Are there any other reasons?

Admittedly, (1) should be fixed and would be a problem even in a two-
node system, but (2) may be unavoidable on certain smaller CAN
controllers with limited FIFOs, (3) is unavoidable unless you change to
another processor/CAN device, and (4) is actually designed to happen.
I'd truly like to know if there is a (5).

So, it would seem in this situation that despite the hardware based ACK
system present in the CAN controllers, you must still produce a high
level protocol which provides a software based mechanism for
acknowledge, timeout and retry.

Such lost messages may only be one in a billion, but if my product sends
a billion or more messages per week and it doesn't include a high-level
acknowledge, timeout and retry mechanism, then I'll have a product MTBF
of a week or less which is totally unacceptable.

I'd be interested in the opinion of others here. I'm in the process of
firmware development on my first CAN based system and only have one of
the nodes up and running in loopback mode for now so I can't assess
reliability on a three-or-more-node system. But based on the fact that
the possibility of a message going missing isn't completely zero, I'm
taking the view that I must implement the additional high-level ACK
mechanism. The general view I sense from reading CAN articles is that
although CAN's error mechanism is extremely robust, it's not 100%, and
stuff does occasionally go missing.

Re: CAN bus reply problems
[Robotics removed from F'up2 list --- should have been done much
earlier...]


Quoted text here. Click to load it

You understand that incorrectly.  No CAN node can possibly "destroy"
an ACK being flagged by some other node.  And "generating an error"
(by which I assume you mean "sending an error frame") for reasons not
already diagnosed by the CAN protocol itself would be a layer model
violation.  Application layer errors have no business generating
transport/link layer errors.  That's also the reason why CAN
controllers typically don't support sending error frames on purpose:
if an error frame needs to be sent, the controller will do that all by
itself.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: CAN bus reply problems
Quoted text here. Click to load it

No, you misunderstood me.

I was talking about the situation where the CAN module detects an error
at the hardware level and deliberately generates an error frame to stop
the transmitter believing the frame was acknowledged. Nothing to do with
software.


Re: CAN bus reply problems
says...
Quoted text here. Click to load it

One thing to watch for that hasn't been pointed out is that a CAN node
may recieve multiple valid copies of the same message.  This has two
consequences, the first is that toggling the state based on message
receipt is a bad idea.  The second is that any acknowledge/retry scheme
has to be able to recognize and discard duplicates if necessary.

Robert

Re: CAN bus reply problems
On Fri, 8 Apr 2005 15:51:33 -0400, R Adsett

Quoted text here. Click to load it

Apart for some strange networks with  multiple store and forward
repeaters, it is hard to imagine how such could situations could
happen.

Basically this would require that the transmitter has recognised an
error (missing ACK or error frame) and thus resends the message.
However, your node did not detect that something was wrong and
accepted the message at the first time.

A properly working receiver should check the CRC, the ACK fields _and_
check that at least six recessive bits are received in the End Of
Frame field.

If your receiver is happy that the frame that you are interested in,
passed the CRC check and immediately accept the message, without
checking the ACK and EOF fields, you are going to get duplicates, if
the transmitter works according to the standard. An other node may
have generated the error frame, which the transmitter detects and
retransmits, but your receiver is content with the first copy.

Paul
 

Re: CAN bus reply problems
Quoted text here. Click to load it




That's indeed what happens, if a bit error hits exactly the wrong bit
in the CAN message: the last bit of the end-of-frame field.  This bit
is checked by the transmitter, but not by the receiver(s).  So, if
this bit is struck by an error, the transmitter will detect this as a
"form error", and re-send, but the receiver will not have noticed any
problem.

Quoted text here. Click to load it

No.  That's not the actual definition of a properly working receiver.
A proper CAN receiver will *not* look at the last bit of the EOF
field.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Re: CAN bus reply problems
aachen.de says...
Quoted text here. Click to load it

I've seen it happen.  Not frequently but far more than could be ignored
even if you were inclined to do so.

Robert


Re: CAN bus reply problems

Quoted text here. Click to load it

At least you'll be forewarned.  Non-systematic errors will supposedly
hit any bit of a CAN message randomly, at equal probability.  So this
particular error will occur at most 1/50 as often as the other types
of error, which both transmitter and receiver notice --- less if you
use longer CAN messages.

Keeping an eye on overall error-induced frame retransmission rates
thus provides a handle on how often to expect this particular error.
Combined with the requirements of the communication at hand, one can
design the amount of countermeasures to match the risk.

--
Hans-Bernhard Broeker ( snipped-for-privacy@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Termination
Yes, improper terminate seems to cause errant behaviours.
This could also lead the situation where your're seeing
can "messages".  Consider two can nodes only.  If it's incorrectl
terminated then a possible ACk would never be received by th
transmitting node.  Somone on the node has to send an acknowledg
reply or the transmitting node will keep transmitting.  This cause
lot
of bus activity but no messages are being recognized.  Also wha
transceivers are you using.  Not all transceivers seem to wor
together


Site Timeline