Random "Bad_NoDataAvailable" errors when writing bit tags

Sorry for the verbose description but this is a curly one…

I have a project that includes a number of “Stored Procedure” Groups (11x at the moment) which all call the same SP in the database and Pass in 6 parameters (all tags) and writes back 4 bit tag values. These procedures are triggering every 2 sec & on average take 0.050 sec to complete.

The PLC is a ControlLogix 5500 series connected via IgnitionOPCUA using the “Allen-Bradley” driver suite via ethernet.
The Ignition version is 7.1.5 (b5670) and is running on a blade system with a dedicated SQL2005 instance on a separate blade to the ignition gateway system.

The problem I have is that very randomly (maybe one time in a couple of days or more) where a group will “error out” when trying to write the result bits to the PLC tags, or at least that is what the Events log indicates. When it happens all 4 bit tags have the same result code (which is Bad_NoDataAvailable) but Ironically the “Count” field displays a count of zero. at the same timestamp there is an info message saying “Group state changed to ‘errored’” also with a zero count. In each case there is another event 8-9 sec later indicating the group state set back to ‘running’ but this time with a count of 1.

All of the Groups tags exist in the SAME PLC yet the “problem” only appears to affect one group at a time - without any descernable pattern.

The four written Bits are crucial for returning important status information back to the PLC and the failure to write causes some grief at the PLC end when it doesn’t see the tags change. (Two of those tags form part of a “Handshake” between the PLC & Ignition)

I have tried changing between the tags being “Subscription” and “Read” to no avail.
I really want to understand what is causing this error & fix it if possible rather than putting a “band aid” fix into the PLC program. (Especially as it isn’t me that deals with the PLC code).

Regards,

Peter

Just bumping this to get a reply…

Some random thoughts for your random problem…

Sounds like a bit is getting dropped in the comms path.

Do you know if the underlying protocol is UDP or TCP? If it’s UDP, it could be that a message is getting corrupted.

Is it fiber or twisted pair? Is there anything else on the network that could be flooding or interfering?

Robert,
Underlying protocol is TCP and it is a combination of Gigabit fibre + twisted pair.
Not sure of your definition of “a lot of activity on the network” but I’d say it was fairly busy yet not too busy for existing standard PLC control traffic. The area has its own VLAN with redundant gigabit fibre between HP Procurve gigabit switches. The Ignition server resides on a separate VLAN & traverses a firewall to get to the PLC.

I did have a close look at the Console event logs & found an instance where the error occurred. ther was an error shortly after the initial one that said something like response found when not requested or something along those lines (I should have taken notes :blush: ) so you may have a point with the “bit is getting dropped in the comms path”. Any idea how I can “stretch” the wait time for the PLC responses? Or am I barking up the wrong tree here?

Peter

Sorry Peter,

I can't help much more then this. It just really sounds like a networking problem. TCP is guaranteed delivery so the messages should be getting through. (Nothing says they won't be delayed) If the don't get through, the sender is notified and should retry.

At this point I would be asking my networks guru to break out wireshark/whatever is needed to verify that the network is working ok and running near the edge. Funny thing about ethernet is a bad cable in one section can cause all sorts of problems somewhere else on the other side of the network.

Just to update & close this off.

We found that yes, there was an issue with comms but it wasn’t what you’d expect. The ethernet module on the PLC rack in question was being “overloaded” with TCP/IP messages (both ignition & other PLC’s + panelViews in the area).

Fix: Our PLC programmer/electrician installed a 2nd ethernet module and split the traffic (our Ignition stuff, versus process related messages) and the problem was solved.
Apparently the ethernet modules don’t like running above a certain percentage of utilisation constantly and will perform their own “reset” in an attempt to rectify what they see as a “problem” (I can’t remember what that actual value is but I believe it’s quite low - I think around 70% or thereabouts but don’t quote me :scratch: )