Performance Tips & Tricks for Optimizing Gateway Networks

60 min video  /  57 minute read
 

Speakers

Travis Cox

Chief Technology Evangelist

Inductive Automation

Getting the most out of your Ignition gateway network is important to your system’s performance, especially for large implementations. In this session, you’ll get expert tips about how to optimize the performance of your gateway network for heavy workloads.

Transcript:
00:06
Travis Cox: Alright. Hello, everybody. Welcome, welcome. Alright, so this session here is gonna be a good one on “Performance Tips and Tricks for Optimizing Gateway Networks.” So I think it's a good session to go into the details of how the Gateway Network works, how we can make it more efficient, how we can get larger systems out there. So you all know me, Travis Cox here. I'm happy to be a part of this session. I'll be doing a couple of others as well as at the conference these couple of days. So for today though, on this session, we're gonna first look at what is the Ignition Gateway Network, go into a little more detail about what that does, some of the architectures that are possible with the Gateway Network, and then we're gonna look at all the different services that ride on the Gateway Network, and then we're gonna get into those... Lots of different tips and tricks. We have quite a few for you here today.

01:06 
Travis Cox: So let's start by looking at what the Ignition Gateway Network is. So first and foremost, the Ignition Gateway Network is a Ignition-to-Ignition communication network, so it allows two Ignition servers, they can be in the same physical location or different locations, allows them to talk to each other and exchange information, exchange data across those two. It makes that part really, really easy. And it unlocks a lot of different architecture possibilities, it allows us to distribute features of Ignition or distribute load of a system.

01:43 
Travis Cox: If you look at... Going back to a lot of people just would have one Ignition server that would do everything. We can actually split apart one Ignition server into multiple that can accomplish different features, and we can connect them all together utilizing this Gateway Network to make it look and feel like it is one... Actually one system. So we're gonna look at it in a lot more detail here today, but it is one where on one Ignition server, you make an outbound connection to the other, so we establish a trusted communication between those two servers so that there we can actually, again, exchange data. And it's a highly secure system. It is built on TLS, and we have certificates that we build, that we create automatically in Ignition starts up, the self-signed certificate, but you could load in your own if you wanted to for the Gateway Network.

02:33 
Travis Cox: And there's a lot of security on what data can be exchanged and who can do what within that network. So we'll look at some of those things here today. But another important piece about the Gateway Network is it also allows for proxying. That is really what allows us to especially work across a Purdue network, a segmented network where we have layer three, the OT layer, layer four, which is the business side, and we can basically connect those two together by having Ignition in the DMZ, so that we can bring that data through in a secure way. So we'll talk more about the proxy nodes, about how it really can create a mesh network that's within our system.

03:16
Travis Cox: So that's what the Ignition Gateway Network is. It allows the Ignition servers to talk to each other and to exchange information and to ultimately make it feel like it's one system. Alright. So as I mentioned, a lot of people start out with Ignition, they'll start with one Ignition server, and our licensing is completely unlimited. You can add as many tags, screens, projects, device connections as you want to that server. Of course, the license doesn't limit you in any way. However, at some point, you're gonna get limitations on hardware, so you might run into performance problems if you, for example, have 1.5 million tags in that system or you're trying to connect to 500, 600 different devices that are out there, or if you have hundreds of people that wanna look at data and the client. And typically, when people start out at the beginning, it's... The system is quite small, but it's over 5-10 years where they start ramping that up, they start adding more to the system and over time, if you're not really paying a lot of attention, it can... Because it's so easy to keep adding on to it, you could start seeing some of those performances hits or at least getting... Seeing CPU ramp up or a lot more memory usage.

04:23 
Travis Cox: So one of the most important parts of the Gateway Network, one of the things that we can do with it is to separate out that is a scale out of system or separate out functionality with Ignition, so instead of having one Ignition server, being able to have two. And the example we're showing here on this screen is to have a backend and a frontend. And it's a pretty common way of utilizing the Gateway Network, and so that we can have two servers that are gonna be able to accomplish what a single Ignition server could have done, but allow us now to do a lot more because we have dedicated resources for both sides of the fence. So on the backend, that can be the drivers to the PLCs, it can be the tag historian, the long notification, it could be gathering all that data and storing into the SQL database. And on the frontend, that would be handling the application so people can launch a Vision client or launch a Perspective client and see that data. So the Gateway Network is what would make that happen. So we can actually have Ignition installed on two different servers. Of course, they'd be licensed on two different servers, but only for the modules they need. We connect them together, and now to the user, when they open the application, they have no idea that the data is coming from a backend server, a different server, but by having dedicated resources, we get more scalability out of it.

05:40
Travis Cox: We can have a larger system, we can start scaling out even further, we can introduce more backend servers, we can introduce more frontend servers, and we'll talk more about that here. But another important piece the Gateway Network gives you is a separation of concern. So if I have one Ignition server, if you will, it is all your eggs in one basket, so as your... If you have multiple areas in your plants, multiple different projects on that, they'd all become pretty critical. Over time, it's easy for one system, one project to affect another if they're doing something or different engineers and different sides. So a good use of being able to separate all the functionality is to create a separation of that concern. We can have a server dedicated to the backend, and I can have the frontend where I'll open the application, and the application is not going to affect the backend. It's on a different instance and vice versa. Alright. So this is one of the most common cases of the Gateway Network, is to separate out this functionality. Now another real common use case of it is you may already have Ignitions at your sites, at your facility, and you wanna be able to connect that up with enterprise, connect it to corporate. You wanna be able to get data up there and provide more enterprise functionality.

06:56 
Travis Cox: So a great use of the Gateway Network is to connect your site up to a corporate level. You can have the Ignition server up at, it could be on the cloud, it could be at your corporate data center, and you can connect the sites going outbound to your corporate center, and now we can share data with that. We can aggregate data, we can bring data from all over at different sites and have it stored in a database at corporate, so we can have, let's say, maybe a long-term storage of the data. Or maybe we could just query it and bring it up there so we can see it on a screen, so we can see all that data across our entire infrastructure in one place. So that can really... It could be a really great use of the Gateway Network. It allows to really increase visibility of what's happening, and you also get the opportunity if you start introducing a corporate system throughout management. So we have the enterprise administration module where we can check health and diagnostics of all of the servers at our sites, we can collect backups, we can do disaster recovery, we can synchronize projects and resources when you get into sort of a development to testing to production life cycle.

08:02
Travis Cox: I'm not gonna go into that here today. We're gonna focus on the Gateway Network, but this is a great simple use case to be able to take your sites, connect it up with a corporate system and be able to now have a bigger... A much bigger system out there where everything's connected together. And it is possible through that that a site can communicate to another site through the corporate location. That's kind of the proxying idea if you want to allow that so that you don't have to connect sites directly together. They can potentially have data flow by going through corporate if you had some need for that to happen, but that's another good use case of the Gateway Network. Now I mentioned proxy nodes here earlier, and this is an important thing that we have in the Gateway Network because we need to be able to fit it into any kind of architecture that's out there that customers have. So we see very highly segmented networks these days, and especially if we have an operations network, and a business network, there's no way you're gonna get them to talk to each other directly.

09:00
Travis Cox: Well, at least a lot of IT professionals now are simply not gonna allow that, so they have to be able to go through a DMZ layer or potentially multiple layers depending on what they're trying to accomplish. So with the Gateway Network, we can put an Ignition server in that middle layer, and I'll... Here looking at a DMZ, and really it's just the Ignition platform. There's nothing that has to be configured on it. We're just gonna make an outbound connection from our site to our DMZ, so that we can... We don't have to have any inbound ports at our site opened. So make an outbound to keep that secure so we're not having... We're not connecting down to our sites and then we can have a corporate site connect into DMZ as well, and by having those two connections to that Ignition server, the DMZ, we can allow data to transfer between that and in fact, we have discovery. So that's what the mesh network gives you. So the site will see corporate, corporate will see site by being connected through that DMZ. So we only have to make these connections from one place. We don't have to go make multiple connections, and the direction really matters in terms of how we make that connection. I'll be talking more about that here as we go forward.

10:10
Travis Cox: So another important piece about proxy nodes is that it will create... It enhances security, right? It makes it harder. There's more places that data has to flow through and there's more firewalls, there's more security that's in place, so that you're not necessarily opening things up. And segmentation, having that is... There's a reason that it's there, is to make it harder for things... For attacks to actually happen. But we've done a lot of work to ensure that the Gateway Network itself is highly secure and that it allows these kinds of configurations. So another thing I wanna point out real quick is that proxy nodes, they can avoid busy connections, and that a lot of people would do something like this where they would connect A to B and to C and to D, and they might then connect B to C and B to D and all of that. And if you really look at it, it can be kind of a complicated mesh of all these different connections at the end of the day too, because it's a mesh network and we have proxy nodes. We're trying to dictate... We're trying to figure out the path of where we're gonna send that data and the simpler we can make the architecture, it's...

11:23
Travis Cox: The easier is gonna be, but we have this picture like this, it can lead to stability issues, there can be a lot of different paths that they can take, and it may not be the original intention in the first place anyways. So we can actually do something like this. If we did have a need to connect these things together, we can have a single Ignition instance in the middle that all of these different servers, so now A, B, C, and D can connect to that, and they all see each other through that proxy node, so it avoids a lot of crazy complexity and the connections. And then there's only one place to manage the firewall. So as I mentioned, the Gateway Network is an outbound connection that uses web sockets for that, so it is full and bidirectional, but it's made from one location. And so here all these servers can make a connection to that proxy server, and that's the only place we have to control the firewall for the Gateway Network. We don't have to do it on any of the other servers that are there. More important, when we look at this kind of system here with the DMZ in the middle, they're only gonna allow connections into DMZ, not into corporate or not into the site. It keeps those...

12:31
Travis Cox: We can really lock those things down quite a bit. Okay, so I mentioned a little bit about how the Gateway Network works. It's a dedicated HTTP channel, data channel, and we're using web sockets and HTTP for that to happen. There's a separate port for it. It's 8060, it is configurable, you could change that, but that's the default for it, and the Gateway Network itself handles multiple streams of data, and it's queue based. It's a very, modeled after other queuing protocols like Kafka and things like that. Well, we have lots of different subsystems in Ignition that has their own queue. So for example, if I remote tags, setting tag updates through that would be a separate queue then, for example, tag history or some other EAM, other services that are on there. So we'll talk more about those, but those queues are prioritized, they could be synchronous or asynchronous, and the queues itself can also have limits to avoid flooding the Gateway Network, because if we have a lot of messages going out in the network, we want to ensure that we prioritize the most important ones. We also wanna make sure that we're not putting too much on there that we're gonna overwhelm the system, right?

13:41
Travis Cox: So we do that in the Ignition quite a bit, and I'm gonna show you a couple of tips here in terms of some settings that you can change to really control how that actually works. So that's how the Gateway Network... That's how it's physically... How it physically works. And if you look at some data flow, it's kind of hard to go through this, every little detail here, but I have the Ignition A that is connected to Ignition B with the Gateway Network. So A is creating an outbound connection to B so we establish that connection. So there's a web socket connection that is created. So you do have to allow web sockets in the firewall for that to work. And then ultimately, let's say that B wanted to ask information or... Sorry, that A wanted to ask for information from B, like tag history and it's gonna send a message to the web socket saying, “Hey, I need tag history data." And then the B is gonna send a message back saying, "Okay, it's ready." And ultimately, through an HTTP channel, we're gonna go get the data. We're not putting that data through the actual web socket itself. So it's a combination of web socket to have all the management of just simple data going back and forth and HTTP channel for actually gathering that information.

14:52
Travis Cox: But you'll see that the connection, it's always A asking B because it's an outbound connection, right? So that's why we don't have to have a firewall open on A so that B doesn't have to go... Is not going down on a separate connection. It's using the web socket to send those messages bidirectionally. If you look down here towards the bottom, B wants to request... Wants to send some stats to A. It'll basically tell A through web socket stats are available, A goes in that and gets the actual data from B and gets the stats, and that's what EAM would use. So it's kinda... I just gave you a sense of how that data flow works, how that connection works. It's always nice to see some diagrams, especially when we're trying to really troubleshoot and understand what our systems are doing. Okay, so one thing that you'll see is when you connect two Ignition servers together and you were to use Wireshark or look at the traffic that's happening between it, there are idle requests that are happening, so it's not like no data's being sent through initially. There is data, and it's a little bit more chatty than MQTT or a lot more chatty, but we're trying to ensure that things are alive, there's pings are happening, we're...

16:08
Travis Cox: Because the Ignition servers have service security and different services that are available, we're enumerating those services across it, so you'll see these kind of messages happening, you'll see incoming on our stats page, our status page, you'll see the stats of the incoming and outgoing bytes. So you can get a sense of what's happening. Plus, there's a lot of other diagnostic information. You can see exactly what each of the queues are doing. I'll show more of that as we look at it here today. What's really important is to establish that connection, understand what that baseline looks like, and then as you add services to it, as you're using that Gateway Network, you can get a sense of the load, what that data is... What data is being sent through, and how is that gonna affect me as my system is gonna grow or get larger, especially if we have a lot of sites bringing data up to a corporate location, we really want to make sure we understand what's happening there.

17:00
Travis Cox: Alright. So before we go into some of my tips, I wanna talk about just quickly the services that bring life to the Gateway Network. The Gateway Network itself is part of the Ignition platform. It just comes with every instance. It's the services that actually make it what it is. You have to be able to send data through it. Our EAM was one of the first modules to do that, and it's the management, so the checking health and diagnostics, synchronized projects, being able to collect backups, remote upgrades. All those things are part of that. And that's a big service that rides on there. Another is remote tags, probably one of the most important ones that's there. It's what makes separating a backend and a frontend possible, is that the frontend can have a remote tag provider. All those tags on the backend are available, but they don't have to be configured on there. It just simply going to that server and getting those values, making it look like it's on that server, but it's not. And there's a lot that comes with remote tags. There's also remote tag history, both and being able to send history data to another server to get stored somewhere, as well as to be able to query history from a server and bring it over.

18:04
Travis Cox: So the frontend could query the backend and say, "I need history data. Bring it here." Or from my site, I can send historical data, aggregate it up to a corporate location, store in a corporate database. Both of those things are possible with it. Remote alarm history and notification, so we can also synchronize alarm journals, we can do alarm notification centrally. For example, we can send alarm messages through the Gateway Network up to a centralized server that maybe has a connection to our VOIP server or to our SMS Gateway, or to our email. You can use our Gateway Network to distribute that out so that... Especially edge nodes could... With alarms, could send it out through a central one. There is remote audit logs and there's also the ability to send just arbitrary messages through scripting over the Gateway Network.

18:52
Travis Cox: So these are all the services that we have in there and that's part... Because the Gateway Network is part of the platform, other modules, other things could take advantage of this Gateway Network. And, for example, Sepasoft definitely does with their MES modules, take advantage of that Gateway Network in terms of being able to synchronize their production model with that. Okay, so that brings me to the tips and tricks here. This is the bulk of really what I wanna get through or share with you guys here today. And in terms of all the lessons learned, sort of seeing different systems that are out there and how they put them together and some of, of course, the issues, the bottlenecks that we run into. So we're gonna look at the direction again of the outgoing connection. I'm gonna share a tip with that in terms of where do you make that connection from. We'll look more into security around the connections along with security on the services, the different things that I mentioned previously like EAM and all that we wanna control, who can do what. We're gonna talk about some optimizations with remote tag providers that can make it more efficient and make things just work a lot better and be more scalable.

20:00
Travis Cox: EAM optimizations. We're gonna talk about ping rate and latency optimizations, understand just the general traffic and the diagnostics of the Gateway Network, and then look at remote tag history optimizations. So we've got quite a few tips here for you today and hopefully, you can take this back, at least arm you with what's necessary to... If you haven't done this yet, to know what to look out for. But if you have a system in place, you can go and really inspect how it's performing today. So let's start with our direction of the outgoing connections. So the direction doesn't matter for Ignition. We can make a connection from either side, so if I've got A and B, A can connect to B, or B can connect A. You don't do both. You just create one connection. It will be bidirectional. But you gotta figure out where you're gonna make that connection from. But it doesn't matter for Ignition, but it matters for IT. It's where they're gonna have that port open in the firewall. So typically, my recommendation is that you're going to be connecting upwards, so the local systems will connect to a DMZ or they'll connect up to a corporate or they are gonna connect up to a cloud. You're gonna go up typically or in the case of a DMZ, you're gonna connect into the DMZ from both ends so that the firewalls are limited on that side.

21:20 
Travis Cox: So pretty simple tip there, but that direction will matter, especially if you're talking about with... If IT people have questions about how it works, now you can explain to them how that... Where you're gonna make that connection from, how we're gonna lock that down, how we're gonna secure that connection. So that's the first. It's a simple tip, but one that a lot of people get tripped up, especially in the fact they only need to make the connection from one side. And once you establish that, that connection, you've now... If there's other connections in the network, you got this mesh network. You can communicate through that. Alright, so the next tip here is on connection security. So as I said, the Gateway Network has a lot of security in place, so we have to make sure that the data is encrypted, we have to make sure that there are approvals. We don't want anybody just connecting to this network. Now with that being said, there are settings that control this and I've seen a lot of... Some people, the defaults are what we want it to be, but as you could turn these things off, I highly recommend not doing that. And basically ensuring that you're requiring SSL, so that you have to connect over an encrypted channel and that you have connection approval.

22:43
Travis Cox: So with SSL and connection approval, if I have Ignition A connecting to B, on B, we have to approve a certificate for the TLS connection. We also then have to approve A as being able to connect to B. So approvals are really, really important. You can also consider doing whitelist. So on top of an approval, you can have whitelist where if I have a DMZ server, I can truly limit who is gonna be... What systems are gonna connect into that. And that's all on the general settings of the Gateway Configuration. So if you go to the configuration page, all the Gateway Network stuff, you'll see those. So again, another simple tip, but it's by default, it's SSL and it's also approval only. Just ensure you use those and you could require approval on both ends if you want if you wanna be more strict on that connection and the whitelist can really help with that. Okay, so secondly to that, we have service security. And not a lot of people realize that we have this with the Gateway Network.

23:52
Travis Cox: So service security is security policies that we put in place on each Ignition system that defines what data could be accessed, if data is... If that service is available, like if EAM is available or if remote tags is available, and if it is read-only, read/write, if... All these things. It's policies that define that. It's defined at the source, so every Ignition server is gonna be a source of its own data. And so if I'm... Doesn't matter the direction of the connection, but if I have established a mesh network, if I have any other server trying to get data from me, I'm gonna define the security for me so that... Because if I defy the source, then nobody else can change that. So no matter who's connected in the network, they won't be able... If you're really strict about it, they will not be able to do anything. So I always say consider a zero-trust environment. You kinda deny everything by default, and this is not the default for Ignition when you start it up. So you're gonna go into the Gateway Configuration, you're gonna see a default zone. And for the service security, you're gonna see our defaults in there that simply allow all of the different services. And yes, the remote tags is read-only, so it's not gonna allow read/write.

25:13
Travis Cox: A lot of people get tripped up by that by default. Tag history is only query only, not storage. So we have everything allowed, but it's gonna be reading data, not writing on there and that's great. But I say, let's be more strict about who's gonna be in the network and what they can actually do. So what I would suggest is basically taking that default zone and basically editing the policy to... Do I have a slide on? Yeah, I do. Editing a policy on the default that simply denies all of those different services. You'll see alarm journal here, alarm notification, alarm status, tags. Deny them all the way down so that you get there, that the default has a security policy defined. And by default, it's gonna hit that zone first. It's gonna say it's everything else... Everything is denied, so it simply... They will not... They're not gonna be able to do anything. So with that, then you can... We're gonna reject all this unwanted requests, then we can define actual security zones and policies for things that we care about.

26:22
Travis Cox: So a security zone is something you can set up in the Ignition that is an arbitrary zone. You can call it whatever you want. And it consists of... You give it a name. I called mine “Central Servers.” So I have... Basically, I have a bunch of central servers up there that could get data from me and I can specify the zone by IP addresses, host name, and gateway names. So you can use any of those. And in mine, I chose to use gateway names here, so Ignition B. Basically saying Ignition B is a server that's from my central side that I... That's in the zone and that I want to allow certain services for. So then on the right, I have a policy defined for that zone. And if you go back here, it's where you see I've got central servers and there's a policy. I can edit that one. That policy is to allow tag access. I want to allow that. Maybe I deny everything else, but maybe I do remote tags and allow it. And I can make that read/write specifically. So I'm basically saying that when B... A and B are connected together, B can because I specifically said on A that B is the only one allowed to connect and B has the ability to do... To get tags.

27:36
Travis Cox: So I kinda consider... I always think like if you really are strict about it, you then can control the data flow a lot better in your network. You don't have just arbitrary requests that are happening. It's very possible that I can have a lot of... I can have a pretty big mesh network and all of a sudden, you got a lot of requests happening over and you're... You got a lot of data being asked for and you may not want them to get that data. It could be inadvertent there. So we could be very, very strict on those. So that's an important one here, a tip that the more you consider how all the connections are happening, the more that you can optimize your network. Alright, so then moving on, remote tags is a really important service that we've got. It makes the backend and frontend act like one server as I mentioned. So remote tag provider, they allow you to read tags, you can also do writing, you can do alarm, you can get alarm status data, you can do alarm acknowledgement within that. And it's a very efficient service in that the frontend server, if I have a client open, it will request... If I have tags on the screen, let's say I have 100 tags, it's going to request from A, "Hey, on the backend, I wanna subscribe to these 100 tags." The backend then will see when those values change and when they do, it'll send it up to that frontend which then goes to your screen, so you can see those values update.

29:00 
Travis Cox: So it's very much subscriptions, it's very much... It's pretty efficient by default. But there's one part of it though, that is something that I recommend always turning on and that is our alarm mode. So if you want to be able to... If I'm on a frontend server, I wanna put alarm status component on there. I do wanna see my alarms from the backend. I wanna see what the state of them are, I wanna be able to acknowledge it, all of that. You want to allow alarms, but the mode is gonna be important here. There's two modes that's queried by default. I suggest changing that to subscribe for all of your remote tag providers because what that's gonna do is... 'Cause the remote tag provider does tag values, but also does alarm information. Tag values are subscribed by default, but alarms are not, and there's a reason for that 'cause there could be a lot of alarms, right? It could be a huge set of them, and by default. Querying means if I have a screen open on the frontend and I'm querying the alarm status, it will go backend query all the alarms for what our filters we have and bring it back and have that update at some frequency. Not a bad thing, but if you have a lot of alarms, if you've got a pretty big system that can have some performance issues, it's much better to just subscribe and say, "Hey, tell me when things change and I'll store it locally."

30:16
Travis Cox: So subscribe mode uses more memory, but has a lot... On that frontend server, but has a lot more performance because those alarms will be immediately available on that frontend, 'cause everything is gonna be pushed when it changes to that server. So really one important one. I'm always looking for this with customers, especially if you have only a few alarms, no big deal. It's really not gonna be a matter, but if you have lots, you would consider this on the remote tag provider, especially when you have lots of these out there. Okay, so I... Let's see. I got... I won't probably talk through all of those. The other thing, okay, yeah, there's one con with doing the remote tag provider. A lot of people run into this and that there is a delay, so by default on the frontend, if I have no clients open, then I'm not asking the backend for any tag data. Alright, nothing's happening through besides our keepalives, the pings that are happening. So when I open a frontend, or I open a client and there's a screen that needs 100 tags, those tags are on the backend, the frontend has to then ask the backend, "Give me these tags." The backend will then send it over to the frontend.

31:23 
Travis Cox: There is a slight delay for that to happen. And so if you have a screen, you have a quality overlay on that screen just for maybe a couple seconds while it's asking for that depending on how many tags you have and that delay can confuse an operator. So if I have a screen open, I've got all these, it might... They don't like to see overlays. You can remove the overlays. It won't be a big deal, but if you really have tags that are incredibly important and you want them to be instantly available on the frontend, you can consider creating a reference tag.

32:00
Travis Cox: By creating a reference tag on the frontend server, you basically will... It'll always be subscribing to that tag, so that tag will be already available in the frontend, and you can just simply have it there. So I would say don't do it for all your tags if there's just important ones that you wanna make sure they're always there, so there's no delay and bring it to a screen, reference tags can help that. Now we are hoping to add a feature to the remote tag providers that can handle this behind the scenes so you could just specify what tags you wanna have available. But I just wanna make that point. That's the one con. It works... Again, it works really, really well. The delay is not very long, but if you have a screen with 1,000 tags on it or more, big P&ID, if that bothers you, there's ways that you can make that better.

32:46
Travis Cox: Okay, so next tip here is on EAM. EAM is the management again, and the way that that works is you have one controller. Typically our controller is centrally essential, and you have lots of agents. All the agents who are going to... The agents will be configured to that controller and that controller is going to... We'll be able to run these tasks. I can collect backups, I can synchronize a project. And one of the things that we do is checking health and diagnostics, and with that, the agent is sending statistics to that controller. And there's a frequency by which we do that. There's a setting on the agent for how fast to send those interval stats, and that seems like CPU, memory usage, database throughput, how many clients are open, those kind of things are being sent through on... To that controller. So I would just say look at your queues in the status page for EAM so you can kinda get a sense of how much is actually going through it. And potentially consider a slower rate if you've got a high number of servers in your network. Higher... A lot of throughput especially if a lot of other requests that are happening there, you may not wanna send stats as fast. Maybe you wanna do it every minute instead of every five seconds. Again, it depends on how fast you want the controller to be able to notify you when there's something that happened, right?

34:10
Travis Cox: But by default, it's five and that's... And every server you have, if I have a 100 servers, 100 agents, they're all gonna be every five seconds sending their stats up to that controller. So it's not a big deal, but it's something you can look at. It's all these things stack up in terms of messaging that are there. So if you understand where all the different settings are and how to control these queues, then the more you are armed to make it more efficient to optimize what's there. Okay, the next down here is ping rate and latency optimizations. So as I said, we are constantly doing ping messages within the Gateway Network, and there are settings in Ignition that control how often we do those pings and when we consider... And when we timeout to consider that that's a faulted connection, that server went down, we're not getting responses back. We wanna be able to do that so we can fault it, but in general though, even if our network is healthy and the connections are there, there can be high latency. We can... We might be getting slow responses, there could be a lot of threads we need to max out because we're putting a lot of data in our network, there could be a high number of requests that are happening out there, or even the Ignition server can be performing slowly. So if a server is performing slowly, it's gonna affect if people are... If other system is trying to ask for data, it may take longer for that to come back.

35:38
Travis Cox: My only point being here is that I would... I highly encourage you to go into the diagnostics page or the diagnostics tab, where you can then for each of your connections on your Gateway Network, you can test it. You can see what that... How long that response took to come back. Albeit a very small response, but how fast it happened. If you start seeing the response time going seconds or... I've seen as high as 10 seconds, that means that the server you're trying to talk to is... There's something going on, right? It's doing a lot of work. It could be threads maxed out, which I'll talk about threading in a minute, it could be just the servers performing slowly, high CPU at the moment. Who knows what that is? But I would encourage you to understand what your connections are doing, understand what your response times are, because you may then want to tune things like the ping rate and others accordingly. So we're pinging a server to see if it's alive, and we're doing that at a particular rate. By default, I think it's a one-second rate, so every outgoing connection, we're doing that at a one-second rate. We are then looking for timeouts and we're looking for a certain number to happen before we consider it faulted and when it faults, we close the web socket and we gotta start up a new one, right?

36:57
Travis Cox: So if you have high latency on your connection, and we are doing these pings and we're timing out a lot, we're gonna consistently... Constantly be closing connections, reopening connections, and if you got a big network and you got a lot of mapping, it can look like your controller or your central server is really performing very, very poorly, or things are just not... No messages are being sent around or it's taking a very, very long time. So with that, take a look at those settings. You can increase the ping rate. I kind of recommend going with like a 10-second ping rate, especially when you have a higher latency or a lot of servers in that network, and you can consider... That's on the outgoing... Sorry, the incoming side. On the incoming side, you'll see the settings as well as the outgoing. We had to ping from both sides because we have to make sure that both sides, we know that the other side is available, 'cause with web socket, it's up, and we may not see anything through that web socket channel. So we do have to know on both sides if it's there. The incoming side, you can set to like 10 seconds and the outgoing side, you can consider that being two times that, so there are settings on both ends. So just take a look at the latency. You can adjust ping rates. I've seen a lot of customers where we just do this, and their system is a lot more stable when they have...

38:12
Travis Cox: Especially when they're going up to corporate or to a cloud, data is... There's not a lot of lost connections and things are able to actually get there when it needs to get there. Okay, so that comes to understanding the traffic and the diagnostics of the overall Gateway Network. And I would highly encourage people to spend a little bit of time to closely review the Gateway Network Status Page. And when you go there, you can understand all the bytes that are coming in and out, so you can also see all the queues that are there. And so you can see a picture of it here. And the queues are... All the different subsystems, those are different queues. You can see how many active requests are on that queue, you can see the total amount that's actually happened on that server.

39:01
Travis Cox: And when you restart it, those would go back to zero. But you can see what your throughput really is on these different queues. And in particular, I will show it in a moment there. On the outgoing side, we have... There's a finite number of threads that are processing those requests, outgoing or incoming requests. And so it's important to kinda see how big your queues are because that's gonna affect the amount of things that a server has to process. So understand the data flow. We really wanna look out for a high number of requests or things that we just weren't expecting to have happen in our network. "Why am I getting so many alarm or tag history queries when I shouldn't be?" Maybe I don't want that to be happening. So by looking at these pages, you'll get more information about that.

39:54
Travis Cox: Now the threads, the send/receive threads is something that's really important. On the outgoing side, you're gonna see we have... There's settings over there for your send/receive threads. By default, they're five. So that's the thread pool. Basically, the outgoing... So my local server would have these, and if I'm sending data up, tag history data or if messages are coming down to me, I've got to process all these requests. And the send side is all the things we're sending, and the receive side, all things that are coming in. And because the thread pools have limits and it's five by default, that means that limits how much I can actually process, how much I can actually do on that system. So the big thing here is, for example, tag history, is the way... If I have a central or a frontend server trying to query tag history from a local server, I can overrun it very quickly by asking for a large chunk of data and do that five times. I'm gonna max out all of my received threads on that, and that's what you're seeing on this particular picture. It can cause that, then that local server is not responding to anything else, and so other things that you're having in your network all of a sudden are just not working, or it's gonna be very, very slow for that to happen.

41:15
Travis Cox: So the thing here is there are settings for those send/receive threads. You can certainly increase that with higher workloads. Simply, like if you have... If you know there's gonna be a lot of requests for tag history you wanna guarantee there's threads available for it, you can certainly do that. So I just wanna point that out. Take a look at the status page, look at these different... The queues, look at the threads that are happening so that you can get a sense of what's going on. We're not gonna have infinite threads because we don't wanna run out and basically crash the system. So we have finite ones so that we can at least reject any other requests that are happening, but that rejecting of the requests could be important things that you're trying to send, so you may not want that to happen. So again, looking at the diagnostic, understanding it will help you understand where you wanna go with that.

42:05 
Travis Cox: Okay. So looking at remote tag history. So this is a service that allows you to do two things. I mentioned before, is I can use remote tag history to have a local server send data to a corporate, to store data up there at corporate, or that we can query data. And in particular, I wanna focus on the querying side. This is a very common one, very simple. We separate out our backend and frontend. And you may not even connect your frontend to the database. I know the diagram here I have up, it is doing that, but some people don't even do that. They wanna just query the data from the backend on there. And so we could do that with remote tag provider. We can bring in that data and query it through the Gateway Network, have it displayed.

42:58
Travis Cox: Now what's really important is to understand, if you have a chart on screen, where is that chart getting its data from? Is it getting it from the database directly, or is it getting it through a Gateway Network, through the remote tag provider? First of all, just investigating that to make sure you know where that's happening is really important because I'm gonna say, one of the biggest tips I'll put here is just simply avoid doing... Querying tag history through the Gateway Network. And it's not that you can't do it. There are places where it makes a lot of sense, but for the most part, we don't have to do that. It's better to query the history directly from the SQL database on the server where the chart is. So it's just going to basically enhance the user experience, things are gonna be faster, and it doesn't bottleneck the Gateway Network. That's the thing we're trying to avoid.

43:53
Travis Cox: Now, there's ways... There's settings that I'll show you here that we can... If we need to go through the Gateway Network, we can make it where it wouldn't bottleneck the Gateway Network. But in most of the architectures that I've helped customers with, you don't need to do that. So if you look at this one here, frontend and backend, they both can talk to the same SQL database. They both can make connections there. So the backend is gonna store data to it and the frontend's simply gonna query data from it. That is a very... A common one, and it works really, really well. Things are quite fast and quite efficient.

44:27 
Travis Cox: So this scenario... Well, I'll come back. So the... We'll look at the... When we go up to like a corporate location in particular. But again, try to avoid it through the Gateway Network, and if you can query from the database directly, that will be preferred. Now, I did mention use fully qualified tag paths. So part of the investigations to do where is the data being queried from is to understand the tag paths that are being used on the historian system. So there are two different tag paths you can use: a real-time tag path and a historical tag path. The real-time tag path is pointing to a tag within a tag provider. And what happens there is that Ignition will have to ask the tag provider, "Where do I query the history from?" And so that could mean that it would query that remote tag provider, which means I'm now querying over the Gateway Network. So you typically want to avoid using real-time tag paths, because they don't specify which data provider or data source, or tag history provider to query the data from.

45:35 
Travis Cox: Versus the historical tag path, that is a fully qualified tag path that defines where it's gonna come from. So I can see history is the actual tag history provider, which could be a database directly, or it could be a remote tag provider. But it's a fully qualified historical tag path that the charts typically use. By default, the power charts and all that use this. And so it's easier to see where it's querying from, but one thing to look out for is on that server, look at the tag history providers, because you can see I've got history splitter, Ignition B_history. The top one is data source. That means I'm querying database directly. So in my tag path up there, that historical one, history, I'm querying the database directly. If I had... Instead of history at the beginning of that tag path, if I had it said, Ignition B_history, now, that data is being queried through the Gateway Network because that's a remote history provider on there, okay? So fully qualified paths, look at where the data is actually being queried from.

46:37
Travis Cox: Again, I say avoid querying through the Gateway Network. Go to the database directly if you can. And this is one where when you have corporate, where often there's no database at corporate, so we have to query through the Gateway Network, you can do that. But we can also then just put a database at corporate and we could aggregate or mirror data up to it so that we can make it available to be queried locally. And we have Ignition tag history splitter to make that happen. So you can query or you can send data to a local database and to a central database. You're gonna get... On both sides, no matter where we are, we're gonna query from our appropriate database, and we're gonna get fast response times on both sides, alright? So it does mean we're duplicating data up there, but it's gonna help us query. Not put load in the network and not put load in our local servers, because we can't control who's gonna access the system on corporate potentially. And we could have hundreds of people all of a sudden querying history data that we didn't want through the network. So I think I've harped on this point enough here, but that's an important one to look at.

47:37
Travis Cox: Now, if you had to query through the Gateway Network, if you have to do it, there's a setting on that queue, the tag history queue, that you can set the max active to two rather than being unlimited, and that would mean that there would only ever be two requests allowed for me for tag history. So it would not ever bottleneck that system. If you want, you can even put that to one. It would only allow one query at a time to happen. So you might get overlays and errors on the frontend side if you're trying to query too many things at the same time, but it doesn't... It will not bottleneck the Gateway Network and it won't shut down the system, okay?

48:17
Travis Cox: Alright, I think that is it. And I think I did save a little bit of time for questions.

48:26
Audience Member 1: So on the accepting connections, you mentioned that you can whitelist servers. Does this mean that when you accept certificates, if they are signed with a CA, you're accepting everything from that CA?

48:46
Travis Cox: Yeah, that would be... I mean, the PKI, that certificate, we are trusting that... We query it self-signed. There is no root CA for that, right?

48:57
Audience Member 1: Right. By default, you said they're self-signed, which means you're gonna have to accept them individually.

49:01 
Travis Cox: Yeah, you have accept that full cert. Yeah.

49:03
Audience Member 1: But if you've generated custom certificates per server...

49:07 
Travis Cox: You just need the root CA.

49:09
Audience Member 1: You just need the one CA?

49:09 
Travis Cox: That's right. But that means you have to go and generate the certs for everything yourself against that root CA. And that could be a little bit of a pain for people, but it is something you could do.

49:20
Audience Member 1: But then you don't have to accept that?

49:22 
Travis Cox: That's right. You don't have to accept the certificate, but you still have to accept the actual connection of the server. There's two approvals: certificate approval, which root CA would handle the certificate part. There's still, "Do I allow that server to connect to me?"

49:37 
Audience Member 1: Got it.

49:37 
Travis Cox: Yeah.

49:39 
Audience Member 2: So in an architecture with a proxy gateway where, say, you're using MQTT, is it possible to leverage the proxy gateway for MQTT so that you can have your site-based publisher pushing to potentially a cloud broker, Azure, whatever you got, and push that to the Gateway Network, or would you need to use another proxy service for that?

50:00 
Travis Cox: So we get a lot of that question. There's a lot of similarities, Gateway Network and MQTT. They are two completely separate things, right? Very similar in that there's outbound connections on those. The Gateway Network is our, if you will, proprietary Ignition-to-Ignition communication. MQTT is the open standard. Okay, now with that being said, we're not proxying... If you're gonna use MQTT, I'm publishing through transmission on the local side, you're publishing through a broker, right? You're not using the Gateway Network at all. But you can go Ignition to Ignition to Ignition and then have transmission there that publishes data, right? So you can go through... You can send data through Ignition by proxying it and then have MQTT at a higher layer, if you wanna do that. Now, if you wanna proxy MQTT, typically you gotta have a broker and then have another transmission to retransmit it somewhere else, or you have a broker that supports relaying the message to a different broker. That would be pure MQTT. But if you want to use Ignition to go Ignition Gateway Network to a place where then you can go MQTT, you can do that. So for example, we have customers who are corporate. They wanna do MQTT from corporate, so they have a lot of edge at both sites, they're basically going through a DMZ sending data to corporate, and then from there, there's a transmission to go to the cloud, for example. Does that make sense? Does that answer your question?

51:19 
Audience Member 2: Yeah. Yeah, it makes sense. Thank you.

51:21
Travis Cox: Cool.

51:24 
Audience Member 3: Hi.

51:25 
Travis Cox: Hi.

51:25 
Audience Member 3: On that last point you made about restricting just two queries at a time, if a client is running the query and something, say there's three clients and that third client, do they give you a notification on the client's side that query is unavailable right now on the screen?

51:38
Travis Cox: It will error out. Yeah, it'll error out. And you'll get... On the client side, you'll get a message. You'll know that it errored out. So if I had a frontend and I pull up a chart and I query that data through the Gateway Network on the local server, and say I have that max setting to one, and that query takes a minute for it to run, right? It's taking a while. Maybe I'm grabbing a boatload of data, but it's taking a whole minute. Within that minute, if I have another query that happens, that query will error out 'cause there's only one, and they'll get indication on the frontend. So they'll... That's the problem, is you'll get... At the frontend, it'll seem like things aren't working, but that's intentional because we didn't want to allow more than one to happen. So you'd have to try to control that in a different way if you wanted to give them a more friendly message, but by default, we're not gonna give a friendly message. It's just gonna error out.

52:26
Audience Member 3: Okay.

52:26 
Travis Cox: Yeah.

52:27
Audience Member 3: Thank you.

52:28
Audience Member 4: Yeah. So kind of a loaded question, but... Or a couple questions, but if I have... And you know our architecture, but if I have a site that has a lot of gateways and they're all connecting as agents to a controller, what's your opinion on having that controller, since it already has those active connections being the proxy as well for all that GAN and remote tag provider traffic? And then what considerations would you have with making an EAM controller of a large site being a part of a redundant pair? And same with the proxy, the proxy being a part of a redundant pair in a large distributed architecture?

53:09
Travis Cox: Okay, so a few questions there.

53:12
Audience Member 4: Yeah.

53:12 
Travis Cox: First one being in that... Keep the remote, just in case I forget all the other questions that we have on there. But the first one in that, if I have EAM controller making that the proxy for other services, right? The EAM controller doesn't do a whole lot, to be fair, right? Like there's... You can create automated tasks that are running to do collect and backups and all of that, but it's... That particular server may not be doing a whole lot typically, so having a proxy data for other remote tags or other services, I think, is perfectly fine. You could make that controller be in like a DMZ or somewhere that makes sense, right? If that's actually happened. Again, you'd wanna look at the queues and kind of see all the stuff that's happening through that server, right? To make sure you're not overrunning it. You might increase the send/receive threads or something on different sides depending on that, but there's no issues with that. That was the first question. What was the...

54:03 
Audience Member 4: The second one was... I'm blanking on it as well.

54:10 
Travis Cox: Redundancy? You mentioned redundancy.

54:12
Audience Member 4: Yeah, redundancy on the EAM controller and redundancy on a proxy gateway, if there's any considerations to have there.

54:17
Travis Cox: Yeah. So the Gateway Network is redundant-aware, and that's... I make a connection from the master... Oh, I make a connection to the master, and through that, we would understand who the backup is. And ultimately, that gateway... That redundant pair acts as a single entity, as a single gateway. They share the same system name to your Gateway Network there. So if I had a service that was, for example, EAM collecting backups or whatever and ultimately... Or if... Either way, right? If a master failed, the backup would take over for those tasks, but then any servers trying to send data to it would know that that's not the one to send to, because it's the same based on that system name. So every Ignition server has that gateway name, and the redundant pair acts as... So technically, it's all transparent to you with that. We've had some things we fixed with it, but it is... It should work perfectly fine.

55:13
Audience Member 4: Okay. Since I have the mic, one more.

55:15
Travis Cox: Yeah.

55:17 
Audience Member 4: In our largest location, would you... And this is probably a hard thing to answer given you need to know what's going through those GAN connections, but what would be a rule of thumb you would use for how many incoming connections you would wanna have going to a proxy?

55:32
Travis Cox: Well, it's gonna be based on the compute power you have, right? The CPU memory. There's been cases where... There's different thread pools in Ignition for different things, like transaction groups and scripting and this and that. And so we've had people increase those up to hundreds, if not thousands of threads, if the server can certainly handle them.

55:52 
Travis Cox: The problem with it though, here's the one caveat to threads, if I increase them, I have more things that are gonna happen concurrently 'cause I'm gonna allow more things to go, which means that now if the server couldn't handle it, I'm gonna get a higher CPU 'cause I'll be spinning 'cause there's more threads that I have to run, right? So there. You kinda have to increase them and see how things go, the diagnostics. But there's no real true rule of thumb necessarily. It's just if you request... If you're having a lot of requests at the same time, but they're not taking very long, then you definitely should increase them. But if you have some requests that are taking a very long time, you may not want to have those in there 'cause they could stack up and they could still overrun your max, right? So it's the long-running stuff like long-running queries or long-running tag history or any of that kind of stuff that you definitely want to understand and see if you can optimize or avoid completely. Yeah, good questions there. Got time for one or two more. There's one. Somebody back up there on that second... Yeah, I got... Yeah. So I saw him up there first.

57:05
Audience Member 5: Thanks. A quick comment and a question. So we have a EAM server set up at headquarters. It has about 80 Ignition systems connected to it. Discovered that the proxy setting was set on by default. It flooded the EAM server with... 'Cause it connected every server to every other server. So just make sure that one's turned off. The other question I have for systems that are using SQL databases as a historian, what's the best way to access that data from external systems, not the Ignition system?

57:35
Travis Cox: Oh yeah, like outside of Ignition, right? You wanna access...

57:37
Audience Member 5: Outside of Ignition, yes.

57:39
Travis Cox: Yeah. So outside Ignition, you could certainly connect to that database and query it. It's just it's in a... The schema that's there...

57:45
Audience Member 5: It's an odd schema. Yes.

57:47
Travis Cox: Yeah. There's different tables that we have for partitioning, right? So you got a couple of options. We don't have a store procedure that we can give you. I think there's one that just got put on the [Ignition] Exchange as part of the Exchange Challenge that is a store procedure for, I think, Microsoft that handles the querying costs, different partitions. So somebody did write that. We don't have one. They can just simply... A lot of times people just need to query the latest partition, the latest month, in which case, that's pretty easy, but you do have to understand that. So you could do... Two things you could do. One is you can turn off Ignition's partitioning where we don't create tables for every month. We create one table, and use actual SQL database as partitioning mechanisms so that we're not gonna... We don't want to... We want the performance to stay good as we go forward. You could do that so you have one table to query. It's a lot easier to query, or the other method is to use web dev and have a REST API in Ignition where you can go and get that data, and I recommend that. So if a third party wants to get it, go call an API in Ignition to query that history.

58:51
Audience Member 5: Thank you.

58:52
Travis Cox: Yep. Right back there.

58:57 
Audience Member 6: Hey Travis, so I've had a question regarding doing Gateway Network connections over cloud, like over a cloud network. So we've experienced issues with pulling tag data in a remote Gateway Network from a local instance to a cloud instance, but did not experience those same issues at a local site. And when I say issues, we started getting block threads on all of our tag reads and stuff like that, and we haven't been able to really identify or pinpoint how to resolve the issue on the cloud side.

59:30
Travis Cox: Yeah, so that's a good use case to kind of go and look at, again, the statistics on there, on the Gateway Network to see what the queues are doing, see what the... Data is actually being sent back and forth. It could also just be that the web socket channel is being closed a lot, in which case increasing the ping rate. Have you done that, increased the ping rate yet?

59:46
Audience Member 6: Yeah, just to five seconds. It didn't seem to make much of a difference.

59:49
Travis Cox: Or go higher potentially with that. But that's something that... What we could do is you can get in touch with our Sales Engineering team and we can certainly go through that a little more detail with you and see... Just try to understand where that bottleneck... Or of course, working with Support here too. It's hard to tell what's happening without looking at the system.

01:00:09 
Audience Member 6: For sure. Thank you.

01:00:10 
Travis Cox: Yeah, unfortunately. Alright. Well, I am out of time. Appreciate it, everybody. Thank you. I'll be around for the conference. Looking forward to it.

Posted on October 18, 2022