What are Interface Errors and Alerts?
The biggest metric for success in network engineering is uptime. If the network is down, it's as useful as a brick. Therefore, a network engineer's primary goal is ensuring a stable and secure network. One of the best ways to ensure uptime is to monitor for common network errors to mitigate and eliminate them.
This aspect of networking is so crucial that CompTIA has included it as a topic in the Network+ Exam. The exam hones in on CRC errors, runts, giants, and encapsulation errors. If you're unfamiliar with any of those concepts, then you've come to the right place. Let's start off by walking through interface errors in general.
What are Interface Errors?
An interface error is a network error within a network interface such as a router or switch. These problems are not intuitive unless you know what to look for. Let's start with one of the most common: CRC (Cyclical Redundancy Check) errors.
CRC Errors
CRC interface errors occur when the data received does not match the data sent. Each packet includes a checksum calculated by the sender. The recipient recalculates the checksum upon receiving the packet. If the checksum calculated by the recipient does not match the checksum sent with the packet, a CRC error occurs.
As a simple analogy, let's say a UPS delivery driver receives a package. Alongside the package, there is an inventory of the contents. Before driving away, he verifies the manifest with the contents. If it's all there, he delivers it. The recipients then verify the same package against the manifest. If it's all there, she accepts it. If it does not jive, then she sends it back. (akin to a CRC Error).
CRC interface errors can cause myriad issues, wreaking havoc on your network. For one, you must retransmit every packet with a CRC error. Retransmission can cause network congestion, resulting in latency. Error detection and retransmission consume CPU load, degrading the effectiveness of the interfaces. Lastly, applications will suffer. VoIP calls in particular degrade and become choppy, frustrating users.
How to Detect CRC Errors
You can detect CRC errors easily if you know what to look for. Check the logs on the interface themselves. On a Cisco device, you'd type show interface <interface-name>. The result will look like this:
Take a look at the highlighted part. It says "678 CRC". That is how many packets received the error. Not good!
One thing to consider is that you can also use log aggregators such as Splunk. Splunk has listeners that allow engineers to integrate interface logs. That way, you don't have to check every interface for the error, just Splunk.
If your organization is using Splunk, a query would look like this:
index=network_logs sourcetype=cisco:ios "CRC error".
This should show approximately the same message as the interface logs themselves.
How to Prevent CRC Errors
CRC errors are the result of faulty hardware, high throughput, or EMI . Verify the cables used are up to spec and can handle the required throughput. Verify there aren't any devices close to the cables corrupting the packet transmissions.
High network loads can cause buffers to overflow. If the buffer overflows, partial packet omission can occur, resulting in CRC error. Ensure your interfaces are powerful enough to meet the load of the network. Do this by verifying CPU size of the interfaces, buffer sizes, RAM size, and bandwidth.
Make sure the network cables are not damaged. Also, Cat Cables are only good for a certain length. If the cables are spanning further than they should, this will result in packet loss, ergo CRC errors.
Giants
Giants are packets that contain more data than allotted. A network frame generally permits 1518 bytes of data. If a packet is bigger than that, then it's a "giant" and is flagged as such.
Giants can lead to packet fragmentation, where a packet is broken into several pieces for transmission. This process causes unnecessary overhead resulting in network latency and instability. Oversized packets cause traffic congestion because of their size. It would be like driving a Smart Car on an interstate with nothing but semi-trucks and tanks. Good luck with that.
Giants also indicate compatibility issues between network interfaces. An interface might send a packet it considers to be a normal size, while the recipient finds it oversized.
How to Detect Giants
You can detect giants the same way you can detect CRC interface errors. Crack open the interface or log error and search for them using show interface <interface-name>.
Pro tip: When searching an interface, use include. Include allows you to search for specific phrases within the log. So, when searching for giants, type show interface | include "giant". This will provide all the logs indicating the word "giant" in them. For *nix folks out there, include works just like grep.
Take a look at the logs directly above the yellow rectangle on the picture up top. It says, "0 runts, 0 giants, 0 throttles". If there were giants on this network, it would show it there. If giants is greater than 0, you have yourself a problem. Let's find out how to prevent it.
How to Prevent Giants
Giants are generally the result of misconfiguration. Verify MTU (Maximum Transmission Unit) configuration on all network interfaces. This is almost certainly the cause of giants. While you're digging through the interfaces, make sure all firmware is up-to-date. Out-of-date firmware can also result in giants. A combination of these two actions will eliminate giants.
Runts
Runts are the giant’s counterpart. A runt is a packet that's too small. "Too small" is defined as sixty-four bytes since that's the smallest size of an Ethernet data frame. A network full of runts will be slow and unreliable. Runts inherently indicate data drops and possible packet collision. Generally, physical issues cause runts, while giants are indicative of misconfiguration.
How to Detect Runts
Runts are detected in the same way as CRC errors and giants. Crack open the logs and search for runts instead of giants. Runt indications are right next to giants. If it's greater than 0, then there's a problem.
How to Prevent Runts
Runts are almost exclusively a result of physical layer issues. Verify your cables are not damaged, tangled, or too long. For good measure, update firmware on all your devices and verify MTU configurations.
Runts can occur from a lack of bandwidth. So, try disturbing traffic better using load balancing techniques.
What are Encapsulation Errors?
Encapsulation errors are a different breed of errors and deserve their own attention. Encapsulation errors occur when the actual packaging of the packet is corrupted. Think of a data frame (i.e., packet) as a letter you're sending to someone.
The letter itself is the data. Then, it is "encapsulated" in an envelope. The envelope contains "metadata" such as address, stamps, and return address. Packets encapsulate in a similar fashion. If there is an error with the "envelope," then it results in an encapsulation error.
The two most common encapsulation errors are misconfiguration and incompatible devices. Each one ultimately results in a protocol mismatch. Let's start with misconfiguration.
Misconfiguration
Misconfiguration is one of the biggest culprits of encapsulation error. This occurs when one network interface encapsulates packets one way and another interface uses a different encapsulation method. For example, one router could encapsulate using IEEE 8021q. VLAN tagging uses this standard, however another interface could be using IEEE 802.3. This can cause an issue, leading to an encapsulation error.
Incompatible Devices
These errors often occur when one interface is antiquated while others are not. When a router isn't designed to handle some encapsulation protocol, it will error out.
Impact on Network Performance And Reliability
Encapsulation errors make it difficult or impossible for two devices to communicate. Any software relying on network performance will be severely degraded. Suffice it to say, you should always prioritize encapsulation errors. Lastly, encapsulation errors can expose packets to third parties. This will result in a security breach.
Best Practices for Resolving Encapsulation Errors
First, verify that all interfaces are using the same encapsulation method. If they are not, then that's a problem. Also, update firmware to the latest and greatest. (As you can tell, updating firmware is a common solution to many problems.)
Do some research on your interface, and make sure there aren't any missing patch updates. Lastly, make sure all your interfaces are up-to-spec, and supported by their manufacturer.
Understanding Common Interface Alerts
We already discussed where to check in the logs for common issues. But you also need to add additional failsafes to ensure a prompt recovery. Let's go through what some of those may look like.
Sound Alerts
Many network management systems make an audible sound when interface errors are found or exceeded. For example, SolarWinds, Nagios, and PRTG Monitor all include this feature. These are customizable within their user interface. Network engineers need to make use of these tools as quickly as possible. Start by finding out which software you have, and researching configuration recommendations.
Threshold Triggers
Network management systems also allow administrators to set thresholds for errors. These thresholds determine the number of errors that will trigger an alert. For example, if the number of CRC errors exceeds 100 errors within an hour, an alert is generated. Of course, this number can be anything.
Then, this threshold trigger will alert a sound. Then the engineer will check Splunk or the interface itself and see the error. That is the optimal chain of responsibility.
Final Thoughts
Uptime is the most critical metric for success in network engineering. Monitoring and mitigating common network errors is essential to maintaining this uptime. Namely, CRC errors, giants, runts, and encapsulation errors
A testament to their importance is their inclusion in the Network+ Exam. Understanding and addressing interface errors are foundational skills for any network professional. It requires vigilance and proactive measures such as alerts and thresholds.
Remember, the foundation of a successful network is not just in its design–success begins and ends with vigilance and maintenance.
Want to learn more about network engineering? Consider taking the CBT Nuggets course Networking Fundamentals Online Training with Keith Barker.
delivered to your inbox.
By submitting this form you agree to receive marketing emails from CBT Nuggets and that you have read, understood and are able to consent to our privacy policy.