triggered or made more obvious as a result of user behavior. For example, staff at a remote site might need to backup data to a central site every afternoon before going home causing the WAN link to become congested. This activity is likely to impact other users who are still working over the link. This particular problem may be solved by changing the work process (stop backing up across the WAN link), by provisioning more bandwidth, or possibly by implementing superior routing technologies such as Link Fragmentation and Interleaving to better utilize the link bandwidth. When the symptom is regular and predictable, it makes it easier to find the cause and solve the problem. Some network problems are intermittent. These sorts of network problems occur with no obvious pattern and often just go away and reappear at will. Intermittent network problems are significantly more complicated to troubleshoot because the ability to collect solid information becomes increasingly difficult. Again, intermittent problems may be caused by user behavior. For example, assume a user is loading a large file across the WAN. This obviously impacts the performance of the WAN link, generating a support call to the IT help desk. By the time the IT help desk has been made aware of the problem and investigates the issue, the file transfer has finished and the WAN link performance has returned to normal. Intermittent problems can also be caused by the interaction of various technologies. Firewalls running dynamic NAT, configuring NAT for load balancing, and running parallel links between systems can all present intermittent problems in network communications. The key to solving these sorts of issues is in understanding the technologies involved. Communication cannot be established between Host A and Host F because of misconfigured ACLs on both Router C and Router D. Using the traditional problem solving methodology of reversing changes before trying another possible solution would fail to resolve this problem. In Figure , communication between Host B and FTP Server D fails intermittently. This time, the problem is caused by dynamic NAT timeouts being too finely tuned. When the Internet is not congested, communications are successful. Because the Internet intermittently gets congested, packets returning from FTP Server D occasionally take too long and the dynamic NAT connection on the router closes, breaking the connection. In each of these situations, the network engineer would need to be able to recognize the misconfiguration as such, repair the configuration, and even though the problem is not immediately resolved, be confident that it was a contributing factor.
Content 6.4 Troubleshooting Complex Network Systems 6.4.2 Disassembling the problem A good way to solve a complex problem is one piece at a time. If transport layer connectivity is failing, it may be because of a misconfiguration on more than one device or technology. Following the traffic flow and correcting each problem as it is encountered can be a valid troubleshooting approach. Examining the symptoms and generating a list of possible causes is less likely to be a successful troubleshooting methodology when dealing with more complex issues. Even though it is a good practice to reverse changes that have no effect when troubleshooting simple problems, this is unlikely to help solve complex problems. With complex problems, the network engineer has to start relying on their own experience and judgment as to what is a probable cause of a problem and what is not. Mechanisms for disassembling complex problems include gathering information from midpoints in the network communications chain and disabling parts of the system to exclude them as being the cause. This can involve recabling components of the network in order to insert monitoring hosts or other troubleshooting tools, or to bypass suspected equipment. Gathering detailed log information from key points in the communications chain can also help pinpoint specific problem areas. When resolving complex network problems, the network engineer should always keep a record of changes being made. Keeping a log has the advantage of providing a record in case the configuration changes need to be reversed. A log of activities performed also removes any doubt as to whether a certain activity has been performed, helping the network engineer avoid repeating troubleshooting activities.
Content 6.4 Troubleshooting Complex Network Systems 6.4.3 Solving the component problems Using the example shown in Figure , assume that Host A is a Telnet client attempting to access the Telnet server on Host E. The network engineer can use a protocol analyzer on the Host A network to confirm that the packets are being generated and sent to the router. At this point, the network engineer should notice that the configuration and operation of Host A appears to be correct and that no reply packets are being received. A protocol analyzer running on the remote network is reporting that no Telnet packets are being received. Based on this information, the network engineer can assume there is a problem with at least one of the routers. Because the access list on Router C is quite complex, there does not appear to be any problem when the network engineer gives the configuration a visual check using show ip access-list. To be sure however, the deny ip any any log statement is configured to highlight any packets not being permitted through the ACL. The messages generated by the ACL logging highlight a misconfiguration that would have otherwise gone unnoticed, which is fixed by the network engineer. The ACL is updated and the show ip access-list command is used again to confirm that packets are being matched by the new access list element entered for the Telnet traffic. Because the problem has still not been solved, the network engineer now moves to the configuration of Router D. The ACL filtering inbound traffic on the serial interface is permitting the Telnet traffic and the packet counter against the appropriate ACE is incrementing with traffic. Using a protocol analyzer, the network engineer confirms that the Telnet packets are now reaching the network of Host E and that replies are being sent back to Host A. The protocol analyzer on the network of Host A, however, is not able to see any of these Telnet packets. It appears as though there is another problem on one of the routers. The access lists on Router D are not as complex as those on Router C and the network engineer immediately spots and corrects a configuration error. The next test works and the problem is considered resolved. The final activity the network engineer should perform is to remove any unnecessary configuration changes to the network. Using the log of activities generated during troubleshooting, the network administrator identifies that the use of the deny ip any any log command only provided diagnostic information and can be removed from the configuration.
Content 6.4 Troubleshooting Complex Network Systems 6.4.4 Dynamic NAT and extended ACLs The interaction of Dynamic NAT and extended access lists can generate complex network problems, particularly regarding the use of addressing and ports. Addressing considerations
Recall that the order of processing inbound traffic on a router is that the inbound traffic is processed by the inbound access list before being processed by outside-to-inside NAT. When designing access lists for implementation on NAT routers, remember that the destination address of inbound traffic will be the IP address used by the outbound NAT translation. Dynamic NAT timeouts
When configuring dynamic NAT, different timeout values can be configured for different types of traffic. Figure shows the commands used to change these values for translations built with and without overloading. As discussed in previous content, highly tuned translation timeouts combined with network congestion can be the cause of intermittent problems in network communications. Different