Cacti (home)ForumsDocumentation

Differences

This shows you the differences between two versions of the page.

manual:087:4_help.2_debugging [2009/04/17 16:24]
gandalf cactid -> spine
manual:087:4_help.2_debugging [2012/11/30 15:14] (current)
gandalf tackling snmpbulkwalks
Line 7: Line 7:
 Please have a look at your cacti log file. Usually, you'll find it at //<path_cacti>/log/cacti.log//. Else see //Settings//, //Paths//. Check for this kind of error: Please have a look at your cacti log file. Usually, you'll find it at //<path_cacti>/log/cacti.log//. Else see //Settings//, //Paths//. Check for this kind of error:
  
-<code>CACTID: Host[...] DS[....] WARNING: SNMP timeout detected [500 ms], ignoring host '........'</code>+<code>SPINE: Host[...] DS[....] WARNING: SNMP timeout detected [500 ms], ignoring host '........'</code>
  
 For "reasonable" timeouts, this may be related to a snmpbulkwalk issue. To change this, see  //Settings//, //Poller// and lower the value for //The Maximum SNMP OID's Per SNMP Get Request//. Start at a value of 1 and increase it again, if the poller starts working. Some agent's don't have the horsepower to deliver that many OID's at a time. Therefore, we can reduce the number for those older/underpowered devices. For "reasonable" timeouts, this may be related to a snmpbulkwalk issue. To change this, see  //Settings//, //Poller// and lower the value for //The Maximum SNMP OID's Per SNMP Get Request//. Start at a value of 1 and increase it again, if the poller starts working. Some agent's don't have the horsepower to deliver that many OID's at a time. Therefore, we can reduce the number for those older/underpowered devices.
Line 46: Line 46:
 All output is printed to STDOUT in both cases. This procdure allows for repeated tests without waiting for the next polling interval. And there's no need to manually search for the failing host between hundreds of lines of output. All output is printed to STDOUT in both cases. This procdure allows for repeated tests without waiting for the next polling interval. And there's no need to manually search for the failing host between hundreds of lines of output.
  
 +==== Check Bulkwalk Behaviour (SNMP Data Queries only) ====
 +
 +The goal of bulkwalks is to reduce SNMP traffic overhead. It works by cramming several SNMP requests/responses into a single IP packet. This feature is not available with SNMP version 1. Some SNMP enabled devices do not like snmpbulkwalks. 
 +
 +Cacti supports this feature with SNMP enabled devices automatically when version 2 or version 3 has been selected. The field "max OIDs" for each hosts governs, how many packets are crammed together. Side note: In case too many SNMP packets will be crammed together, IP fragmentation takes care of splitting those into chunks manageable by the IP layer.
 +You will see such an effect when e.g. your manual 
 +<code>snmpwalk -c <community string> -v 2c <target> <OID></code>
 +produces a result but cacti poller output shows NaN.
 +
 +Now you have two different means to tackle such an issue:
 +  - reduce "max OIDs" to 1: Cacti now will disable all Cacti-internal mechanisms to use snmpbulkwalk
 +  - select SNMP version 1: as SNMP V1 does not support snmpbulkwalks, all Cacti-internal and Cacti-external bulkwalk mechanisms are disabled
 +
 +Discussion:\\ 
 +**Cacti-internal bulkwalk mechanisms**: Spine checks the "max OIDs". In case they are set to a value higher than 1, we will use snmpbulkwalk-like code. Else, we use standard snmpwalks.\\ 
 +
 +**Cacti-external bulkwalk mechanisms**: It has been found, that php-snmp automatically uses snmpbulkwalk, even when only snmpwalk has been requested. As of current, php-snmp will join 20 requests/response. We can't change this setting externally. So the ultimate answer to this is to use SNMP version 1. The drawback of using SNMP version 1 is that e.g. COUNTER64 is **not** available with this setting. As a result, e.g. a **Verbose Query** from within the browser may fail while spine still works. Yes, crazy.
 ==== Check MySQL Update ==== ==== Check MySQL Update ====
  
Line 147: Line 164:
 User "criggie" reported an issue with running smartctl. It was complaining "you are not root" so a quick //chmod +s// on the script fixed that problem. User "criggie" reported an issue with running smartctl. It was complaining "you are not root" so a quick //chmod +s// on the script fixed that problem.
  
-Secondly, the script was taking several seconds to run. So cacti was logging a "U" for unparseable in the debug output, and was recording NAN. So my fix there was to make the script run faster - it has to complete in less than one second, and the age of my box made it difficult to accomplish.+Secondly, the script was taking several seconds to run. So cacti was logging a "U" for unparseable in the debug output, and was recording NAN. So my fix there was to make the script run faster, and the age of my box made it difficult to accomplish.  
 + 
 +The timeout setting is governed by "Settings -> Poller -> Script and Script Server Timeout Value". In general, it is recommended to make scripts faster to avoid that the poller does not finish in time.





Personal Tools