Wireless Catalyst 9800 WLC KPIs

This blog will focus on Key Performance Indicators (AP and RF) for Access Points. I will discuss methods and commands that can be used to assess the health of APs as well as RF.

KPIs different buckets or areas:

  • WLC checks,
  • Connection with other devices
  • AP checks
  • RF checks
  • Client checks
  • Packet Drops.

AP Checks

Let’s now focus on APs health. We can first verify that the number of APs linked to our WLC is the correct number. Use the command “ i Number of APs “. If the AP count is incorrect, we will need to identify missing APs and the reason they were disconnected. It is helpful to have a complete list if APs are needed for a working scenario using ethernet mac and IP addresses. Show a summary “).

Gladius1#show ap sum Load for five secs: 0%/0%; one minute: 0%; five minutes: 0% Time source is NTP, 19:18:03.363 CEST Wed May 25 2022 Number of APs: 8 AP Name               Slots    AP Model              Ethernet MAC    Radio MAC       Location                          Country     IP Address                                 State ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8    2      AIR-AP3802I-E-K9      0042.68a0.fc4a  0062.ecf3.8310  default location                  DE          192.168.127.108                            Registered 9130i-r2sw1-te2016      3      C9130AXI-E            04eb.409e.14c0  04eb.409f.0c60  default location                  DE          192.168.25.133                             Registered 9130i-r2sw1-te2015      3      C9130AXI-E            04eb.409e.1724  04eb.409f.1f80  default location                  DE          192.168.25.122                             Registered 9130i-r3-sw2-g1-0-10    3      C9130AXI-B            04eb.409e.1d28  04eb.409f.4fa0  default location                  US          192.168.127.113                            Registered AP1562-r3-sw-3-gi1-0-3  2      AIR-AP1562E-E-K9      0062.ec80.8c8c  2c33.1192.3e40  default location                  DE          192.168.127.106                            Registered SS-I-1                  2      C9115AXI-B            7069.5a74.7a50  7069.5a78.7780  default location                  US          192.168.127.97                             Registered ap3800i-r2-sw1-te1-0-5  2      AIR-AP3802I-E-K9      0042.68c5.bdf0  cc16.7e5f.f000  default location                  CH          192.168.127.109                            Registered 9120i-r4-sw2-te1-0-39   2      C9120AXI-E            d4e8.8019.60e8  d4e8.801a.3340  default location                  DE          192.168.127.114                            Registered

Check AP count, and have a list of ethernet mac and IP addresses of all the APs.

To quickly locate and identify missing devices, we can compare the outputs of non-working and working scenarios.

Even if the WLC has an expected number of APs, it is still important to verify that those APs remain stable. WLC provides a command that allows us to verify Capwap tunnel reliability and uptime (reloads). ex ____([0-9])+ day” “exclude” keyword will help us to focus on APs reloaded or disconnected within 1 day.

Gladius2#sh ap uptime Number of APs: 8 AP Name                    Ethernet MAC    Radio MAC       AP Up Time                                          Association Up Time --------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8       0042.68a0.fc4a  0062.ecf3.8310  26 days 0 hour 57 minutes 41 seconds                15 days 1 hour 50 minutes 4 seconds 9130i-r2sw1-te2015         04eb.409e.1724  04eb.409f.1f80  9 days 3 hours 26 minutes 48 seconds                9 days 3 hours 24 minutes 24 seconds 9130i-r2sw1-te2016         04eb.409e.14c0  04eb.409f.0c60  9 days 1 hour 39 minutes 29 seconds                 9 days 1 hour 26 minutes 47 seconds 9120i-r4-sw2-te1-0-39      d4e8.8019.60e8  d4e8.801a.3340  8 days 1 hour 36 minutes 57 seconds                 8 days 1 hour 33 minutes 49 seconds SS-I-1                     7069.5a74.7a50  7069.5a78.7780  26 days 0 hour 54 minutes 57 seconds                22 minutes 15 seconds ap3800i-r2-sw1-te1-0-5     0042.68c5.bdf0  cc16.7e5f.f000  26 days 0 hour 46 minutes 12 seconds                22 minutes 13 seconds 9130i-r3-sw2-g1-0-10       04eb.409e.1d28  04eb.409f.4fa0  22 minutes 21 seconds                               19 minutes 39 seconds

Check uptime and Association uptime. In this case we see SS-I-1 and ap3800i-r2-sw1-te1-0-5 facing disconnection, while 9130i-r3-sw2-g1-0-10 facing reload.

The above command will tell us if there have been any unexpected AP reloads. It is also possible to determine if multiple APs were reloaded at once. It could indicate a problem with the network or power supply in that area/switch if the reloaded APs were at the same place or connected to the same switch. Similar to AP disconnections, we can also compare “Association Uptime” between them to identify patterns, determine if any tunnel teardowns occurred, and when. Keep in mind that APs can flip the CAPWAP tunnel if they make certain configuration changes (e.g. when a new tag has been applied).

If “AP Uptime”, which is not due to general reloading, is lower than anticipated, we can examine the WLC for any AP crashes and bootflash content in any report file. i crash”

Gladius1#show ap crash-file File Location: BOOTFLASH AP Name                         Crash File                Radio Slot 0                       Radio Slot 1 ------------------------------------------------------------------------------------------------------------------------------- ap3800i-r2-sw1-te0-1             ap3800i-r2-sw1-te0-1_0062ecaade80.crash Gladius1#dir bootflash: | i crash 54      -rw-            50476   May 9 2022 13:07:34 +02:00  ap3800i-r2-sw1-te0-1_0062ecaade80.crash 66      -rw-           120276  Jan 26 2022 11:46:55 +01:00  AP9120-2-r3-sw2-Gi1-0-39_d4e88019f140.crash 28      -rw-            93952   Nov 2 2021 13:02:21 +01:00  SS-E-2_00eeab18c160.crash 12      -rw-            42975  Oct 27 2021 15:01:44 +02:00  9115i-r4-sw2-te1-0-38_f80f6f154ce0.crash 42      -rw-            42235  May 15 2021 14:24:59 +02:00  9115i-r3-sw2-te1-0-38_f80f6f154960.crash 41      -rw-            26063  Mar 30 2021 13:06:45 +02:00  9115i-r3-sw2-te1-0-38_f80f6f154c80.crash

Check for AP crashes occurring, multiple crashes seen in the same AP, and periodic crashes.

To find new crashes, it is recommended to periodically review bootflash content. Download any new crashes and share them with TAC to conduct root cause analysis. To keep your file system clean, delete any old files.

If we have AP disconnections, it will allow us to determine what the most frequent termination event is and what the AP state was at that time. This will give us a global picture. Use the command “show wireless stats to terminate ap session”

Gladius1#show wireless stats ap session termination Event                           Previous State                  Occurance Count ------------------------------------------------------------------------------------ DTLS session closed             JOINED                          6 Heartbeat timer expiry          JOINED                          2 Reset by API                    IMAGE_DOWNLOAD                  1 Image download status           IMAGE_DOWNLOAD                  6 Reset by API                    RUN                             3 DTLS session closed             RUN                             17 Heartbeat timer expiry          RUN                             6

Check events with the highest count. If AP was in RUN state disconnections could be due to consistent packet drops.

The AP history command allows us to drill down further on specific APs. Filtering AP history using disconnections will reveal if multiple APs were disconnecting at the same moment and the reason. Analyzing the command output will allow us to determine if multiple disconnections are occurring for the same AP as well as the frequency of these disconnections. i Disjoined”

Gladius1#show wireless stats ap history | i Disjoined ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 12:27:39  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 12:24:26  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 12:17:47  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 11:41:17  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 11:38:04  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 10:18:04  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/09/22 13:02:28  NA Heart beat timer expiry ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/09/22 10:49:34  NA Heart beat timer expiry ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/05/22 19:53:31  NA Failure decoding wtp descriptor ap3800i-r3-sw2-Gi1-0-37  0042.68a1.03d2  Disjoined  05/12/22 12:02:38  NA DTLS close alert from peer ap3800i-r3-sw2-Gi1-0-37  0042.68a1.03d2  Disjoined  05/12/22 11:57:43  NA Wtp reset config cmd sent ap3800i-r3-sw2-Gi1-0-37  0042.68a1.03d2  Disjoined  05/10/22 10:54:49  NA DTLS close alert from peer

Check timestamps and disjoin reason. Find multiple disconnections per AP, disconnections occurring at the same time or periodically.

A second important thing to do is to check the AP tag assignment. The SSIDs, AP modes, RF profiles and policies for each AP will be determined by tags. It is possible to verify that APs have the correct tags and use the correct method for assigning them. Comparing tags attached at APs located in the same area, or non-working, could help spot an incorrect tag allocation. Use the command “sh-ap tag summary”

We also need to determine if any AP has misconfigured tags. A non-existent/removed parameter, such as profile policy, RFprofile, …),, or an incorrect configuration combination could cause misconfigured tags. APs that are marked as “misconfigured” will not broadcast any BSSID. i Yes”

Gladius1#sh ap tag summary Number of APs: 4 AP Name   AP Mac      Site Tag Name     Policy Tag Name     RF Tag Name   Misconfigured    Tag Source ---------------------------------------------------------------------------------------------------------- HG-2     0cd0.f894.0f40   default-site-tag   default-policy-tag   default-rf-tag    No      Default AP1832I  80e8.6fd8.6330   site2              flex-vlan4             rf-hig          No      Location ap1700i  f44e.0578.a560   site2              default-policy-tag   default-rf-tag    Yes     Static AP9120   d4e8.8019.6100   default-site-tag   LOCAL_VLAN169        default-rf-tag    No      Filter

Check for misconfigured tags, correct tag source, and same tag assignment for APs in the same branch

We can still check that the APs have been set up correctly and are in good working order to find out if there is any misbehavior. It is possible for a perfectly functioning AP to show no clients at the moment. We can identify APs that may be having issues based on the information we have about the network, and the number clients we see in other APs within the same area. We can verify that radios are working and that the AP broadcasts the correct BSSIDs. Then we monitor the APs for a time. If AP still shows no clients after the monitoring period we can test to reset the AP radio, or to reconnect with WLC. Use command: “show ap sum sort descending client-count | i __0__”

Gladius1#show ap sum sort descending client-count | i __0__ ---------------------------------------------------------------------------------------------------------- AP-name         AP-mac           Client count          Data Usage          Through-Put     Admin-State ---------------------------------------------------------------------------------------------------------- 9120i            d4e8.801a.3340       0                    1407172              515           Enabled AP1562           2c33.1192.3e40       0                    4189901              69            Disabled AP3800           0062.ecf3.8310       0                    48548613             473           Disabled

Check for APs with zero clients and in enabled state.

One example of those AP KPIs that helped to identify an issue was a customer facing AP random AP disconnections. We were able to identify the impacted APs by looking at the show AP availability. We were able to determine that all APs were located in the same place and were connected to one switch using the show ap cdp neighbor output. These APs were disconnected because of a connection being closed by AP. We could see multiple re-transmissions CAPWAP packets when we checked the AP logs. We then tried to ping from AP into WLC, and saw packet loss. When pinging from AP his gateway, the same packet loss was observed. Ping tests showed that there was a connectivity problem between the gateway and the APs.

RF Checks

Monitoring per band AP channel assignments, channel widths, transmission power and radio state can be done. We can also check if channels are evenly distributed in order to avoid interference from co-channels. Additionally, we can determine if multiple APs are using maximum TXpower. This could indicate coverage problems. It is also possible to identify APs marked as down that are not radio-operative. This verification is required for all APs, including the 9136 new ones, at 24ghz and 5ghz. To verify the assigned BSS Color to each AP, use command: “show an AP dot11 24ghz/5ghz/6ghz Summary”.

Gladius1#sh ap dot11 5ghz summary AP Name  Mac Address     Slot    Admin State    Oper State    Width    Txpwr           Channel    Mode --------------------------------------------------------------------------------------------------------------------------------------------------------- 9130E    0c75.bdb5.71e0  1       Enabled        Up            20       *2/8 (21 dBm)    (100)*      Local 9130E    0c75.bdb5.71e0  2       Disabled       Down          20       *1/8 (15 dBm)   (36)*        Local AP9120A  d4e8.8019.f140  1       Enabled        Up            20       *2/8 (19 dBm)    (40)*       Local AP9120B  d4e8.801a.3400  1       Enabled        Up            20       7/8 (4 dBm)     (40)         Local

Check for Txpwr 1, uneven channel distribution, radios down, and unexpected static assignment.

The next statistics will allow us to determine the frequency of radio channel changes. We can examine if AP changes channels when radar is detected on the same channel (5ghz). Client connectivity could be affected if we see many channel changes, and the numbers are rising. Channel changes will cause the AP radio to reset and all clients will be disconnected. Channel change in 5ghz will cause clients to be disconnected. AP radio must monitor the channel for 60 seconds before beaconing. Channel changes that are excessive could indicate RRM or RF issues. This should be investigated. i Channel changes due to radar

Gladius1#sh ap auto-rf dot11 5ghz | i Channel changes due to radar|AP Name|Channel Change Count AP Name                                           : 9130E-r3-sw2-g1014 Channel changes due to radar              : 0 Channel Change Count                          : 2 AP Name                                           : 9130E-r3-sw2-g1014 Channel changes due to radar              : 0 AP Name                                           : AP9120-2-r3-sw2-Gi1-0-39 Channel changes due to radar              : 3 Channel Change Count                          : 10 AP Name                                           : AP9120-r3-sw3-Gi1-0-47 Channel changes due to radar              : 0 Channel Change Count                          : 62

Check for a high amount of channel changes and changes due to DFS events.

Another thing we can check is the radio’s load or channel utilization. Catalyst 9800WLC will display the channel utilization and client count to help us identify high-loading APs. We can identify APs that have few clients and high loads. This will allow us to focus our attention on them and determine if this is due to traffic sent or received by the AP, or co-channel interference. We can also use information about the load to help identify the most loaded areas and those that may require more density. Use the command “show ap 11 24ghz/5ghz/6ghz loads-info”

Gladius1#sh ap dot11 5ghz load-info AP Name              Radio MAC       Slot  Channel Utilization (%)  Clients ---------------------------------------------------------------------------------------- 9130E                0c75.bdb5.71e0     1                        2        0 9130E                0c75.bdb5.71e0     2                        0        0 AP9120A              d4e8.8019.f140     1                       11        5 AP9120B              d4e8.801a.3400     1                       11        0

Check for high channel utilization or channel utilization with no client (co-channel interference). We can see co-channel interference because AP9120A and 9120B are both in the same channel 40.

One example of an issue that could be identified by looking at RF KPIs is a client with poor customer performance. The radio load at 5ghz was high, even though there were very few clients. The load was not caused by transmitting or receiving data, but rather due to interference from co-channels. We found that the rf-profile configuration issue caused only 4 channels to be assigned to the APs with high loads. Utilization decreased after adding additional channels to the RF profile channel. No other performance issues were reported.

For more detailed RF analysis you can use Wireless Config Analyzer Express (WCAE) tool: https://developer.cisco.com/docs/wireless-troubleshooting-tools/#wireless-config-analyzer-express

WCAE will show you the distribution of channels, TXpower, RF metrics per AP, and more details.

With provided methodology and commands you can proactively identify if there are any issues in our WLC APs and RF. In the next blog, we will share 9800 WLC KPIs to check client connectivity and WLC drops/punted packets.

List of commands to use for KPIs and automation scripts

The document below also contains a link that will allow you to access a script that will automatically gather all commands. It will automatically collect commands based upon platform and release. The files are saved in a file and exported. This script uses the “Guest Shell” feature, which is currently only available in physical WLCs 9800-40/80 or 9800-L.

This document includes an example of an EEM script that collects logs regularly. EEM, along with the “Guestshell” script, will allow you to collect 9800 WLC KPIs. This will provide a baseline for your Catalyst9800 WLC.

For the list of commands used to monitor those KPIs

Visit the Monitor Wireless Catalyst 9800 KPIs

About JNS

As a Managed Service Provider delivering IT Services in Miami and throughout South Florida we provide Cisco Wireless deployments for any venue. Call us today to lean more.

 Wireless Catalyst 9800 WLC KPIs

This blog will focus on Key Performance Indicators (AP and RF) for Access Points. I will discuss methods and commands that can be used to assess the health of APs as well as RF.

KPIs different buckets or areas:

  • WLC checks,
  • Connection with other devices
  • AP checks
  • RF checks
  • Client checks
  • Packet Drops.

AP Checks

Let’s now focus on APs health. We can first verify that the number of APs linked to our WLC is the correct number. Use the command “ i Number of APs “. If the AP count is incorrect, we will need to identify missing APs and the reason they were disconnected. It is helpful to have a complete list if APs are needed for a working scenario using ethernet mac and IP addresses. Show a summary “).

Gladius1#show ap sum Load for five secs: 0%/0%; one minute: 0%; five minutes: 0% Time source is NTP, 19:18:03.363 CEST Wed May 25 2022 Number of APs: 8 AP Name               Slots    AP Model              Ethernet MAC    Radio MAC       Location                          Country     IP Address                                 State ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8    2      AIR-AP3802I-E-K9      0042.68a0.fc4a  0062.ecf3.8310  default location                  DE          192.168.127.108                            Registered 9130i-r2sw1-te2016      3      C9130AXI-E            04eb.409e.14c0  04eb.409f.0c60  default location                  DE          192.168.25.133                             Registered 9130i-r2sw1-te2015      3      C9130AXI-E            04eb.409e.1724  04eb.409f.1f80  default location                  DE          192.168.25.122                             Registered 9130i-r3-sw2-g1-0-10    3      C9130AXI-B            04eb.409e.1d28  04eb.409f.4fa0  default location                  US          192.168.127.113                            Registered AP1562-r3-sw-3-gi1-0-3  2      AIR-AP1562E-E-K9      0062.ec80.8c8c  2c33.1192.3e40  default location                  DE          192.168.127.106                            Registered SS-I-1                  2      C9115AXI-B            7069.5a74.7a50  7069.5a78.7780  default location                  US          192.168.127.97                             Registered ap3800i-r2-sw1-te1-0-5  2      AIR-AP3802I-E-K9      0042.68c5.bdf0  cc16.7e5f.f000  default location                  CH          192.168.127.109                            Registered 9120i-r4-sw2-te1-0-39   2      C9120AXI-E            d4e8.8019.60e8  d4e8.801a.3340  default location                  DE          192.168.127.114                            Registered

Check AP count, and have a list of ethernet mac and IP addresses of all the APs.

To quickly locate and identify missing devices, we can compare the outputs of non-working and working scenarios.

Even if the WLC has an expected number of APs, it is still important to verify that those APs remain stable. WLC provides a command that allows us to verify Capwap tunnel reliability and uptime (reloads). ex ____([0-9])+ day” “exclude” keyword will help us to focus on APs reloaded or disconnected within 1 day.

Gladius2#sh ap uptime Number of APs: 8 AP Name                    Ethernet MAC    Radio MAC       AP Up Time                                          Association Up Time --------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8       0042.68a0.fc4a  0062.ecf3.8310  26 days 0 hour 57 minutes 41 seconds                15 days 1 hour 50 minutes 4 seconds 9130i-r2sw1-te2015         04eb.409e.1724  04eb.409f.1f80  9 days 3 hours 26 minutes 48 seconds                9 days 3 hours 24 minutes 24 seconds 9130i-r2sw1-te2016         04eb.409e.14c0  04eb.409f.0c60  9 days 1 hour 39 minutes 29 seconds                 9 days 1 hour 26 minutes 47 seconds 9120i-r4-sw2-te1-0-39      d4e8.8019.60e8  d4e8.801a.3340  8 days 1 hour 36 minutes 57 seconds                 8 days 1 hour 33 minutes 49 seconds SS-I-1                     7069.5a74.7a50  7069.5a78.7780  26 days 0 hour 54 minutes 57 seconds                22 minutes 15 seconds ap3800i-r2-sw1-te1-0-5     0042.68c5.bdf0  cc16.7e5f.f000  26 days 0 hour 46 minutes 12 seconds                22 minutes 13 seconds 9130i-r3-sw2-g1-0-10       04eb.409e.1d28  04eb.409f.4fa0  22 minutes 21 seconds                               19 minutes 39 seconds

Check uptime and Association uptime. In this case we see SS-I-1 and ap3800i-r2-sw1-te1-0-5 facing disconnection, while 9130i-r3-sw2-g1-0-10 facing reload.

The above command will tell us if there have been any unexpected AP reloads. It is also possible to determine if multiple APs were reloaded at once. It could indicate a problem with the network or power supply in that area/switch if the reloaded APs were at the same place or connected to the same switch. Similar to AP disconnections, we can also compare “Association Uptime” between them to identify patterns, determine if any tunnel teardowns occurred, and when. Keep in mind that APs can flip the CAPWAP tunnel if they make certain configuration changes (e.g. when a new tag has been applied).

If “AP Uptime”, which is not due to general reloading, is lower than anticipated, we can examine the WLC for any AP crashes and bootflash content in any report file. i crash”

Gladius1#show ap crash-file File Location: BOOTFLASH AP Name                         Crash File                Radio Slot 0                       Radio Slot 1 ------------------------------------------------------------------------------------------------------------------------------- ap3800i-r2-sw1-te0-1             ap3800i-r2-sw1-te0-1_0062ecaade80.crash Gladius1#dir bootflash: | i crash 54      -rw-            50476   May 9 2022 13:07:34 +02:00  ap3800i-r2-sw1-te0-1_0062ecaade80.crash 66      -rw-           120276  Jan 26 2022 11:46:55 +01:00  AP9120-2-r3-sw2-Gi1-0-39_d4e88019f140.crash 28      -rw-            93952   Nov 2 2021 13:02:21 +01:00  SS-E-2_00eeab18c160.crash 12      -rw-            42975  Oct 27 2021 15:01:44 +02:00  9115i-r4-sw2-te1-0-38_f80f6f154ce0.crash 42      -rw-            42235  May 15 2021 14:24:59 +02:00  9115i-r3-sw2-te1-0-38_f80f6f154960.crash 41      -rw-            26063  Mar 30 2021 13:06:45 +02:00  9115i-r3-sw2-te1-0-38_f80f6f154c80.crash

Check for AP crashes occurring, multiple crashes seen in the same AP, and periodic crashes.

To find new crashes, it is recommended to periodically review bootflash content. Download any new crashes and share them with TAC to conduct root cause analysis. To keep your file system clean, delete any old files.

If we have AP disconnections, it will allow us to determine what the most frequent termination event is and what the AP state was at that time. This will give us a global picture. Use the command “show wireless stats to terminate ap session”

Gladius1#show wireless stats ap session termination Event                           Previous State                  Occurance Count ------------------------------------------------------------------------------------ DTLS session closed             JOINED                          6 Heartbeat timer expiry          JOINED                          2 Reset by API                    IMAGE_DOWNLOAD                  1 Image download status           IMAGE_DOWNLOAD                  6 Reset by API                    RUN                             3 DTLS session closed             RUN                             17 Heartbeat timer expiry          RUN                             6

Check events with the highest count. If AP was in RUN state disconnections could be due to consistent packet drops.

The AP history command allows us to drill down further on specific APs. Filtering AP history using disconnections will reveal if multiple APs were disconnecting at the same moment and the reason. Analyzing the command output will allow us to determine if multiple disconnections are occurring for the same AP as well as the frequency of these disconnections. i Disjoined”

Gladius1#show wireless stats ap history | i Disjoined ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 12:27:39  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 12:24:26  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 12:17:47  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 11:41:17  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 11:38:04  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/24/22 10:18:04  NA DTLS close alert from peer ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/09/22 13:02:28  NA Heart beat timer expiry ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/09/22 10:49:34  NA Heart beat timer expiry ap3800i-r2-sw1-te0-1     0042.68a0.ee78  Disjoined  05/05/22 19:53:31  NA Failure decoding wtp descriptor ap3800i-r3-sw2-Gi1-0-37  0042.68a1.03d2  Disjoined  05/12/22 12:02:38  NA DTLS close alert from peer ap3800i-r3-sw2-Gi1-0-37  0042.68a1.03d2  Disjoined  05/12/22 11:57:43  NA Wtp reset config cmd sent ap3800i-r3-sw2-Gi1-0-37  0042.68a1.03d2  Disjoined  05/10/22 10:54:49  NA DTLS close alert from peer

Check timestamps and disjoin reason. Find multiple disconnections per AP, disconnections occurring at the same time or periodically.

A second important thing to do is to check the AP tag assignment. The SSIDs, AP modes, RF profiles and policies for each AP will be determined by tags. It is possible to verify that APs have the correct tags and use the correct method for assigning them. Comparing tags attached at APs located in the same area, or non-working, could help spot an incorrect tag allocation. Use the command “sh-ap tag summary”

We also need to determine if any AP has misconfigured tags. A non-existent/removed parameter, such as profile policy, RFprofile, …),, or an incorrect configuration combination could cause misconfigured tags. APs that are marked as “misconfigured” will not broadcast any BSSID. i Yes”

Gladius1#sh ap tag summary Number of APs: 4 AP Name   AP Mac      Site Tag Name     Policy Tag Name     RF Tag Name   Misconfigured    Tag Source ---------------------------------------------------------------------------------------------------------- HG-2     0cd0.f894.0f40   default-site-tag   default-policy-tag   default-rf-tag    No      Default AP1832I  80e8.6fd8.6330   site2              flex-vlan4             rf-hig          No      Location ap1700i  f44e.0578.a560   site2              default-policy-tag   default-rf-tag    Yes     Static AP9120   d4e8.8019.6100   default-site-tag   LOCAL_VLAN169        default-rf-tag    No      Filter

Check for misconfigured tags, correct tag source, and same tag assignment for APs in the same branch

We can still check that the APs have been set up correctly and are in good working order to find out if there is any misbehavior. It is possible for a perfectly functioning AP to show no clients at the moment. We can identify APs that may be having issues based on the information we have about the network, and the number clients we see in other APs within the same area. We can verify that radios are working and that the AP broadcasts the correct BSSIDs. Then we monitor the APs for a time. If AP still shows no clients after the monitoring period we can test to reset the AP radio, or to reconnect with WLC. Use command: “show ap sum sort descending client-count | i __0__”

Gladius1#show ap sum sort descending client-count | i __0__ ---------------------------------------------------------------------------------------------------------- AP-name         AP-mac           Client count          Data Usage          Through-Put     Admin-State ---------------------------------------------------------------------------------------------------------- 9120i            d4e8.801a.3340       0                    1407172              515           Enabled AP1562           2c33.1192.3e40       0                    4189901              69            Disabled AP3800           0062.ecf3.8310       0                    48548613             473           Disabled

Check for APs with zero clients and in enabled state.

One example of those AP KPIs that helped to identify an issue was a customer facing AP random AP disconnections. We were able to identify the impacted APs by looking at the show AP availability. We were able to determine that all APs were located in the same place and were connected to one switch using the show ap cdp neighbor output. These APs were disconnected because of a connection being closed by AP. We could see multiple re-transmissions CAPWAP packets when we checked the AP logs. We then tried to ping from AP into WLC, and saw packet loss. When pinging from AP his gateway, the same packet loss was observed. Ping tests showed that there was a connectivity problem between the gateway and the APs.

RF Checks

Monitoring per band AP channel assignments, channel widths, transmission power and radio state can be done. We can also check if channels are evenly distributed in order to avoid interference from co-channels. Additionally, we can determine if multiple APs are using maximum TXpower. This could indicate coverage problems. It is also possible to identify APs marked as down that are not radio-operative. This verification is required for all APs, including the 9136 new ones, at 24ghz and 5ghz. To verify the assigned BSS Color to each AP, use command: “show an AP dot11 24ghz/5ghz/6ghz Summary”.

Gladius1#sh ap dot11 5ghz summary AP Name  Mac Address     Slot    Admin State    Oper State    Width    Txpwr           Channel    Mode --------------------------------------------------------------------------------------------------------------------------------------------------------- 9130E    0c75.bdb5.71e0  1       Enabled        Up            20       *2/8 (21 dBm)    (100)*      Local 9130E    0c75.bdb5.71e0  2       Disabled       Down          20       *1/8 (15 dBm)   (36)*        Local AP9120A  d4e8.8019.f140  1       Enabled        Up            20       *2/8 (19 dBm)    (40)*       Local AP9120B  d4e8.801a.3400  1       Enabled        Up            20       7/8 (4 dBm)     (40)         Local

Check for Txpwr 1, uneven channel distribution, radios down, and unexpected static assignment.

The next statistics will allow us to determine the frequency of radio channel changes. We can examine if AP changes channels when radar is detected on the same channel (5ghz). Client connectivity could be affected if we see many channel changes, and the numbers are rising. Channel changes will cause the AP radio to reset and all clients will be disconnected. Channel change in 5ghz will cause clients to be disconnected. AP radio must monitor the channel for 60 seconds before beaconing. Channel changes that are excessive could indicate RRM or RF issues. This should be investigated. i Channel changes due to radar

Gladius1#sh ap auto-rf dot11 5ghz | i Channel changes due to radar|AP Name|Channel Change Count AP Name                                           : 9130E-r3-sw2-g1014 Channel changes due to radar              : 0 Channel Change Count                          : 2 AP Name                                           : 9130E-r3-sw2-g1014 Channel changes due to radar              : 0 AP Name                                           : AP9120-2-r3-sw2-Gi1-0-39 Channel changes due to radar              : 3 Channel Change Count                          : 10 AP Name                                           : AP9120-r3-sw3-Gi1-0-47 Channel changes due to radar              : 0 Channel Change Count                          : 62

Check for a high amount of channel changes and changes due to DFS events.

Another thing we can check is the radio’s load or channel utilization. Catalyst 9800WLC will display the channel utilization and client count to help us identify high-loading APs. We can identify APs that have few clients and high loads. This will allow us to focus our attention on them and determine if this is due to traffic sent or received by the AP, or co-channel interference. We can also use information about the load to help identify the most loaded areas and those that may require more density. Use the command “show ap 11 24ghz/5ghz/6ghz loads-info”

Gladius1#sh ap dot11 5ghz load-info AP Name              Radio MAC       Slot  Channel Utilization (%)  Clients ---------------------------------------------------------------------------------------- 9130E                0c75.bdb5.71e0     1                        2        0 9130E                0c75.bdb5.71e0     2                        0        0 AP9120A              d4e8.8019.f140     1                       11        5 AP9120B              d4e8.801a.3400     1                       11        0

Check for high channel utilization or channel utilization with no client (co-channel interference). We can see co-channel interference because AP9120A and 9120B are both in the same channel 40.

One example of an issue that could be identified by looking at RF KPIs is a client with poor customer performance. The radio load at 5ghz was high, even though there were very few clients. The load was not caused by transmitting or receiving data, but rather due to interference from co-channels. We found that the rf-profile configuration issue caused only 4 channels to be assigned to the APs with high loads. Utilization decreased after adding additional channels to the RF profile channel. No other performance issues were reported.

For more detailed RF analysis you can use Wireless Config Analyzer Express (WCAE) tool: https://developer.cisco.com/docs/wireless-troubleshooting-tools/#wireless-config-analyzer-express

WCAE will show you the distribution of channels, TXpower, RF metrics per AP, and more details.

With provided methodology and commands you can proactively identify if there are any issues in our WLC APs and RF. In the next blog, we will share 9800 WLC KPIs to check client connectivity and WLC drops/punted packets.

List of commands to use for KPIs and automation scripts

The document below also contains a link that will allow you to access a script that will automatically gather all commands. It will automatically collect commands based upon platform and release. The files are saved in a file and exported. This script uses the “Guest Shell” feature, which is currently only available in physical WLCs 9800-40/80 or 9800-L.

This document includes an example of an EEM script that collects logs regularly. EEM, along with the “Guestshell” script, will allow you to collect 9800 WLC KPIs. This will provide a baseline for your Catalyst9800 WLC.

For the list of commands used to monitor those KPIs

Visit the Monitor Wireless Catalyst 9800 KPIs

About JNS

As a Managed Service Provider delivering IT Services in Miami and throughout South Florida we provide Cisco Wireless deployments for any venue. Call us today to lean more.