v1.5.16 Bug Reports and Comments

Wed May 29, 2024 2:09 am

Yes, it has only tagged VLANs, and no untagged. About 20 VLANs in total.

Wed Jun 19, 2024 11:12 am

A little gun-shy after 1.5.15 but did one of my test units this morning to 1.5.16.

Sun Jul 07, 2024 8:53 pm

We upgraded 3 units overnight and will see how it goes for a few weeks before doing the rest of the network.

Sun Jul 21, 2024 10:35 pm

Watchdog timer fails to bounce the port, it reports the below....BUT what is latitude : X? It should read "watchdog triggered on port 2, action is Bounce Power" that is what the other switches report why does this one note latitude and fail to bounce port?

Jul 20 05:35:56 switch[1760]: Watchdog '5XHD-MTR2TDC' failure checking 10.11.0.232, watchdog triggered on port 2 (5XHD-MTR2TDC), action is Bounce Power switch latitude: x

EDIT: I think having - in the port name might be trigging bad things to happen with the watchdog. Removed it for testing.

Mon Aug 05, 2024 10:34 am

oeyre wrote:We upgraded 3 units overnight and will see how it goes for a few weeks before doing the rest of the network.

It's been a few weeks now of our limited trial upgrading from 1.5.8 to 1.5.16, what has everyone's experience been like?

While we can't fault the software for day to day operations we are seeing that the memory used keeps going up. Is this normal/expected? Do we have a memory leak or is this just linux doing what it does and keeping everything in memory until it has to let go?

Screenshot: https://ibb.co/YPyZy2B

Mon Sep 30, 2024 5:16 pm

As been noted in the1.5.17rcx thread about discovery tab possibly causing a memory leak. I am experiencing a similar, if not the same issue in 1.5.16. I have setup a network monitor to track memory usage via SNMP now to catch the switches before they crash and recover.

I had one that i estimate was up for about 18 days before it dropped and came back. While another was at 43 Days and memory usage was at 86 MB. Same model (WS-12-250-DC). Similar configuration, no SFP, NO LAG or LCAP. Though the one that dropped at about 18 Days is sees a lot more traffic and is closer to the core of our network.

I'll report back in time with an update. This leak seems to be slow compared to the SNMP leak.

Sat Oct 05, 2024 9:48 am

oeyre wrote:
oeyre wrote:We upgraded 3 units overnight and will see how it goes for a few weeks before doing the rest of the network.

It's been a few weeks now of our limited trial upgrading from 1.5.8 to 1.5.16, what has everyone's experience been like?

So just an update here... We've just finished updating our network to 1.5.16 (more than 150 units). Most went fine without issue. Some had issues actually running the upgrade but were able to be resolved by a pre-boot before attempting again.

However 2 in particular we are seeing their CPU utilisation constantly at 100%. The first unit (WS-12-250-AC rev F) has seemingly had this problem since forever. The second unit (WS-12-250-DC rev F) was fine before the upgrade, but has now been having high CPU ever since.

I ran top via console/cmdline and in both cases vtss_appl is using at least 60%, then switch ~10%

Is there a way to enable any additional debugging/profiling to see what is actually happening here?

Screens
CPU: unit 1 / unit 2
top: unit 1 / unit 2

Tue Oct 22, 2024 2:03 pm

bipbaep wrote:Yes, it has only tagged VLANs, and no untagged. About 20 VLANs in total.

I just experienced a similar issue but in my case no vlans were tagged, one vlan set for untagged. It happens also to be the same vlan that is set as the management vlan for this switch.

For example the management vlan is 3000. When I enabled 2 ports, the connected device ended up on vlan 1, not 3000.

To resolve, I excluded vlan 3000 from the 2 ports and saved. Then set back to untagged on vlan 3000 and saved. (Through the webui) Then the connected devices were on vlan 3000.

Luckily for me these were new device local to this switch and not used for up or down links.

Mon Nov 11, 2024 11:25 pm

So I'd just like to update everyone that we are noticing a pattern regarding the memory continuously increasing.

Some units are now at 60 days uptime and have been completely fine operationally, and we've seen the memory hold steady with approx 25MB used and 70-80MB free.

On the other hand we've noticed a trend where units that were recording a memory climb per one of my previous comments, are now rebooting either unprompted or when somebody tries to login via web. This seems to reliably happen at 46-47 days of uptime.

I've yet to go through and matrix all of the various config differences between these however it appears to be happening across multiple combinations of port count and AC/DC.

Edit: all affected seem to have the following line in the logs

Code: Select all: switch[888]: Detected cold (watchdog) boot

v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Re: v1.5.16 Bug Reports and Comments

Who is online