v1.5.16 Bug Reports and Comments
Re: v1.5.16 Bug Reports and Comments
Yes, it has only tagged VLANs, and no untagged. About 20 VLANs in total.
-
mnitech - Member
- Posts: 21
- Joined: Wed Oct 21, 2015 1:21 pm
- Location: Morgan Hill, CA
- Has thanked: 0 time
- Been thanked: 0 time
Re: v1.5.16 Bug Reports and Comments
A little gun-shy after 1.5.15 but did one of my test units this morning to 1.5.16.
- oeyre
- Member
- Posts: 24
- Joined: Mon Feb 05, 2024 1:38 am
- Location: Australia
- Has thanked: 0 time
- Been thanked: 10 times
Re: v1.5.16 Bug Reports and Comments
We upgraded 3 units overnight and will see how it goes for a few weeks before doing the rest of the network.
Re: v1.5.16 Bug Reports and Comments
Watchdog timer fails to bounce the port, it reports the below....BUT what is latitude : X? It should read "watchdog triggered on port 2, action is Bounce Power" that is what the other switches report why does this one note latitude and fail to bounce port?
Jul 20 05:35:56 switch[1760]: Watchdog '5XHD-MTR2TDC' failure checking 10.11.0.232, watchdog triggered on port 2 (5XHD-MTR2TDC), action is Bounce Power switch latitude: x
EDIT: I think having - in the port name might be trigging bad things to happen with the watchdog. Removed it for testing.
Jul 20 05:35:56 switch[1760]: Watchdog '5XHD-MTR2TDC' failure checking 10.11.0.232, watchdog triggered on port 2 (5XHD-MTR2TDC), action is Bounce Power switch latitude: x
EDIT: I think having - in the port name might be trigging bad things to happen with the watchdog. Removed it for testing.
- oeyre
- Member
- Posts: 24
- Joined: Mon Feb 05, 2024 1:38 am
- Location: Australia
- Has thanked: 0 time
- Been thanked: 10 times
Re: v1.5.16 Bug Reports and Comments
oeyre wrote:We upgraded 3 units overnight and will see how it goes for a few weeks before doing the rest of the network.
It's been a few weeks now of our limited trial upgrading from 1.5.8 to 1.5.16, what has everyone's experience been like?
While we can't fault the software for day to day operations we are seeing that the memory used keeps going up. Is this normal/expected? Do we have a memory leak or is this just linux doing what it does and keeping everything in memory until it has to let go?
Screenshot: https://ibb.co/YPyZy2B
- JeffreyS
- Member
- Posts: 22
- Joined: Thu Nov 11, 2021 12:20 pm
- Has thanked: 5 times
- Been thanked: 10 times
Re: v1.5.16 Bug Reports and Comments
As been noted in the1.5.17rcx thread about discovery tab possibly causing a memory leak. I am experiencing a similar, if not the same issue in 1.5.16. I have setup a network monitor to track memory usage via SNMP now to catch the switches before they crash and recover.
I had one that i estimate was up for about 18 days before it dropped and came back. While another was at 43 Days and memory usage was at 86 MB. Same model (WS-12-250-DC). Similar configuration, no SFP, NO LAG or LCAP. Though the one that dropped at about 18 Days is sees a lot more traffic and is closer to the core of our network.
I'll report back in time with an update. This leak seems to be slow compared to the SNMP leak.
I had one that i estimate was up for about 18 days before it dropped and came back. While another was at 43 Days and memory usage was at 86 MB. Same model (WS-12-250-DC). Similar configuration, no SFP, NO LAG or LCAP. Though the one that dropped at about 18 Days is sees a lot more traffic and is closer to the core of our network.
I'll report back in time with an update. This leak seems to be slow compared to the SNMP leak.
- oeyre
- Member
- Posts: 24
- Joined: Mon Feb 05, 2024 1:38 am
- Location: Australia
- Has thanked: 0 time
- Been thanked: 10 times
Re: v1.5.16 Bug Reports and Comments
oeyre wrote:oeyre wrote:We upgraded 3 units overnight and will see how it goes for a few weeks before doing the rest of the network.
It's been a few weeks now of our limited trial upgrading from 1.5.8 to 1.5.16, what has everyone's experience been like?
So just an update here... We've just finished updating our network to 1.5.16 (more than 150 units). Most went fine without issue. Some had issues actually running the upgrade but were able to be resolved by a pre-boot before attempting again.
However 2 in particular we are seeing their CPU utilisation constantly at 100%. The first unit (WS-12-250-AC rev F) has seemingly had this problem since forever. The second unit (WS-12-250-DC rev F) was fine before the upgrade, but has now been having high CPU ever since.
I ran top via console/cmdline and in both cases vtss_appl is using at least 60%, then switch ~10%
Is there a way to enable any additional debugging/profiling to see what is actually happening here?
Screens
CPU: unit 1 / unit 2
top: unit 1 / unit 2
- JeffreyS
- Member
- Posts: 22
- Joined: Thu Nov 11, 2021 12:20 pm
- Has thanked: 5 times
- Been thanked: 10 times
Re: v1.5.16 Bug Reports and Comments
bipbaep wrote:Yes, it has only tagged VLANs, and no untagged. About 20 VLANs in total.
I just experienced a similar issue but in my case no vlans were tagged, one vlan set for untagged. It happens also to be the same vlan that is set as the management vlan for this switch.
For example the management vlan is 3000. When I enabled 2 ports, the connected device ended up on vlan 1, not 3000.
To resolve, I excluded vlan 3000 from the 2 ports and saved. Then set back to untagged on vlan 3000 and saved. (Through the webui) Then the connected devices were on vlan 3000.
Luckily for me these were new device local to this switch and not used for up or down links.
- oeyre
- Member
- Posts: 24
- Joined: Mon Feb 05, 2024 1:38 am
- Location: Australia
- Has thanked: 0 time
- Been thanked: 10 times
Re: v1.5.16 Bug Reports and Comments
So I'd just like to update everyone that we are noticing a pattern regarding the memory continuously increasing.
Some units are now at 60 days uptime and have been completely fine operationally, and we've seen the memory hold steady with approx 25MB used and 70-80MB free.
On the other hand we've noticed a trend where units that were recording a memory climb per one of my previous comments, are now rebooting either unprompted or when somebody tries to login via web. This seems to reliably happen at 46-47 days of uptime.
I've yet to go through and matrix all of the various config differences between these however it appears to be happening across multiple combinations of port count and AC/DC.
Edit: all affected seem to have the following line in the logs
Some units are now at 60 days uptime and have been completely fine operationally, and we've seen the memory hold steady with approx 25MB used and 70-80MB free.
On the other hand we've noticed a trend where units that were recording a memory climb per one of my previous comments, are now rebooting either unprompted or when somebody tries to login via web. This seems to reliably happen at 46-47 days of uptime.
I've yet to go through and matrix all of the various config differences between these however it appears to be happening across multiple combinations of port count and AC/DC.
Edit: all affected seem to have the following line in the logs
- Code: Select all
switch[888]: Detected cold (watchdog) boot
Who is online
Users browsing this forum: Google [Bot] and 8 guests