Multiple WS-8-150-AC Units Hanging

bdowne01
Member
 
Posts: 9
Joined: Thu Jul 07, 2016 3:01 pm
Has thanked: 0 time
Been thanked: 1 time

Multiple WS-8-150-AC Units Hanging

Fri Jul 05, 2019 3:34 pm

Hello folks. Got a bit of a head-scratcher over here. We have a particular tower site with WS-8-150-AC units freezing, requiring a power cycle to come back up. Average run time can be anywhere from as long as a few days, to as short as a few hours. Another suspect behavior is on-location firmware updates on these units also result in a hang, necessitating a power cycle every time.

Thinking it was a hardware issue, we swapped in a few new WS-8-150-AC units. All in turn exhibit the same behaviors. All units were running various firmware in the 1.5.0-1.5.2 range, but trying different firmware revisions doesn't seem to resolve the hangs. I'm thinking it may be environmental, not not sure how to narrow it down without more logging data. For what it's worth, other equipment (Ubiquiti EdgeRouter, a cheap Netgear switch, DigitalLoggers WebPower Switch, and various radios) don't seem to be bothered at all.

The site is outdoors, & runs warm (as the central California valley does), but nothing outlandish. Temps max at about 75-80c on the CPU/PHY in the afternoons, board tops out around 60-65c. Should be noted hangs have occurred any time of the day, even during cool periods. There doesn't seem to be a timing pattern.

I've enabled remote syslog, but unfortunately it doesn't appear the logs reveal any clues. All we see is some spanning-tree messages from the end of the prior reboot followed by the new boot sequence messages such as follows:

Code: Select all
 Jul 5 07:32:31 192.168.6.6 STP: msti 0 set port 4 to forwarding sw-01
Jul 5 07:32:31 192.168.6.6 STP: msti 0 set port 3 to forwarding sw-01
Jul 5 07:32:31 192.168.6.6 STP: msti 0 set port 2 to forwarding sw-01
Dec 31 16:00:20 192.168.6.6 STP: msti 0 set port 3 to discarding sb-dry-dsw-01
Dec 31 16:00:20 192.168.6.6 Port: link state changed to 'down' on port 2 sw-01
Dec 31 16:00:20 192.168.6.6 Port: link state changed to 'down' on port 6 sw-01
Dec 31 16:00:20 192.168.6.6 Port: link state changed to 'down' on port 3 sw-01

 



Questions:

a) Is there a way to bump up verbosity on the log messaging (specifically syslog)?

b) Forum searches on temperature seem to show ours within reasonable range, so we're ruling that out somewhat. Could that be a factor?

c) We're going to put a different UPS in at the site, but not sure its dirty power if all the other equipment seems fine. Still a possibilty.

Anything obvious we're missing? We have dozens of Netonix WISPSwitch units and they're all bulletproof except for this particular site.

bdowne01
Member
 
Posts: 9
Joined: Thu Jul 07, 2016 3:01 pm
Has thanked: 0 time
Been thanked: 1 time

Re: Multiple WS-8-150-AC Units Hanging

Sun Jul 07, 2019 11:43 am

We're definitely zeroing in on this being heat related. The attached telemetry is beginning to show a pattern. The gaps in the data are when the switch goes unstable. Yesterday the Netonix wouldn't stay up more than 10 minutes before hanging in the midst of the afternoon.

Yet the temps appear to be well within spec. I checked another tower site (albeit with a different Netonix model) and it's up in the same temp range and having no stability issues whatever. Could this be a model specific issue with the WS-8-150-AC? We haven't tried a different model at this site yet. I think our next step will be to try swapping in different model and see if the issues persist.

Image

User avatar
mike99
Associate
Associate
 
Posts: 837
Joined: Tue Nov 25, 2014 10:53 am
Location: Quebec, Canada
Has thanked: 95 times
Been thanked: 245 times

Re: Multiple WS-8-150-AC Units Hanging

Sun Jul 07, 2019 7:50 pm

From specs, - 25 to 55 so that likely.

bdowne01
Member
 
Posts: 9
Joined: Thu Jul 07, 2016 3:01 pm
Has thanked: 0 time
Been thanked: 1 time

Re: Multiple WS-8-150-AC Units Hanging

Tue Jul 09, 2019 4:57 pm

An update to this thread so it hopefully helps out others in the future: If you're in a warm environment, you may not want to use the WS-8-150-AC.

This past Sunday we swapped in a WS-12-250-AC. Though running in exactly the same spot as the WS-8, temps reported are much cooler (almost 10C) and the switch has remained stable since then. Below is the telemetry from our NMS showing the recorded metrics and when we made changes. The ambient temperature is generally very consistent (and so is the weather), so it doesn't appear to be an outside factor.
Image

User avatar
sirhc
Employee
Employee
 
Posts: 7415
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1608 times
Been thanked: 1325 times

Re: Multiple WS-8-150-AC Units Hanging

Wed Jul 10, 2019 10:50 am

All the WS models use the same SOC and same basic design and run the same firmware so they all have the same operating temperature ranges.

If your WS-8-150-AC is older, say more than a year old it is possible it has defective 3V CAPs which are on the edge and with the increased heat are failing.

Read this post on the 3V CAP issue which effected some units in 2 production batches of the WS-6 and WS-8-150-AC models and how to bench test it: https://forum.netonix.com/viewtopic.php?f=6&t=2780#p19221
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

bdowne01
Member
 
Posts: 9
Joined: Thu Jul 07, 2016 3:01 pm
Has thanked: 0 time
Been thanked: 1 time

Re: Multiple WS-8-150-AC Units Hanging

Wed Jul 10, 2019 11:58 am

Thanks for that link Chris. I'll follow those instructions to bench test our units.

What do you think about the recorded temperature shift? I understand they all have the same operating temperature and basic design, but that doesn't explain the large differential in operating temps. Do you think it could it just be as simple as the 12-port having better ventilation or something? 6 of the ports are doing 24v PoE, perhaps it's reduced load on the 250W vs 150w model?

I'm curious about this since our area is definitely on the hot side (summer temps topping out at 110F), and small variances like this end up being a large factor in equipment deployment decisions.

Brian

User avatar
sirhc
Employee
Employee
 
Posts: 7415
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1608 times
Been thanked: 1325 times

Re: Multiple WS-8-150-AC Units Hanging

Wed Jul 10, 2019 12:28 pm

So there was a change in fan speeds from v1.5.0 to v1.5.2 which could explain some differences.
Explained here and several other places: https://forum.netonix.com/viewtopic.php?f=6&t=4918&p=27922&hilit=+fan+sunon+faster#p27922

Also the "board" temperature sensor has a 3%+/- variance but the internal SOC sensors vary more and are not that accurate but the CPU and PHYs are rated to run at 125C. In our thermal testing the board temp was fine at 70C+/-

Other variances can be differnt boxes with differnt direct sunlight and air flow. You would have to compare box air temps with switch temps.

But I am sure board temps between models will vary slightly but all models tested to run in 55C environments.

I am going to lean towards the 3V CAPs until you test them which if they are defective we extended the warranty on them that we will repair for free so long as no other damage is on the board and the unit is RMA back to us with latest firmware on it (saves us the time to upgrade peoples units which they should be doing).
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

bdowne01
Member
 
Posts: 9
Joined: Thu Jul 07, 2016 3:01 pm
Has thanked: 0 time
Been thanked: 1 time

Re: Multiple WS-8-150-AC Units Hanging

Wed Jul 10, 2019 12:41 pm

Ok, sounds good Chris. We had two of the WS-8-150-AC's misbehaving. Will bench test both and report back.

bdowne01
Member
 
Posts: 9
Joined: Thu Jul 07, 2016 3:01 pm
Has thanked: 0 time
Been thanked: 1 time

Re: Multiple WS-8-150-AC Units Hanging

Thu Jul 18, 2019 12:47 pm

Finally got the units in to test. One of the two WS-8-150's that was at the problem site is failing the cap test, the other is not. Should I just follow the standard RMA procedure as described here? viewtopic.php?f=6&t=1259#p9343

We have several other WS-150-8 units out in the field and will be bench testing those to confirm operation as well.

User avatar
sirhc
Employee
Employee
 
Posts: 7415
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1608 times
Been thanked: 1325 times

Re: Multiple WS-8-150-AC Units Hanging

Thu Jul 18, 2019 5:24 pm

I would RMA both WS-6 as out testing is more in depth, just state I told you to RMA the unit that passed that way if we find nothing wrong we will not charge you $25 for that one.


Only the WS-8-150-AC unit had issues with 3V CAPs not the WS-8-150-DC
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

Return to General Discussion

Who is online

Users browsing this forum: Bing [Bot] and 20 guests