2 x Sangoma A101DE causing server to reboot/hang.

my898

Joined
Jan 6, 2009
Messages
7
Likes
0
Points
0
#1
Hi,

I am having a problem with my server rebooting/hanging randomly ...

Server is a DELL 1950 ( 1 x Quad-CPU, 2Gb RAM, 2 x 250HDD in software RAID 0) with the latest BIOS and firmware udpates. It has 2 x A101DE Sangoma Primary Rate cards so that I can "pass-through" calls to our current Alcatel PBX. Our setup is,

Telco (production) <--> Elastix w1g1 & w2g1 <--Cross-over cable--> Alcatel PBX.

Elastix version is 1.5.2-2.3 with the latest updates from YUM.

Incoming and Outoging calls are all ok.

I have another identical server with 1 x A101DE card as well, which has also rebooted/hung. This server was build from scratch with Elastix 1.5. While the server with 2 cards was upgraded from 1.3 to 1.5 via YUM. I've checked the package version's on both server's and they are pretty much identical.

The issue started when I installed the 2nd card into the server and connected it the other server with 1 x A101DE during my testing. Telco (test) <--> Elastix w1g1 & w2g1 <--X-Over--> Elastix w1g1. I started to get an DELL PCIE hardware error on the front LCD display. After 3 days of running there hardware diagonstics, etc they replaced the motherboard and PCIE riser cards. This did not fix the problem. The server lasted about 48 hours before hanging.

I've had a look around on the web and found some possible solutions from re-compiling the wanpipe, dahdi and kernel modules, disabling acpi on boot-up, changing the interrupt settings to removing the card. I tried re-compiling the modules from source which was ok but I couldn't start the Primary Rate. The acpi and changing the interrupts is out of my depth. Removing the card it not an option.

I also thought it might a faulty A101DE card so I swapped the cards around and even the riser cards before DELL replaced the hardware but no success.

I've send Sangoma some data as outlined by there Support site today, so hopefully I'll hear from them in the next few days.

The server with 2 cards as been in/out of production and I'm firing fighting to keep the phone service available.

The other server with 1 card has been up for over 3 days so far but it doesn't have any calls going through it.

Another suggestion was to only have 1 card in each server and pass-through the calls via SIP but it will be my last thing I'll do.

Cheers
 

marc.sangoma

Joined
Jul 20, 2009
Messages
14
Likes
0
Points
0
#2
Re:2 x Sangoma A101DE causing server to reboot/han

Hi,

Please download and install our wanpipe 3.4.2.8 driver from ftp://ftp.sangoma.com/linux/custom/3.4/ ... .4.2.8.tgz. There has been some extra code added in recent releases which after thousands of read/writes on some systems causes it to crash. This code has been removed in this driver and every customer who has tried this has had no issues. If any issues occur send an email to techdesk@sangoma.com and they will look into this with you.

Marc
Sangoma Technologies
 

my898

Joined
Jan 6, 2009
Messages
7
Likes
0
Points
0
#3
Hi marc.sangoma,

Thanks for your quick reply. I'll give it a go on our test server then on our production server.

The server hung today after 48 hours uptime ... so it would be nice if it stays up longer than that.

I've got a ticket raised with Sangoma #3472 and they have requested the information when the server stops working but I told them I cannot get a response even from the console so it can't get the date for them.

I'll keep you posted on my findings.

Cheers
 

my898

Joined
Jan 6, 2009
Messages
7
Likes
0
Points
0
#4
Hi marc.sangoma,

I've updated the wanpipe driver to v3.4.4 as suggest by the support person from Sangoma.

I used these instructions to build the wanpipe driver from source,

http://www.elastix.org/index.php?option ... t=10#26396

Will see if it fixes the problem in the next couple of days.

I also use to get this message from dmesg, wanpipe: no version for "dahdi_alarm_notify" found: kernel tainted., but now I don't this any more.

Cheers
 

mihpel

Joined
May 8, 2007
Messages
87
Likes
0
Points
0
#5
my898 said:
I also use to get this message from dmesg, wanpipe: no version for "dahdi_alarm_notify" found: kernel tainted., but now I don't this any more.

Cheers
This seems to be relevant to http://bugs.elastix.org/view.php?id=126
which has been resolved with wanpipe-util-3.4.1-4 and dahdi-2.1.0.4-22 rpm's.

Have you been using a clean fully updated system while getting the tainted message.

If you can test and give more feedback i would really like to know if this bug has been reintroduced with the latest updates !

Regards,
Mihpel
 

my898

Joined
Jan 6, 2009
Messages
7
Likes
0
Points
0
#6
Hi All,

My server's have been up for 6 days now without any issues. I have migrated about 40 people onto the server so far.

Both my server's (one was updated from v1.3 to v1.5 while the other one was built with v1.5) have the latest updates via yum except for the wanpipe driver which is v3.4.4 build from source.

I had to modify the /etc/wanpipe/scripts/start script to use "/dev/dahdi" instead of "/dev/zap" which didn't cause any issues but when restart with wanrouter is indicated that it couldn't find the status of the "/dev/zap" channels and the error ... device failed to come up.

Hope this helps.

Cheers
 

marc.sangoma

Joined
Jul 20, 2009
Messages
14
Likes
0
Points
0
#7
Re:2 x Sangoma A101DE causing server to reboot/han

Hi,

If you are getting /dev/zap errors then run the command "cp /etc/wanpipe/wancfg_zaptel/templates/dahdi_cfg_script /etc/wanpipe/scripts/start". This will place the correct start up script for Dahdi. It seems the zaptel start script was placed there instead of the Dahdi.

Marc
Sangoma Technologies
 

Members online

No members online now.

Latest posts

Forum statistics

Threads
30,902
Messages
130,887
Members
17,565
Latest member
omarmenichetti
Top