2 x Sangoma A101DE causing server to reboot/hang.

Discussion in 'Gateways' started by my898, Jul 27, 2009.

  1. my898

    Joined:
    Jan 6, 2009
    Messages:
    7
    Likes Received:
    0
    Hi,

    I am having a problem with my server rebooting/hanging randomly ...

    Server is a DELL 1950 ( 1 x Quad-CPU, 2Gb RAM, 2 x 250HDD in software RAID 0) with the latest BIOS and firmware udpates. It has 2 x A101DE Sangoma Primary Rate cards so that I can "pass-through" calls to our current Alcatel PBX. Our setup is,

    Telco (production) <--> Elastix w1g1 & w2g1 <--Cross-over cable--> Alcatel PBX.

    Elastix version is 1.5.2-2.3 with the latest updates from YUM.

    Incoming and Outoging calls are all ok.

    I have another identical server with 1 x A101DE card as well, which has also rebooted/hung. This server was build from scratch with Elastix 1.5. While the server with 2 cards was upgraded from 1.3 to 1.5 via YUM. I've checked the package version's on both server's and they are pretty much identical.

    The issue started when I installed the 2nd card into the server and connected it the other server with 1 x A101DE during my testing. Telco (test) <--> Elastix w1g1 & w2g1 <--X-Over--> Elastix w1g1. I started to get an DELL PCIE hardware error on the front LCD display. After 3 days of running there hardware diagonstics, etc they replaced the motherboard and PCIE riser cards. This did not fix the problem. The server lasted about 48 hours before hanging.

    I've had a look around on the web and found some possible solutions from re-compiling the wanpipe, dahdi and kernel modules, disabling acpi on boot-up, changing the interrupt settings to removing the card. I tried re-compiling the modules from source which was ok but I couldn't start the Primary Rate. The acpi and changing the interrupts is out of my depth. Removing the card it not an option.

    I also thought it might a faulty A101DE card so I swapped the cards around and even the riser cards before DELL replaced the hardware but no success.

    I've send Sangoma some data as outlined by there Support site today, so hopefully I'll hear from them in the next few days.

    The server with 2 cards as been in/out of production and I'm firing fighting to keep the phone service available.

    The other server with 1 card has been up for over 3 days so far but it doesn't have any calls going through it.

    Another suggestion was to only have 1 card in each server and pass-through the calls via SIP but it will be my last thing I'll do.

    Cheers
     
  2. marc.sangoma

    Joined:
    Jul 20, 2009
    Messages:
    14
    Likes Received:
    0
    Re:2 x Sangoma A101DE causing server to reboot/han

    Hi,

    Please download and install our wanpipe 3.4.2.8 driver from ftp://ftp.sangoma.com/linux/custom/3.4/ ... .4.2.8.tgz. There has been some extra code added in recent releases which after thousands of read/writes on some systems causes it to crash. This code has been removed in this driver and every customer who has tried this has had no issues. If any issues occur send an email to techdesk@sangoma.com and they will look into this with you.

    Marc
    Sangoma Technologies
     
  3. my898

    Joined:
    Jan 6, 2009
    Messages:
    7
    Likes Received:
    0
    Hi marc.sangoma,

    Thanks for your quick reply. I'll give it a go on our test server then on our production server.

    The server hung today after 48 hours uptime ... so it would be nice if it stays up longer than that.

    I've got a ticket raised with Sangoma #3472 and they have requested the information when the server stops working but I told them I cannot get a response even from the console so it can't get the date for them.

    I'll keep you posted on my findings.

    Cheers
     
  4. my898

    Joined:
    Jan 6, 2009
    Messages:
    7
    Likes Received:
    0
    Hi marc.sangoma,

    I've updated the wanpipe driver to v3.4.4 as suggest by the support person from Sangoma.

    I used these instructions to build the wanpipe driver from source,

    http://www.elastix.org/index.php?option ... t=10#26396

    Will see if it fixes the problem in the next couple of days.

    I also use to get this message from dmesg, wanpipe: no version for "dahdi_alarm_notify" found: kernel tainted., but now I don't this any more.

    Cheers
     
  5. mihpel

    Joined:
    May 8, 2007
    Messages:
    87
    Likes Received:
    0
    This seems to be relevant to http://bugs.elastix.org/view.php?id=126
    which has been resolved with wanpipe-util-3.4.1-4 and dahdi-2.1.0.4-22 rpm's.

    Have you been using a clean fully updated system while getting the tainted message.

    If you can test and give more feedback i would really like to know if this bug has been reintroduced with the latest updates !

    Regards,
    Mihpel
     
  6. my898

    Joined:
    Jan 6, 2009
    Messages:
    7
    Likes Received:
    0
    Hi All,

    My server's have been up for 6 days now without any issues. I have migrated about 40 people onto the server so far.

    Both my server's (one was updated from v1.3 to v1.5 while the other one was built with v1.5) have the latest updates via yum except for the wanpipe driver which is v3.4.4 build from source.

    I had to modify the /etc/wanpipe/scripts/start script to use "/dev/dahdi" instead of "/dev/zap" which didn't cause any issues but when restart with wanrouter is indicated that it couldn't find the status of the "/dev/zap" channels and the error ... device failed to come up.

    Hope this helps.

    Cheers
     
  7. marc.sangoma

    Joined:
    Jul 20, 2009
    Messages:
    14
    Likes Received:
    0
    Re:2 x Sangoma A101DE causing server to reboot/han

    Hi,

    If you are getting /dev/zap errors then run the command "cp /etc/wanpipe/wancfg_zaptel/templates/dahdi_cfg_script /etc/wanpipe/scripts/start". This will place the correct start up script for Dahdi. It seems the zaptel start script was placed there instead of the Dahdi.

    Marc
    Sangoma Technologies
     

Share This Page