CHOPPY SOUND / BAD SERVER PERFORMANCE WITH ELASTIX

Discussion in 'General' started by striderec, Jan 27, 2010.

  1. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    Greetings,

    The Server of my friend has these hardware settings:

    - 4 GB of DDR2 RAM
    - Intel Core 2 DUO 2.8 Ghz CPU
    - Gigabyte Mainboard
    - 1 TB 7200 RPM SATA2 HARD DISK
    - 1 GLAN NIC
    - Elastix 1.6-13 32 bits on CentOS 5.4 (after yum update)
    - 2 PCI OpenVOX A400P TDM cards with 4 FXO ports each. (Digium TDM400P Clone)

    The problem I have is that despite the powerful specifications of this computer, whenever there is some casual, normal or slightly above normal hard disk activity or LAN activity, the server suffers a terrible slowdown to the point that if you try to place a call while the hard disk is reading or writing data and the NIC is transferring data over the LAN, the call will either fail (it will timeout) or you'll hear chopped, non-understandable audio and you have to cancel the call. Even if you try to do stuff over the local network such as calling another extension or calling the voicemail under the situations described above, you'll hear the vm menu or your conversations all choppy and hard to understand. This is something quite unacceptable.

    My friend is upset for this because it is his production server and he has remote extensions on other countries. Now his coworkers complain that the calls drop, the audio has terrible echo, or if they get more than one call the sound quality deterirates terribly.

    He used to have another server but it broke and i had to help him installing a new one. He and his coworkers told me that everything worked fine with the former server but this one is a disaster.

    Can anyone help me? I have tried to fine-tune that server increasing the OSLEC software echo cancellation algorythm from 128 to 256 in the chan_dahdi.conf configuration and forcing the LAN adapter to work at 100 Mbps Full-duplex and no auto-negotiation via ethtool.

    They have reported that the echo has ceased but the poor quality of the calls and the dropped calls still persist.

    Can anyone help me? Another friend recommended me to check the /proc/interrupts output. Here is what I've got:

    CPU0 CPU1
    1: 2 0 Phys-irq i8042
    6: 6 0 Phys-irq floppy
    7: 0 0 Phys-irq parport0
    8: 1 0 Phys-irq rtc
    9: 0 0 Phys-irq acpi
    12: 4 0 Phys-irq i8042
    14: 1170247 0 Phys-irq ide0
    15: 0 0 Phys-irq ata_piix
    16: 0 0 Phys-irq uhci_hcd:usb3
    17: 1023946 0 Phys-irq ide2
    18: 34 0 Phys-irq ehci_hcd:usb1, uhci_hcd:usb5, uhci_hcd:usb8
    19: 0 0 Phys-irq ehci_hcd:usb2, uhci_hcd:usb6
    20: 0 0 Phys-irq uhci_hcd:usb4
    21: 116980430 0 Phys-irq uhci_hcd:usb7, ata_piix, wctdm
    22: 116889062 0 Phys-irq wctdm
    252: 14206766 0 Phys-irq eth0
    256: 29345284 0 Dynamic-irq timer0
    257: 668230 0 Dynamic-irq resched0
    258: 59 0 Dynamic-irq callfunc0
    259: 0 5380598 Dynamic-irq resched1
    260: 0 112 Dynamic-irq callfunc1
    261: 0 7870231 Dynamic-irq timer1
    262: 0 0 Dynamic-irq xenbus
    263: 0 0 Dynamic-irq console
    NMI: 0 0
    LOC: 0 0
    ERR: 0
    MIS: 0
    [root@elastix ~]#

    How can I help my friend? He is desperate. He receives a lot of calls a day and he says he can't have a person checking on the PBX over and over or restarting it when it gets overloaded.

    Thank you in advance,

    Paul
     
  2. ramoncio

    Joined:
    May 12, 2010
    Messages:
    1,663
    Likes Received:
    0
    Hi striderec,
    You didn't say the motherboard model or chipset, some new chipsets might have problems with linux.
    Maybe you'll need a newer CentOS kernel, some times this fixes many hardware problems.

    To try to solve your problems read through this document:

    http://www.x100p.com/support/doc/novavo ... issues.pdf

    It explains exactly what you need. It is a bit outdated, as it is based in old zaptel, but dahdi commands are very similar, and I think it can help you a lot.
     
  3. Siu

    Siu

    Joined:
    Jan 15, 2010
    Messages:
    30
    Likes Received:
    0
    IO think the harddisk have bad sector.
     
  4. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    The hard disk is BRAND NEW. All parts are brand new and Elastix took 3 long hours to format that huge 1 TB monster.....
     
  5. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    Thank you Ramoncio.

    I'll take a look at your link to see what happens. By the way, I hve asked my friend to mail me the Mainboard Model.
     
  6. ramoncio

    Joined:
    May 12, 2010
    Messages:
    1,663
    Likes Received:
    0
    I have also noticed 2 issues with your interrupts, as you will learn from that great manual I've linked.
    1.- Your tdms irqs are very high, try to disable from the bios everything you don't need as usb ports, sound card, etc.
    2.- Try using irqbalance, as all your interrupts are handled by CPU0.

    Read and absorb that manual, it is really good.
     
  7. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    Ramoncio,

    I am reading the manual and is good, however, I finally found the big problem here... The CentOS Kernel is detecting the hard disk controller as the generic Legacy ATA controller and instead of seeing the Hard Disk as "sda" (SATA) I see it as "hda" (IDE!!!)

    See it for yourself:

    [root@elastix ~]# df -h
    Filesystem Size Used Avail Use% Mounted on
    /dev/mapper/VolGroup00-LogVol00
    901G 42G 813G 5% /
    /dev/hda1 99M 33M 62M 35% /boot
    tmpfs 3.8G 0 3.8G 0% /dev/shm
    [root@elastix ~]#

    But the worse part comes here... I ran the command hdparm -t /dev/hda as the guide you linked me to suggests and man, it really shocked the hell out of me...

    These are the horrifying results of running that command:

    [root@elastix ~]# hdparm -t /dev/hda

    /dev/hda:
    Timing buffered disk reads: 10 MB in 3.50 seconds = 2.86 MB/sec
    [root@elastix ~]#

    See the terrible transfer rate!! it could just read 10 MB at a 2.86 MB/sec rate!!! I tried calling the DID of my friend while performing this test and I could barely hear the IVR, the sound was all distorted and non-audible... I ran the same test on my own PBX and these are the results:

    [root@masterpbx ~]# hdparm -t /dev/sda

    /dev/sda:
    Timing buffered disk reads: 226 MB in 3.02 seconds = 74.89 MB/sec
    [root@masterpbx ~]#

    As you can see, *that is* a true benchmark for a fast hard disk. I even called my DID and the IVR played *flawlessly* unlike my friend's. So I see now this is the source of all my friend's problems.. a poor hard disk data read/write rate.

    Summarizing, It looks like I have a few issues here:

    1) I don't see the kernel is balancing the threads and processes of the "2 Cores" (CPU0 and CPU1) regardless of the processor being an Intel Core 2 DUO. It's just using one core not the two..

    2) The Hard Disk is running in turtle mode and not the FAST SATA mode. This is the core of the problems and now i don't know if this can be fixed without having to reformat the hard drive...

    3) As you mentioned on your last post, the IRQs for the TDM cards must be the lowest ones and not shared. It looks like I can disable many things I don't need to use for the PBX.

    Please let me know what you think...... I am strongly interested in finding a solution for the terrible hard disk performance without reinstallation...

    Thank you in advance,

    Paul
     
  8. dicko

    Joined:
    Oct 24, 2008
    Messages:
    4,099
    Likes Received:
    0
    I suggest you try booting the non xen kernel on this machine. (unless of course, you are using xen :) ), just for comparison.

    JM2CWAE

    dicko
     
  9. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    Hi Dicko:

    Will it make the Kernel recognize the HD as an sda device instead of hda? It seems to me like the Kernel is loading the wrong drivers for the hard disk.

    - Paul
     
  10. dicko

    Joined:
    Oct 24, 2008
    Messages:
    4,099
    Likes Received:
    0
    Hi Paul:

    I can't answer that because it's not my machine, but the virtualization layer, the way the hardwrae handles interrupts, the chipset support of XEN on your motherboard, and the BIOS implementation, can all affect what the OS actually sees, and definitely how it performs, personally I don't understand Elastix' fascination with the XEN kernel as to be quite honest it runs, if at all, like shit on many of my (legacy and not so much) hardware, and how many people need it anyway?
     
  11. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    That's true Dicko. I am going to ask my friend to allow me check his BIOS settings and also try to boot the server without the XEN thing to see what happens. Of course it'll have to be past 6 PM since that's when his business closes.

    Something else.... I see I can use hdparm to set 32-bit transfer mode and UDMA-2 to this hard disk but the Idea that setting these parameters as the OpenVox guide Ramoncio sent me suggests me is something that scares me because I am afraid it may corrupt data of the hard disk even when hrparm /dev/hda says the disk supports UDMA-2 though it's currently disabled:

    [root@elastix ~]# hdparm -i /dev/hda

    /dev/hda:

    Model=WDC WD10EADS-00L5B1, FwRev=01.01A01, SerialNo=WD-WCAU48016812
    Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
    RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50
    BuffType=unknown, BuffSize=32767kB, MaxMultSect=16, MultSect=16
    CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
    IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
    PIO modes: pio0 pio3 pio4
    DMA modes: mdma0 mdma1 mdma2
    UDMA modes: udma0 udma1 udma2
    AdvancedPM=no WriteCache=enabled
    Drive conforms to: Unspecified: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-
    4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7

    * signifies the current active mode

    [root@elastix ~]#

    and:

    [root@elastix ~]# hdparm /dev/hda

    /dev/hda:
    multcount = 16 (on)
    IO_support = 0 (default 16-bit)
    unmaskirq = 0 (off)
    using_dma = 0 (off)
    keepsettings = 0 (off)
    readonly = 0 (off)
    readahead = 256 (on)
    geometry = 65535/255/63, sectors = 1953525168, start = 0
    [root@elastix ~]#

    Thanks,

    Paul
     
  12. dicko

    Joined:
    Oct 24, 2008
    Messages:
    4,099
    Likes Received:
    0
    Again, I can't answer that cos it's not my harddrive, personally I would be very conservative on a production machine,
    However I would love to get my hands on that POS and find out WTF :) , I'd clone the OS and bring it up on different hardware . . .
     
  13. dicko

    Joined:
    Oct 24, 2008
    Messages:
    4,099
    Likes Received:
    0
    /dev/hda:
    multcount = 16 (on)
    IO_support = 0 (default 16-bit)
    unmaskirq = 0 (off)
    using_dma = 0 (off)
    keepsettings = 0 (off)
    readonly = 0 (off)
    readahead = 256 (on)
    geometry = 65535/255/63, sectors = 1953525168, start = 0

    looks a little conservative, did you FWI?

    I agree it would be better if the lower layers would present that particular hardware as sda, then you could do all that sdparm stuff
     
  14. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    Dicko:

    That's the original configuration post-elastix installation. In fact, the server only has 3 weeks running. I did nothing.. yet.. don't wanna create havoc and panic to my friend with corrputed D data :p
     
  15. ramoncio

    Joined:
    May 12, 2010
    Messages:
    1,663
    Likes Received:
    0
    I'm afraid you'll need to reinstall Elastix, because when you get the hd controller right, the hard drive will be recognized as sda, I think.
    Get a new hard drive and keep the old one in a drawer, just in case you want to go back.
    Then play with your BIOS, and select a different SATA mode. Try using compatibility mode instead of enhanced mode. You can do some tests using some linux live cds until you get the right SATA config and good hdparm results.
    Then you can try again with your old hd to see if your previous installation works ok, but I'm afraid you'll need to reinstall the system.
    I had similar problems with a HP ML110 G5. Since then, I always run some hardware tests at any new installed system.
    You can also try some different CPU configs when you play with the BIOs, maybe enabling or disabling VT-X can make your kernel detect it right.
     
  16. dicko

    Joined:
    Oct 24, 2008
    Messages:
    4,099
    Likes Received:
    0
    Hola Ramoncio:

    If the state of the machine was important, I would personally use mondorescue, it would allow cloning and relatively easily allow moving between disk sub-systems (hd to sd), after of course getting the hardware right, without any downtime (until fixed)


    if you go that mondo route , a tip I learned well, the restore mode should be expert, and then issue export TERM=vt100 or you will go cross-eyed, before you mondorestore
     
  17. striderec

    Joined:
    Nov 25, 2008
    Messages:
    105
    Likes Received:
    0
    Ah guys........ a pity it's bad news.. this is exactly what I wanted to avoid..... reinstallation..... this is his production server and it cannot be shut down during regular hours.. Besides, it took me 10 hours overall to install, configure and test all previous configurations of my friend to have the PBX ready to go. He has like 80 extensions, a page and intercom system among other things.. I wanted to avoid losing another 10 hours on this stuff and more importantly, I don't really know how to to tell him this yet.. I know he will curl up into a ball and cry when he realizes another reinstallation needs to be done....

    Thank you so much Dicko and Ramoncio for your invaluable help.

    Best Regards,

    Paul
     

Share This Page