BIG delay in call progress -help please

Discussion in 'General' started by witekprytek, Dec 3, 2010.

  1. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    I have very strange problem in my PBX -elastix 1.6.2-7.
    From call start to start dialing i have 20-40 sec delay.
    The delay appears between internal extensions and when somebody calling fron pstn or to pstn.
    Last time i Have similar issue -it was resolved by cleaning /var/spool/asterisk/record directory


    Example call progress (from 642 to 666 with 20 sec delay before ringing):
    [Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-record-enable:5] MacroExit("SIP/642-00000042", "") in new stack
    [Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-exten-vm:9] Macro("SIP/642-00000042", "dial||tr|666") in new stack
    [Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-dial:1] GotoIf("SIP/642-00000042", "1?dial") in new stack
    [Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-dial:3] AGI("SIP/642-00000042", "dialparties.agi") in new stack
    [Dec 3 12:36:32] VERBOSE[13023] logger.c: dialparties.agi: Caller ID name is 'tt /dd' number is '642'
    [Dec 3 12:36:32] VERBOSE[13023] logger.c: -- dialparties.agi: dbset CALLTRACE/666 to 642
    [Dec 3 12:36:32] VERBOSE[13023] logger.c: -- Executing [s@macro-dial:7] Dial("SIP/642-00000042", "SIP/666||tr") in new stack
    [Dec 3 12:36:32] NOTICE[13023] app_dial.c: Hey! chan SIP/642-00000042's context='macro-dial', and exten='s'
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (macro-dial, s, 7) exited non-zero on 'SIP/642-00000042' in macro 'dial'
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (macro-exten-vm, s, 9) exited non-zero on 'SIP/642-00000042' in macro 'exten-vm'
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (from-internal, 666, 1) exited non-zero on 'SIP/642-00000042'
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [h@macro-dial:1] Macro("SIP/642-00000042", "hangupcall") in new stack
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:1] GotoIf("SIP/642-00000042", "1?skiprg") in new stack
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:4] GotoIf("SIP/642-00000042", "1?skipblkvm") in new stack
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:7] GotoIf("SIP/642-00000042", "1?theend") in new stack
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:9] Hangup("SIP/642-00000042", "") in new stack
    [Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (macro-hangupcall, s, 9) exited non-zero on 'SIP/642-00000042' in macro '

    Could anybody help?
     
  2. oldrags

    Joined:
    Dec 3, 2010
    Messages:
    13
    Likes Received:
    0
    I am very interested in a solution for this, i have the same issue on an Elastix 2.0.0.
    The error comes and goes: sometimes there are some 10-20 seconds delay until the call is setup and sometimes the phones react quickly.
    The error happens, as mentioned by witekprytek, even calls between two internal hardphones), so it can't be a NAT or internet issue.
     
  3. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    I'm sure that this is some internal error, because when I'm calling from pstn to the IVR, server respond quickly, but when I choose number pointing to some internal extension e.g fax (IAX on server) the effect is identical as described above - server start dialing process and after some time usually 20-30 sec forward call to the fax without any problem.
     
  4. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    I have noticed, that the delay is introduced by agi scripts: recordingcheck and dialparties.

    time php -v
    PHP 5.1.6 (cli) (built: Nov 29 2010 16:47:37)
    Copyright (c) 1997-2006 The PHP Group
    Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend Technologies

    real 0m20.020s
    user 0m0.010s
    sys 0m0.007s

    so its look like php problem

    Is there any way to fix it?
     
  5. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    Oldrags: could you please give me some information about your system: processor, disks (do you have a raid?), kernel version

    The reason of this error is php. Strange thing is that during this happens php scripts called from web server works without any problems only php cli is affected.

    I have done "strace php -v" twice time and the system has started work in normal way.
    Php execution time after this is about 17ms and there is no problem with calling and other function which need php agi.

    I'm not php expert, but in my opinion some attached library could be broken or not linked correctly in the system.
     
  6. Bob

    Bob

    Joined:
    Nov 4, 2007
    Messages:
    2,400
    Likes Received:
    1
    Gentlemen,

    This is going to be one those issues that you have to be there to see it happening.

    When the issue is occurring, can you perform a top command at the Linux prompt and look to see if something is using CPU resources (e.g. more than 4%)....I think you might be on the right track with the AGI scripts....

    However I have only seen this issue with 1.5/1.6 Version.
    Can you also supply Asterisk version?
    Are you using or setup Queues on your system?


    I have not seen this issue on 2.x of Elastix....

    Regards
    Bob
     
  7. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    HI Bob.
    My asterisk version is: Asterisk 1.4.36
    Yes I have configured some queues in my system.

    I have made additional tests and there was no problem with other system commands or Asterisk in general. All additional processes e.g. tftp, http, mysql, had been working fine and processor and memory usage was on standard level.
    Only php commands needed about 20 sec to execute e.g.:echo "<?php phpinfo()?>" | php
    need 20 sec to be executed, but the same script called form web server via www was executed in 30ms

    I have noticed, that during this issue http server need "standard restart time" + 20 sec. to perform restart.
     
  8. oldrags

    Joined:
    Dec 3, 2010
    Messages:
    13
    Likes Received:
    0
    Hello

    Thanks a lot for offering me help. I'm sorry it took so long for my follow-up. I'm Swiss, so I had to serve in the militia for the last few days instead of worrying about real problems like this one. :(

    So here's what I figured out in the meantime:

    The error doesn't always happen. Only about every other time.

    I'm running Elastix on a asus eeepc laptop with an atom processor, 1 gig of ram.
    I've been supervising the system with top and with SAR, couldn't find any problems there (cpu always about 99.5% idle, sometimes only 80% idle)
    If I reproduce the error and watch what happens on TOP, i see just about nothing. (Asterisk process takes 4% CPU, as it would for any other call)

    I updated the whole system today with yum update.
    I've now got this kernel:
    Code:
    Linux 2.6.18-194.3.1.el5 #1 SMP Thu May 13 13:09:10 EDT 2010
    The FreePBX front end says there are 16 updates available, but I didn't apply them, because I read somewhere in this forum you shouldn't update FreePBX separately from the rest of the Elastix system.

    I have been doing some research about this. I noticed the dialparties agi script was taking so long, and i tried to follow these steps:
    http://www.freepbx.org/forum/freepbx/us ... arties-agi

    it wasn't WINS (i didn't have that configured)
    The solution that worked for that post didn't work for me. I set it to Europe/Zurich in php.ini, didn't help either.

    So I went on an tried this:
    I ran this command several times and figured out the following:
    sometimes it's fast (subjectively immediate) and sometimes it's about 11 seconds. This correpsonds to the behaviour with calls. Sometimes it's fast, sometimes a bit more than 10 seconds.
    When it takes 11 seconds to run the script, there is 2 times a 5 second break. Here's an extract of the log file, I marked those places in the text below with [BREAK].
    Code:
    gettimeofday({1292246080, 411647}, NULL) = 0
    poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
    send(3, "2I\1\0\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 26, MSG_NOSIGNAL) = 26
    poll([{fd=3, events=POLLIN}], 1, 5000[BREAK])  = 0 (Timeout)
    gettimeofday({1292246085, 411295}, NULL) = 0
    poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
    send(3, "2I\1\0\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 26, MSG_NOSIGNAL) = 26
    poll([{fd=3, events=POLLIN}], 1, 5000[BREAK])  = 0 (Timeout)
    close(3)                                = 0
    alarm(0)                                = 0
    futex(0xdef800, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    futex(0xdef720, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    futex(0x1ef8e0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    
    Here's what happens when it runs fine:
    Code:
    gettimeofday({1292246094, 103891}, NULL) = 0
    poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
    send(3, "\344\371\1\0\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 26, MSG_NOSIGNAL) = 26
    poll([{fd=3, events=POLLIN}], 1, 5000)  = 1 ([{fd=3, revents=POLLIN}])
    ioctl(3, FIONREAD, [26])                = 0
    recvfrom(3, "\344\371\201\203\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.0.1")}, [16]) = 26
    close(3)                                = 0
    alarm(0)                                = 0
    futex(0xdef800, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    futex(0xdef720, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    futex(0x1ef8e0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    
    So apparently, php is waiting for 2x a 5 second timeout sometimes, event for that simple script. What I don't know is how to go after it...
     
  9. fraggle4

    Joined:
    Apr 22, 2009
    Messages:
    98
    Likes Received:
    0
  10. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    This issue is strictly related with php. ( command line interpreter). Maybe some library which need to DNS lookup is linked by mistake to the code. But for what agi scrip e.g. recording check need to use DNS?
     
  11. oldrags

    Joined:
    Dec 3, 2010
    Messages:
    13
    Likes Received:
    0
    Hi all

    The DNS solution worked! :woohoo:
    THANK YOU SO MUCH!

    All I did was change my resolv.conf. Commented the IP address of my Router-Firewall (A ZyWall 5) and used that of Google instead.
    Code:
    # cat /etc/resolv.conf
    nameserver 8.8.8.8
    #nameserver 192.168.0.1
    So the solution works, which is great, but I wouldn't mind getting a technical explanation for this behavior.
    First: why does PHP always attempt to access the DNS server, even when all the script is doing is print out a line?
    Second: why in the world would the name resolution work for my whole system (web interface, ping, browsing, email, everything can resolve DNS just fine with the ZyWall) EXCEPT for PHP?

    thanks a lot!

    oldRags
     
  12. fraggle4

    Joined:
    Apr 22, 2009
    Messages:
    98
    Likes Received:
    0
    From the trixbox thread cited:
    There are plenty of other posts around on this topic, but few clear answers, most of the fixes I have seen for this issue involve installing DNSmasq or BIND, which will also help spare you the "no-internal-calls-when-internet-goes-down" problem.

    Glad it worked....
     
  13. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    fraggle4: Thank you for this solution.
    So what does it finally mean? All Asterisx distro which uses php agi scripts need Internet connection, to get call process working correctly - is it true?
    What about ippabx installed in the places without internet connection?

    What guys do you think about installing dns cache by dj berstein cr.yp.to? Could it help a little?
     
  14. fraggle4

    Joined:
    Apr 22, 2009
    Messages:
    98
    Likes Received:
    0
  15. witekprytek

    Joined:
    Dec 19, 2007
    Messages:
    145
    Likes Received:
    0
    fraggle4 escribió:
    So it seems that this bug is related for all freepbx based distro.
    I could understand that some problems with dns resolution could affect call to other party via sip trunk behind the network. But why this issue does touch all internal calls or calls from PSTN via analog/digital/ cards?
     

Share This Page