BIG delay in call progress -help please

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#1
I have very strange problem in my PBX -elastix 1.6.2-7.
From call start to start dialing i have 20-40 sec delay.
The delay appears between internal extensions and when somebody calling fron pstn or to pstn.
Last time i Have similar issue -it was resolved by cleaning /var/spool/asterisk/record directory


Example call progress (from 642 to 666 with 20 sec delay before ringing):
[Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-record-enable:5] MacroExit("SIP/642-00000042", "") in new stack
[Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-exten-vm:9] Macro("SIP/642-00000042", "dial||tr|666") in new stack
[Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-dial:1] GotoIf("SIP/642-00000042", "1?dial") in new stack
[Dec 3 12:36:12] VERBOSE[13023] logger.c: -- Executing [s@macro-dial:3] AGI("SIP/642-00000042", "dialparties.agi") in new stack
[Dec 3 12:36:32] VERBOSE[13023] logger.c: dialparties.agi: Caller ID name is 'tt /dd' number is '642'
[Dec 3 12:36:32] VERBOSE[13023] logger.c: -- dialparties.agi: dbset CALLTRACE/666 to 642
[Dec 3 12:36:32] VERBOSE[13023] logger.c: -- Executing [s@macro-dial:7] Dial("SIP/642-00000042", "SIP/666||tr") in new stack
[Dec 3 12:36:32] NOTICE[13023] app_dial.c: Hey! chan SIP/642-00000042's context='macro-dial', and exten='s'
[Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (macro-dial, s, 7) exited non-zero on 'SIP/642-00000042' in macro 'dial'
[Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (macro-exten-vm, s, 9) exited non-zero on 'SIP/642-00000042' in macro 'exten-vm'
[Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (from-internal, 666, 1) exited non-zero on 'SIP/642-00000042'
[Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [h@macro-dial:1] Macro("SIP/642-00000042", "hangupcall") in new stack
[Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:1] GotoIf("SIP/642-00000042", "1?skiprg") in new stack
[Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:4] GotoIf("SIP/642-00000042", "1?skipblkvm") in new stack
[Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:7] GotoIf("SIP/642-00000042", "1?theend") in new stack
[Dec 3 12:36:35] VERBOSE[13023] logger.c: -- Executing [s@macro-hangupcall:9] Hangup("SIP/642-00000042", "") in new stack
[Dec 3 12:36:35] VERBOSE[13023] logger.c: == Spawn extension (macro-hangupcall, s, 9) exited non-zero on 'SIP/642-00000042' in macro '

Could anybody help?
 

oldrags

Joined
Dec 3, 2010
Messages
13
Likes
0
Points
0
#2
I am very interested in a solution for this, i have the same issue on an Elastix 2.0.0.
The error comes and goes: sometimes there are some 10-20 seconds delay until the call is setup and sometimes the phones react quickly.
The error happens, as mentioned by witekprytek, even calls between two internal hardphones), so it can't be a NAT or internet issue.
 

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#3
oldrags said:
I am very interested in a solution for this, i have the same issue on an Elastix 2.0.0.
The error comes and goes: sometimes there are some 10-20 seconds delay until the call is setup and sometimes the phones react quickly.
The error happens, as mentioned by witekprytek, even calls between two internal hardphones), so it can't be a NAT or internet issue.
I'm sure that this is some internal error, because when I'm calling from pstn to the IVR, server respond quickly, but when I choose number pointing to some internal extension e.g fax (IAX on server) the effect is identical as described above - server start dialing process and after some time usually 20-30 sec forward call to the fax without any problem.
 

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#4
I have noticed, that the delay is introduced by agi scripts: recordingcheck and dialparties.

time php -v
PHP 5.1.6 (cli) (built: Nov 29 2010 16:47:37)
Copyright (c) 1997-2006 The PHP Group
Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend Technologies

real 0m20.020s
user 0m0.010s
sys 0m0.007s

so its look like php problem

Is there any way to fix it?
 

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#5
oldrags said:
I am very interested in a solution for this, i have the same issue on an Elastix 2.0.0.
....
Oldrags: could you please give me some information about your system: processor, disks (do you have a raid?), kernel version

The reason of this error is php. Strange thing is that during this happens php scripts called from web server works without any problems only php cli is affected.

I have done "strace php -v" twice time and the system has started work in normal way.
Php execution time after this is about 17ms and there is no problem with calling and other function which need php agi.

I'm not php expert, but in my opinion some attached library could be broken or not linked correctly in the system.
 

Bob

Joined
Nov 4, 2007
Messages
2,400
Likes
1
Points
36
#6
Gentlemen,

This is going to be one those issues that you have to be there to see it happening.

When the issue is occurring, can you perform a top command at the Linux prompt and look to see if something is using CPU resources (e.g. more than 4%)....I think you might be on the right track with the AGI scripts....

However I have only seen this issue with 1.5/1.6 Version.
Can you also supply Asterisk version?
Are you using or setup Queues on your system?


I have not seen this issue on 2.x of Elastix....

Regards
Bob
 

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#7
HI Bob.
My asterisk version is: Asterisk 1.4.36
Yes I have configured some queues in my system.

I have made additional tests and there was no problem with other system commands or Asterisk in general. All additional processes e.g. tftp, http, mysql, had been working fine and processor and memory usage was on standard level.
Only php commands needed about 20 sec to execute e.g.:echo "<?php phpinfo()?>" | php
need 20 sec to be executed, but the same script called form web server via www was executed in 30ms

I have noticed, that during this issue http server need "standard restart time" + 20 sec. to perform restart.
 

oldrags

Joined
Dec 3, 2010
Messages
13
Likes
0
Points
0
#8
Hello

Thanks a lot for offering me help. I'm sorry it took so long for my follow-up. I'm Swiss, so I had to serve in the militia for the last few days instead of worrying about real problems like this one. :(

So here's what I figured out in the meantime:

The error doesn't always happen. Only about every other time.

I'm running Elastix on a asus eeepc laptop with an atom processor, 1 gig of ram.
I've been supervising the system with top and with SAR, couldn't find any problems there (cpu always about 99.5% idle, sometimes only 80% idle)
If I reproduce the error and watch what happens on TOP, i see just about nothing. (Asterisk process takes 4% CPU, as it would for any other call)

I updated the whole system today with yum update.
I've now got this kernel:
Code:
Linux 2.6.18-194.3.1.el5 #1 SMP Thu May 13 13:09:10 EDT 2010
The FreePBX front end says there are 16 updates available, but I didn't apply them, because I read somewhere in this forum you shouldn't update FreePBX separately from the rest of the Elastix system.

I have been doing some research about this. I noticed the dialparties agi script was taking so long, and i tried to follow these steps:
http://www.freepbx.org/forum/freepbx/us ... arties-agi

it wasn't WINS (i didn't have that configured)
The solution that worked for that post didn't work for me. I set it to Europe/Zurich in php.ini, didn't help either.

So I went on an tried this:
create a file called test.php and put the following in it:
print"Do you have stairs in your house?";
?>
then run:
strace php ./test.php > /tmp/log 2>&1
then you can go through /tmp/log and see if you can find the delay.
It will be tedious. Just a friendly warning.
I ran this command several times and figured out the following:
sometimes it's fast (subjectively immediate) and sometimes it's about 11 seconds. This correpsonds to the behaviour with calls. Sometimes it's fast, sometimes a bit more than 10 seconds.
When it takes 11 seconds to run the script, there is 2 times a 5 second break. Here's an extract of the log file, I marked those places in the text below with [BREAK].
Code:
gettimeofday({1292246080, 411647}, NULL) = 0
poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
send(3, "2I\1\0\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 26, MSG_NOSIGNAL) = 26
poll([{fd=3, events=POLLIN}], 1, 5000[BREAK])  = 0 (Timeout)
gettimeofday({1292246085, 411295}, NULL) = 0
poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
send(3, "2I\1\0\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 26, MSG_NOSIGNAL) = 26
poll([{fd=3, events=POLLIN}], 1, 5000[BREAK])  = 0 (Timeout)
close(3)                                = 0
alarm(0)                                = 0
futex(0xdef800, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0xdef720, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x1ef8e0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
Here's what happens when it runs fine:
Code:
gettimeofday({1292246094, 103891}, NULL) = 0
poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
send(3, "\344\371\1\0\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 26, MSG_NOSIGNAL) = 26
poll([{fd=3, events=POLLIN}], 1, 5000)  = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [26])                = 0
recvfrom(3, "\344\371\201\203\0\1\0\0\0\0\0\0\10lotertux\0\0\1\0\1", 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.0.1")}, [16]) = 26
close(3)                                = 0
alarm(0)                                = 0
futex(0xdef800, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0xdef720, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x1ef8e0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
So apparently, php is waiting for 2x a 5 second timeout sometimes, event for that simple script. What I don't know is how to go after it...
 

fraggle4

Joined
Apr 22, 2009
Messages
98
Likes
0
Points
0
#9

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#10
fraggle4 said:
I have never experienced this but I see the issue come up now and again.
Maybe DNS Lookups?
http://fonality.com/trixbox/forums/trix ... thing-slow
This issue is strictly related with php. ( command line interpreter). Maybe some library which need to DNS lookup is linked by mistake to the code. But for what agi scrip e.g. recording check need to use DNS?
 

oldrags

Joined
Dec 3, 2010
Messages
13
Likes
0
Points
0
#11
Hi all

The DNS solution worked! :woohoo:
THANK YOU SO MUCH!

All I did was change my resolv.conf. Commented the IP address of my Router-Firewall (A ZyWall 5) and used that of Google instead.
Code:
# cat /etc/resolv.conf
nameserver 8.8.8.8
#nameserver 192.168.0.1
So the solution works, which is great, but I wouldn't mind getting a technical explanation for this behavior.
First: why does PHP always attempt to access the DNS server, even when all the script is doing is print out a line?
Second: why in the world would the name resolution work for my whole system (web interface, ping, browsing, email, everything can resolve DNS just fine with the ZyWall) EXCEPT for PHP?

thanks a lot!

oldRags
 

fraggle4

Joined
Apr 22, 2009
Messages
98
Likes
0
Points
0
#12
From the trixbox thread cited:
In the command line, I executed "#php -v" and it took between 15 to 30 seconds for the output to be displayed. I then temporarily moved /etc/php.d (where all the extension *.ini's are located) and "#php -v" ran instantly. After moving /etc/php.d back and systematically commenting out each module, I discovered imap.so was causing the problem. I'm not sure why, but perhaps this will help someone who is having similar delays.
There are plenty of other posts around on this topic, but few clear answers, most of the fixes I have seen for this issue involve installing DNSmasq or BIND, which will also help spare you the "no-internal-calls-when-internet-goes-down" problem.

Glad it worked....
 

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#13
fraggle4 said:
From the trixbox thread cited:
In the command line, I executed "#php -v" and it took between 15 to 30 seconds for the output to be displayed. I then temporarily moved /etc/php.d (where all the extension *.ini's are located) and "#php -v" ran instantly. After moving /etc/php.d back and systematically commenting out each module, I discovered imap.so was causing the problem. I'm not sure why, but perhaps this will help someone who is having similar delays.
There are plenty of other posts around on this topic, but few clear answers, most of the fixes I have seen for this issue involve installing DNSmasq or BIND, which will also help spare you the "no-internal-calls-when-internet-goes-down" problem.

Glad it worked....
fraggle4: Thank you for this solution.
So what does it finally mean? All Asterisx distro which uses php agi scripts need Internet connection, to get call process working correctly - is it true?
What about ippabx installed in the places without internet connection?

What guys do you think about installing dns cache by dj berstein cr.yp.to? Could it help a little?
 

fraggle4

Joined
Apr 22, 2009
Messages
98
Likes
0
Points
0
#14

witekprytek

Joined
Dec 19, 2007
Messages
145
Likes
0
Points
0
#15
fraggle4 escribió:
This topic has come up many times, in many places, including Elastix without Tears, and affects all flavours of Asterisk.
Here's some info for starters:
http://www.elastixconnection.com/index. ... &task=view
So it seems that this bug is related for all freepbx based distro.
I could understand that some problems with dns resolution could affect call to other party via sip trunk behind the network. But why this issue does touch all internal calls or calls from PSTN via analog/digital/ cards?
 

Members online

No members online now.

Latest posts

Forum statistics

Threads
30,902
Messages
130,886
Members
17,564
Latest member
Mai Tuyen
Top