[Gross] Rotate stuck
Eino Tuominen
eino at utu.fi
Sun Sep 27 11:26:05 EEST 2009
Eino Tuominen wrote:
> Steve Wardle wrote:
>> On Fri, 18 Sep 2009 19:00:20 +0300
>> Eino Tuominen <eino at utu.fi> wrote:
>>
>>> Hi,
>>>
>>> pstack looks fine to me. This is how the thing should work:
>>>
>>> ...
>> Hi Eino,
>>
>> Thanks for the explanation.
>>
>> I installed 1.0.2 this morning but I'm still seeing the same problem.
>>
>> There is no "received rotate command" in the log once the rotation is "stuck".
>>
>> I'll send you the log off list.
>
> [ a lot of debugging off list ]
>
> Can anybody replicate the issue Steve is seeing? I know of at least to
> major sites using Gross as a milter, but they are not running on
> Solaris. One is running on NetBSD and another one on FreeBSD, I think.
>
> The right thing to do is to separate milter from grossd and run it as a
> separate process.
Hello,
I just had another look of pstack of the grossd process. This is from
Steve's main thread:
---------------- lwp# 1 / thread# 1 --------------------
ff0cc21c pause ()
ff379cc0 sleep (ff390000, ffbff820, 4, 0, 12c, 3dbe0) + f4
0001aaec main (ffbff740, 2, ffbffcac, 2a400, 2a400, 2a400) + 320
00014590 _start (0, 0, 0, 0, 0, 0) + 5c
-
And this is from a running grossd of our own MTA:
----------------- lwp# 1 / thread# 1 --------------------
ff19c648 nanosleep (ffbff7a0, ffbff798)
ff09dc5c sleep (1, a22dd, 0, 488ebd, 1, ff3cdb8c) + 58
0001f51c main (1, ffbffcc4, ffbffccc, 47000, 0, 0) + a3c
00014110 _start (0, 0, 0, 0, 0, 0) + 108
-
There it is, Steve's grossd is in pause(), but what is causing that I
just cannot understand. It looks like sleep() is using single-thread
implementation of sleep (which I think uses pause() and alarm()). Check
out the manpage of sleep() on Solaris 10.
Could you send me output of configure and make, I'm interested to see
what options get used?
What you could do is to replace sleep(1) at the end of the gross.c with
this code segment:
sleeptime.tv_sec = 1;
sleeptime.tv_nsec = 0;
do {
ret = nanosleep(&sleeptime, &sleepleft);
if (ret) {
/* sleep was interrupted */
sleeptime.tv_sec = sleepleft.tv_sec;
sleeptime.tv_nsec = sleepleft.tv_nsec;
}
} while (ret);
And of course add
struct timespec sleeptime, sleepleft;
at the beginning of the main() function.
There's also one sleep() in syncmgr.c (which looks a bit weird, it looks
like it should be replaced with a mutex), but that gets called only when
a sync peer connects.
--
Eino Tuominen
More information about the Gross
mailing list