Re: [Jack-Devel] Place for bug reporting

Prev	Next Index
Date	Thu, 05 Jun 2014 12:46:46 +0400
From	TimKa <[hidden] at yandex dot ru>
To	Adrian Knoth <[hidden] at drcomp dot erfurt dot thur dot de>
Cc	JACK <[hidden] at lists dot jackaudio dot org>
In-Reply-To	Adrian Knoth Re: [Jack-Devel] Place for bug reporting
Please see bellow

04.06.2014 20:52, Adrian Knoth пишет:
> On Wed, Jun 04, 2014 at 08:00:53PM +0400, TimKa wrote:
>
>> I do not believe that it's clock drift. I have two master clock
>> sides.
> I could end here. There are never two masters. There is exactly one
> clock master, everthing else is slaved. If not, you need sample rate
> conversion (SRC). As simple as that.
>
> And this is what is happening. Whenever your two "masters" are off by
> more than your "jitter buffer" (read: yet another FIFO), you're into
> trouble. SRC or clock synchronistation, choose whatever suits your
> needs.
Ok, may be I have some misunderstanding in terms, I use master mode when 
talking about synchronisation with hardware clock from audio codec. 
Because I working with voice, I can do not neither clock sync nor 
re-sampling but control fifo-size by period level during silent by 
adding or deleting periods from buffer. Because of bidirectional of 
voice transferring I need to have two "masters" in my terms

>
>
>> I just send UDP buffers to ethernet on sender side. On
>> receiver side I just check availability of data with zero timeout
>> (and read to buffer if present) and copy data to out from jitter
>> buffer, also I check for jitter buffer overflow, so, as I
>> understand, clock drift is not key for my situation.
> And what do you do when there is not enough data in the buffer?
> Inserting zeroes as needed? Likewise, I guess that you throw away data
> if the buffer is full, too. In that case, it should work, you're
> essentially doing some poor way of SRC.
Yes it's some poor way of SRS but not on sample but period level, but 
it's much more less computing and I believed that's it's OK for my 
embedded platform and my task.
>
>
> Use some debugging information to verify that your buffer always
> maintains a valid fill level. In other words, print a warning whenever
> it passes a given low/high watermark.
Well, right now I never have low or high watermark of mybuffer. If my 
buffer is empty I just return from process function and if my buffer 
full I just rewrite last buffer. Please keep in mind that it's just test 
of pure jackd with net transfer on my OS/HW platform. Purpose of this 
test just find the problem in jackd or my OS/platform and check it's 
stability. I suppose to have 24/7 stability of this solution in future.

>
>> For VoiceIP (similar to my demo) there are some strategics to work
>> good enough even with two master clock sides.
> It's called SRC.
Ok, if we can talk about resampling on period base, you are right it's SRC.
>
>
>>> Posting logs of underruns won't get you nowhere, this doesn't even
>>> qualify as a bug report. A segfault without a stacktrace neither.
>> Could you please make advise what's enough for bug report?
> Let's start with what is not a bug report:
>
>    * "I'm writing my own application, and it's not working."
>
>    * "Whatever I do, it segfaults."
Well, not this way. I really appreciate of jack and I would like to use 
it. I'm going to write my own client application for jackd but now I'm 
testing jackd/OS/platform with very simple net client and got unexpected 
behaviour. Because right now I have 95% of idle in average with jackd 
working but some times jackd/recv client is stopping.
>
>
> Xruns are not bugs, they are expected error conditions. It's up to the
> integration engineer (you) to figure out why they occur. Nobody except
> you knows your setup.
Yes, I understand it, but in my case it's looking like extremly loading 
CPU after xrun happens.
But I suppose that I confuse source and result. May be I have CPU high 
load first because of some platform problems and then jackd problems.
>
> If something segfaults, fire up gdb and point out where the problem is
> or at least could be. We're talking multithreaded applications here,
> it's super easy to create a memory corruption somewhere. In that case,
> not even gdb would make sense. valgrind might help, but then again,
> you'd need to invoke jackd with a large timeout. Debugging RT
> applications is hard.
>
> If gdb is too much to ask for, then at least try to generalise your
> problem, try it on a different machine and see if it's really a jackd
> problem. It has to be reproducible.
Yes, I will check different hardware/software combination for 
reproducing problem.
Actually gdb is last option because of complexity of jackd so I really 
hope I can demonstrate problem without using gdb. Of course if we talk 
about patch as I think I must use gdb in any case.
>
>
>
> Let's do some expectation management: If you assume that you can pick
> two random ARM boards, install Linux and just use jackd to forward some
> audio from one board to the other, than this is likely not going to
> work. You can start with ordinary UDP streaming via gstreamer/vlc, make
> sure to have SRC on the receiving end, and see if that works.
I already done it. Typical alsa utils with aplay/arecord/nc working well 
without any problems. I also will try streaming for my platform but I 
understand that delay for 1-2 sec is not suitable even for simple demo.

>
> You'll notice that smaller buffers are more likely to cause trouble.
> Depending on your needs, you might never reach the desired latency.
> There are people with full-blown i7 machines out there who cannot go
> below a certain latency because of some obscure system details (graphics
> driver, wifi driver, firmware (SMI), IRQ assignment).
That's why I choose embedded platform, I can view all schematics, 
hardware list and etc.
>
> As said before: start with large buffers (4k samples) and expect all
> kinds of trouble when you go lower. Be scientific, measure, take notes
> what is working and what is not. And most importantly: don't expect this
> stuff to work out of the box. It's neither a text editor nor a web
> radio, it's a real-time application.
Thanks for advise, but I think you will smile, but in RT apllication 
everything should be much more predictable and definitely stated. I 
migrate from QNX to linux development not long time ago, and had a lot 
of experience with much more frequent process then audio with rate of 
8000Hz and period in 128 samples, my team were working with 
radiolocation (sample rate 80MHz). Last hardware we made on DSP with 600 
MHz clock rate and Ethernet transmission (also with linux as OS), so I 
couldn't even imagine that my task is "very complex" for ARM with 
720Mhz. Yes I took it from box, but I don't stop be the engineer and I 
was ready for troubles. I still think that my hardware have enough 
performance for transferring audio with 8000Hz over network even with 
period of 128 sample, but still explorering where problem is.

Right now (after switching off NO_HZ options in kernel) I have quite 
nice behaviour of linux kernel.
I have next couple of cases (jackd -d alsa -r8000 -n4 -p256 -s + my 
clients):
1. After some time my client stop silency on receiving side. CPU don't 
have extreme load, no xruns on console but following present:
subgraph starting at netrecv timed out (subgraph_wait_fd=16, status = 0, 
state = Running, pollret = 0 revents = 0x0)
2. After some time jackd stop silency on receiving side, CPU don't have 
extreme load, no xruns on console.
Sending side is working stable for a long time. Please focus me what can 
I check or may be get additional verbose/debug information about problem

>
> And if it xruns, it's not a jackd bug. It's your system that is at
> fault. Your only chance is to spend more time in the lab figuring out
> why.
Exactly this one I'm going to do.
>
>
> HTH
>
Cheers
Prev	Next Index
1401958024.9126_0.ltw:2,a <53902E76.1090000 at yandex dot ru>