Overrun when doing sync meta transfers TX and RX

Bcd · Post by **Bcd** » Thu Dec 07, 2017 12:28 pm

Hi there,

I am working on a SDR program that requires using both RX and TX at specific timestamps. In testing the program, sometimes I get overruns and discontinued timestamps on RX.

I reproduced the bug with a small program by modifying the sync_rxtx.c (in order to reduce the clutter) and, yes, I am able to see the same overrun behaviour.

Looking at dmesg I see that the USB device is reset due to LPM from time to time. I correlate the messages with the moments when the reset occurs but I am not sure about that.

One strange thing is that the overrun occurs when I create the timestam for TX based on the one from RX like this:
meta_tx.timestamp = meta_rx.timestamp + 10000;

but it does not occur when I create it like this:

meta_tx.timestamp += 10000;

I am using Ubuntu 16.04.3 with kernel 4.10.

Is there something I am missing, is there something I need to do?

output from console:

.........
TX actual_count: 0, timestamp: 81667826
RX actual_count: 10000, timestamp: 81667826
TX actual_count: 0, timestamp: 81677826
RX actual_count: 10000, timestamp: 81677826
TX actual_count: 0, timestamp: 81687826
RX actual_count: 10000, timestamp: 81687826
TX actual_count: 0, timestamp: 81697826
RX actual_count: 10000, timestamp: 81697826
TX actual_count: 0, timestamp: 81707826
RX actual_count: 10000, timestamp: 81707826
TX actual_count: 0, timestamp: 81717826
RX actual_count: 10000, timestamp: 81717826
TX actual_count: 0, timestamp: 81727826
RX actual_count: 10000, timestamp: 81727826
TX actual_count: 0, timestamp: 81737826
RX overrun, count: 8792, timestamp: 81737826
RX actual_count: 8792, timestamp: 81737826
TX actual_count: 0, timestamp: 81747826
RX actual_count: 10000, timestamp: 81747646
Closing the device...

dmesg:

[70124.812673] usb 1-4: Product: bladeRF
[70124.812676] usb 1-4: Manufacturer: Nuand
[70124.812678] usb 1-4: SerialNumber: 9eeeeecf610343cba827a9889dcc9c23
[70135.899122] usb 1-4: reset high-speed USB device number 5 using xhci_hcd
[70189.999920] usb 1-4: USB disconnect, device number 5
[70266.959216] usb 2-2: new SuperSpeed USB device number 9 using xhci_hcd
[70266.979724] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[70266.980198] usb 2-2: New USB device found, idVendor=2cf0, idProduct=5246
[70266.980203] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[70266.980207] usb 2-2: Product: bladeRF
[70266.980210] usb 2-2: Manufacturer: Nuand
[70266.980213] usb 2-2: SerialNumber: 9eeeeecf610343cba827a9889dcc9c23
[70276.919235] usb 2-2: reset SuperSpeed USB device number 9 using xhci_hcd
[70276.939527] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[70317.299166] usb 2-2: reset SuperSpeed USB device number 9 using xhci_hcd
[70317.319220] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[70376.383426] usb 2-2: reset SuperSpeed USB device number 9 using xhci_hcd
[70376.403448] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[70821.463588] usb 2-2: reset SuperSpeed USB device number 9 using xhci_hcd
[70821.483728] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[70869.671442] usb 2-2: reset SuperSpeed USB device number 9 using xhci_hcd
[70869.691695] usb 2-2: LPM exit latency is zeroed, disabling LPM.

Link to download the sync_rxtx.c

https://we.tl/ZUJuq92pOy

Thanks

Bcd · Post by **Bcd** » Sat Dec 09, 2017 11:35 am

Since last post I did more investigation on the USB problem and I enabled USB logging in kernel and attach now a complete dmesg with all debug messages. Seems the EP is reset from time to time, not sure if this is ok or not or if it affects the transfer.

Also I attach also the complete log from the example_meta_rxtx.c app

Am am not able to attach logs. Is there a way for enabling attachments instead of external storage and post link here?

dmesg - https://we.tl/187wzTP9aS
run log - https://we.tl/GmOO7ODV2g

Thanks

Bcd · Post by **Bcd** » Wed Dec 13, 2017 6:41 am

More news on the problem.

I moved the bladerf to another PC, having a different USB 3.0 controller. I am using the same software stack and FW/FPGA versions: 0.6, 2.0.0

The result is the same, the app crashes as there is overrun. From logs it seems that the driver restarts and hensce the discontinuity in timestamps after reboot..

Anyone willing to help me with running the app (see above posts) to help me identify if this is a failing BladeRF?

Thanks

Bcd · Post by **Bcd** » Wed Dec 13, 2017 6:56 am

Well mixed news and feelings !!! I did a test by reducing the temperature of the BladeRF by taking it outside of the window. There are 12 degree C outside and I am using the plastic case that I bought from Nuand together with bladeRF.

Seems the problem disappears!!! And there is no overrun for long time. The app ran for like 10 hours without overrun and crashing. When I bring it inside and the temperature rises it starts giving overruns and I return back to the usual behavior.

Anyone owning a bladeRF wanting to test the same app to see if I have a defective unit? The unit is new and I just bought it after I returned the previous one due to being deaf (the first one was a new unit too).

Thanks

Bcd · Post by **Bcd** » Wed Dec 13, 2017 9:44 am

So I add a modified sync_rxtx.c without prints on the console, for those who want to test:

https://we.tl/Givbyq11vL

and also example_common.h - it defines the samplerate and frequencyes:

https://we.tl/YNyqv0g3QS

and example_common.c

https://we.tl/oYCQfrdyPo

bglod · Post by **bglod** » Wed Dec 13, 2017 11:01 am

Fascinating. If it's temperature dependent, it sounds very much like a timing issue. Can you open up Quartus' TimeQuest and report failing paths? That might tell you something. If you're using the stock image, I can take a gander at it.

Bcd · Post by **Bcd** » Wed Dec 13, 2017 12:58 pm

bglod wrote: ↑Wed Dec 13, 2017 11:01 am Fascinating. If it's temperature dependent, it sounds very much like a timing issue. Can you open up Quartus' TimeQuest and report failing paths? That might tell you something. If you're using the stock image, I can take a gander at it.

I do not have Quartus installed and did not use it before. I might try when have some time.
You can reproduce the problem using the above files. Your colleague Ray was able to reproduce it so far.

bglod · Post by **bglod** » Wed Dec 13, 2017 1:05 pm

I gotcha. It would help if I checked IRC, huh?

rtucker · Post by **rtucker** » Wed Dec 13, 2017 4:49 pm

So yep, was working with Bcd via IRC today...

The application in question alternates between doing a bladerf_sync_rx of 10000 samples, followed by a bladerf_sync_tx of 10000 samples, scheduled for 1024 ticks after the RX sample timestamp (meta_tx.timestamp = meta_rx.timestamp + 1024).

The parameters:
num_buffers = 256
buffer_size = 8192
num_transfers = 32
sample rate = 7.68 Msps

How things break down: discontinuities in the RX sample timestamps occur, and eventually a BLADERF_ERR_TIME_PAST is thrown because bladerf_sync_tx is called with a timestamp in the past.

Things that don't seem to make a difference on my end:

FPGA 0.6.0 vs 0.7.1
libbladeRF 1.7.2 vs 1.9.0[
Temperature (tried 5, 20, and 40 degrees C, no big difference)

Things that do seem to make a difference:

buffer_size (I cannot get it to fail with 8192, but I can with 1024)
samples_len (if sample_len is bigger than buffer_size, it seems more likely to end on a BLADERF_ERR_TIME_PAST)
I/O load on host computer

The I/O load one was a bit of a surprise. I had a disk-intensive background task running on my laptop while working with this earlier, and then stepped away for a few hours. That disk-intensive process was done, and sure enough, I could no longer reproduce the problem. But if I did a "dd if=/dev/zero of=largefile bs=1024 count=4000000", the discontinuities start up again and eventually it will crash out. (Note that the original background task and dd are both CPU-intensive, so it could be that too.)

The textbook way to end up in this situation is to have too long of a gap between calls to bladerf_sync_rx(). So, I threw in some instrumentation.

First, up top:

Code: Select all

#include <sys/time.h>

#define RX_TIME_USEC 1500
#define TX_TIME_USEC 1500

// https://www.gnu.org/software/libc/manual/html_node/Elapsed-Time.html
int
timeval_subtract (struct timeval *result, struct timeval *x, struct timeval *y)
{
  /* Perform the carry for the later subtraction by updating y. */
  if (x->tv_usec < y->tv_usec) {
    int nsec = (y->tv_usec - x->tv_usec) / 1000000 + 1;
    y->tv_usec -= 1000000 * nsec;
    y->tv_sec += nsec;
  }
  if (x->tv_usec - y->tv_usec > 1000000) {
    int nsec = (x->tv_usec - y->tv_usec) / 1000000;
    y->tv_usec += 1000000 * nsec;
    y->tv_sec -= nsec;
  }

  /* Compute the time remaining to wait.
     tv_usec is certainly positive. */
  result->tv_sec = x->tv_sec - y->tv_sec;
  result->tv_usec = x->tv_usec - y->tv_usec;

  /* Return 1 if result is negative. */
  return x->tv_sec < y->tv_sec;
}

Then, the rxtx_loop section looks like this... each of the lines containing a reference to "tv_" is added for instrumentation. I think I put a " // <<<" at the end of the new lines.

Code: Select all

    /** [rxtx_loop] */

    struct bladerf_metadata meta_rx;
    struct bladerf_metadata meta_tx;
    struct bladerf_metadata meta;

    struct timeval tv_now, tv_start, tv_end, tv_delta, tv_last_rx, tv_last_tx; // <<<

    memset(&meta_rx, 0, sizeof(meta_rx));
    memset(&meta_tx, 0, sizeof(meta_tx));
    memset(&meta, 0, sizeof(meta));

    meta_tx.flags = BLADERF_META_FLAG_TX_BURST_START;
    meta.flags = BLADERF_META_FLAG_TX_BURST_START;

    meta_rx.flags = BLADERF_META_FLAG_RX_NOW;
    status = bladerf_sync_rx(dev, rx_samples, samples_len, &meta_rx, timeout_rx);
    if (status != 0)
        printf("Failed to RX count: %u, timestamp: %lu\n", meta_rx.actual_count, meta_rx.timestamp);
    else if (!status && meta_rx.status == BLADERF_META_STATUS_OVERRUN)
        printf("RX overrun, count: %u, timestamp: %lu\n", meta_rx.actual_count, meta_rx.timestamp);

    gettimeofday(&tv_last_rx, NULL); // <<<

#if 1
    // obtain current timestamp
    status = bladerf_get_timestamp(dev, BLADERF_MODULE_TX, &meta_tx.timestamp);
    if (status != 0) {
        fprintf(stderr, "Failed to get current TX timestamp: %s\n", bladerf_strerror(status));
        return status;
    } else {
        printf("Current TX timestamp: %lu\n", meta_tx.timestamp);
    }
#endif

    gettimeofday(&tv_last_tx, NULL); // <<<

    // an initial delay
    meta_tx.timestamp = meta_rx.timestamp;;

    while (status == 0 && !done) {
        gettimeofday(&tv_start, NULL); // <<<

#if 1   
        /* Receive samples */
        status = bladerf_sync_rx(dev, rx_samples, samples_len, &meta_rx, timeout_rx);
    if (status != 0)
        printf("Failed to RX count: %u, timestamp: %lu\n", meta_rx.actual_count, meta_rx.timestamp);
    else if (!status && meta_rx.status == BLADERF_META_STATUS_OVERRUN) {
        printf("RX overrun, count: %u, timestamp: %lu\n", meta_rx.actual_count, meta_rx.timestamp);
    }

    gettimeofday(&tv_now, NULL); // <<<
    timeval_subtract(&tv_delta, &tv_now, &tv_last_rx); // <<<
    if (tv_delta.tv_sec > 0 || tv_delta.tv_usec > RX_TIME_USEC) { // <<<
        printf("%ld.%06ld since last RX (current ts = %lu)\n", tv_delta.tv_sec, tv_delta.tv_usec, meta_rx.timestamp); // <<<
    } // <<<
    gettimeofday(&tv_last_rx, NULL); // <<<

    //printf("RX actual_count: %u, timestamp: %lu\n", meta_rx.actual_count, meta_rx.timestamp);
#endif

    status = 0;
        if (status == 0) {

            /* Process these samples, and potentially produce a response
             * to transmit */

            //done = do_work(rx_samples, samples_len, &have_tx_data, tx_samples, samples_len);
            done = false;
            have_tx_data = 1;

            if (!done && have_tx_data) {

        meta_tx.timestamp = meta_rx.timestamp + 10000;
        //meta_tx.timestamp += 10000;

                /* Transmit a response */
                status = bladerf_sync_tx(dev, tx_samples, samples_len, &meta_tx, timeout_tx);
                if (status != 0) {
                    fprintf(stderr, "Failed to TX samples: %s\n",
                            bladerf_strerror(status));
                }
        else if (!status && meta_tx.status == BLADERF_META_STATUS_UNDERRUN)
            printf("TX actual_count: %u, timestamp: %lu\n", meta_tx.actual_count, meta_tx.timestamp);

        gettimeofday(&tv_now, NULL); // <<<
        timeval_subtract(&tv_delta, &tv_now, &tv_last_tx); // <<<
        if (tv_delta.tv_sec > 0 || tv_delta.tv_usec > TX_TIME_USEC) { // <<<
            printf("%ld.%06ld since last TX (current ts = %lu)\n", tv_delta.tv_sec, tv_delta.tv_usec, meta_tx.timestamp);  // <<<
        } // <<<
        gettimeofday(&tv_last_tx, NULL); // <<<

        meta_tx.flags = BLADERF_META_FLAG_TX_UPDATE_TIMESTAMP;


        //meta_tx.timestamp += samples_len + 10;

            }
        } else {
            fprintf(stderr, "Failed to RX samples: %s\n",
                    bladerf_strerror(status));
        }

        //done = (meta_tx.timestamp > 50000000);

        gettimeofday(&tv_end, NULL); // <<<

    }

    timeval_subtract(&tv_delta, &tv_end, &tv_start); // <<<
    printf("last loop took %ld.%06ld seconds\n", tv_delta.tv_sec, tv_delta.tv_usec); // <<<

    /** [rxtx_loop] */

With samples_len = 8192, at 7.68 Msamples/s, the time between each call to RX or TX should average out to about 0.001067 seconds. If it starts running substantially longer than this on a regular basis, an overflow can occur.

So, with this in mind, I would recommend making sure your system is as quiet as possible (CPU, disk, network, video, etc). Playing around with gprof might prove interesting, too. I don't think it's a hardware problem, although I could see this being exacerbated by a weird USB signal integrity issue. Try a different USB cable? I'd think that kind of issue would show up with slower sample rates as well, though.

Bcd · Post by **Bcd** » Thu Dec 14, 2017 3:29 am

Thanks for your effort Rey!

Below I am going through things I consider worth investigating:

- Where do I get the latest FPGA? You mentioned 0.7.1, I have 0.6.0

- I/O for me does not matter much. I did the dd test and there is no noticeable difference. Maybe because on my laptop the USB3.0 controller does not have another device connected to it. How is it in your case? lsusb -t shows the tree.

- I integrated the changes in your comments above and the result is that the average time for RX/TX is about 2.2ms so I needed to change the defines to have 2500 instead of 1500. Hmm, what is the actual calculus of transfer for 10000bytes RX/RX taking in consideration libbladerf, libusb and controllers? I will do later the same test on a PC where I had the same problems.

- the temperature matters a lot in my case. At the room temp (23 degree C) there happens the sample discontinuity that I believe it is due to a reset of device during transfer. If I take the bladeRF board outside in 12 degree C, there is no timestamp discontinuity and the timediff is kept below 2.5 and the app can run for long time (half an hour at least for now). See the next point regarding the reset.

See below an excerpt from a quick run and crash at room temp:

Running...
[VERBOSE @ nios_access.c:298] nios_config_read: Read 0x0003002f
[VERBOSE @ nios_access.c:309] nios_config_write: Wrote 0x0003002f
[VERBOSE @ nios_access.c:298] nios_config_read: Read 0x0003002f
[VERBOSE @ nios_access.c:309] nios_config_write: Wrote 0x0003002f
[DEBUG @ bladerf.c:352] Enable Module: RX - True
[VERBOSE @ nios_access.c:409] nios_lms6_read: Read 0x00 from addr 0x70
[VERBOSE @ nios_access.c:421] nios_lms6_write: Wrote 0x01 to addr 0x70
[DEBUG @ bladerf.c:352] Enable Module: TX - True
[VERBOSE @ nios_access.c:409] nios_lms6_read: Read 0x00 from addr 0x40
[VERBOSE @ nios_access.c:421] nios_lms6_write: Wrote 0x02 to addr 0x40
[DEBUG @ sync.c:326] sync_rx: Worker is idle. Going to reset buf mgmt.
[DEBUG @ sync.c:346] sync_rx: Reset buf_mgmt consumer index
[DEBUG @ sync.c:361] sync_rx: Worker is now running.
[VERBOSE @ nios_access.c:370] nios_get_timestamp: Read TX timstamp: 0x140720752595968
Current TX timestamp: 18004
[DEBUG @ sync.c:849] sync_tx: Worker is now running.
0.004856 since last TX (current ts = 181036)
0.004253 since last RX (current ts = 190012)
0.002671 since last RX (current ts = 1820012)
0.002722 since last TX (current ts = 1821036)
[DEBUG @ sync.c:485] Sample discontinuity detected @ buffer 140, message 4: Expected t=48997628, got t=317433084
RX overrun, count: 7616, timestamp: 48990012
[DEBUG @ sync.c:485] Sample discontinuity detected @ buffer 140, message 5: Expected t=317433592, got t=48998136
RX overrun, count: 508, timestamp: 317433084
[DEBUG @ sync.c:704] Provided timestamp=48999160 is in past: current=317444108
Failed to TX samples: Requested timestamp is in the past
last loop took 0.000988 seconds
[DEBUG @ bladerf.c:352] Enable Module: RX - False
[VERBOSE @ nios_access.c:409] nios_lms6_read: Read 0x01 from addr 0x70

- there is a reset that happen at the device every times the app exists due to overruns. I can see it in dmesg (how is it in your case?).

[ 93.186439] usb 2-2: reset SuperSpeed USB device number 2 using xhci_hcd
[ 93.206379] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[ 294.510555] usb 2-2: reset SuperSpeed USB device number 2 using xhci_hcd
[ 294.530444] usb 2-2: LPM exit latency is zeroed, disabling LPM.
[ 354.166147] usb 2-2: reset SuperSpeed USB device number 2 using xhci_hcd
[ 354.186369] usb 2-2: LPM exit latency is zeroed, disabling LPM.

The reset happens during transfer and I think it is the cause of the bladerf timestamp mismatch after which the app exits. If the reset does not happen, even though the RX and TX tv timestamp diff go sometimes above 2.5ms there is no problem and the app keeps going. Could it be the FPGA or FX3 resetting? Is there a log for the FPGA/FX3 to check?

- I ordered a quality cable, I'll see how it goes.

Thanks

Bcd · Post by **Bcd** » Thu Dec 14, 2017 12:46 pm

Alright, I did the same tests on a server with an i7 / 8cores. I am using Ubuntu 16.04.3 (kernel 4.10.0-35) on both Laptop and server and the bladerf sources are from mainstream github.

The RX/TX diff timing is also 2.2ms like on the laptop and I needed to set

#define RX_TIME_USEC 2500
#define TX_TIME_USEC 2500

At room temp I get the same reset of the device during transfers which in turn make the app to exit (see below the log). There is actually no time diff above 2.5ms, good server I guess.

This is from the kernel log, the device reset:

[152500.042265] usb 4-2: reset SuperSpeed USB device number 2 using xhci_hcd
[152500.063196] usb 4-2: LPM exit latency is zeroed, disabling LPM.

Just cooling the device makes it run without reset and no overruns.

Basically there is no difference from running the app on Laptop and on the server.

The log below is from a run at room temp:

[VERBOSE @ "backend/usb/libusb.c":593] Using libusb version: 1.0.20.11004
[VERBOSE @ "backend/usb/libusb.c":470] Found a bladeRF (idx=3)
[VERBOSE @ "backend/usb/libusb.c":532] USB port reset succeeded for bladeRF 9eeeeecf610343cba827a9889dcc9c23
[VERBOSE @ "capabilities.c":45] Capability mask before FPGA load: 0x0000000700000000
[VERBOSE @ "backend/usb/usb.c":112] Changing to USB alt setting 0
[VERBOSE @ "backend/usb/usb.c":112] Changing to USB alt setting 1
[VERBOSE @ "backend/usb/usb.c":169] Read FPGA version: 0.6.0
[VERBOSE @ "capabilities.c":96] Capability mask after FPGA load: 0x00000007000003ff
[VERBOSE @ "backend/usb/usb.c":180] Using current packet handler formats
[VERBOSE @ "backend/usb/usb.c":112] Changing to USB alt setting 2
[VERBOSE @ "backend/usb/usb.c":112] Changing to USB alt setting 1
[VERBOSE @ "backend/usb/usb.c":112] Changing to USB alt setting 2
[VERBOSE @ "backend/usb/usb.c":112] Changing to USB alt setting 1
[VERBOSE @ "backend/usb/nios_access.c":298] nios_config_read: Read 0x0003002f
[DEBUG @ "tuning.c":82] Default tuning mode: FPGA
[VERBOSE @ "backend/usb/nios_access.c":298] nios_config_read: Read 0x0003002f
[VERBOSE @ "backend/usb/nios_access.c":745] Clearing RX retune queue.
[VERBOSE @ "backend/usb/nios_access.c":768] RX operation duration: 44
[VERBOSE @ "backend/usb/nios_access.c":745] Clearing TX retune queue.
[VERBOSE @ "backend/usb/nios_access.c":768] TX operation duration: 35
[DEBUG @ "tuning.c":135] Setting RX frequency to 2560000000
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":357] VCOCAP estimate: 42
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1677] ---- Frequency ----
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1678] x : 2
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1679] nint : 133
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1680] nfrac : 2796203
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1681] freqsel : 0x2c
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1682] reference: 38400000
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1683] freq : 2560000001
[VERBOSE @ "backend/usb/nios_access.c":750] nios_retune: module=RX timestamp=0 nint=133 nfrac=2796203
freqsel=0x2c vcocap=0x2a low_band=0 quick_tune=0
[VERBOSE @ "backend/usb/nios_access.c":765] RX retune operation: vcocap=44, duration=2305
[VERBOSE @ "si5338.c":525] Setting integer sample rate: 7680000
[VERBOSE @ "si5338.c":383] Found r value of: 1
[VERBOSE @ "si5338.c":393] MSx a + b/c: 165 + 0/1
[VERBOSE @ "si5338.c":422] MSx a + b/c: 165 + 0/1
[VERBOSE @ "si5338.c":211] MSx P1: 0x00005080 (20608) P2: 0x00000000 (0) P3: 0x00000001 (1)
[VERBOSE @ "si5338.c":347] Calculated multisynth frequency: 7680000 + 0/1
[VERBOSE @ "si5338.c":234] Writing MS1
[VERBOSE @ "backend/usb/nios_access.c":385] nios_si5338_read: Read 0x01 from addr 0x25
[VERBOSE @ "si5338.c":243] Wrote enable register: 0x01
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x01 to addr 0x25
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x80 to addr 0x40
[VERBOSE @ "si5338.c":257] Wrote regs[0]: 0x80
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x50 to addr 0x41
[VERBOSE @ "si5338.c":257] Wrote regs[1]: 0x50
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x42
[VERBOSE @ "si5338.c":257] Wrote regs[2]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x43
[VERBOSE @ "si5338.c":257] Wrote regs[3]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x44
[VERBOSE @ "si5338.c":257] Wrote regs[4]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x45
[VERBOSE @ "si5338.c":257] Wrote regs[5]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x01 to addr 0x46
[VERBOSE @ "si5338.c":257] Wrote regs[6]: 0x01
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x47
[VERBOSE @ "si5338.c":257] Wrote regs[7]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x48
[VERBOSE @ "si5338.c":257] Wrote regs[8]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x49
[VERBOSE @ "si5338.c":257] Wrote regs[9]: 0x00
[VERBOSE @ "si5338.c":272] Wrote r register: 0xc0
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0xc0 to addr 0x20
[VERBOSE @ "si5338.c":542] Set actual integer sample rate: 7680000
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x3e from addr 0x54
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x3e to addr 0x54
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x0c from addr 0x55
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x3e from addr 0x54
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x3e to addr 0x54
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0xe0 from addr 0x75
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0xe0 to addr 0x75
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x68 to addr 0x76
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x00 to addr 0x65
[DEBUG @ "tuning.c":135] Setting TX frequency to 2561920000
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":357] VCOCAP estimate: 42
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1677] ---- Frequency ----
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1678] x : 2
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1679] nint : 133
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1680] nfrac : 3635063
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1681] freqsel : 0x2c
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1682] reference: 38400000
[VERBOSE @ "/home/bogdan/projects/gnuradio/src/bladeRF/fpga_common/src/lms.c":1683] freq : 2561919999
[VERBOSE @ "backend/usb/nios_access.c":750] nios_retune: module=TX timestamp=0 nint=133 nfrac=3635063
freqsel=0x2c vcocap=0x2a low_band=0 quick_tune=0
[VERBOSE @ "backend/usb/nios_access.c":765] TX retune operation: vcocap=44, duration=2097
[VERBOSE @ "si5338.c":525] Setting integer sample rate: 7680000
[VERBOSE @ "si5338.c":383] Found r value of: 1
[VERBOSE @ "si5338.c":393] MSx a + b/c: 165 + 0/1
[VERBOSE @ "si5338.c":422] MSx a + b/c: 165 + 0/1
[VERBOSE @ "si5338.c":211] MSx P1: 0x00005080 (20608) P2: 0x00000000 (0) P3: 0x00000001 (1)
[VERBOSE @ "si5338.c":347] Calculated multisynth frequency: 7680000 + 0/1
[VERBOSE @ "si5338.c":234] Writing MS2
[VERBOSE @ "backend/usb/nios_access.c":385] nios_si5338_read: Read 0x03 from addr 0x26
[VERBOSE @ "si5338.c":243] Wrote enable register: 0x03
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x03 to addr 0x26
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x80 to addr 0x4b
[VERBOSE @ "si5338.c":257] Wrote regs[0]: 0x80
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x50 to addr 0x4c
[VERBOSE @ "si5338.c":257] Wrote regs[1]: 0x50
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x4d
[VERBOSE @ "si5338.c":257] Wrote regs[2]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x4e
[VERBOSE @ "si5338.c":257] Wrote regs[3]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x4f
[VERBOSE @ "si5338.c":257] Wrote regs[4]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x50
[VERBOSE @ "si5338.c":257] Wrote regs[5]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x01 to addr 0x51
[VERBOSE @ "si5338.c":257] Wrote regs[6]: 0x01
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x52
[VERBOSE @ "si5338.c":257] Wrote regs[7]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x53
[VERBOSE @ "si5338.c":257] Wrote regs[8]: 0x00
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0x00 to addr 0x54
[VERBOSE @ "si5338.c":257] Wrote regs[9]: 0x00
[VERBOSE @ "si5338.c":272] Wrote r register: 0xc0
[VERBOSE @ "backend/usb/nios_access.c":397] nios_si5338_write: Wrote 0xc0 to addr 0x21
[VERBOSE @ "si5338.c":542] Set actual integer sample rate: 7680000
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x3e from addr 0x34
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x3e to addr 0x34
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x0c from addr 0x35
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x3e from addr 0x34
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x3e to addr 0x34
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x0f to addr 0x41
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x00 from addr 0x45
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x00 to addr 0x45
[VERBOSE @ "backend/usb/nios_access.c":298] nios_config_read: Read 0x0003002f
[VERBOSE @ "backend/usb/nios_access.c":309] nios_config_write: Wrote 0x0003002f
[VERBOSE @ "backend/usb/nios_access.c":298] nios_config_read: Read 0x0003002f
[VERBOSE @ "backend/usb/nios_access.c":309] nios_config_write: Wrote 0x0003002f
[DEBUG @ "bladerf.c":356] Enable Module: RX - True
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x00 from addr 0x70
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x01 to addr 0x70
[DEBUG @ "bladerf.c":356] Enable Module: TX - True
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x00 from addr 0x40
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x02 to addr 0x40
[DEBUG @ "sync.c":326] sync_rx: Worker is idle. Going to reset buf mgmt.
[DEBUG @ "sync.c":346] sync_rx: Reset buf_mgmt consumer index
[DEBUG @ "sync.c":361] sync_rx: Worker is now running.
[VERBOSE @ "backend/usb/nios_access.c":370] nios_get_timestamp: Read TX timstamp: 0x140728404499152
[DEBUG @ "sync.c":849] sync_tx: Worker is now running.
[DEBUG @ "sync.c":485] Sample discontinuity detected @ buffer 111, message 5: Expected t=92458551, got t=360894007
[DEBUG @ "sync.c":485] Sample discontinuity detected @ buffer 111, message 6: Expected t=360894515, got t=92459059
[DEBUG @ "sync.c":704] Provided timestamp=92460083 is in past: current=360905031
Failed to TX samples: Requested timestamp is in the past
[DEBUG @ "bladerf.c":356] Enable Module: RX - False
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x01 from addr 0x70
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x00 to addr 0x70
[DEBUG @ "bladerf.c":356] Enable Module: TX - False
[VERBOSE @ "backend/usb/nios_access.c":409] nios_lms6_read: Read 0x02 from addr 0x40
[VERBOSE @ "backend/usb/nios_access.c":421] nios_lms6_write: Wrote 0x00 to addr 0x40
[VERBOSE @ "backend/usb/nios_access.c":745] Clearing RX retune queue.
[VERBOSE @ "backend/usb/nios_access.c":768] RX operation duration: 43
[VERBOSE @ "backend/usb/nios_access.c":745] Clearing TX retune queue.
[VERBOSE @ "backend/usb/nios_access.c":768] TX operation duration: 35
Opening and initializing device...

RX frequency: 2560000000 Hz
RX samplerate: 7680000 sps
RX bandwidth: 1500000 Hz
RX LNA Gain: Max
RX VGA1 gain: 20
RX VGA2 gain: 0

TX frequency: 2561920000 Hz
TX samplerate: 7680000 sps
TX bandwidth: 1500000
TX VGA1 gain: -20
TX VGA2 gain: 0

Running...
Current TX timestamp: 17821
RX overrun, count: 8540, timestamp: 92450011
RX overrun, count: 508, timestamp: 360894007
last loop took 0.000987 seconds
Closing the device...

Bcd · Post by **Bcd** » Mon Dec 18, 2017 2:59 pm

So far, I did try some more things I had on my list:

- I powered bladeRF with an external power 5V/2A (changed J70) - same result
- changed the USB cable with another one - same result
- I installed the new FPGA 0.7.1 - same result.
Though, I could notice the some differences between 0.6.0 and 0.7.1 although I don't think they are important:

0.6.0
RX operation duration 43
TX operation duration 35

0.7.1
RX operation duration 380
TX operation duration 380

I even installed Quartus and built the FPGA from scratch. I assume that I need the JTAG USB cable in order to try TimeQuest as @bglod mentioned above? Or not?

- reinstalled the FX3 firmware - same result.

In all cases there is always the same temperature dependency as reported above.

Any ideas to try out?

Bcd · Post by **Bcd** » Thu Dec 21, 2017 4:20 am

Ok, yesterday I tried the TimeQuest option in Quartus as @bglod mentioned in a post above.

It uses the Slow 1200mV 85C, Slow 1200mv 0C, Fast 1200mV 0C models.

Although there are several failing paths with a negative slack there is just one that appears only at higher temperature, see below:

+--------------------------------------------------------------------------------+
; Slow 1200mV 85C Model Setup Summary ;
+-------------------------------------------------------+--------+---------------+
; Clock ; Slack ; End Point TNS ;
+-------------------------------------------------------+--------+---------------+
; lms_sclk_pin ; -4.705 ; -9.386 ;
; lms_rx_clock_out ; -3.549 ; -3.549 ;
; U_fx3_pll|altpll_component|auto_generated|pll1|clk[0] ; -0.021 ; -0.021 ;
; fx3_virtual ; 0.229 ; 0.000 ;
; U_pll|altpll_component|auto_generated|pll1|clk[0] ; 0.560 ; 0.000 ;

....

+--------------------------------------------------------------------------------+
; Slow 1200mV 0C Model Setup Summary ;
+-------------------------------------------------------+--------+---------------+
; Clock ; Slack ; End Point TNS ;
+-------------------------------------------------------+--------+---------------+
; lms_sclk_pin ; -4.345 ; -8.646 ;
; lms_rx_clock_out ; -3.629 ; -3.629 ;
; U_fx3_pll|altpll_component|auto_generated|pll1|clk[0] ; 0.031 ; 0.000 ;
; fx3_virtual ; 0.913 ; 0.000 ;
; U_pll|altpll_component|auto_generated|pll1|clk[0] ; 1.678 ; 0.000 ;

And here is the output form the 'worst case scenario' on Slow 1200mV 85C:
-0.021 fx3_ctl[5] fx3_gpif:U_fx3_gpif|can_tx fx3_virtual U_fx3_pll|altpll_component|auto_generated|pll1|clk[0] 10.100 1.428 3.285 1

jbassey · Post by **jbassey** » Thu Nov 12, 2020 10:14 am

Bcd wrote: ↑Wed Dec 13, 2017 9:44 am So I add a modified sync_rxtx.c without prints on the console, for those who want to test:

https://we.tl/Givbyq11vL

and also example_common.h - it defines the samplerate and frequencyes:

https://we.tl/YNyqv0g3QS

and example_common.c

https://we.tl/oYCQfrdyPo

Can I ask how exactly you run the example sync_rx_meta.c example file? The boilerplate.c example seems to work but the sync_rx_meta-c gives errors regarding header files.

Official bladeRF forum

Overrun when doing sync meta transfers TX and RX

Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX

Re: Overrun when doing sync meta transfers TX and RX