Hi Patrick, we retesting this on Ubuntu-Precise kernel (3.2.0-29-generic #46) and the latest ixgbe driver (3.11.33):
PF to PF (SR-IOV enabled) and VF to PF – 9.5 GBs
VF to PF and VF to VF – 5.5 Gbs
It appears that the limiting factor is the VF receiving queue. We see that the interrupt that is assigned to it, happens on a single CPU and this CPU gets 100% busy with handling softirqs. We tried to spread the IRQs between several CPUs using /proc/irq/XXX/smp_affinity and /sys/class/net/XXX/queues/rx-0/rpc_cpus (Receive Packet Steering), but still all the interrupts get handled on the same CPU. Is there a way to spread the interrupts for the receiving queue between several CPUs?
That apart, we also did a bi-directional test: we saw that VF-to-VF in total gets to 10Gbs, but PF-to-PF gets in total 13-14 Gbs, which is strange. Is the 10Gbs network bandwidth uni-directional or bi-directional?
Regarding the script: we tried it, but it is intended for multi-queue interfaces. When SR-IOV is enabled, each interface (VFs and PF) receives only one tx/rx queue pair (PF receives a singe TxRx queue), so it does not look relevant for the SR-IOV case.
Lastly, we disabled irqbalance, but didn't see any notable difference.
Alex.
P.S.: I will also reply to other threads that we are having in parallel with you:) Thank you for being so responsive.