Tuesday, 2022-05-10

*** tpb <[email protected]> has joined #litex00:00
*** Degi_ <[email protected]> has joined #litex00:09
*** Degi <[email protected]> has quit IRC (Ping timeout: 276 seconds)00:09
*** Degi_ is now known as Degi00:09
*** subthreshold <[email protected]> has quit IRC (Quit: Client closed)01:58
*** FabM <FabM!~FabM@armadeus/team/FabM> has joined #litex08:04
*** FabM <FabM!~FabM@armadeus/team/FabM> has quit IRC (Ping timeout: 248 seconds)10:05
*** TMM_ <[email protected]> has quit IRC (Quit: https://quassel-irc.org - Chat comfortably. Anywhere.)10:52
*** TMM_ <[email protected]> has joined #litex10:52
*** FabM <[email protected]> has joined #litex12:22
tntThe litepcie loopback when operating in prog mode should be 'blocking' right ? I mean if the dma writer doesn't have any descriptors it will not read from its input fifo. And the dma reader isn't going to read/use any of the descriptors if its output fifo is full ?12:54
tntOh ...but when disabled, they will drop anything.13:18
*** oter <oter!5e7a0135f3@2604:bf00:561:2000::25f> has quit IRC (Remote host closed the connection)13:26
*** oter <oter!5e7a0135f3@2604:bf00:561:2000::25f> has joined #litex13:27
_florent_tnt: the difference between prog and loop mode is only that in loop mode, descriptors read from the fifo are written back14:06
tnt_florent_: yeah, I know, but that means it doesn't need interaction from the sw to proceed since it always has descriptors.14:11
tntHere what I was observing is that due to some issue in my code, I wasn't enabling the DMA Writer, but I was seeing the DMA reader keep going ...14:12
tntand that surprised me because I was wondering where the data was going since the writer wasn't being run, it wouldn't consume any dat aand so the reader should block.14:13
tntBut turns out that when disabled the DMA writer just discards any data at its input.14:13
_florent_tnt: Ah sorry, so while programming the descriptors, you can keep the DMA disable14:16
tntBTW, here's the current state : https://github.com/smunaut/litepcie/commits/rework14:17
tntand this now behaves like I'd expect it to.14:17
_florent_tnt: the fact that DMA is discarding the data in indeed wanted in some cases (to avoid propagating backpressure/blocking things)14:17
_florent_but this should be configurable14:17
_florent_I remember looking at this recently, give me a second14:18
_florent_OK, so I wanted to provide the behaviour you can for the generator and did this: https://github.com/enjoy-digital/litepcie/commit/e60c97f946db0385320da5efeb9070ddcad2221814:22
_florent_so filtered the valid/ready with the enable14:23
_florent_but this makes behavior different on this point with direct LiteX integration of the core and with the generator14:25
_florent_so we could eventually add a parameter to configure this14:25
tntWait, this does the opposite of what the commit says.  It says "DMA Writer will not accept incoming stream when disabled."  but then you do "self.comb += sink.ready.eq(1)"14:25
_florent_the commit is describing the behavior for the generator14:27
_florent_the commit is also reverting this: https://github.com/enjoy-digital/litepcie/commit/7e63e459b8909f43cb60b45362c10a7397b8722414:28
_florent_(I was not really satisfied with it)14:29
tntOk.14:29
tntAnyway, I'm not really bothered by the behavior, I just wasn't expecting it, but my logic won't generate any input to the writer with it being disabled, so not a problem.14:30
tntWhat kind of performance (Gb/s or % of theoritical) should I be expecting btw ?14:31
_florent_ok, I understand it can be confusing. Maybe we should block DMA Writer by default and enable the discarding only when specified. I'll look at this.14:32
_florent_That's generally around ~80-85% efficiency (on PCIe Gen2 / 7-Series), it should be similar for Gen3/Gen4/Ultrascale 14:35
_florent_So ~3.5Gbps per Gen2 lane. (theoritical max of 4Gbps with the 8b10b encoding).14:36
tntgen3 is not 8b/10b, you should get almost all of it so in 8x, I should get about 50G.  ATM I'm at 33G (in zero copy and without data check), so probably some overhead of using prog mode vs loop mode, I'll look into improving that.14:39
_florent_yes I'm aware gen3 is not 8b/10b, I was just providing the numbers I have on gen2 :)14:41
tntyeah, I was just providing explanation for my math :)14:42
_florent_with high PCIe bandwidth, the DRAM bandwidth on the Host can also be a limiting factor14:44
_florent_be sure to activate dual/quad channel if available14:45
tntYeah, I filled the right DIMM slots, but on the chipset you can't "overclock" the RAM :/ It's stuck at like 2466M or something like that.14:46
*** futarisIRCcloud <[email protected]> has joined #litex15:25
*** FabM <FabM!~FabM@armadeus/team/FabM> has quit IRC (Quit: Leaving)15:30
*** futarisIRCcloud <[email protected]> has quit IRC (Quit: Connection closed for inactivity)17:35
somlogatecat: I just updated my toolchain to the latest (as of last night) yosys, trellis, and nextpnr. And it looks like I can now fit a FPU-enabled rocket core on the 85k ecp5, which is awesome!21:49
somlonextpnr FTW! :D21:49
gatecatoh, that's very good news, there were some ECP5 packing improvements that have hopefully helped21:49
somlotiming is less forgiving -- I still get nextpnr to report 25-ish MHz (I'm asking for 50). Without the FPU, it mostly boots linux and trundles along OK. But with the FPU, it now fails memtest21:51
somlonot sure how much lower than 50 I can take LiteX and still have it work (iirc, litedram really really doesn't like running at slow sysclock rates)21:52
somlobut I'm currently hammering at it with/without `nowidelut` and `abc9`, and with random nextpnr seeds, to see if I maybe get lucky with one of the runs, timing-wise :)21:53
somlobut anyway, TLDR -- I wanted to say thanks for the placement improvement, it's quite significant!21:54
swetlandooh I should update.  been squishing a VexRISCV RV32IM w/ U/S/M and MMU and peripherals in a 25F and it's crowded in there21:58
tntyou need to disable the DDR DLL to run at slow speed.22:03

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!