Wednesday, 2022-03-09

*** tpb <[email protected]> has joined #litex00:00
*** nelgau <[email protected]> has joined #litex00:24
*** Degi <[email protected]> has quit IRC (Ping timeout: 256 seconds)02:57
*** Degi <[email protected]> has joined #litex02:59
*** eigenform <[email protected]> has quit IRC (Remote host closed the connection)05:38
*** eigenform <[email protected]> has joined #litex05:38
*** FabM <FabM!~FabM@2a03:d604:103:600:582e:9e69:69b0:53da> has joined #litex06:41
*** FabM <FabM!~FabM@armadeus/team/FabM> has quit IRC (Remote host closed the connection)06:51
*** FabM <FabM!~FabM@2a03:d604:103:600:222b:6e3a:9732:5f7f> has joined #litex07:06
*** cr1901 <cr1901!~cr1901@2601:8d:8600:911:cf1:8507:720a:c17> has quit IRC (Read error: Connection reset by peer)07:44
*** cr1901 <cr1901!~cr1901@2601:8d:8600:911:4c40:72d2:6947:5aae> has joined #litex08:07
*** jryans <jryans!~jryans@2001:470:69fc:105::1d> has quit IRC (Quit: You have been kicked for being idle)09:00
*** FabM <FabM!~FabM@armadeus/team/FabM> has quit IRC (Ping timeout: 240 seconds)11:34
tntWell ... I implemented the 'streaming' mode in  a way that works. (I'm not a jtag expert but I'm pretty sure openocd is doing DRPAUSE wrong and I had to "match their wrongness").12:47
tntIt still sucks :/  30MHz JTAGBone is 20x slower at downloading a litescope trace than a 2Mbaud UARTBone.12:47
tntUnfortunately I think there is a bunch of inefficiencies points that compound:13:34
tnt- AFAICT LiteScope just uses plain old single-at-a-time register read to get the data, but each read is one command, then the latency to wait for the response and then get the data. No bursting or anything to avoid that long latency cycle.13:35
tnt- The whole valid/ready handshake on the jtagbone also means that it can only send one byte per drscan burst because if the ready bit it reads back is 0, it would have no way to stop any further bytes in the burst to go through should the 'ready' bit flip to 1 during a burst.13:37
_florent_tnt: litex_server is able to automatically regroup read access in bursts with https://github.com/enjoy-digital/litex/blob/master/litex/tools/litex_server.py#L2414:09
_florent_tnt: this should happen during the LiteScope upload14:10
tnt_florent_: but they're not sequential reads14:10
tntI mean ... thre is a few sequential to read the width of the monitored burst.14:10
tntbut then it's reading the same address again.14:10
tntSo lets say you monitor 64 bits, it will read address 0 4 0 4 0 4 0 4 0 4 ....14:10
tntSo sure, it regroups the 0 4 in a burst, but that's not much compared to all the reads to get the whole data.14:11
tntAlso in litescope_cli it's calling regs.xxxx.read()  so it would want the result before issuing the next one, so it wouldn't have any opportunity to merge.14:13
_florent_tnt: it will indeed have more effect on large capture buses14:14
tntshould probably rever d8df6cb27d0d2611ab4e4fd41d303c111525581c too. I guess there was a reason it wasn't in the list and I just assumed it was an inadvertant omission.14:18
_florent_tnt: but yes, this is a simple upload protocol and this could be optimized14:20
tntThat's what I'm looking at now. It's probably the shortest path to increase performance.14:21
tnt(rather than keep banging my head against JTAGBone)14:21
_florent_tnt: to allow fixed bursts, we could also only expose the data on a 32-bit CSR and use DownConverter between the mem FIFO and the CSR interface.14:26
_florent_tnt: this would avoid the 0 4 0 4 0 4 etc... pattern and allow read_merger to generate a proper fixed burst14:26
tntYes, that's the plan.14:26
_florent_but we should also probably avoid checking mem_valid for each data: https://github.com/enjoy-digital/litescope/blob/master/litescope/software/driver/analyzer.py#L16214:27
tntAnd I also have a plan to deal with the 'mem_valid' it sticks in there.14:27
_florent_ok good :)14:28
_florent_instead of the valid, you could report the mem.level on a CSR14:29
tntI think the valid is mostly there to deal with the CDC fifo becoming empty14:29
tntin case the 'scope' clock domain is slow vs the 'sys' one.14:30
tntOr maybe not ... because it aborts if not valid.14:31
tntThen yeah, actually just reading the level at the beginning instead of storage_length would do.14:31
*** bl0x <bl0x!~bastii@p200300d7a7116d009ec14f33202f95b5.dip0.t-ipconnect.de> has joined #litex14:34
_florent_tnt: I did something very close to speed-up crossover UART, this can maybe be useful: https://github.com/enjoy-digital/litex/blob/master/litex/tools/litex_term.py#L122-L13114:35
_florent_tnt: here I was monitoring full/empty CSR but principle will be similar with a level CSR.14:36
tnt_florent_: is there convenient way to get the native CSR bus width ?14:38
tnt(instead of just CSRStatus(32) ...)14:38
_florent_The SoC has it (self.csr_data_width), but this is not directly available from the core, this could be a parameter of LiteScopeAnalyzer14:42
_florent_tnt: csr_data_width of 8 is still supported, but not sure it's useful have it optimized with LiteScope since mostly here for retro-compabitibility and almost everyone is probably using csr_data_width=32 now14:44
*** r4d10n[m] <r4d10n[m]!~r4d10nmat@2001:470:69fc:105::1:6255> has quit IRC (Quit: You have been kicked for being idle)16:00
tntGotta run now, but first results are encouraging. ~ 7x speed up of the download phase and > 85% of the theoritical max bitrate of UART.16:19
*** daveb <daveb!~Thunderbi@cpc138570-newt42-2-0-cust29.19-3.cable.virginm.net> has joined #litex16:35
_florent_Great!16:41
jevinskie[m]About JTAGbone speed… I’ve been pondering for a while about adding another bit to the protocol. If set, it would initiate a DMA from a following addr/length pair and stream it out16:55
_florent_jevinskie[m]: we could think about an alternative protocol to speed up large transfers yes. (I was also thinking doing something similar for Etherbone where we could just stream the data on a specific UDP port to/from the Host)16:59
*** daveb <daveb!~Thunderbi@cpc138570-newt42-2-0-cust29.19-3.cable.virginm.net> has quit IRC (Quit: daveb)19:08
*** Znullptr <[email protected]> has joined #litex19:08
tntjevinskie[m]: well the protocol actually already support that since it's the same UARTBone protocol and supports bursting.19:09
tntBut an improvement to the tunneling made in JTAGBone would be that instead of a 'ready' bit (in the device -> host directly), we use an 'not almost full' bit instead. So that if it's not set, we can safely push several chars because there is enough buffer space.19:10
*** znullptr[m] <znullptr[m]!~znullptrm@2001:470:69fc:105::1:d698> has joined #litex19:51
*** Znullptr <[email protected]> has quit IRC ()20:15
tntPushed a PR for the new proto. Should probably be tested by a few more people though :)21:08
znullptr[m]used all defaults for install ` litex_sim --cpu-type=vexriscv ` :   ../libc/libc.a(libc_ssp_chk_fail.c.o): in function `__chk_fail': undefined reference to `write'  22:49

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!