*** tpb has joined #yosys | 00:00 | |
*** maartenBE has quit IRC | 00:26 | |
*** Degi has quit IRC | 00:33 | |
*** maartenBE has joined #yosys | 00:34 | |
*** Degi has joined #yosys | 00:34 | |
*** emeb_mac has quit IRC | 01:19 | |
*** emeb has quit IRC | 01:19 | |
*** emeb has joined #yosys | 01:20 | |
*** emeb has quit IRC | 01:27 | |
*** emeb has joined #yosys | 01:28 | |
*** emeb_mac has joined #yosys | 01:28 | |
*** cr1901_modern has quit IRC | 02:00 | |
*** emeb has quit IRC | 02:01 | |
*** citypw has joined #yosys | 02:12 | |
*** cr1901_modern has joined #yosys | 02:28 | |
*** SpaceCoaster has joined #yosys | 03:45 | |
*** FFY00 has quit IRC | 03:53 | |
*** FFY00 has joined #yosys | 03:54 | |
*** az0re has joined #yosys | 04:32 | |
*** craigo has joined #yosys | 05:05 | |
*** xtro has quit IRC | 05:46 | |
*** FL4SHK has quit IRC | 06:06 | |
*** FL4SHK has joined #yosys | 06:08 | |
*** emeb_mac has quit IRC | 06:56 | |
*** craigo has quit IRC | 07:19 | |
*** FL4SHK has quit IRC | 07:20 | |
*** craigo has joined #yosys | 07:21 | |
*** Asu has joined #yosys | 07:57 | |
*** N2TOH_ has joined #yosys | 08:23 | |
*** N2TOH has quit IRC | 08:27 | |
*** kraiskil has joined #yosys | 08:58 | |
*** kraiskil has quit IRC | 09:32 | |
*** kraiskil has joined #yosys | 09:34 | |
pepijndevos | daveshah, in Gowin it seems pip delay depends on fanout. How is this represented in nexpnr(-generic)? | 09:35 |
---|---|---|
*** kraiskil has quit IRC | 09:42 | |
Lofty | pepijndevos: in nextpnr's API, that's up to you; you can probably model it as a base + constant * fanout model | 09:52 |
daveshah | This isn't something nextpnr-generic supports at the moment | 09:53 |
Lofty | But pip delay in general depends on fanout due to capacitance, I think | 09:53 |
daveshah | Yes although not all arches model this | 09:53 |
pepijndevos | right, that was more or less what I expected heh | 09:53 |
pepijndevos | ty | 09:53 |
daveshah | ECP5 does so that would be a starting point if you want inspiration | 09:53 |
pepijndevos | ah good to know | 09:53 |
pepijndevos | I think I'll keep postponing a gowin nextpnr target a while longer until at least clock routing works | 09:54 |
pepijndevos | I'll probably ping you at that time for some recommendations for a good starting point for a new arch. | 09:54 |
*** citypw has quit IRC | 10:02 | |
pepijndevos | ugh... seems Ghidra somehow forgot the analysis data, so I guess I'll be working on something else... | 10:07 |
*** citypw has joined #yosys | 10:14 | |
tux3 | Aw, I can't get Yosys to accept clocks in interfaces anymore, not even with hacky workarounds :/ | 11:24 |
tux3 | >Handling const CLK on $memory$flatten\<snipped path>.mem[437]$55298 ($dff) from module top (removing D path). | 11:24 |
mwk | hmm | 11:24 |
mwk | could you show an example? | 11:24 |
tux3 | I haven't minimized it, but essentially I use my AXI4-lite interface to talk to a small SRAM, so I have a `module my_sram(axi4lite.slave bus)`, and some master talks to it | 11:28 |
tux3 | Well, I did have this bug open https://github.com/YosysHQ/yosys/issues/1592 originally, but at the time I could workaround it by putting my clock in a modport | 11:28 |
tpb | Title: Clock in interface port mis-synthesized away (but accepted in modport) · Issue #1592 · YosysHQ/yosys · GitHub (at github.com) | 11:28 |
tux3 | It's probably not an easy fix or a recent regression, so I might just keep this project on the proprietary toolchains for a while more /shrug | 11:30 |
mwk | just to be sure, does it work with 4a05cad7f8a6ee57292e5360eb06305e13fc308b? | 11:31 |
mwk | because it may indeed be interfaces, or it may be my screwup in refactoring the pass that detects const clocks in the first place | 11:32 |
mwk | hmmm | 11:35 |
mwk | as for the original bug, it seems to be an issue with opt_clean | 11:36 |
mwk | you're in luck, I have repairing this godforsaken pass on my immediate todo list | 11:37 |
tux3 | yay. still compiling | 11:37 |
tux3 | waiting on abc, it seems to take a lot longer on 4a05cad7f8a6ee57292e5360eb06305e13fc308b? | 11:39 |
tux3 | yosys is now using 9.5GB RAM and rising, starting to wonder if a BRAM turned into a reg =] | 11:43 |
tux3 | oh it's done. very different output/performance, but same result on 4a05cad7f8a6ee57292e5360eb06305e13fc308b | 11:45 |
mwk | ... weird | 11:46 |
tux3 | hhhhm | 11:46 |
tux3 | uh | 11:46 |
mwk | ... hmmm | 11:46 |
mwk | so I fixed the clean issue, and it no longer nukes the submodule, but... clock is still not connected | 11:46 |
mwk | not good | 11:46 |
tux3 | I'm saying it "failed" because i see ICESTORM_RAM: 12/ 32, which is what I had when my clock got removed | 11:46 |
tux3 | But uh, actually on 4a05 I have ICESTORM_LC: 138640/ 7680 1805% | 11:47 |
whitequark | try (*ram_block*) | 11:47 |
tux3 | So maybe it actually "worked!" | 11:47 |
whitequark | this forces the memory into a BRAM or fails the build | 11:47 |
mwk | okay so forget about the opt_clean thing | 11:48 |
mwk | that's a legitimate bug triggered by your testcase, but there clearly is another problem, in the SV frontend | 11:48 |
mwk | and it was about (* keep *) wires getting removed, so nothing that would matter for synthesis | 11:48 |
tux3 | assuming (*ram_block*) \n logic [data_width-1:0] mem [(1<<addr_width)-1:0]; is the correct syntax, synthetizing | 11:50 |
tux3 | ERROR: cell type '$mem' is unsupported (instantiated as 'foo.bram_rdata[1]_$mem_RD_DATA_4') | 11:50 |
mwk | ... that's not the best error message ever, but it does mean that blockram inference is failing | 11:51 |
tux3 | More info above in the log: https://paste.debian.net/1159736/ | 11:54 |
tpb | Title: debian Pastezone (at paste.debian.net) | 11:54 |
mwk | ... we'd really have to look into the design | 11:55 |
tux3 | Happy to send it over if it helps, but fair warning the code is uh not very good | 11:56 |
tux3 | It's just a toy riscv core at about 7k lines | 11:57 |
whitequark | compiler maintainers don't really care about code quality for the most part | 11:57 |
mwk | whatever it is, I've seen worse | 11:57 |
whitequark | also that | 11:57 |
daveshah | The worse the code is the more bugs it usually finds :) | 11:58 |
mwk | (and if not, I reserve the right to tell random people the war story) | 11:58 |
whitequark | at worst we might point out non-synthesizable constructs | 11:58 |
whitequark | but as long as it's all valid synthesizable verilog i honestly can't be bothered even thinking whether it's elegant or not | 11:58 |
daveshah | In terms of Verilog, dodgy async stuff is often interesting from a finding weird edge cases point of view | 11:58 |
whitequark | right, that's a different point of view :) | 11:58 |
tux3 | well my testbenches pass, and avhdl is happy with it, but it's entirely possible I accidentally have nonsense verilog | 11:59 |
tux3 | et me tar something up with a Makefile that doesn't require my custom tools to build | 11:59 |
mwk | for completeness, I've opened https://github.com/YosysHQ/yosys/pull/2337 for the opt_clean issue affecting your example in #1592, but... this actually doesn't fix the main problem | 12:02 |
tpb | Title: opt_clean: Fix module keep rules. by mwkmwkmwk · Pull Request #2337 · YosysHQ/yosys · GitHub (at github.com) | 12:02 |
mwk | (the device module & instantiation is now kept alive as expected, but the clock/reset lines are unconnected) | 12:03 |
tux3 | Okay here's a tarball: https://drive.google.com/file/d/13IuEX5lXghHEg0bkuVL0CTuMjyp-D8Gq/view?usp=sharing | 12:08 |
tux3 | I've kept the (*ram_block*) so this just fails to build, but the "correct" result is if nextpnr reports something like 28/32 BRAMS used (or, I guess, 1800% utilization) | 12:08 |
tux3 | the offending memory is at ./src/dev/axi4lite_sram.sv:42 | 12:09 |
mwk | tux3: well at least the memory inference failure is quite obvious | 12:20 |
mwk | you have an asynchronous read port, so using blockram isn't possible | 12:20 |
mwk | ideally you should read synchronously from memory in the same module where it is defined, but yosys with flatten gives you a bit more freedom | 12:21 |
mwk | it would be fine if you had an async read port directly feeding a register, but you have a problem at this line: wire [data_width-1:0] rdata = bram_rdata[bram_read_index]; | 12:22 |
tux3 | oh right, I wanted to move the reg outside the bram module, I was hoping flatten would see it | 12:22 |
mwk | this inserts muxes between the $mem and the $dff, preventing merging it | 12:22 |
tux3 | Makes sense | 12:22 |
tux3 | I can just make my bus output comb and put it after the mux, then. Shouldn't be a problem | 12:23 |
tux3 | Thanks, I should have known this =] | 12:23 |
*** craigo has left #yosys | 12:24 | |
mwk | I don't quite understand what you want to do | 12:24 |
tux3 | I guess changing writing `.unregistered(0)` at line 44 is a good enough "fix" to clear that issue, even if it's technically wrong | 12:25 |
mwk | yeah you'd have to do some pipeline fixing | 12:27 |
mwk | also I don't know why you're so carefully splitting the bram into bram_blocks, that's something yosys does for you | 12:27 |
tux3 | for my caches the block size is tied to tag size, addr space, etc, I had in mind to make it "portable" by having each arch set reasonnable parameters | 12:30 |
tux3 | not sure if that makes sense, I just got used to building things up from blocks and arch-specific params | 12:31 |
tux3 | I'm still not sure how the async ram "works" with 4a05cad but on master the whole module gets optimized out. But I guess keeping the clean synchronous ram works all the time, so I'll just do that. | 12:37 |
tux3 | I really appreciate the help, thank you. (And sorry that I don't have a more interesting bug to show for it!) | 12:38 |
*** _whitelogger has quit IRC | 12:48 | |
*** _whitelogger has joined #yosys | 12:50 | |
*** emeb has joined #yosys | 13:20 | |
*** SpaceCoaster has quit IRC | 13:46 | |
*** maartenBE has quit IRC | 14:08 | |
*** maartenBE has joined #yosys | 14:10 | |
pepijndevos | daveshah, I plotted wire delay for gowin, and it seems offset of 0.5 is pretty good, but the proportional part is more like 0.05, correct? https://ibb.co/bXTnSPL Y is ns, X is wire length | 14:30 |
tpb | Title: delay — ImgBB (at ibb.co) | 14:30 |
pepijndevos | Assuming wire lenght is measured in grid units | 14:31 |
daveshah | Yes, that sounds believable | 14:35 |
daveshah | It is manhattan distance yeah | 14:35 |
pepijndevos | ok thanks | 14:36 |
pepijndevos | hmmm, so gowin wires can be tapped at the ends and halfway, so I'm not really sure how that works timing wise. They don't list delays for e.g. 4 distance | 15:16 |
pepijndevos | Although... I'm not sure but probably the actual wire length doesn't mater so much as the parasitics of the wire | 15:21 |
pepijndevos | huh... I'm confused... so there is a tile full of muxes that select from many inputs to a fixed output. Does the pip delay correspond to that output, or to the input it selects from? | 15:30 |
pepijndevos | I'm assuming the former | 15:31 |
*** citypw has quit IRC | 15:50 | |
*** kraiskil has joined #yosys | 18:05 | |
*** thardin has quit IRC | 18:38 | |
*** _whitelogger has quit IRC | 19:12 | |
*** _whitelogger has joined #yosys | 19:14 | |
*** kraiskil has quit IRC | 19:17 | |
*** m4ssi has joined #yosys | 19:32 | |
*** Asu has quit IRC | 19:44 | |
awygle | has anybody tried replicating the numbers from table 3.20 in the ECP5 datasheet with nextpnr and/or diamond? | 20:10 |
awygle | it's a list of "basic functions" and their "register-to-register performance" | 20:10 |
daveshah | Nope | 20:11 |
daveshah | Not that I know of | 20:11 |
*** m4ssi has quit IRC | 20:13 | |
awygle | i got 207.3 MHz for a 64-bit adder as compared to the 441 MHz suggested in the datasheet, using nextpnr | 20:13 |
awygle | (and yosys, which is probably more relevant now that i think about it | 20:14 |
awygle | abc9 improves it slightly to 217.34 MHz | 20:19 |
daveshah | It might be related to register packing or something | 20:19 |
daveshah | There may also be issues with parts being pulled apart by the connections to the IO | 20:19 |
awygle | could be. i have inputs->reg x2->adder->reg->output | 20:20 |
daveshah | That is probably OK | 20:20 |
daveshah | Is the speed grade the same as the datasheet? | 20:20 |
awygle | yep, i'm running --speed 8 | 20:22 |
awygle | and the table says "-8 timings" | 20:22 |
daveshah | I see | 20:22 |
daveshah | It's probably suboptimal register placement for some reason | 20:22 |
daveshah | But quite frankly I have bigger things to worry about than microbenchmark performance | 20:23 |
awygle | yeah, definitely | 20:23 |
awygle | i just was curious about achievable speeds | 20:23 |
awygle | i don't have a 64-bit wide datapath in my critical section anyway so it's more or less irrelevant | 20:23 |
daveshah | The register placement here is quite different to a real design anyway | 20:24 |
daveshah | As the influence from the IO is going to be much higher | 20:24 |
awygle | yeah | 20:25 |
*** FL4SHK has joined #yosys | 20:25 | |
awygle | i'm gonna increase the "registers between this and the I/O" to like 16 each, and see if that changes anything | 20:26 |
awygle | and then i'll be done playing with it | 20:26 |
daveshah | You could also try --out-of-context --placer sa to disable IO insertion (and work around the resulting probably singular matrix) | 20:27 |
awygle | ok, will do | 20:27 |
awygle | that got me to ~250 MHz | 20:28 |
awygle | that's close to what they have for "64 bit counter" | 20:28 |
awygle | so i wonder if their 64-bit adder is actually a half-adder | 20:28 |
*** cr1901_modern has quit IRC | 20:29 | |
daveshah | Yeah, a counter and adder shouldn't be so different | 20:29 |
awygle | on a 64-bit counter i got 263 MHz, so yeah i think it's fair to say they're cheating | 20:30 |
awygle | er, 269 MHz rather, so beating their claimed 263 MHz | 20:30 |
awygle | oh daveshah, do you have any idea about the edgeclk vs sclk question i asked the other day? the TN for high-speed I/O says the SCLK topology must be used for frequencies <250 MHz and the eclk must be used for frequencies >400 MHz, does that mean i can do whatever in between those two? | 20:34 |
daveshah | No idea | 20:34 |
daveshah | litedram also uses edge clock at 100MHz+ fine | 20:34 |
awygle | mk, i'll probably just design for eclk then | 20:35 |
daveshah | I didn't know that rule was even a thing | 20:35 |
awygle | makes the timing a lot looser | 20:35 |
awygle | oh but litedram is using the DQS stuff so it probably doesn't count come to think of it | 20:35 |
daveshah | maybe | 20:35 |
daveshah | depends what they mean by SCLK/ECLK topology really | 20:35 |
awygle | it's section 5 of TN-02035-1.2 that i'm looking at, if you feel compelled to investigate | 20:36 |
awygle | but don't worry about it on my account | 20:36 |
*** emeb_mac has joined #yosys | 20:41 | |
*** strongsaxophone has joined #yosys | 21:07 | |
*** kristianpaul has quit IRC | 21:14 | |
*** kristianpaul has joined #yosys | 21:15 | |
*** cr1901_modern has joined #yosys | 21:24 | |
*** kraiskil has joined #yosys | 21:26 | |
*** kraiskil has quit IRC | 21:30 | |
*** m4ssi has joined #yosys | 21:39 | |
*** m4ssi has quit IRC | 22:00 | |
*** cr1901_modern has quit IRC | 22:33 | |
*** cr1901_modern has joined #yosys | 22:34 | |
*** strongsaxophone has quit IRC | 22:41 | |
*** m4ssi has joined #yosys | 22:53 | |
*** m4ssi has quit IRC | 23:07 | |
*** lf_ has quit IRC | 23:15 | |
*** lf has joined #yosys | 23:16 | |
*** emeb has quit IRC | 23:43 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!