Saturday, 2020-08-15

*** tpb has joined #litex00:00
*** Degi has quit IRC00:21
*** Degi has joined #litex00:21
*** _whitelogger has quit IRC02:03
*** _whitelogger has joined #litex02:05
*** CarlFK has quit IRC02:40
*** levi has quit IRC02:56
*** mithro has quit IRC02:59
*** guan has quit IRC03:02
*** jaseg has quit IRC03:02
*** jaseg has joined #litex03:04
*** rohitksingh has quit IRC03:09
*** bubble_buster has quit IRC03:11
*** rohitksingh has joined #litex03:12
*** bubble_buster has joined #litex03:13
*** rohitksingh has quit IRC03:20
*** bubble_buster has quit IRC03:21
*** rohitksingh has joined #litex03:24
*** bubble_buster has joined #litex03:24
*** guan has joined #litex03:25
*** mithro has joined #litex03:25
*** levi has joined #litex03:26
*** HoloIRCUser1 has joined #litex03:35
*** HoloIRCUser has quit IRC03:39
*** davidcorrigan714 has joined #litex03:51
*** _whitelogger has quit IRC05:09
*** davidcorrigan714 has quit IRC05:11
*** _whitelogger has joined #litex05:11
*** kgugala_ has joined #litex06:05
*** kgugala has quit IRC06:08
*** kgugala has joined #litex06:14
*** kgugala_ has quit IRC06:17
*** CarlFK has joined #litex06:28
*** kgugala has quit IRC06:38
*** kgugala has joined #litex06:38
*** tcal has quit IRC07:56
*** kgugala_ has joined #litex10:49
*** kgugala has quit IRC10:53
lkclthat's interesting.  nextpnr-ecp5 compiling microwatt didn't complete (overnight!) and has locked up solid after 15 hours, with 4,000 / 140,000 routes left.12:31
daveshahthe routeability of large ecp5 designs isn't great12:32
lkclannoying as i need it in order to test DDR3 memtest with 64-bit12:33
daveshahyou can try Rocket, it isn't POWER but in the default, non-Linux-ready config, I think it should fit slightly better12:35
lkclah that's rocket-chip (written in chisel3).  yeah12:36
lkclarg probably need an updated rv cross compiler... hm there should be debian packages12:38
lkclah interesting, upgrading to gcc-9 didn't help12:41
lkcllitex/litex/soc/software/bios/isr.c:56: Error: Instruction csrr requires absolute expression12:42
lkclbizarre-ness.  picorv32 csrr() and microwatt csrr() macros in their respective system.h are identical12:45
*** kgugala has joined #litex12:46
lkclbut it's on mtval (in software/bios/isr.c) which i've commented out12:47
lkcli *think* this is because mtval was retired (or is optional)12:48
*** kgugala_ has quit IRC12:48
lkclahh interesting.  i'm getting data corruption on the UART terminal (minicom)12:50
lkclthis is on libresoc 64-bit 55mhz with latest git master12:51
lkcldaveshah: urrr.... rocketchip build - Info: 9.8 ns logic, 33.0 ns routing13:07
lkclWarning: Max frequency for clock              '$glbnet$sys_clk': 23.36 MHz (FAIL at 55.01 MHz)13:07
lkclaaa i just want something to work! :)13:08
lkclthe axi4 to tilelink converter creates a morass of critical path dependencies13:09
* lkcl is trying 50mhz nextpnr-ecp5 builds and that seems to work (picorv32)13:30
lkcli suspect that the libresoc build, at 55mhz, is just too close to the limit estimated by nextpnr... nope.  arse.  a 50mhz libresoc build gets the same UART corruption.13:31
lkclahh interesting13:39
lkcl_florent_: when i re-added kwargs["integrated_main_ram_size"] = 0x1000 the UART corruption (black square "?" being the predominant char) went away13:40
*** proteusguy has quit IRC15:08
*** proteusguy has joined #litex15:22
*** kgugala_ has joined #litex15:44
*** kgugala has quit IRC15:47
*** HoloIRCUser has joined #litex18:20
*** HoloIRCUser1 has quit IRC18:20
somlolkcl: I'm probably late to this party, but I managed to get 30 "official" MHz timing for my 5g-versa board with this command line:19:43
somlolitex-boards/litex_boards/targets/versa_ecp5.py --sys-clk-freq 60e6 --csr-data-width 32 --cpu-type rocket --integrated-rom-size 0x10000 --build19:43
somloI can't test it right now, but the "advertised" fmax is usually *way* lower than what one can get away with (nextpnr is typically overly conservative w.r.t. timing)19:44
somloI also had to remove "-abc9" here: https://github.com/enjoy-digital/litex/blob/master/litex/build/lattice/trellis.py#L57 (latest abc crashes about 50% of the time when called from yosys' abc9 stage)19:44
tpbTitle: litex/trellis.py at master · enjoy-digital/litex · GitHub (at github.com)19:44
somloI think I'm going to revert litex commit 6c298cb7 and make abc9 optional via a command line flag19:45
somloand if you have a non-5g versa board, you'll probably need to make additional tweaks to the litex build environment (but if you've already been building stuff for your versa board, you've likely already figured that part out)19:47
somloonce I free up my (currently busy) usb port, I can actually test the bitstream I built and see if it actually works, and if it manages to pass memtest, but that'll have to wait for a couple of hours :)19:48
lkclsomlo: ah thank you20:30
lkclahh interesting.  csr data width normally defaults to 8 bit20:32
lkcli wonder if that has any impact, here.20:33
somlolkcl: historically, I don't think csr-data-width has had a major impact on timing, placement, or utilization20:39
lkclok appreciated20:44
lkclaw poop, forgot the --device=LFE5UM :)20:48
somlooh, so you do have a non-5g versa, then :)20:58
lkclsomlo: for lack of being able to purchase a 5g version via mouser 3 months ago... yes20:59
somloI'm still ripping a multi-cd audiobook to mp3 for my daughter, and the one usb port I could use to hook up my dev board is currently occupied by the cd drive -- so I can't test myself for a while :)20:59
somlolkcl: I initially ordered a 5g versa, they shipped me a non-5g one, daveshah helped me figure out why it wasn't working the way I thought it should have -- long story short, it took another month or so to sort out the mess :)21:01
lkcldoh21:01
somloeventually they sent me the right one (this was via the lattice web store from the US)21:01
lkcloh cool21:02
lkclbuilt - and fails.21:03
lkcl./versa_ecp5.py --sys-clk-freq 50e6 --device=LFE5UM --csr-data-width 32 --cpu-type rocket --integrated-rom-size 0x10000 --build21:03
lkclWarning: Max frequency for clock              '$glbnet$sys_clk': 20.58 MHz (FAIL at 50.00 MHz)21:03
somloaaah, that21:04
somlotry pushing the bitstream to the board anyway, see what happens21:04
lkclbecause of the massive combinatorial chain between tilelink and axi-lite21:04
lkclnope, fail21:04
somloI consistently get "fails at 30 (or 40) MHz" on a trellisboard (after requesting 60MHz) and it works fine :)21:05
lkcli mean: it loads successfully, it just doesn't report anything on the USB console21:05
lkcl:)21:05
daveshahsomlo: I remember Rocket giving 50MHz+ once upon a time21:05
somloright, so then it *really* fails21:05
daveshahdid we ever figure out why that changed?21:05
somlodaveshah: not sure if it was rocket, or yosys/trellis/nextpnr, or some interaction between the changes made to each of them21:06
somloI remember a point in time when bumping the rocket chip git version caused utilization to go up by 6-7 %21:06
daveshahyeah, seems likely21:07
daveshahthat was probably almost a year ago that I remember that number from21:07
somlobut I didn't pay as much attention to timing. I think at some point (before that) I had been complaining about no longer meeting timing, and litex switched to "--timing-allow-fail" based on the "looseness" of that calculation in nextpnr...21:08
somlothese days I can get 60MHz working on the trellisboard most of the time, and the "linux" version of rocket will no longer fit on the versa no matter what (105+ % utilization), so I haven't tried with that board in the last several months21:10
lkclsomlo: do you happen to have _anything_ that has a 64 bit bus, that you know (last time you tried it) "works"?21:11
lkclnextpnr-ecp5 on microwatt took 16 hours to fail to complete routing (locked up with 5% nets left)21:12
somlolkcl: http://www.contrib.andrew.cmu.edu/~somlo/BTCP/#sec_2_2_121:18
tpbTitle: A Trustworthy, Free (Libre), Linux Capable, Self-Hosting 64bit RISC-V Computer (at www.contrib.andrew.cmu.edu)21:18
somlosome git versions of things I used back a while ago. sadly, I don't remember which litex commit that was at the time21:18
lkclcpu-variant "linuxd"?  what's that one?21:19
somlolkcl: there's several versions of rocket with different mem-axi port widths, matching the native width of litedram on various dev boards21:23
somloso we can hook rocket's memory port directly to litedram, point-to-point, and bypass the wishbone bus used for MMIO21:23
somlobut if we're looking for something we know used to work, I'd check out litex cca. 1 year ago, e.g. commit 4cc40aad21:24
somloalso check out one of the earlier (earliest) versions of pythondata-cpu-rocket to match21:25
somlolkcl: there's another trick I remember from a while back: using `--nextpnr-timingstrict` with `versa_ecp5.py`, so that it *properly* fails when not meeting timing21:34
somlothen running that in a `for` loop in the shell, until you get lucky and it succeeds :)21:34
somlobut that might take a while (days)21:34
somloand it's not guaranteed to work21:34
lkclyyeah21:41
lkcland i'd have to then try to get libresoc working against that version of litex, to be able to make a fair comparison21:41
lkcli think... i am tempted to track down a nmigen 64-to-32-bit converter and throw that in front, before the data requests reach litex21:42
lkclhey i got the LUT4s down from 20,000 to around 16,000 by adding in sync delays into the regfile reads, that gave nextpnr a chance to to "hey i can use BRAMS now"21:45
lkclyaay21:45
daveshahit's yosys that does that not nextpnr, jfyi21:45
lkcldaveshah: oh, ta :)21:46
daveshahand yeah, that will definitely help :)21:46
lkcli'm very happy.  it was... a small but hair-raising change21:46
daveshahthis is the kind of interesting difference between what is efficient on ASIC vs FPGA21:46
lkcli'm doing an out-of-order design, which (for now) has a FSM, however all the infrastructure is there to add the dependency matrices in21:47
lkclwhich means that there are *combinatorial* "Pipeline Managers" that expect operands to come in with a "operand is ok" plus "actual operand" in the *same* cycle21:47
lkclnot, as things are done in regfiles, on the *next* cycle (after the read-enable)21:48
lkclso, of course, to make life easier, i made the regfile combinatorial as well...21:48
lkclbaad move21:49
lkcldaveshah: well, it's more that i've never done this before.21:51
lkcli had:21:51
lkcl* an instruction decoder that was combinatorial and21:52
lkcl* a regfile read likewise and21:52
lkcl* a pipeline manager likewise...21:52
lkclno wonder i was only getting 20 mhz, a couple days ago :)21:52
lkclnuts to it, i'm going to try libresoc @ 75mhz22:00
lkclwhat's a DPR16X4?22:04
somlolkcl: just tried that bitstream on the 5g versa, and it works (requested 50MHz, "passed" at 24-and-change, works fine, including passing memtest, at 50 when programmed)22:08
daveshahlkcl: 16x4 bit distributed RAM22:09
daveshahie a cluser of 4 LUTs can be used as RAM instead of ROM22:09
lkcldaveshah: thx22:10
lkclsomlo: hmm interesting.22:10
lkclwhat versions of... "stuff" are you using?22:10
lkclbecause if you use latest rocket-chip, they're removed mtval22:11
lkclso it's guaranteed that the litex bios won't compile22:11
somlohttps://imgur.com/a/wpFgTbu22:12
tpbTitle: Imgur: The magic of the Internet (at imgur.com)22:12
lkcllitex git a0c8bb5522:13
somlo litex 35929c0f yosys c39ebe6 trellis f93243b nextpnr b39a2a522:14
somlo"latest" to within the last week, or thereabouts22:14
lkclok got it22:15
lkcli've got a version of trellis and nextpnr from about a month ago22:16
somlolkcl: I would bet against that making a relevant difference, but one never knows for sure :)22:19
lkclwait... is there anything that's determined by software read-levelling that is passed over to hardware?22:22
lkcli get this:22:27
lkclRead leveling:22:27
lkclm0, b00: |0| delays: -22:27
somloI think `read_delay_inc()` is used to modify `dq` based on calibration tests22:27
lkclargh ok22:27
lkclso, the core i'm doing is currently a FSM, not a pipelined design22:28
somlobut I'm dangerously approaching the limits of my understanding of litedram :)22:28
lkclbecause the dependency matrices are massive and particularly complex, i've left them out for now (to add later)22:29
lkclproblem is: the CPU therefore runs at a 0.1 to 0.2 IPC22:29
lkclsomlo: i kiiinda get it.  i'll take a look22:30
daveshahGiven that SERV runs litedram calibration just fine and that is IPC <022:31
daveshah<0.122:31
daveshahthere shouldn't be an IPC dependency22:31
* sorear thinks about the implications of ipc<022:31
daveshahEven PicoRV32 is an IPC of only 0.2 ish, less depending on the memory bus22:31
lkcldaveshah: ok.  even stranger, then22:32
daveshahIs there any cache involved?22:33
lkcldaveshah: no, no caches22:33
lkcl(yet)22:33
daveshahGood, that eliminates one cause of ddr3 calibration issues22:34
daveshahMy guess is that something somewhere is going wrong in the bus interface22:35
lkcldaveshah: i did have an issue with the load/store interface, dropping the wishbone "ack" far too early22:36
lkcli did fix that (so that sim.py works)22:36
lkcltck-tck... i need to run some of microwatt's unit tests under the litex sim.py22:37
lkclhow do you specify an alternative binary to be loaded in litex (at address 0x000000)?22:38
daveshahNot sure, I haven't used sim.py much22:39
daveshahhttps://github.com/litex-hub/linux-on-litex-vexriscv/issues/84 might be interesting though. Simulating part of the DRAM interface.22:39
lkclmy guess is, specify a different "BIOS"22:39
tpbTitle: sim: use SDRAM DFI model for simulation · Issue #84 · litex-hub/linux-on-litex-vexriscv · GitHub (at github.com)22:39
daveshahYeah22:39
lkclthat's an interesting one, i think i'll try that22:41
lkclof course, disable --integrated_ram_size (doh)22:44
*** lf has quit IRC23:08
*** lf has joined #litex23:09
lkclahhh :)23:11
lkcldaveshah: good call.23:11
lkclmanaged very quickly to put something together23:12
lkclmicrowatt: passed23:12
lkcllibresoc: fail23:12
lkcl*now* i have something i can test/compare against23:12
lkcland, as an added bonus, get a vcd trace23:12
lkclexcellent call23:12
daveshahGreat!23:15
lkclthank you for that.  vcd file's going to be massive... one for investigating tomorrow.23:21
lkclbtw do you have some trellisboards available?23:22
daveshahNo, unfortunately not23:27
lkcldaveshah: ok.  i noticed the UXLS3 has an 85k version _and_ SDRAM.23:28
daveshahYep23:29
daveshahShould be a bit simpler to debug than DDR3, too23:29
daveshahFrom memory it was running reliably for me even at 16MHz23:29
lkclyeah that sounds not unreasonable for SDRAM.  i'd really like to test opencores sdram before putting it into an actual ASIC23:31
lkclok enough.  00:31 here :)23:31
daveshahDitto :)23:32

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!