Thursday, 2019-06-20

*** tpb has joined #yosys00:00
*** emeb has quit IRC00:14
*** emeb_mac has joined #yosys00:17
*** proteusguy has quit IRC01:04
*** gsi__ has joined #yosys01:06
*** gsi_ has quit IRC01:08
*** rektide has joined #yosys01:34
*** proteusguy has joined #yosys02:09
*** AlexDaniel has quit IRC02:25
*** zignig has joined #yosys02:29
*** citypw has joined #yosys02:39
*** adjtm has joined #yosys02:47
*** adjtm_ has quit IRC02:49
*** proteusguy has quit IRC02:50
*** PyroPeter has quit IRC02:53
*** proteusguy has joined #yosys03:03
*** PyroPeter has joined #yosys03:06
*** adjtm has quit IRC03:26
*** adjtm has joined #yosys03:26
*** X-Scale has quit IRC03:38
*** vonnieda has joined #yosys03:55
bwidawskis there like a goto riscv I should be working with? I'd bee looking at the Wishbone VexRiscv04:06
sorearare you looking for anything in particular?04:08
bwidawsksorear› just looking for something that will synthesize with diamond and yosys, and is "complex" to do a comparison04:09
*** rohitksingh has joined #yosys04:23
*** rohitksingh has quit IRC04:35
*** rohitksingh_work has joined #yosys04:50
*** _whitelogger has quit IRC06:05
*** _whitelogger has joined #yosys06:07
corecodeac06:33
corecodewoops06:33
*** emeb_mac has quit IRC06:39
*** voxadam has quit IRC06:44
*** voxadam has joined #yosys06:45
*** dys has quit IRC07:02
*** gsi__ is now known as gsi_07:17
*** fsasm_ has joined #yosys07:27
*** dys has joined #yosys07:32
*** m4ssi has joined #yosys07:37
*** citypw has quit IRC07:41
*** citypw has joined #yosys07:42
*** fsasm_ has quit IRC08:41
*** vidbina has joined #yosys09:06
*** rohitksingh has joined #yosys09:12
*** dys has quit IRC09:27
*** vidbina has quit IRC09:41
*** AlexDaniel has joined #yosys09:46
*** dys has joined #yosys09:47
*** AlexDaniel has quit IRC10:23
*** citypw has quit IRC11:00
*** citypw has joined #yosys11:04
*** jakobwenzel has quit IRC11:43
*** jakobwenzel has joined #yosys11:44
ZirconiumXSo, adding DFFSRs (in the form of the only AC-family chip I could find, the 74AC11074) actually reduces the overall chip count, despite the '74 only having 2 DFFSRs11:50
ZirconiumX(thanks daveshah)11:51
*** fsasm has joined #yosys12:06
*** shorne has joined #yosys12:13
*** AlexDaniel has joined #yosys12:21
*** rohitksingh has quit IRC12:24
*** proteusguy has quit IRC12:32
*** rohitksingh has joined #yosys12:35
*** AlexDaniel has quit IRC12:44
*** rohitksingh has quit IRC12:47
*** citypw has quit IRC13:11
*** rohitksingh has joined #yosys13:13
*** rohitksingh has quit IRC13:30
*** X-Scale has joined #yosys13:33
*** citypw has joined #yosys13:44
*** vonnieda has quit IRC13:48
*** rohitksingh_work has quit IRC13:52
*** fsasm has quit IRC13:53
*** fsasm has joined #yosys13:59
*** m4ssi has quit IRC14:44
*** emeb has joined #yosys14:58
*** AlexDaniel has joined #yosys15:03
*** rohitksingh has joined #yosys15:06
*** unkraut has quit IRC15:08
*** unkraut has joined #yosys15:10
ZirconiumXdaveshah: My adder techmap pass seems to produce more Yosys warnings. Mind telling me how I fucked up this time?15:23
ZirconiumXhttps://github.com/ZirconiumX/74xx-liberty/blob/master/74_adder.v15:23
tpbTitle: 74xx-liberty/74_adder.v at master · ZirconiumX/74xx-liberty · GitHub (at github.com)15:23
ZirconiumX../74_adder.v:32: Warning: Range [3:0] select out of bounds on signal `\AA': Setting 1 MSB bits to undef.15:24
ZirconiumX../74_adder.v:33: Warning: Range [3:0] select out of bounds on signal `\BB': Setting 1 MSB bits to undef.15:24
daveshahZirconiumX: AA needs to be WIDTH-1:0 not Y_WIDTH-1:015:24
daveshahsame for BB15:24
ZirconiumXWoops, thank you15:25
*** emeb_mac has joined #yosys15:26
*** rohitksingh has quit IRC15:31
*** vonnieda has joined #yosys15:32
*** emeb_mac has quit IRC15:34
ZirconiumXdaveshah: While reading through the synth_ice40 pass, I noticed that the iCE40 uses DFFE cells. Is the E here an enable or something?15:39
daveshahYes15:39
daveshahClock enable15:39
*** rohitksingh has joined #yosys15:43
ZirconiumXdaveshah: So from some research a DFFE is essentially a transparent latch?15:48
daveshahZirconiumX: No, that's a D latch15:48
daveshahA DFFE is effectively a flipflop with an AND gate on the clock (or a mux in front of the data input)15:48
daveshah*a D flipflop15:48
tnt"an AND gate on the clock" ... well ... don't implement it like that :p15:50
ZirconiumXI'm confused, then15:51
tntIf clock-enable is low, a dffe will ignore rising edges.15:51
ZirconiumXhttps://cdn.eeweb.com/articles/quizzes/dff-1293487103_180201_061807.png <-- one of these?15:51
tntyes15:51
tnt74AC37715:52
tntOctal D-Type Flip-Flop with Clock Enable15:52
*** rohitksingh has quit IRC15:52
ZirconiumXSadly that particular part was not made in the AC family15:53
tnthttp://www.mouser.com/ds/2/149/74ac377-288934.pdf ?15:54
ZirconiumXOh, seems TI are lying then :P15:55
tntWell ... not all manufacturer make all parts in each family ...15:55
tnteach has its unique set of gates they make in each family.15:55
ZirconiumXTrue, I suppose15:56
*** rohitksingh has joined #yosys15:56
ZirconiumXHmmm16:01
ZirconiumXWhat advantage does a DFFE have over a plain DFF?16:02
ZirconiumXProbably more flexibility, at least16:02
daveshahIt saves a mux, in situations when you only update the DFF sometimes16:02
daveshahany HDL of the form if (a) q <= d maps to a DFFE nicely16:02
*** flammit_ has joined #yosys16:03
ZirconiumXAh, I see16:03
*** citypw has quit IRC16:04
*** nengel has joined #yosys16:07
*** Wolf481pl has joined #yosys16:10
*** rohitksingh has quit IRC16:11
*** flammit has quit IRC16:11
*** ZipCPU has quit IRC16:11
*** Wolf480pl has quit IRC16:11
*** attie has quit IRC16:11
*** flammit_ is now known as flammit16:11
*** ZipCPU has joined #yosys16:14
*** rrika has quit IRC16:14
*** rrika has joined #yosys16:14
ZirconiumXdaveshah: It looks like dfflibmap can't match/create DFFE cells. Is this correct, or am I just blind?16:19
daveshahYeah, looked like no-one has ever implemented this16:20
daveshahIt might be that DFFEs are more common in an FPGA context, which doesn't use dfflibmap16:20
ZirconiumXPresumably then I should use techmap for this instead?16:21
daveshahYes16:21
ZirconiumXDFF vs DFFE in 74-series logic is going to be an interesting tradeoff that might not pay off; you're saving muxes, sure, but the 16373 lets you fit twice as many DFFs in a chip.16:23
ZirconiumXWell, assuming my math on this (broken) pass is correct, it *should* be a fairly major gain17:08
ZirconiumXhttps://github.com/ZirconiumX/74xx-liberty/commit/4fa6b83d17:09
tpbTitle: (broken) DFF to 74AC377 DFFE pass · ZirconiumX/[email protected] · GitHub (at github.com)17:09
ZirconiumXThis leaks $_DFFE_PP_ cells, though17:10
*** citypw has joined #yosys17:10
*** citypw has quit IRC17:19
*** proteusguy has joined #yosys17:28
daveshahZirconiumX: you need a general techmap call (`-map +/techmap.v`) before you try to map the $_DFFE_PP_17:29
ZirconiumXdaveshah: Ah, thank you17:31
ZirconiumXBefore: 772917:32
ZirconiumXAfter: 673417:32
ZirconiumXThat's pretty huge17:32
daveshahI would expect to see a significant drop in the number of MUX2s?17:32
ZirconiumXIndeed, we go from 1,316 to 87617:33
tntwiring 6700 chips is still going to be fun :p17:37
ZirconiumXThis is for the whole benchmark17:38
ZirconiumXBiggest winner is axilxbar, with about 25% less gates17:40
ZirconiumXPicoRV32 is currently at 1,532 gates17:41
*** rohitksingh has joined #yosys17:42
ZirconiumXdaveshah: Actually, I just had a thought. Yosys would expect each individual DFFE to have its own enable bit, but the 74AC377 has a single enable bit for 8 flops17:45
ZirconiumXSo this would be technically incorrect, right?17:45
ZirconiumXOr at least, modelled incorrectly17:45
daveshahZirconiumX: the iCE40 flipflops are similar (as are most FPGAs)17:48
daveshahhave a look at how tnt implemented dffe_min_ce_use in synth_ice4017:48
ZirconiumXAh, thank you, daveshah17:54
ZirconiumXIt's still an improvement, but very much less so17:54
ZirconiumXAt 7563 chips, currently17:55
ZirconiumXAdding an opt_merge before unmapping like synth_ice40 does helped bring that down to 737818:11
*** maikmerten has joined #yosys18:12
*** rohitksingh has quit IRC18:57
ZirconiumXI'm reading the "memory_bram" documentation (as Clifford suggested); what is a transparent read?18:58
ZirconiumX(of SRAM)18:58
daveshahA transparent read is where the read port will reflect writes in the current clock cycle (aka read after write)18:58
ZirconiumXSo if you write X to address Y on one port and simultaneously command a read from address Y, SRAM is transparent is you get X out?19:00
ZirconiumX*if19:00
daveshahYes19:00
daveshahYosys can fake it with a mux if the SRAM isn't capable natively19:01
ZirconiumXThe SRAM I'm looking at at the moment appears to stall the read if you do that19:01
ZirconiumXIs that transparent?19:01
daveshahThat sounds like not transparent, ie read before write19:01
ZirconiumXhttps://www.idt.com/document/dst/713242-datasheet19:02
ZirconiumXMy plan with this is to designate one port as write and one as read19:03
daveshahI'm not sure if this really fits one way or another19:03
daveshahYosys doesn't have a concept of BRAM stalling - this wouldn't map from Verilog well either19:04
ZirconiumXSo this chip wouldn't work?19:04
*** rohitksingh has joined #yosys19:04
daveshahYou'd probably need to be a bit clever with how you drove it19:05
daveshahRead on one clock cycle and write on the other or something, so you didn't have the collision19:05
ZirconiumXYeah, it'd need some anti-collision circuitry19:07
ZirconiumXOr even properties to verify collisions could not happen19:07
*** dys has quit IRC19:14
*** dys has joined #yosys19:16
ZirconiumXSo, I managed to coerce memory_bram into working by fooling it into thinking the write port is clocked19:32
ZirconiumX1,009 chips, even though the RAM chips are very underused19:33
*** jevinskie has joined #yosys19:40
ZirconiumX6,140 ICs19:43
ZirconiumXOh, hey jevinskie19:44
jevinskieHowdy!19:45
*** m4ssi has joined #yosys20:12
*** m4ssi has quit IRC20:22
bwidawskdaveshah› what is a SLICE in this context after PNR?20:32
daveshahbwidawsk: a unit of two LUT4s, two flipflops, two MUX2s and two bits of carry logic20:33
bwidawskah, this is like an ALM in altera parlance20:34
bwidawskthanks20:34
daveshahYup20:34
*** maikmerten has quit IRC20:37
bwidawskdaveshah› interestingly, synthesis time alone is a decent amount faster on blinky with gcc over clang20:51
bwidawskroughly 25% faster (granted we're talking ~2s here)20:52
daveshahInteresting20:56
daveshahWould be good to know if that applies to bigger benchmarks too20:57
bwidawskdaveshah› I'll try to provide that info after I figure out how to get the same data from diamond on blinky21:04
*** Thorn has quit IRC21:10
*** Thorn has joined #yosys21:11
bwidawskdaveshah› actually, I had it backwards - clang is faster, and it's more like 10%21:29
bwidawskit does fluctuate a bit...21:29
*** SpaceCoaster has quit IRC21:40
bwidawsknot sure if I did something wrong, but diamond and yosys use the same number of luts, but diamond uss half the number of slices21:47
bwidawskdaveshah› https://0x0.st/zedC.txt22:00
daveshahThis is probably because nextpnr's packing density is pretty poor22:01
daveshahThis isn't counting carries (CCU2Cs) which take up a slice too22:02
bwidawskdaveshah› does it make sense to add that?22:03
daveshahYes22:03
bwidawskdaveshah› in nextpnr side, it's already counted by TRELLIS_SLICE, correct?22:05
daveshahYes22:05
bwidawskthat brings it up to 184 vs. 232 then22:05
* bwidawsk needs to add a cost column :P22:06
daveshahI was referring to the synthesis side, in terms of LUT usage22:06
daveshahDiamond SLICEs should already include CCU2s too22:06
bwidawskI have this22:07
bwidawsk  Number of SLICEs:       117 out of 41820 (0%)22:07
bwidawsk      SLICEs as Logic/ROM:    117 out of 41820 (0%)22:07
bwidawsk      SLICEs as RAM:            0 out of 31365 (0%)22:07
bwidawsk      SLICEs as Carry:         67 out of 41820 (0%)22:07
bwidawsk   Number of LUT4s:        233 out of 83640 (0%)22:07
bwidawsk      Number used as logic LUTs:         9922:07
bwidawsk      Number used as distributed RAM:     022:07
bwidawsk      Number used as ripple logic:      13422:07
bwidawsk      Number used as shift registers:     022:07
daveshahSo that is equivalent to 117 TRELLIS_SLICE22:07
daveshahI'm curious what the Yosys output is22:08
bwidawskdaveshah› https://0x0.st/zedC.txt22:08
bwidawskoops22:08
bwidawskdaveshah› https://0x0.st/zenc.txt22:09
daveshahSo the total number of LUTs Yosys has inferred is 233 + 74*222:10
daveshahYosys doesn't include the two LUT4s in the CCU2C in its statistic, whereas I believe Diamond does22:11
bwidawskso quite a bit worse then, huh?22:11
bwidawsklet me post the entirety of the map output22:12
bwidawskdaveshah› https://0x0.st/zenm.txt22:12
daveshahYeah, Yosys has some serious area issues for ECP5 at the moment22:13
daveshahMostly because the lack of proper LUT timings in ABC make it much too eager use muxes to build large LUTs22:14
bwidawskif I'm trying to paint open tools in a good light, should I do ice40 then?22:15
daveshahI expect you'll find it much the same22:15
bwidawsk:/22:15
daveshahProbably a bit better, but we still definitely lag behind22:16
daveshahThings will pick up once this PR is merged https://github.com/YosysHQ/yosys/pull/109822:16
tpbTitle: WIP "abc9" pass for timing-aware techmapping (experimental, FPGA only, no FFs) by eddiehung · Pull Request #1098 · YosysHQ/yosys · GitHub (at github.com)22:16
daveshahBut it might be a month or two22:16
bwidawskfor my sake, would it make sense to just merge it and try that out?22:17
bwidawskI don't care if it's in master so long as I can say in good faith, it will be22:17
daveshahIt's probably not giving a massive improvement yet, there is still some work that isn't even pushed22:18
daveshahIf you do try it, you'll need to add -abc9 to synth_ecp5 and synth_ice4022:18
bwidawskdaveshah› actually, if you read the notes from the diamond log, it seems like they misreport the total number of luts22:18
bwidawsk   Notes:-22:18
bwidawsk      1. Total number of LUT4s = (Number of logic LUT4s) + 2*(Number of22:18
bwidawsk     distributed RAMs) + 2*(Number of ripple logic)22:18
bwidawskit's a bit confusing22:19
daveshahThat is equivalent to in Yosys doing LUT4 count + 2*CCU2 count22:19
daveshahIf you want to see Yosys doing better in area, you can try adding -nomux to synth_ecp422:20
daveshah*synth_ecp522:20
bwidawskfor diamond then, i should be doing 99 + 2 * 134, correct?22:20
daveshahNo the Diamond number of LUT4s is correct22:21
daveshahYosys has done badly and there's no escaping22:21
bwidawskand presumably, Fmax goes up if you fix those somewhat22:22
*** AlexDaniel has quit IRC22:23
daveshahYes22:24
*** fsasm has quit IRC22:24
daveshahDo you have the design somewhere?22:25
bwidawskit's your blinky from prjtrellis22:25
daveshahbwidawsk: I'll push a PR tomorrow, turns out the subtractor mapping in Yosys was very suboptimal and hurting that design particularly badly22:41
daveshahShould be down to more like 284 LUT4s in Diamond terms22:42
bwidawskdaveshah› thanks!22:42
bwidawskif you add me to the cc, I would be happy to test it22:42
*** vonnieda has quit IRC22:50
*** jevinskie has quit IRC22:59
*** jevinskie has joined #yosys22:59
bwidawsksadly the soc_ecp5_evn project doesn't just work in diamond23:14
*** rohitksingh has quit IRC23:25
bwidawsklooks like it doesn't like how EHXPLLL is instantiated23:41
bwidawskit all looks right afaict...23:53

Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!