Wednesday, 2018-06-13

*** tpb has joined #vtr-dev00:00
*** digshadow has quit IRC00:49
*** digshadow has joined #vtr-dev00:50
mithrodaveshah / digshadow / kem_: Just tested the output on real hardware and it worked!02:26
mithroAccording to icetime -- vpr - Total path delay: 7.28 ns (137.33 MHz), arachne - Total path delay: 10.84 ns (92.21 MHz)02:27
mithroHowever I'm not sure that makes much sense because I just have "random" values in the ice40 description... I guess it does just try and minimize them....02:27
mithroTook me a while to realize that iceblink != icestick02:32
digshadowmithro: nice! thats great news02:50
mithrodigshadow: Did you get anywhere with that rr_graph fixes?02:50
digshadowno02:52
*** digshadow has quit IRC03:00
*** digshadow has joined #vtr-dev03:04
*** digshadow has quit IRC03:33
kem_mithro: Awesome!12:39
kem_mithro: I would add, that if the timing model you've given VPR in the architecture file/RR graph is effectively 'random' then I'd expect to see further performance improvements with an accurate timing model12:40
mithrokem_: Yeah - it's mostly random I would say16:50
mithrokem_: when you have a moment, I have a question about clustering16:57
daveshahmithro: Congrats on the PnR!17:56
daveshahLet me know when you want help with icetime stuff17:57
daveshahI might need to learn a bit myself, but I can always ping clifford17:57
mithrodaveshah: How's now :-P17:57
daveshahSure, I can start giving you some pointers now17:57
mithrodaveshah: So there are two major parts right17:57
mithrodaveshah: The "routing timing" and the "internal to a block" timing17:58
daveshahYes, exactly17:58
mithrodaveshah: Where is the data stored?17:58
daveshahAlthough icetime uses the same model for both of those17:58
daveshahFor the routing cells are used for buffers/muxes too17:58
mithroby "model" what does that mean?17:58
daveshahThe data is at https://github.com/cliffordwolf/icestorm/blob/master/icefuzz/timings_hx1k.txt17:59
tpbTitle: icestorm/timings_hx1k.txt at master · cliffordwolf/icestorm · GitHub (at github.com)17:59
daveshahmithro: the model in both cases is a cell17:59
daveshahWith rising and falling delays from each input to output plus setup and hold checks if applicable17:59
daveshahYou can make an HTML database view with the icefuzz makefile18:00
mithrodaveshah: By model - you just mean an object which has a path delay between In / Out  + checks?18:00
daveshahmithro: yea18:00
daveshahThe checks only apply if clocked of course18:00
mithrodaveshah: That isn't published on the site?18:00
daveshahmithro: No, I'll ping clifford18:00
daveshahFor what it's worth, using cells for IO muxes is a fairly primitive model. More advanced approaches would use a loading model, etc18:02
daveshahBut it's basically how icecube2 does it too18:02
mithrodaveshah: Lets start with primitive :-P18:03
daveshahThere are some heuristics as to which cell to pick for a connection in icecube, and to an extent in icetime too I think18:03
daveshahPrimitives are definitely where to start18:03
daveshahMay as well make a script to get the timing data and put it into the XML18:03
mithrodaveshah: I actually mean let's start with simple modelling :-)18:03
mithrodaveshah: I was going to just hand port over some values first to see how it maps18:04
daveshahWell, we don't have any data for the iCE40 for anything more anyway18:04
mithrodaveshah: Then once I have a hand coded version write a script to generate one18:04
daveshahSounds good18:04
sorearLoading model = distributed RC?18:04
daveshahYeap18:05
daveshahIn some way or another18:05
daveshahNeed to poke about with ecp5 timings18:05
daveshahEach wire can have only 0, 1 or 2 loads in ecp5 before a buffer18:05
daveshahSo loading model is either really simple or not needed at all18:05
daveshahOtherwise it's very regular so it's just a case of finding the delay cost of every mux arc and through every cell18:06
daveshahBit of linear algebra needed because you can't run the vendor timing analysis over a partial path, so you'll have to create lots of full paths and compare18:06
daveshahmithro: BTW comparing the icetime Verilog outout to the routing is probably a good place to start working out where different timing cells are inserted18:09
mithrodaveshah: Maybe we should start with the FF and LUT timing?18:09
daveshahmithro: definitely18:09
daveshahThey are the LogicCell40 entry in the timing data18:10
daveshahhttps://github.com/cliffordwolf/icestorm/blob/master/icefuzz/timings_hx1k.txt#L5018:10
tpbTitle: icestorm/timings_hx1k.txt at master · cliffordwolf/icestorm · GitHub (at github.com)18:10
mithroAll the cells have the same timing?18:10
daveshahNumbers are min:typ:max18:10
daveshahYep, it's a repeated structure after all18:11
mithrodaveshah: Are there any simulation models for the stuff icetime outputs?18:11
daveshahmithro: yes, inside icecube18:11
daveshahI don't think there are any open source ones18:11
mithrodaveshah: But we don't have any open source ones?18:11
mithrodaveshah: From what I can see of the Artix-7 there are about 7 models for the slices depending on what is near the slice...18:12
daveshahmithro: luckily not the case in ice4018:12
daveshahmithro: the icetime verilog output is intended for use with a STA tool, not a simulator18:12
daveshahhence the lack of open models (thats what icebox_vlog is for)18:12
mithroSTA?18:12
daveshahstatic timing analysis18:12
daveshahi.e. opentimer18:12
daveshahwhich we hope will be the open source sign-off timing tool18:13
daveshahof SymbiFlow18:13
mithrodaveshah: VPR seems to output the right post routing data need for opentimer from what I can see18:14
daveshahmithro: oh, thats18:14
daveshah*thats nice18:14
daveshahwe have been wanting to try opentimer at some point18:14
daveshahI think icetime+opentimer should work somehow, but may never have been tested...18:15
daveshahyou might need to get the delays into the right format, but the netlist should be OK at least18:15
mithrodaveshah: VPR outputs the .v and .sdf with specify annotations18:18
daveshahmithro: not sure if OpenTimer does support sdf actually18:19
daveshahsomething to investigate in the future18:20
kem_davesha: I've actually worked a little bit with OpenTimer. It basically expects the standard ASIC fileformats: verilog netlist, liberty cell delay models, and SPEF parasitics (wire resistance/capacitance).19:05
daveshahkem_: how was your overall experience? Do you think it would be something useful to add at the end of a VPR flow for more advanced analysis?19:07
kem_davesha: It's a very ASIC oriented tool. The 'signoff' timing analyzer in  FPGA vendor tools is usually quite a bit simpler; since the vendor's FPGA timing model is fairly abstracted from the real devices anyway.19:08
daveshahkem_: good to know. Clifford and I have been considering it but it sounds like it will probably be overkill and hard to get the data into the right format19:09
kem_davesha: Yeah, I agree likely overkill19:10
kem_davesha: I actually wrote a new STA engine for VPR which is designed to be a modular library re-usable in other tools19:10
kem_mithro: If you have clustering questions I'm around for the next half-hour or so19:16
mithrokem_: so with clustering, the clustering code understands that it can't route these atoms in the same cluster but it doesn't seem to start a new cluster for them - it just fails19:17
kem_mithro: Have your tried with --debug_clustring on?19:18
kem_mithro: It should try to open a new cluster if they didn't fit previously19:18
mithroYeah, that is how I know it says the routing is invalid19:18
kem_mithro: It should  also retry in all the top-level block-types which are potentially legal for that primitive type19:19
mithrokem_: Yeah - that is what I thought the code should do -- but it doesn't seem to be....19:19
kem_mithro: Is it possible the atom is not legal in any of block types? That would be the case where it would give-up19:20
kem_mithro: Perhaps an architecture specification bug?19:20
mithrokem_: entirely possible19:21
daveshahmithro: what are you trying to pack19:23
mithrodaveshah: Your FF example19:23
daveshahmithro: that is of course a bit of a torture test19:23
daveshahWhich FF is failing?19:24
mithrokem_: Let me get the debug clustering output19:25
kem_mithro: Sure19:25
kem_mithro: Also does the architecture use any pack-patterns?19:25
mithrokem_: Not really19:26
mithrokem_: I should see that as atoms which have been built into molecules right?19:26
kem_mithro: Correct19:26
mithroI'm not seeing any molecules in the prepack .echo file19:27
mithrohttps://www.irccloud.com/pastebin/Zpq2gMiK/19:29
tpbTitle: Snippet | IRCCloud (at www.irccloud.com)19:29
mithrokem_: The SB_DFFNESS can't be packed into the same cluster as the SB_DFF19:30
kem_mithro: That's the architectural constraint right? That posedge/negedge can't be in the same block?19:31
daveshahkem_: yes the NegClk config bit is set per tile not per cell19:32
kem_mithro: Ok, how are you modelling that in your arch file?19:32
kem_mithro: The way I'd go about it is to have two top level <mode>s, one for posedge (containing the SB_DFF instances), and another mode for negedge (containing the SB_DFFNE)19:33
mithrokem_: A model which has two modes, one which connects CLK to POSCLK and one which connects CLK to NEGCLK -- the POSCLK is then routed to the SB_DFF and NEGCLK to the SB_DFFN19:34
mithrokem_: Why isn't the $auto$simplemap.cc:420:simplemap_dff$86 being moved to "complex block 5" ?19:35
kem_mithro: I think it has to do with how the architecture is being modelled19:36
mithrokem_: FYI the prepacking output looks like this19:36
mithrohttps://www.irccloud.com/pastebin/jiXHStdd/19:36
tpbTitle: Snippet | IRCCloud (at www.irccloud.com)19:37
kem_mithro: It sounds like you have a mode per-DFF instance in the cluster; so the packer assumes it is legal to 'place' put it there. However it is illegal due to the routing constraints, and it gives up since it now thinks that *any* cluster will be illegal19:37
kem_mithro: I think the fix is to hoist the posedge/negedge mode to the top-level19:38
kem_mithro: Basically have two modes which encapsulate all the DFF primitives19:38
kem_mithro: One mode (e.g. posedge) has only SB_DFF primitive sites19:38
kem_mithro: The other mode (e.g. negedge) has only SB_DFFNE primitive sites19:38
mithrokem_: Can we just get the packer to not assume that *any* cluster will be illegal?19:39
kem_mithro: Potentially possible, but hard to come-up with stopping conditions in the general case then...19:40
kem_mithro: I think using two mutually exclusive posedge/negedge modes is really the right approach, as davesha pointed out, there is only one config bit, and hence only one decision for the packer to make19:41
mithroI would expect the packer to eject the SB_DFFN from the cluster19:42
mithroThen it starts a new cluster and packs the SB_DFFNs together19:43
kem_mithro: It would if there were no legal sites for SB_DFFNEs given the cluster's current mode configuration19:43
kem_mithro: But as you described  your architecture, the packer is viewing each DFF site as either a SB_DFF or a SB_DFFNE (i.e. it has a choice for each DFF) instead of a single choice between  all SB_DFF or all SB_DFFNE19:44
mithrokem_: I can think of many packing problems were the only issue is internal routing19:44
kem_mithro: I think in those cases the packer should re-try in a new cluster19:46
mithroYeah19:47
mithroIt seems that at the moment a routing failure just causes the clustering to stop19:47
mithrokem_: so I just need to change the if statement which makes the packer stop at a routing failure?20:07
kem_mithro: The packer has multiple stages (it does it's own internal placement and routing within the cluster), and multiple levels of legality checks. I suspect what is happening is that the routing failure is just a coincidental symptom of a placement issue (i.e. DFFs must be all posedge, or all negedge).21:44
kem_mithro:  Can send along the test case to reproduce? I can try to take a look when I've got a moment.21:45
mithrokem_: Sure21:45
mithrokem_: I'll have a dig a bit further first21:46

Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!