Tuesday, 2018-04-17

*** tpb has joined #vtr-dev00:00
*** digshadow has quit IRC03:21
*** digshadow has joined #vtr-dev03:40
jholdigshadow: I started making a list of everything I find that needs to be in the rr_graph: https://docs.google.com/document/d/1kTehDgse8GA2af5HoQ9Ntr41uNL_NJ43CjA32DofK8E/edit?usp=sharing12:11
tpbTitle: Config list - Google Docs (at docs.google.com)12:11
jholis this useful? - if so I'll keep adding any muxes and wires I encounter12:11
jholcould do with some tidying12:11
jholperhaps you could use it as a check-list for the things that need to go into the rr_graph12:12
mithrojhol: Morning14:51
jholmithro: hi14:51
mithrojhol: How are things going? Did my email summary make sense?14:52
mithrokem_: When you have 30m -> 1h I could use some help understanding some very strange behaviour myself and digshadow are seeing around "fc" tag14:52
jholmithro: yeah, thanks for the email14:54
jholI've just been tinkering with the 3-bit counter a little more today14:54
jholgot it to route properly, now14:54
jholmore adjustments to the way modes are done14:55
mithrojhol: Oh?14:55
mithroPlease do share!14:55
mithrojhol: Did you end up putting together those more simple tests?14:55
jholhttp://www.clifford.at/icestorm/logic_tile.html14:55
tpbTitle: Project IceStorm LOGIC Tile Documentation (at www.clifford.at)14:55
jholso in the first paragraph of the Logic Block section...14:56
jholit mentions that CLK, CLKEN input and SR input are shared, but the bypass flip-flop mux, whether SR does set or reset, and the synchronous/asynchronous mode are indifividually configurable14:57
mithroEach logic tile has a logic block containing 8 logic cells. Each logic cell contains a 4-input LUT, a carry unit and a flip-flop. Clock, clock enable, and set/reset inputs are shared along the 8 logic cells. So is the bit that configures positive/negative edge for the flip flops. But the three configuration bits that specify if the flip flop should be used, if it is set or reset by the set/reset input, and if the set/reset is14:57
mithrosynchronous or asynchronous exist for each logic cell individually.14:57
jholthis means that the whole tile needs modes for NegClk, Enable/Disable, and Yes/No to Set/Reset14:58
jholand the individual LUTFFs need configs for Set or Reset and Sync/Async14:58
mithrojhol: Yes, that makes sense14:58
jhol-- I've got that implemented now, and VPR place my 3-bit counter into two tiles as it should do14:58
mithrojhol: WOOT!14:58
jholthen I've been hacking around some carry-chain issues, and it gets through to the end14:59
jhol1-sec, I will push code14:59
jholhttps://github.com/jhol/symbiflow-arch-defs/commits/modes-fixes215:00
tpbTitle: Commits · jhol/symbiflow-arch-defs · GitHub (at github.com)15:00
jholalso, your "tests: Make yosys write $true/$false/$undef definitions" was a helpful fix15:02
mithrojhol: Oh - it was you who had seen that problem too?15:03
mithroI thought it was digshadow but he was very confused when I started chatting to him about it...15:03
jholwell not as such, but now VPR is able to spawn a LUT4 so it can implement a $false for my counter15:03
jhol-- now -- with your patch15:03
jholso the $false + 1x LUT4 + 2x SB_DFFs go in 1 tile, and the SB_DFFE go in the other15:04
mithrojhol: Did you figure out SB_DFFE verse SB_DFF?15:07
jholI'm going to chat to clifford later15:08
jholit's not really an issue, just an observation15:08
mithroWhat is the difference between them?15:08
jholso a SB_DFFE is a D-flip flop, with an extra clock enable pin15:09
jholat the moment yosys always seems to implement counters by using the clk to toggle a SR_DFF for bit 0, then then using that signal to drive the EN pin in a SR_DFFE for bit 1 and driving that with the clock as well15:09
jholfor higher bits, it just adds more SR_DFFs in a carry chain15:09
jholso my question was just about WHY yosys implements the circuit this way, and this if optimal behaviour15:10
jhol*and if this is...15:10
mithrojhol: I would love someone to write that up :-P15:10
mithrojhol: And then they could also do a YouTube video about it ;-)15:11
jholyeah - I'm tempted to do one about VPR15:11
mithrojhol: Let's leave the vpr one until we have working ice40 and artix-7 support :-)15:11
jholthe only thing is that software-only videos risk being a bit dry15:11
jholyeah15:12
jholan artix-7 video would be beyond awesome15:12
mithrojhol: I think if we can get the ice40 working, the artix-7 won't be far behind15:13
jholamazing - you guys have been doing great things15:13
mithrojhol: In some ways I think the Artix-7 stuff was closer to being working before you started on the ice40 stuff15:14
jholreally? that's so exciting15:14
mithrojhol: But we switched back to the ice40 because we have a known timing model for that15:15
jholso you want to hear my big comment about symbiflow-arch-defs?15:15
mithrojhol: The Makefile system sucks and needs to be rewritten? :-P15:15
jholyes: the XML+XSLT+Makefile stuff is cumbersome15:15
jholmy approach was better15:15
mithrojhol: My eventual approach is to have everything in verilog and generate the XML from that15:16
mithrojhol: Take a look at the slice-l carry 4 on the artix-715:16
jholI just did it all with a python script15:16
jholhttps://github.com/jhol/icestorm/blob/vtr-py/icebox/vtr.py15:16
tpbTitle: icestorm/vtr.py at vtr-py · jhol/icestorm · GitHub (at github.com)15:16
jholbeutifully expressive - no repetition15:17
jholeasy to edit without developer conflicts15:17
mithrojhol: Yeah - you are probably right, once we have things working we can loop back to ways to improve the repo15:17
jholthat 387 lines implements all the SB_DFF modes, and takes a stab at the interconnect15:17
jholalso has RAM and PIOs15:18
jholmultiple things are busted, but structurally it does the job very concisely15:18
mithrojhol: Yeap15:18
mithrojhol: If I had known about your vtr-py before we started, then we probably would have gone that route15:19
jholyeah - I worked on it starting 2016, but I never go anything good enough to publish15:19
mithrojhol: But alas Clifford did not mention you had done anything and I didn't find it when searching...15:19
mithrojhol: So, you just said I could blame you for the xml mess? :-P15:20
jholI'm sorry :(15:20
mithrojhol: If you get the ice40 PLB working your forgiven ;-)15:21
mithrojhol: I'm open to the idea of moving to a vtr-py like thing in the future15:21
jholwhat are we doing about the PIOs?15:21
mithrojhol: Ignoring them? :-P15:21
jholcan we ignore them?15:22
jholthat's where the global injectors live15:22
mithrojhol: "Sticking our head in the sand until we can't ignore them any longer" :-)15:22
jholI would think getting the <pb_type> for them better would be pretty easy15:22
jholat the moment you've got PIO-LR and PIO-TB, but really there needs to be 4x PIOs - one for each side15:23
mithrojhol: Probably15:23
mithrojhol: But don't get distracted by that yet :-P15:23
jholno - I'm going to test the router with more things15:24
jholalso, you saw this, right?: https://docs.google.com/document/d/1kTehDgse8GA2af5HoQ9Ntr41uNL_NJ43CjA32DofK8E/edit?usp=sharing15:24
tpbTitle: Config list - Google Docs (at docs.google.com)15:24
mithrojhol: Yeah! That is pretty good15:24
jholthere are so many things to remember to put in the rr_graph, including things deeply burried inside the PLBs15:25
jholI'm not sure I've got everything there- but I think it might help to check those things off as they're coded up in python15:25
mithrojhol: Yeap!15:30
mithrojhol: See privmsg15:35
kem_mithro: I might be able to do later this afternoon (otherwise probably tomorrow afternoon) to discuss Fc related stuff15:37
mithrokem_: Okay, when is afternoon for you? :-P15:38
mithrokem_: My morning is pretty busy, but I would be free in like ~3 hours from now....15:40
kem_mithro: OK, that might work. I've got meetings before hand which may run long. Hopefully they should be done by 3 or 4pm Eastern.15:43
mithrokem_: I can do 4pm eastern easily15:45
mithrokem_: How about we just pencil that in now and if things change just mention it?15:45
kem_mithro: Sounds good!15:46
mithrojhol: Started adding some questions to your doc :-)15:50
daveshahjhol: I've cleared up a couple of the IO questions in the doc15:58
*** q3k has joined #vtr-dev16:39
*** digshadow has quit IRC17:51
*** digshadow has joined #vtr-dev18:25
mithroHey kem_, myself and digshadow are around now -- happy to chat any time from now19:48
kem_mithro: OK I'm free now19:57
mithrokem_: I was thinking we would do it via IRC, but we can do VC if you think that would be easier?19:58
kem_mithro: Either works for, me if we'll be sending files back and forth IRC is probably easiest19:59
mithroOkay, let's start here and move if we need20:00
kem_mithro: Ok20:00
mithrokem_: So myself and digshadow have been trying to understand how the fc parameter affects connection from pins to the fabric20:01
mithrokem_: It is my understanding that  <fc in_type="abs" in_val="1" out_type="abs" out_val="1"/> should connect each in pin and each out pin to one track of each segment type?20:02
kem_mithro: Not quite20:02
kem_mithro: That would connect to one track/segment in the entire channel20:03
kem_mithro: If you explicitly want one to each segment type you'd need to use <fc_override>20:03
mithrokem_: But each pin *should* be connected to one track20:04
mithroIE No pins should be unconnected?20:04
kem_mithro: Yes that sounds correct in general20:04
mithrokem_: Okay - let me check that I wasn't imagining things from last night20:05
mithrokem_: But I'm pretty sure that isn't happening in our email20:05
kem_mithro: There are some corner cases where that may not be the case (e.g. a direct connects where the target block doesn't exist)20:05
mithrokem_: https://usercontent.irccloud-cdn.com/file/OxokPIBA/image.png20:06
mithrokem_: That architecture has <fc in_type="abs" in_val="1" out_type="abs" out_val="1"/> in the purple block type20:07
mithroBut pins 0, 1, 2 seem to be connected to 4 tracks each, while 3,4,5 seem to be connected to none20:07
mithrokem_: In fact we get the following in the log output20:08
mithroWarning 3: in check_rr_node: RR node: 25 type: OPIN location: (3,1) pin: 7 pin_name: BLK_TI-LUTFF.OUT[1] has no out-going edges.20:08
kem_mithro: Interesting20:08
mithroError 1: in check_rr_graph: node 41 (IPIN) at (3,2) block=BLK_TI-LUTFF side=RIGHT has no fanin.20:08
mithrokem_: Ha - that is not a good interesting :-P20:09
mithrokem_: I can share the architecture XML - it's pretty simple20:09
kem_mithro: Yeah send along the architecture and I'll take a look20:09
mithrokem_: But your expectation would be that each one of those pins connected to at least one track?20:10
kem_mitrho: Yes unless I'm completely miss-understanding :)20:10
mithrokem_: Okay - that is the conclusion we came too too20:10
kem_mithro: One thing to note is that in a uni-directional routing architecture tracks are assigned in pairs (one in each direction), so I'm not surprised to see things connections which are multiples of 220:11
mithrodigshadow: kem seems to agree with our theory! :-P20:12
kem_mithro: Are you doing anything with multiple pin locations? For example the same pin on multiple sides of the block?20:13
mithrokem_: Nope! All pins are on the right side20:13
digshadowmithro: cool20:14
mithrokem_: https://gist.github.com/mithro/17a4a3b6cc22c6a340f23933a109ca5d20:19
tpbTitle: Makefile · GitHub (at gist.github.com)20:19
kem_mithro: Which device are you using?20:21
mithrokem_: 2x420:21
mithrokem_: Just confirmed that 4x4 also has the same behaviour...20:21
mithrohttps://www.irccloud.com/pastebin/1Dno9flx/20:22
tpbTitle: Snippet | IRCCloud (at www.irccloud.com)20:22
kem_mithro: Yep, I can reproduce the behavior20:23
mithrokem_: Great!20:27
mithrokem_: I have to run away to another meeting, but digshadow is still looking into this20:27
kem_mithro: OK I'll take a look too and see if I can track the cause down20:27
digshadowkem_: ping my nick if you have any questions, sounds like you have enough info to investigate your side20:29
digshadowIn the meantime I'll be using our rr_graph library to create a custom, simple interconnect20:29
kem_digshadow: will do20:30
mithrokem_: I tried changing the fc abs value to 2 and similar and it still didn't make any sense to me20:32
kem_mithro: I've tried a couple similar tweaks with the same result. Using an equivalent fc frac also shows problems20:34
digshadowmithro: FYI added note to the large track with ticket: https://github.com/SymbiFlow/symbiflow-arch-defs/issues/8820:38
tpbTitle: Investigate high channel width on test · Issue #88 · SymbiFlow/symbiflow-arch-defs · GitHub (at github.com)20:38
mithrokem_: Any theories on what is going on/21:03
mithrokem_: Some info / notes I collected21:04
mithrohttps://www.irccloud.com/pastebin/Irxi3P01/21:04
tpbTitle: Snippet | IRCCloud (at www.irccloud.com)21:04
kem_mithro: I've found the root cause, just thinking about the best way to fix it21:06
mithrokem_: Oh? I would love to know21:06
kem_mithro: Internal to build_rr_graph() the fc specifications from the architecture file are converted to per-pin absolute Fc values21:07
mithrokem_: When I looked at the arch.echo - the fc_value_type output seemed to make sense there...21:08
kem_mithro: Right now this process is leaving some of the pins with an Fc value of zero, which is why they have no connections21:09
mithrodigshadow - BTW I don't think I shared this -> https://gist.github.com/mithro/3d6ec5ff0e2eea7fa8ae23caffabc50c21:09
tpbTitle: Convert rr_graph file from Verilog-To-Routing into GraphViz `.dot` file · GitHub (at gist.github.com)21:09
kem_mithro: You can see this if you add '#define VERBOSE' to rr_graph.cpp21:09
kem_mithro: It produces output like the following when run:21:10
kem_mithro: Fc Actual Values: type = BLK_TI-LUTFF, pin = 3, seg = 0 (span), Fc_out = 0, Fc_in = 0.21:10
kem_mithro: which matches what's in the GUI21:11
mithroFYI I also also investigating some weirdness around IO blocks but ended up at this yak shave :-P21:11
mithrodigshadow: See what kem_ said above -- could be very useful to you21:13
kem_mithro: The challenge is that the easy fix makes the specified and actual Fc values quite different...21:15
kem_mithro: So I'm tempted to forbid very low Fc values in cases where it is infeasible/not well defined, and throw an error21:16
kem_mithro: So in the case of your architecture setting the Fc values to 2 ensures the pins all get connected21:21
mithrokem_: have to run away again21:31
mithroBut why doesn't abs 1 work?21:31
kem_mithro: We try to balance the Fc values within a single port. But in uni-dir architectures (since we assign tracks in pairs) we don't actually have enough pins to balance without leaving some pins disconnected21:34
kem_mithro: In the case where the absolute Fc is very low (e.g. 1)21:35
kem_mithro: Stepping back from this bug in particular, the semantics of the Fc specification are a bit odd, since if you ask for Fc abs = 1 on a uni-dir arch, you get an architecture with Fc abs = 2*K, where K is the number of wire types in the architecture21:37
kem_mithro: I'm thinking to set the minimum absolute Fc to 2*K to avoid this, since things aren't really well defined for abs Fc < 2*K (at least given how the rr graph generator is currently implemented).21:39
digshadowkem_: I guess what throws me off on this, if I'm understanding correction, is that I would expect type=abs,val=1 to connect each pin to at least one track. But if I'm understanding correctly, it connects one track to all pins21:56
digshadowcorrectly21:57
kem_digshadow: Your expectation is correct, that is the behaviour that makes sense21:57
digshadowI am unclear from above the relation between us having incorrect expectations vs bug in VPR21:58
kem_digshadow: It's definitely a bug in VPR21:59
digshadowAre you saying that it doesn't do that today, but should, except there are some complications to make that work the way that would be intuitive?21:59
digshadowdue to algorithm limitations21:59
kem_digshadow: Yep that's a good summary22:00
digshadowperfect, thanks22:00
digshadowokay, let me re-read now with that in mind22:00
digshadowone thing that would help clarify would be to better understand the unidirection arch components. I don't understand why you can't have an odd number of tracks. Is there a resource you can point me at?22:01
kem_It's not a fundamental limitation of uni-dir architectures, you could build an architecture with an odd number of tracks (although there wouldn't be much motivation too).22:06
kem_It is however a restriction/simplification made within the VPR's rr graph generator22:07
digshadowgotcha22:07
kem_This gives an overview of uni-dir support in VPR (although I'm not sure it talks about odd vs even channel widths): http://www.eecg.toronto.edu/~jayar/pubs/luu/luutrets11.pdf22:08
kem_It also has references to the original work on uni-dir architectures22:08
digshadowcool will take a look22:08
digshadowah did vpr start out as strictly bidir?22:08
kem_Yes, historically FPGAs used bidir architectures, but unidir is almost uniformly better so most modern FPGAs us unidir22:10
digshadowGot it22:12
digshadowOkay, so uni-dir will assign one to inc and one to dec22:13
kem_Yep22:13
digshadowso fabs=1 is unreasonable, since it should be either 0 or 222:13
digshadowhow does the number of wire types factor in? are you expecting it to assign to each of the different segment types?22:14
mithroI would have expected that fabs=1 connects you to one of each direction?22:14
digshadowmithro: eh well they are two separate wires22:14
mithroTrue22:15
kem_mithro: That's what the code currently tries to do, but it is somewhat inconsistent...22:15
mithrokem_: Okay22:15
kem_On the subject of wiretypes, we currently try to ensure each block pin connects to at least one wire of each type22:17
kem_So on a unid-dir with K wire types and Fc abs=1, you actually get 2*K connections per pin22:17
kem_Which is also somewhat inconsistent with the concept behind Fc...22:18
digshadowkem_: got it22:18
digshadowHmm yeah I'm not sure that Fc is really behaving like I was expecitng22:18
kem_Yeah, hence why I'm trying to think of a better way to handle these cases22:19
mithrokem_: I think the fc=1 connecting a pin to one track of /each/ type seems a pretty logical interpretation22:21
mithrokem_: But I would also be happy with fc=1 connect a pin to one track in total22:21
mithrokem_: But it should never connect a pin to no tracks :-P22:21
kem_Agreed!22:22
mithrokem_: Shall we leave the ball in your court to solve?22:23
mithrokem_: Oh - does fc_overrides work the way we would expect?22:23
mithrokem_: <fc_override fc_type="abs" fc_val="1" segment_name="short"><fc_override fc_type="abs" fc_val="1" segment_name="long"> do the right thing at the moment?22:24
kem_mithro: probably not22:24
mithrokem_: What about fc=2?22:25
kem_mithro: fc_overrides and the defaults are handled the same way. Internally the defaults and overrides are expanded into a single set of pins + seg Fc constraints22:25
kem_mithro: Trying fc=2 fixes the architecture you sent, although I haven't tried with overrides22:26
mithrokem_: Well, just discovered that fc_override for a segment name that doesn't exist do not cause vpr to fail....22:26
kem_mithro: We're probably a missed error check for that case :(22:27
mithrokem_: Okay - with fc=2, I get each pin connected to on pair of uni directional wires per type22:28
digshadowkem_: we found a few other things too, but I figured I'd bring them up later to not derail the main conversation22:29
mithrohttps://usercontent.irccloud-cdn.com/file/BCAWXIar/image.png22:29
mithroBTW - I think it would be nice to color-code the wires to different types22:29
mithrokem_: Maybe we should just say if the architecture has uni-directional wires fc must be mod 2 ?22:30
kem_mithro: Yep, that's what I'm adding now22:30
mithrokem_: Ha - great minds think alike :-P22:31
mithrokem_: It would also be nice for the pins which are connected to "long lines" which are not drawn (IE pin 8 and 9) had some type of indicator too22:31
kem_mithro: I think 8 and 9 are clock pins, which is why they aren't connected to anything22:32
kem_mithro: We are actually starting to look at clock networks so hopefully they will be connected to something soon :)22:33
digshadowkem_: unrelated if we are taking a break from the main issue, we also observed that you could have num_pb on a top level pb_type22:39
digshadowwould you expect an error to be thrown for this?22:39
digshadowmithro: also don't forget to rename BEL/BLK if you haven't done that yet22:40
kem_digshadow: Yes that should probably be an error22:41
digshadowkem_: I'll file a bug then22:42
kem_digshadow: Thanks22:42
kem_digshadow: There is a separate 'capacity' attribute which is conceptually similar and active only at the top level22:43
mithrokem_: So - should things fail if there is a mixture of bidir / unidir segment types?22:44
kem_mithro: I haven't tried it, but I expect the rr graph generator assumes only one of bidir/unidir22:45
mithrokem_: Yeap, it seems to just pick the first one22:46
kem_mithro: I think the router should be OK with a mixed bidir/unidir architecture (it's the same code for both routing types).22:46
mithrokem_: okay22:48
*** digshadow has left #vtr-dev22:49
*** digshadow has joined #vtr-dev23:08
digshadowkem_: also unrelated, I'm unclear why you need to specify --route_chan_width when reading an rr_graph. Doesn't the rr_graph define this and not vpr?23:10
kem_digshadow: I think that's mostly just a historic artifact.23:14
kem_digshadow:  VPR always took this from the command-line (or calculated it from binary search) so we just required it be specified23:14
kem_digshadow: when an rr graph was loaded, which allowed us to do some consistency checks that the <channel> info was consistent23:15
digshadowgot it23:19
digshadowI think there was also something related to fc, but I can't recall specifically. Along the lines of presumably fc doesn't matter when you give an rr_graph23:20
kem_That sounds correct. When you load an rr graph the rr graph generator is bypassed so the Fc values won't be used.23:22
kem_The only possible exception might be placement macros. I think we look for fc=0 to identify placement macros (e.g. carry chains) which must be kept together during placement to ensure they can be connected via dedicated routing.23:23
digshadowokay, thats good to know23:26
digshadowmithro: I'm clicking on pins in the vtr GUI, they only report 1 edge per connection. I'd be expecting n + 1 to connect to the net source23:40
mithrodigshadow: Hrm23:40
mithrodigshadow: Got look and see where that comes from?23:40
digshadowI'll dump the rr_graph and see what it looks like23:40
digshadowkem_: rr_graph has id in some locations and index in some others to refer to identifiers. Thoughts on standardizing the name?23:51
kem_digshadow: I'd lean towards 'id' (since how things are stored is an implementation detail)23:56
digshadowmithro: also I asked a while back why ptc is in loc for <node>. its presumably because the block type is inferred from the location23:56
digshadowkem_: okay, so I'll file a ticket on that then23:56

Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!