*** tpb has joined #symbiflow | 00:00 | |
*** proteusguy has quit IRC | 02:48 | |
*** _whitelogger has quit IRC | 03:33 | |
*** _whitelogger has joined #symbiflow | 03:36 | |
*** citypw has joined #symbiflow | 03:59 | |
*** proteusguy has joined #symbiflow | 04:07 | |
*** proteusguy has quit IRC | 04:21 | |
*** proteusguy has joined #symbiflow | 04:46 | |
*** proteusguy has quit IRC | 06:25 | |
*** proteusguy has joined #symbiflow | 07:03 | |
*** proteusguy has quit IRC | 07:08 | |
*** proteusguy has joined #symbiflow | 07:20 | |
*** citypw has quit IRC | 09:03 | |
nats` | hello | 09:04 |
---|---|---|
nats` | kgugala, I have a problem running the fuzzer 005 but I remember seeing something on the bug tracker | 09:05 |
nats` | is it me or there is something broken ? | 09:05 |
*** Rahix has joined #symbiflow | 09:09 | |
nats` | ++ fgrep CRITICAL vivado.log | 09:10 |
nats` | + test -z '' | 09:10 |
nats` | + for x in design*.bit | 09:10 |
nats` | + /home/nats/project/symbiflow/prjxray/build/tools/bitread --part_file /home/nats/project/symbiflow/prjxray/database/zynq7/xc7z010clg400-1.yaml -F 0x00000000:0xffffffff -o design.bits -z -y design.bit | 09:10 |
nats` | Part file not found or invalid | 09:10 |
nats` | ../fuzzaddr/common.mk:12: recipe for target 'build/specimen_001/OK' failed | 09:10 |
nats` | uhhmm it may have been corrected by a commit I don't have let me retry | 09:11 |
noopwafel | heh is zynq7 'working' now? :) | 09:11 |
nats` | working on it (mainly kgugala in fact I'm fixing generale perf problem in fuzzer) | 09:12 |
noopwafel | very nice! | 09:12 |
nats` | oky kgugala seems fixed by a change in the 001 fuzzer :) | 09:13 |
nats` | I start a full run on zynq with time to see how long it'll take | 09:14 |
nats` | let's hope it doesn't fail mid way | 09:14 |
*** citypw has joined #symbiflow | 09:15 | |
nats` | + python3 /home/nats/project/symbiflow/prjxray/utils/parsedb.py --strict build/segbits_clbx.db | 09:42 |
nats` | Traceback (most recent call last): | 09:42 |
nats` | File "/home/nats/project/symbiflow/prjxray/utils/parsedb.py", line 61, in <module> | 09:42 |
nats` | main() | 09:42 |
nats` | File "/home/nats/project/symbiflow/prjxray/utils/parsedb.py", line 57, in main | 09:42 |
nats` | run(args.fin, args.fout, strict=args.strict, verbose=args.verbose) | 09:42 |
nats` | File "/home/nats/project/symbiflow/prjxray/utils/parsedb.py", line 27, in run | 09:42 |
nats` | bits, tag, bitss[bits]) | 09:42 |
nats` | AssertionError: strict: got duplicate bits frozenset({'31_16', '31_17'}): CLB.SLICE_X0.BLUT.RAM CLB.SLICE_X0.ALUT.RAM | 09:42 |
nats` | + finish | 09:42 |
nats` | + echo 'Cleaning up temp files' | 09:42 |
nats` | I'll switch to artix because I think kgugala is stil lfixing a lot of things on zynq target | 09:42 |
*** citypw has quit IRC | 10:10 | |
nats` | fuzzer 013 making error too in artix 7 | 10:54 |
nats` | + python3 /home/nats/project/symbiflow/prjxray/utils/mergedb.py --out ./tmp.7L99aPR736 /home/nats/project/symbiflow/prjxray/database/artix7/segbits_clbll_l.db ./tmp.HnAre7NUQC | 10:54 |
nats` | WARNING: got duplicate tag CLBLL_L.SLICEL_X1.CARRY4.ACY0 | 10:54 |
nats` | Orig line: CLBLL_L.SLICEL_X1.CARRY4.ACY0 31_01 31_04 31_14 31_15 | 10:54 |
nats` | New line : CLBLL_L.SLICEL_X1.CARRY4.ACY0 31_14 | 10:54 |
nats` | Traceback (most recent call last): | 10:54 |
nats` | File "/home/nats/project/symbiflow/prjxray/utils/mergedb.py", line 64, in <module> | 10:54 |
nats` | main() | 10:54 |
nats` | File "/home/nats/project/symbiflow/prjxray/utils/mergedb.py", line 60, in main | 10:54 |
nats` | verbose=args.verbose) | 10:54 |
nats` | File "/home/nats/project/symbiflow/prjxray/utils/mergedb.py", line 28, in run | 10:54 |
nats` | assert not strict, "strict: got duplicate tag" | 10:54 |
nats` | AssertionError: strict: got duplicate tag | 10:54 |
kgugala | yes, I've seen the problems with 01x fuzzers on zynq | 10:58 |
nats` | there it's on artix | 10:58 |
nats` | the second one | 10:58 |
kgugala | I think they were working before | 10:58 |
nats` | how about taking a break in new feature and working together to get a full run working for each target ? | 10:59 |
nats` | in my case the 013 always fail on same error | 10:59 |
kgugala | stable DB would be very good to have | 11:00 |
nats` | sure | 11:00 |
nats` | it would help people wanting to work on higher level stuff | 11:00 |
kgugala | can you create an issue with 013 on artix | 11:00 |
nats` | yep sure I wanted to be sure it's not already a known issue | 11:01 |
kgugala | note that about a week ago we switched to more strict error checking | 11:01 |
nats` | ahhhh | 11:01 |
kgugala | so the problem you have could be present before, but was masked | 11:01 |
nats` | oky good things anyway to clean everything :) | 11:02 |
kgugala | still, we need to fix this | 11:02 |
nats` | https://github.com/SymbiFlow/prjxray/issues/532 | 11:04 |
tpb | Title: Fuzzer 013 Duplicate Tag · Issue #532 · SymbiFlow/prjxray · GitHub (at github.com) | 11:04 |
nats` | you're copied in the issue | 11:04 |
nats` | so you'll see update | 11:04 |
nats` | do you have an idea about this 013 issue ? | 11:07 |
nats` | I don't know where I should look | 11:07 |
kgugala | I'd grep through logs to see which run actually instantiated CLBLL_L.SLICEL_X1.CARRY4.ACY0 | 11:31 |
kgugala | also, I'd check git log if there was something done with this fuzzer recently | 11:32 |
nats` | thanks good hints | 11:34 |
*** proteusguy has quit IRC | 11:39 | |
*** Rahix_ has joined #symbiflow | 11:52 | |
*** Rahix has quit IRC | 11:55 | |
nats` | I don't find any clue about that duplicated problem | 13:49 |
*** proteusguy has joined #symbiflow | 13:55 | |
kgugala | all those 01x bugs look like heisenbugs | 14:30 |
kgugala | when I run it on my machine I cannot reproduce it | 14:30 |
kgugala | however, on different machines those fails in exact the same way as for nats` | 14:31 |
nats` | ouch I don't like that | 14:32 |
nats` | I rerun one after having explicitly make clean the database | 14:32 |
nats` | just to be sure | 14:32 |
nats` | schroendinbug | 14:32 |
nats` | uhh kgugala | 14:42 |
nats` | make clean in database cleared the problem... | 14:42 |
nats` | I guess we have to trigger the database make clean from fuzzer make | 14:42 |
kgugala | I got this bug even with clean db | 14:56 |
nats` | that becomes really "mystic" :D | 14:58 |
nats` | currently running the 026 no problem so far since I clean the database | 14:59 |
kgugala | for Artix? | 15:12 |
nats` | yep | 15:15 |
nats` | it's now Dumping pips from tile | 15:15 |
nats` | I don't know which fuzzer it is :D | 15:15 |
kgugala | I'd guess one of 05x | 15:34 |
*** citypw has joined #symbiflow | 15:34 | |
*** citypw has quit IRC | 16:21 | |
*** Rahix_ has quit IRC | 17:40 | |
litghost | kgugala: > I think they (CLB fuzzers) were working before -> No, the CLB fuzzers were broken for a while. They were running to completion, but returning invalid results by including some INT bits. | 18:14 |
litghost | I've been working on getting the false positive rate down, and artix7 seems more consistent. I'm currently doing a complete run on artix7 with the latest PR's, and then I'll do some repeat runs to ensure that the CLB fuzzers are stochastically failing | 18:16 |
litghost | Are not stochastically failling | 18:16 |
litghost | The changes I added for the CLB fuzzers will cause broken partial databases to generate errors, like the "strict: got duplicate tag" failure | 18:17 |
litghost | But the reason for the failure is that the fuzzer added bad bits in the earlier run, and needs to be cleaned out | 18:17 |
litghost | We could add smarts to detect when a new entry is narrower than the previous entry. Generally speaking narrower entries are more likely to be correct than wide entries | 18:18 |
nats` | litghost, I think we need a massive rework of the dependencies and cleaning process | 19:17 |
nats` | because a lot of things explode bescause of that | 19:17 |
litghost | I don't disagree, however I don't see an immediately obvious solution. Part of the problem arises in the fact that the fuzzer process is additive, which makes the dependency tracking merky. Unclear what the "correct" structure looks like that enables parallelism and tighter dependency tracking | 19:24 |
nats` | oky | 19:29 |
*** Sonnenmann has joined #symbiflow | 19:48 | |
*** Rahix has joined #symbiflow | 20:10 | |
nats` | uhhhhh question... | 20:10 |
nats` | is it normal to loop over the fuzzer 050 since few hours ? | 20:10 |
litghost | yes and no | 20:13 |
litghost | https://github.com/SymbiFlow/prjxray/issues/537 | 20:13 |
tpb | Title: int_loop stability criteria is wasteful · Issue #537 · SymbiFlow/prjxray · GitHub (at github.com) | 20:13 |
litghost | I'm working on a patch | 20:13 |
litghost | lowers the runtime significantly | 20:13 |
litghost | on artix7 it appears to take all 12 iterations, but the last couple iterations only find less than 10 pips per iteration | 20:14 |
*** Sonnenmann has quit IRC | 20:27 | |
nats` | uhhmmm ok I need to understand how it works but anyway I think it loops since 3 or 4hours so I'll stop it | 20:27 |
nats` | litghost, do you think I modify my 074 patch to do what we said and them redo a clean PR to merge anyway ? | 20:29 |
litghost | nats: can you rephrase? | 20:43 |
nats` | we discussed some modification to apply to my last code for fuzzer 074 | 20:45 |
nats` | I was wondering if I do them even if we couldn't fully test it because of dependencies to other fuzzer | 20:45 |
nats` | I'm tired I guess my english is pretty hard to understand :D | 20:46 |
nats` | short: do I update my patch to merde the PR for the 074 | 20:46 |
nats` | merge | 20:46 |
litghost | nats: Yes, apply the fixes. 072-074 are relatively independent of the other fuzzers, so you only really need to test 072-074 | 20:49 |
nats` | 074 is more complicated | 20:53 |
nats` | the post processing call stuff from lower fuzzer like 005 | 20:54 |
nats` | so I couldn't test post processing but I didn't modify that part so if the generation part of the fuzzer is good.... it should be | 20:54 |
nats` | just be aware that merge could break something | 20:55 |
nats` | but if everybody is aware it's not a real problem | 20:55 |
nats` | litghost, can we skip the move of my pool function to utils.py for now | 20:57 |
nats` | I would like to avoid spreadin mistake if there are some | 20:58 |
mithro | So, should "make -jXXX" work on the fuzzers? | 20:58 |
nats` | once well tested we could refactors things | 20:58 |
litghost | make -jXXX is working | 20:58 |
litghost | I've been using it to accelerate testing with success | 20:59 |
mithro | litghost: Does it work with QUICK=y ? | 21:05 |
mithro | litghost: and what should I set it too? The number of processors? smaller? greater? | 21:06 |
litghost | Unclear, I haven't been testing with QUICK, I'm mostly interested in fixed the fuzzer output to be correct right now | 21:06 |
litghost | Pretty much none of the tip of HEAD fuzzers output the correct bits | 21:07 |
litghost | As for number of processes, I've been using -j48 to good affect, but my machine is very large. | 21:13 |
mithro | litghost: This is for the CI | 21:20 |
mithro | litghost: It's almost working | 21:20 |
mithro | Currently trying "make --output-sync=target --warn-undefined-variables QUICK=y -j32" | 21:21 |
* nats` is dying of jealousy of all those huge -j :) | 21:23 | |
nats` | litghost, do you have an idea about the good default size for the number of item to process in each instance of vivado ? | 21:23 |
litghost | try 50 items? | 21:24 |
nats` | it's really low | 21:24 |
nats` | I made all my test with 64 block so something like hundreds thousand item per instance | 21:24 |
litghost | oh | 21:24 |
nats` | but it varies a lot following what you're processing | 21:24 |
nats` | tiles/nodes/pips | 21:25 |
litghost | so tiles should be ~10 - ~100 | 21:25 |
nats` | in fact it depends more on what is called in the loop | 21:25 |
litghost | pips should 10k - 100k | 21:25 |
litghost | same with nodes | 21:25 |
litghost | try that for a first cut | 21:25 |
litghost | watch for peak memory and runtime | 21:25 |
nats` | I'll do that yes | 21:27 |
litghost | nodes is ~10k | 21:28 |
nats` | how hard could it be to make a first fuzzer called bencher | 21:28 |
litghost | pips is ~100k | 21:28 |
nats` | to find the sweet spot | 21:28 |
nats` | it could be the next step | 21:28 |
litghost | I don't think we want it to be that dynamic | 21:28 |
litghost | But it is a fun exercise | 21:28 |
nats` | ok | 21:28 |
litghost | Keep in mind you are balancing a lot of variables | 21:28 |
litghost | CPU usage, disk usage, memory usage | 21:28 |
nats` | sure you're right | 21:29 |
nats` | I'll put a default at 10k in the python script and set things in the makefile | 21:29 |
litghost | I think we might tune the CI and larger systems differently than the defaults | 21:29 |
litghost | sounds good | 21:29 |
nats` | CI ? | 21:29 |
litghost | continuous integration | 21:29 |
litghost | Once the memory requirements on 072-074 is fixed, we should be running a database generation once a day (or maybe every other day) | 21:30 |
litghost | mithro was alluding to a QUICK mode, which hopefully runs in <10 minutes | 21:31 |
litghost | to do sanity checking on PR's | 21:31 |
nats` | ah oky | 21:32 |
nats` | it seems complicated no ? | 21:32 |
nats` | I mean even with optimisation I'm pretty sure 072 and 074 can't be run in less than few hours | 21:33 |
nats` | by the way I was wondering if we should split 074 in two part, the generation and the post processing | 21:33 |
litghost | 074 is already split in a functional sense | 21:37 |
litghost | I'm not sure the value of splitting it in the fuzzer sense | 21:38 |
litghost | 072-074 is generally excluded from QUICK because of its large runtime | 21:38 |
nats` | the main advantage I see is in case of late failure in 074 you can easily rerun the post processing part | 21:39 |
litghost | that's what https://github.com/SymbiFlow/prjxray/blob/master/fuzzers/074-dump_all/generate_after_dump.sh is for | 21:39 |
tpb | Title: prjxray/generate_after_dump.sh at master · SymbiFlow/prjxray · GitHub (at github.com) | 21:39 |
litghost | I guess you mean from the standpoint of the makefile? | 21:39 |
nats` | yep | 21:39 |
litghost | the makefile could easily be refactored to know when the vivado portion is done | 21:40 |
nats` | I used to run generate_after_dump by hand but it's not really "documented" | 21:40 |
nats` | yep | 21:40 |
nats` | could be done in the same way I want to implement the "last target function" | 21:40 |
litghost | there are many ways if you are simply referring to keep 074 as one fuzzer, but making two targets instead of one | 21:41 |
nats` | https://github.com/SymbiFlow/prjxray/issues/528 <= I was talking about extending this "feature request" | 21:42 |
tpb | Title: Fuzzer storing last target · Issue #528 · SymbiFlow/prjxray · GitHub (at github.com) | 21:42 |
litghost | Not sure how that is related | 21:42 |
litghost | https://github.com/SymbiFlow/prjxray/blob/master/fuzzers/074-dump_all/Makefile#L7 the database target could invoke generate_after_dump.sh | 21:43 |
tpb | Title: prjxray/Makefile at master · SymbiFlow/prjxray · GitHub (at github.com) | 21:43 |
nats` | we could save the last valid step and the corresponding target (zynq/artix/kintex) | 21:43 |
nats` | so it would allow to keep everything coherent when restarting a failed fuzzer | 21:43 |
litghost | that feature feels orthogonal to the 074 refactoring | 21:43 |
nats` | sure it is ! | 21:43 |
nats` | sorry my mistake I made that confusing | 21:44 |
nats` | I think it's a better feature to do before refactoring | 21:44 |
nats` | my opinion is if we can be sure the dependency system covers almost all possible easy mistake the refactoring can be splitted in even more small job | 21:45 |
litghost | Ultimately up to you | 21:46 |
nats` | ok I'll focus on making things work first and we will see later or I'll stack a lot of modification in my code without testing but I keep that in mind | 21:47 |
nats` | :wq | 22:02 |
nats` | ... | 22:03 |
nats` | -_- | 22:03 |
nats` | litghost, basic question with makefile but I don't remember if it ok to use $(var) to pass the argument to internal command line ? | 22:12 |
litghost | what do you mean "internal command line"? | 22:16 |
nats` | uhhmmm to pass as an argument for command line started by the Makefile | 22:19 |
litghost | https://www.gnu.org/software/make/manual/html_node/Reading-Makefiles.html | 22:22 |
tpb | Title: GNU make: Reading Makefiles (at www.gnu.org) | 22:22 |
litghost | If it's in the command portion, it is deferred, meaning command line overrides | 22:22 |
litghost | if you need something in the dependency list, so the link | 22:22 |
nats` | I think I see the link is useful thanks | 22:26 |
nats` | I'm sorry this evening my brain doesn't want to speak english.... I'll try to write better sentence | 22:27 |
nats` | cool thanks the arg passing through Makefile is working now with default value | 22:29 |
nats` | I have something that seems to work i'll push to update the PR if it's good I'll squash commit before merge is it ok ? | 22:39 |
litghost | nats: Sounds good | 22:41 |
nats` | can you keep old file to compare with the new version I did a stupid make clean without saving them :| | 23:00 |
mithro | litghost: It would be really nice if we solve https://github.com/SymbiFlow/prjxray/issues/494 -- then the CI should give us nice looking status outputs of which fuzzers ran / failed / etc | 23:11 |
tpb | Title: Make each fuzzer output a test_results.xml file · Issue #494 · SymbiFlow/prjxray · GitHub (at github.com) | 23:11 |
mithro | bblr, going to find some lunch | 23:13 |
nats` | litghost, saw your review i'll take care of that tomorrow I need to go to sleep :) | 23:33 |
nats` | good night ! | 23:33 |
mithro | nats`: thanks for all your work! | 23:34 |
Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!