Tuesday, 2022-06-28

*** tpb <[email protected]> has joined #openrisc00:00
shorneI am just playing with the best benchmarks to measure tlb misses on qemu automatically00:00
shornebut now there is one issue cropping up, kicking /init sometimes takes long like 10 seconds, and sometimes 1 second00:01
shorneit seems to flip-flop between fast/slow00:01
shornewell, kind of random00:02
shorneits one of those things I rather look at later, as the system is still pretty stable00:02
zx2c4hmmmm00:02
shornebut its annoying to have to wait 10 seconds for init to start00:02
zx2c4alright, well, i sent the patch00:02
zx2c4yea that is quite odd. i've seen it too00:02
shornethanks00:02
zx2c4the big test will be whether build.wireguard.com gets slowed down waiting for it or not00:02
shorneso I am thinking phase 1 get stability patches out00:03
zx2c4(please don't modify the commit subject; wireguard commits are al uniform and i'd like to keep them that way)00:03
shornephase 2 get performance patches out, and try to fix this 10 second hank00:03
zx2c4yea. also also - once this is in tree, build.wireguard.com will be rebuilding and rerunning for every single commit pushed to a bunch of trees00:03
zx2c4which means it'll be a good way to find new bugs and crashes and stuff00:03
shorneI usually don't modify subjects, so no worries00:04
zx2c4i guess i'll have to patch QEMU on the CI server00:04
zx2c4alright, qemu on CI server patched00:10
zx2c4so if you send your PR to linus in the next few hours i guess that's good timing for pacific coast for him00:11
zx2c4and then this will be churning away by morning00:11
zx2c4okay it's churning way on the CI server now00:19
shorneI think I need my patches to be reviewed00:23
shornesorry, I am a bit slow :(00:23
shornebasically its a pci support and irqchip patch00:24
shorneprobably PCI is a bit too big for the -rc series00:25
shornebut also PCI is not needed probably for the wireguard CI00:26
zx2c4wgci uses mmio yea00:26
zx2c4hmm no console output? wonder what's up here00:27
zx2c4did i forget some patch00:27
zx2c4oh hah im dumb00:51
zx2c4it doesnt run because the other kernel pieces aren't there :-P00:51
zx2c4duh00:51
zx2c4alright ill just be patient and wait for you to submit it00:52
shorneI am testing a 5.19-fixes branch now00:58
shornewith hopefully just the basics of what we need00:58
shorneoh, its running good so far wireguard tests are running00:59
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc01:12
shornezx2c4: ok, I have this queue ready to go: https://github.com/openrisc/linux/tree/or1k-5.19-fixes01:30
shornewireguard test pass01:30
shornewaiting for01:30
shorne 1. irq patch to get acked01:30
shorne 2. ill push this to the next branch too for some more 'global' testing01:31
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Quit: leaving)07:25
shorneok, got ack on the IRQ patch08:32
shorneits in linux next right now08:32
shorneso likely Ill send to linux tomorrow AM08:33
shorneto linus08:33
zx2c4shorne: great! Sounds like you got your ack!10:06
zx2c4shorne: i just pushed things to the CI server with the irqchip fix in there and it's broken - https://xn--4db.cc/D5dbnrvy12:27
zx2c4this is running your branch of qemu12:27
zx2c4with irqchip patch 12:27
zx2c4for kernel12:27
zx2c4it failed like this after quite a bit of work done12:27
zx2c4indicating that this is a hard to hit bug that only happened under load...12:27
zx2c4here's the whole log https://א.cc/x3Vt402T12:28
zx2c4ooo it keeps splatting12:30
zx2c4https://xn--4db.cc/bKiNzmFE12:30
zx2c4shorne: what this indicates is there's probably still some kind of race12:32
zx2c4or ordering issue12:32
zx2c4because, if it works on a laptop, but fails when it's being run alongside 18 other concurrent tests on other archs,12:33
zx2c4then the differentiating factor is that the qemu processes are constantly being scheduled out and not allowed to run for very long, because theyre all fighting for a CPU12:33
zx2c4which in turn means the or1k cpus execute in a different order than usual and with longer delays than usual12:34
zx2c4shorne: okay eventually it recovered? but too slowly, so the thing still failed https://build.wireguard.com/wireguard-linux-stable/a913f377cf1dbe90786e99ca3661e57a382c4541/or1k.log12:51
zx2c4So yea these hangs or whatever it is needs to be fixed before this is ready12:54
*** Finde_ <[email protected]> has joined #openrisc13:06
*** Finde <[email protected]> has quit IRC (Read error: Connection reset by peer)13:06
zx2c4shorne: can you drop the wireguard patch for now?13:09
zx2c4Seems a bit premature13:10
*** Finde_ is now known as Finde14:34
*** Finde <[email protected]> has quit IRC (Quit: WeeChat 2.3)14:35
*** Finde <[email protected]> has joined #openrisc14:35
*** arnd <[email protected]> has quit IRC (*.net *.split)20:01
*** shorne <[email protected]> has quit IRC (*.net *.split)20:01
*** Finde <[email protected]> has quit IRC (*.net *.split)20:01
*** zx2c4 <zx2c4!sid204921@gentoo/developer/zx2c4> has quit IRC (*.net *.split)20:01
*** jcm <jcm!sid410222@2a03:5180:f::6:426e> has quit IRC (*.net *.split)20:01
*** knz <knz!~kena@2001:41d0:a:f6e9::1> has quit IRC (*.net *.split)20:01
*** Finde <[email protected]> has joined #openrisc20:15
*** arnd <[email protected]> has joined #openrisc20:15
*** shorne <[email protected]> has joined #openrisc20:15
*** jcm <jcm!sid410222@2a03:5180:f::6:426e> has joined #openrisc20:15
*** knz <knz!~kena@2001:41d0:a:f6e9::1> has joined #openrisc20:15
*** zx2c4 <zx2c4!sid204921@gentoo/developer/zx2c4> has joined #openrisc20:15
shornezx2c4: understood, let me have a look, its good to get these sorted out, maybe its related to that 10 second pause issue20:38
shorneI still did see some rcu timeouts myself before, but it did recover and the test passed20:40
zx2c4shorne: in this case the test failed because it took >20min20:56
shornezx2c4: yeah, it looks like the lockups are when waiting for IPI requests to complete. i.e. kernel sends a task to be done on multiple CPU's then some CPU's don't respond back21:04
shornethis issue I put some fixes for but there could still be race conditions in qemu that I am not covering21:04
shorneits better to track these down I agree21:04
shorneI would like to get most of the stuff for qemu merged upstream now though21:05
zx2c4Multiple pull requests seems reasonable 21:05
zx2c4One now, one later21:05
shorneyeah, also maybe someone will point out the issue in the PR review :)21:06

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!