*** tpb <[email protected]> has joined #openrisc | 00:00 | |
zx2c4 | SMP thing might be a redherring, but CONFIG_PREEMPT seems like a good differentiator | 00:18 |
---|---|---|
zx2c4 | btw, any interest in making `-accel tcg,thread=multi` work properly? | 00:26 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 04:15 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 05:32 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Remote host closed the connection) | 07:08 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 07:13 | |
shorne | zx2c4: I did have multi threading working before, I have some patches I'll try to find them | 07:18 |
shorne | ill try the CONFIG_PREEMPT setting to see if I can see anything | 07:19 |
*** littlebo1eep <littlebo1eep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 08:15 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 08:15 | |
zx2c4 | shorne: oh yea multithreading would be a very nice difference indeed | 08:20 |
zx2c4 | Any luck with CONFIG_PREEMPT? | 08:20 |
shorne | zx2c4: sorry only had time on the computer to reply to you, I have a few minutes now | 08:56 |
shorne | ill see if I can find my threading patches, it doesn't look like I pushed them anywhere | 08:56 |
shorne | this was the patch: https://github.com/stffrdhrn/qemu/commit/27bb7ae5837c18b0355643a53279aaa915f2b041 | 09:00 |
*** littlebo1eep <littlebo1eep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Remote host closed the connection) | 09:00 | |
shorne | All I did was move ttcr to be a per cpu property, but somehow that broke singlethread timer sychronization I think | 09:01 |
shorne | I did it to try to solve a TLB thrashing problem but since it didn't really help I just dropped it | 09:01 |
zx2c4 | Oh you should push that upstream to qemu | 09:10 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 09:10 | |
shorne | alright, let me work on it | 09:15 |
shorne | you are making my todo list pretty long ;), all good stuff for openrisc | 09:16 |
shorne | I hope you find it useful to have wireguard running on OpenRISC | 09:16 |
shorne | It sure uncovers some interesting dependencies | 09:17 |
zx2c4 | heh | 09:22 |
zx2c4 | well it succeeded for the first time on build.wireguard.com last night | 09:22 |
zx2c4 | (when i disabled preemption) | 09:22 |
zx2c4 | so it's now there if you ctrl+f for or1k | 09:22 |
zx2c4 | (working on wireguard-linux-stable) | 09:22 |
zx2c4 | it currently takes three times as long to run as other architectures though | 09:22 |
zx2c4 | not quite sure why tcg is so much slower | 09:23 |
shorne | I am running CONFIG_PREEMPT=y, CONFIG_SMP=y I am still not seeing the errors with curve25519 | 09:24 |
shorne | thats interesting | 09:25 |
zx2c4 | let me send you my VM image | 09:25 |
zx2c4 | that exhibits the issue | 09:25 |
zx2c4 | then you can see whether it's your qemu | 09:26 |
shorne | thanks | 09:26 |
zx2c4 | wait... | 09:26 |
zx2c4 | try with CONFIG_SMP=n | 09:26 |
zx2c4 | you misunderstood me earlier | 09:26 |
shorne | I tried with SMP=n too, let me try more | 09:27 |
shorne | if you don't mind me asking where do you live? | 09:27 |
shorne | Either you don't sleep much or we have pretty good time overlap | 09:28 |
shorne | I am in Japan | 09:28 |
zx2c4 | haha | 09:28 |
zx2c4 | I'm in Paris | 09:28 |
zx2c4 | I also dont sleep much | 09:28 |
zx2c4 | This one exhibits the issue. Run with `qemu-system-or1k -nodefaults -nographic -cpu or1200 -machine or1k-sim -smp 1 -m 256M -serial stdio -monitor none -kernel path/to/vmlinux` https://usercontent.irccloud-cdn.com/file/VYhUCJ7l/vmlinux | 09:29 |
zx2c4 | (and remember to use that patch that page aligns the kernel) | 09:30 |
zx2c4 | er, page aligns the dtc i mean | 09:30 |
zx2c4 | dtb | 09:30 |
shorne | yes | 09:30 |
shorne | I need to go away again, dinner time | 09:32 |
shorne | not sure I will be back until morning | 09:32 |
shorne | One strange thing about the page_align patch is my kernel boots ok even when I don't use an external initrd | 09:32 |
shorne | your vmlinux doesn't I might just be lucky with the bondary | 09:33 |
shorne | mine is only about 6mb compared to 11mb | 09:33 |
shorne | got to go | 09:33 |
zx2c4 | yea, it'll boot okay but it uses the compiled-in DTB | 09:33 |
zx2c4 | rather than the provided one | 09:33 |
zx2c4 | im forwarding porting your multithreaded patch to recent qemu | 09:34 |
zx2c4 | hmm seems like that doesnt work straight forwardly | 10:08 |
zx2c4 | no idea | 10:08 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 11:10 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 11:10 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Remote host closed the connection) | 13:34 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 13:35 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 14:19 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 18:12 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 18:31 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 18:40 | |
*** littlebo1eep <littlebo1eep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 18:49 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 18:53 | |
*** littlebo1eep <littlebo1eep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 18:57 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 19:02 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 19:14 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 19:15 | |
arnd | shorne, zx2c4: I've seen compilers struggle with curve25519 a lot. Usually it manifests as exploding stack usage that shows up in the build warnings, and that can lead to data corruption, but I guess it could be anything. May be worth trying to see if it happens with both gcc and clang, or just one of them | 19:26 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 19:29 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 19:33 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 19:42 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 19:56 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 20:09 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 20:12 | |
zx2c4 | Ive seen chacha20poly1305 tests fail too fwiw | 20:13 |
zx2c4 | Less commonly but still | 20:13 |
shorne | good morning, I see | 20:14 |
shorne | BTW, booting your kernel | 20:29 |
shorne | Linux version 5.18.0-rc4+ (zx2c4@thinkpad) (or1k-linux-musl-gcc (GCC) 11.2.1 20211120, GNU ld (GNU Binutils) 2.37) #31 PREEMPT Tue May 3 11:27:57 CEST 2022 | 20:29 |
shorne | on my existing QEMU I can see the issue right away | 20:29 |
shorne | Linux version 5.18.0-rc4-00002-gd53a0fd87c26-dirty (shorne@antec) (or1k-linux-gcc (GCC) 12.0.1 20220210 (experimental), GNU ld (GNU Binutils) 2.38.50.20220202) #737 PREEMPT Tue May 3 18:28:10 JST 2022 | 20:31 |
shorne | basics seem fairly inline, compilers are different, I did fix a few things in gcc 12.x but I think mostly related to linking | 20:35 |
shorne | one was updating libgcc | 20:36 |
shorne | -#define _FP_TININESS_AFTER_ROUNDING 1 | 20:36 |
shorne | +#define _FP_TININESS_AFTER_ROUNDING 0 | 20:36 |
zx2c4 | So it's either: compiler config or linux config | 20:36 |
zx2c4 | https://xn--4db.cc/JZujg1aZ try this config i guess | 20:37 |
shorne | well, I tested with that compiler | 20:37 |
shorne | so it seems unlikely | 20:37 |
zx2c4 | Oh, and couldnt repro with my compiler | 20:37 |
zx2c4 | Alright so my config then... | 20:37 |
shorne | yes, testing with that | 20:39 |
shorne | btw this is my config I have been using to try to reproduce the issue: https://gist.github.com/8bffdbaf3eca34f100bee9962d3a1375 | 20:43 |
zx2c4 | No idea if it matters but my configs set CONFIG_HZ_250 instead of 100 | 20:51 |
zx2c4 | Your kernel has scheudler and lock debugging enabled | 20:57 |
zx2c4 | shorne: ^ | 21:01 |
zx2c4 | Also im using slub instead of slob fwiw | 21:01 |
zx2c4 | My guess would be that the scheduler or lock debugging hides the bug | 21:02 |
shorne | it could be | 21:09 |
shorne | I got your config booting, the only difference now is I am not packing your initrd into the kernel | 21:10 |
shorne | but its not reproducing, it could be that the extra work of unpacking initrd during boot changes something | 21:11 |
shorne | I can try with one of my initrds... | 21:12 |
shorne | ok, even with your config, and a big fat initrd, its not reproducing | 21:24 |
shorne | Ill try using the compiler too | 21:24 |
shorne | I am looking at your initramfs --- https://github.com/WireGuard/wireguard-linux/blob/2bc51cf02c400ec6ea230a0efed2f2d9cc2bc5f5/tools/testing/selftests/wireguard/qemu/Makefile#L307 | 21:46 |
shorne | its not much different from mine, its just you build a very minimal one, mine is here: https://github.com/stffrdhrn/or1k-utils initramfs/ initramfs.devnodes | 21:48 |
shorne | zx2c4: one difference I find is where Unpacking initramfs... happens: https://gist.github.com/stffrdhrn/b81c2fc73253a11a7ec1ccea472ddd7a | 22:06 |
shorne | Ill try to reproduce the order you are seeing | 22:07 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 22:15 | |
shorne | as per how the kernel is linked the order in your kernel should be more correct, but the rootfs init is run async so the order doesn't need to be guaranteed | 22:27 |
shorne | zx2c4: what is your setting for `locale`, i,e, LC_COLLATE="en_US.UTF-8" I wonder if sorting has anything to do with it | 22:29 |
zx2c4 | Ill be home in 10min | 22:40 |
zx2c4 | Same locale though i believe | 22:40 |
shorne | OK, probably not related then, not sure what is the difference now, Ill try to build using your wireguard-linux makefile | 22:41 |
shorne | sorry, I need to go out now | 22:41 |
zx2c4 | Darn. | 22:43 |
zx2c4 | alright home now | 22:50 |
zx2c4 | taking a look | 22:50 |
zx2c4 | LANG=en_US.UTF-8 | 22:50 |
zx2c4 | > In my builds Unpacking initramfs... runs after crypto self tests | 22:51 |
zx2c4 | Oh interesting observation | 22:51 |
zx2c4 | [ 3.380000] Initramfs unpacking failed: compression method gzip not configured | 22:51 |
zx2c4 | i wonder if i can repro if i remove the initramfs entirely | 22:52 |
zx2c4 | shorne: huh here's a new flake i havent seen before https://א.cc/WI4rWSXq and cant repro now | 22:55 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!