*** tpb <[email protected]> has joined #openrisc | 00:00 | |
shorne | zx2c4: interesting SIGILL, the address doesnt seem right 0x300a7ce4 its a user space address | 00:01 |
---|---|---|
zx2c4 | thats a crash when starting bash | 00:01 |
zx2c4 | ive seen various things sort of resembling that... | 00:02 |
zx2c4 | what this all makes me think is that sometimes there's a jump to the wrong code offset | 00:02 |
zx2c4 | btw, you wont be able to repro with wireguard-linux repo because or1k.config has PREEMPT_NONE=y in it. you have to remove that | 00:03 |
shorne | I see, any luck reproducing without the initramfs? | 00:03 |
zx2c4 | innnnteresting. no, when i totally disable initramfs it doesnt repro! | 00:08 |
zx2c4 | btw, on mine, i get | 00:11 |
zx2c4 | [ 0.000000] Initial ramdisk not found | 00:11 |
zx2c4 | in the fialing case | 00:11 |
zx2c4 | because im using an in-built initramfs | 00:11 |
zx2c4 | rather than one passed via -initramfs | 00:11 |
zx2c4 | CONFIG_BLK_DEV_INITRD=y | 00:11 |
zx2c4 | CONFIG_INITRAMFS_SOURCE="/home/zx2c4/Projects/wireguard-linux/tools/testing/selftests/wireguard/qemu/build/or1k/init-cpio-spec.txt" | 00:11 |
zx2c4 | CONFIG_INITRAMFS_ROOT_UID=0 | 00:11 |
zx2c4 | CONFIG_INITRAMFS_ROOT_GID=0 | 00:11 |
zx2c4 | where that file contains https://xn--4db.cc/8XjbyUbr | 00:12 |
shorne | yeah, I build with : | 00:13 |
shorne | make -j12 LC_ALL=en_US.UTF-8 ARCH=openrisc CROSS_COMPILE=or1k-linux-musl- CONFIG_INITRAMFS_SOURCE=/home/shorne/work/openrisc/or1k-utils/initramfs /home/shorne/work/openrisc/or1k-utils/initramfs.devnodes | 00:13 |
shorne | to try to reproduce a similar CPIO build in image | 00:13 |
zx2c4 | I am able to reproduce even if i reduce that file to a single entry | 00:14 |
zx2c4 | file /init /home/zx2c4/Projects/wireguard-linux/tools/testing/selftests/wireguard/qemu/build/or1k/init 755 0 0 | 00:14 |
zx2c4 | it reproduced with just that | 00:14 |
zx2c4 | here's the init file https://usercontent.irccloud-cdn.com/file/zeCwEBDE/init | 00:14 |
shorne | ok, let me try that | 00:14 |
zx2c4 | here's the output of my `locale` | 00:14 |
zx2c4 | I'm also able to reproduce with the empty initramfs! | 00:16 |
zx2c4 | CONFIG_BLK_DEV_INITRD=y | 00:16 |
zx2c4 | CONFIG_INITRAMFS_SOURCE="" | 00:16 |
zx2c4 | (the kernel supplies a "default" one in that case) | 00:16 |
zx2c4 | So here's a complete kernel config that exhibits the issue and doesnt have any userland dependencies: https://xn--4db.cc/cCRlQ1AE | 00:17 |
zx2c4 | another interesting quirk: | 00:18 |
zx2c4 | [ 3.364000] Segment Routing with IPv6 | 00:18 |
zx2c4 | [ 3.364000] In-situ OAM (IOAM) with IPv6 | 00:18 |
zx2c4 | [ 200.944000] List of all partitions: | 00:18 |
zx2c4 | [ 200.944000] No filesystem could mount root, tried: | 00:18 |
zx2c4 | It jumped to 200 otu of nowhere. I've seen this happen too a few times. The clock goes nuts | 00:19 |
shorne | oh, that is strange | 00:19 |
zx2c4 | More flawed computation? A different bug? Dunno | 00:19 |
zx2c4 | here's a minimal kernel with no userland that exhibits the bug. `qemu-system-or1k -nodefaults -nographic -cpu or1200 -machine or1k-sim -serial stdio -kernel vmlinux` https://usercontent.irccloud-cdn.com/file/CHD5g4be/vmlinux | 00:21 |
shorne | that one reproduces on my qemu | 00:24 |
shorne | but I can't get it with my own build yet | 00:24 |
zx2c4 | use https://xn--4db.cc/cCRlQ1AE with no modifications | 00:24 |
shorne | Tried minimap cpio | 00:24 |
shorne | Will try no cpio | 00:24 |
zx2c4 | Can you try exactly the contents of https://xn--4db.cc/cCRlQ1AE ? | 00:24 |
shorne | trying https://xn--4db.cc/cCRlQ1AE | 00:25 |
zx2c4 | and then maybe that + the musl.cc compiler? | 00:25 |
shorne | its on musl compiler | 00:26 |
shorne | make -j12 LC_ALL=en_US.UTF-8 ARCH=openrisc CROSS_COMPILE=or1k-linux-musl- | 00:26 |
shorne | trying that with your .config | 00:26 |
shorne | maybe its me passing LC_ALL, or other env vars? SHould be little difference now | 00:27 |
zx2c4 | HOSTCC shouldnt be doing anything interesting | 00:28 |
zx2c4 | I'll try building with `env -i PATH=/usr/bin:/bin make ...` | 00:29 |
zx2c4 | totally empty environment | 00:30 |
zx2c4 | if youre using the same compiler and same everything else, in theory we should be able to compare vmlinux images, right? | 00:31 |
shorne | I cannot repro with the minimal config | 00:31 |
zx2c4 | i can repro the issue with `env -i PATH=/usr/bin:/bin make ...` | 00:31 |
zx2c4 | oh im setting KBUILD_BUILD_TIMESTAMP to an empty string in my build environment... | 00:32 |
zx2c4 | nope, that's not it. removed that and still reproudces | 00:33 |
zx2c4 | are you compiling my wireguard tree, by the way, on the `stable` branch, or some other tree? | 00:34 |
shorne | It would be nice if we could compare images, my kernel version is | 00:34 |
shorne | 2022-05-01 d53a0fd87c26 Julia Lawall openrisc: fix typos in comments (HEAD -> master, shorne/master) | 00:34 |
shorne | 2022-05-01 9f10b44dcefc Jason A. Donenfeld openrisc: define nop command for simulator reboot | 00:34 |
shorne | 2022-04-24 af2d861d4cd2 Linus Torvalds Linux 5.18-rc4 (tag: v5.18-rc4) | 00:34 |
zx2c4 | alright same here | 00:34 |
shorne | I have 2 patches on top of v5.18-rc4 | 00:34 |
zx2c4 | oh, er | 00:34 |
shorne | ah, maybe thats it? | 00:35 |
shorne | let me try to get your tree | 00:35 |
zx2c4 | let me check out linus' tree | 00:35 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 00:37 | |
shorne | zx2c4: should I be using the stable branch? | 00:37 |
zx2c4 | yea | 00:37 |
shorne | ok, building it now | 00:39 |
zx2c4 | im building rc5 | 00:39 |
shorne | [ 3.320000] WARNING: CPU: 0 PID: 1 at lib/crypto/curve25519.c:19 curve25519_init+0x38/0x50 | 00:40 |
zx2c4 | BINGO | 00:40 |
zx2c4 | !!! | 00:40 |
zx2c4 | whewwww | 00:40 |
shorne | ok its happening right away | 00:40 |
zx2c4 | good | 00:40 |
zx2c4 | i can repro with rc5 | 00:41 |
zx2c4 | so ostensibly this is a regression somewhere between rc4 and rc5? | 00:41 |
shorne | yeah, it looks like it | 00:41 |
shorne | need to bisect | 00:41 |
shorne | but again, I need to go soon | 00:42 |
shorne | sorry, I am on vacation with my family this week, Japan has a 3 day holiday tue-thu | 00:42 |
zx2c4 | ahhh | 00:42 |
shorne | oh, wife is getting mad too much computer | 00:42 |
zx2c4 | haha | 00:42 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 01:33 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 01:37 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 02:17 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 06:13 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 06:51 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 07:28 | |
shorne | bisected to e5be15767e7e284351853cbaba80cde8620341fb, hex2bin: make the function hex_to_bin constant-time | 08:05 |
shorne | after reverting that on your 'stable' branch it doesnt reproduce | 08:13 |
zx2c4 | Whaaaaa.... That cant possibly be it... | 09:17 |
zx2c4 | shorne: funny enough, https://lists.librecores.org/pipermail/openrisc/2022-May/003922.html also "fixes" it... so basically there's some really subtle bug that most things mask | 11:11 |
tpb | Title: [PATCH] openrisc: remove bogus nops and shutdowns (at lists.librecores.org) | 11:11 |
zx2c4 | and we're really lucky that we've now unmasked it now | 11:11 |
zx2c4 | figuring out root cause would be a very very good idea before it's papered over by other things | 11:11 |
littlebobeep | Hmm so yall use QEMU? No FPGAs? | 12:18 |
zx2c4 | littlebobeep: im just a lowly CI admin | 14:05 |
zx2c4 | im sure shorne has some FGPAs running or1k though | 14:05 |
zx2c4 | FPGAs | 14:05 |
littlebobeep | I see okay | 14:09 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 14:24 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 14:27 | |
*** littlebo1eep <littlebo1eep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 14:33 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 14:35 | |
*** littlebo1eep <littlebo1eep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 14:39 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 14:45 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 15:16 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 15:23 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 15:29 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 15:41 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 15:55 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 16:01 | |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has quit IRC (Ping timeout: 240 seconds) | 16:42 | |
shorne | yeah, I can try on FPRA's | 19:39 |
shorne | but right now I was to try a few other things | 19:39 |
shorne | using my 12.x compiler I cannot trigger the issue | 20:14 |
shorne | Also, I cannot reproduce using v5.18-rc5 | 20:46 |
shorne | so it seems that particular commit with that particular compiler version 11.2.1 with config | 20:46 |
*** littlebobeep <littlebobeep!~alMalsamo@gateway/tor-sasl/almalsamo> has joined #openrisc | 21:32 | |
shorne | littlebobeep: I use FPGA and QEMU | 22:06 |
shorne | but for this testing we are discussing its QEMU | 22:07 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!