*** tpb has joined #tomu | 00:00 | |
*** alon_ has joined #tomu | 00:09 | |
*** alon has quit IRC | 00:12 | |
*** CarlFK has quit IRC | 00:33 | |
*** alon_ has quit IRC | 01:33 | |
*** alon_ has joined #tomu | 01:34 | |
*** alon_ has quit IRC | 02:00 | |
xobs | vesim: what if you place it in ram? | 02:32 |
---|---|---|
futarisIRCcloud | vesim: V (vector) extensions? | 03:02 |
xobs | It'd be nice if we added -M | 03:17 |
*** alon has joined #tomu | 03:34 | |
*** alon has quit IRC | 03:52 | |
*** rohitksingh has joined #tomu | 04:08 | |
*** buZz__ has quit IRC | 06:45 | |
*** buZz__ has joined #tomu | 06:45 | |
*** buZz__ is now known as buZz | 06:45 | |
*** CarlFK has joined #tomu | 10:13 | |
mithro | Happy new year! | 11:32 |
xobs | Happy New Year! | 11:40 |
vesim | xobs: i don't think that's going to change a lot, the lack of m exetnsion is killing the performance | 12:29 |
xobs | vesim: let me see how much a multiply unit adds. | 12:29 |
vesim | i've tried to enable m extension and it is working fine, but the USB stack changed | 12:31 |
vesim | from eptri to epfifo | 12:31 |
vesim | and tinyusb doesn't work with that | 12:31 |
vesim | i don't want to mess too much with foboot because i don't have jtag jig for fomu | 12:32 |
xobs | vesim: You can always load top.bin, then load your program on top of it. That's not permanent. | 12:33 |
vesim | so, only the booster is overriding it? | 12:34 |
xobs | Correct! | 12:34 |
vesim | nice, thx | 12:35 |
vesim | btw, having https://github.com/pulp-platform/ariane would be awesome :D | 12:35 |
tpb | Title: GitHub - pulp-platform/ariane: Ariane is a 6-stage RISC-V CPU capable of booting Linux (at github.com) | 12:35 |
xobs | Now I'm trying various vexriscv configurations to see what'll fit for you. | 12:56 |
xobs | If only it were slightly larger. | 12:56 |
vesim | for now i just increased the timeouts to 2 minutes in u2f host library | 13:07 |
xobs | That does seem like an unreasonable amount of time. | 13:28 |
vesim | it is like 20-40 seconds but i've set it to 2 minutes just to be safe | 14:02 |
vesim | the default timeout is 2 seconds | 14:02 |
xobs | I'm having to patch vexriscv to let the multiplier and divider work with our short pipeline. | 14:03 |
vesim | i wonder if there is a way to get good source of entropy on fomu | 14:07 |
vesim | because without that getting u2f on fomu is kinda useless | 14:12 |
xobs | So I can get the multiply unit to fit if I disable debug, which actually might be fine for you. | 14:14 |
xobs | Sure you can't debug it anymore, but you have a workable solution for debug already, with the default firmware. | 14:15 |
xobs | For entropy, we can try creating a Random block based on something like this: https://github.com/dpiegdon/verilog-buildingblocks/tree/4f19cbaf35c7f3fee21b497a8f0ca2b3bf252950/lattice_ice40 | 14:16 |
tpb | Title: verilog-buildingblocks/lattice_ice40 at 4f19cbaf35c7f3fee21b497a8f0ca2b3bf252950 · dpiegdon/verilog-buildingblocks · GitHub (at github.com) | 14:16 |
vesim | is there space for that and m extension? :D | 14:17 |
xobs | The ring oscillator is super tiny. With M (but without an MMU or USB Debug), we're at 93%. | 14:18 |
xobs | `ICESTORM_LC: 4944/ 5280 93%` (with a full single-cycle multiply + divide) | 14:19 |
vesim | nice | 14:21 |
xobs | Also: `ICESTORM_DSP: 4/ 8 50%` (so it's actually using the DSP, yay) | 14:22 |
vesim | did you change something else than cpu_variant in foboot-bitstream.py? | 14:27 |
xobs | I mean, I replaced the CPU. I need to get some changes merged back up to vexriscv to get the multiply and divide plugins to work with a two-stage pipeline. | 14:29 |
xobs | I can generate a trial .bin file for you if you'd like to experiment some. I haven't had a chance to test the multiply unit yet. | 14:35 |
xobs | You should be able to load it onto your Fomu, then it will show up as "2.0.3-[something]", then you can load your program recompiled with -M. | 14:36 |
xobs | When you unplug it, it will go back to 2.0.3. | 14:36 |
vesim | which usb stack you used? eptrio? | 14:36 |
xobs | Yeah, eptri | 14:36 |
vesim | because tinyusb is not working with epfifo | 14:36 |
xobs | Right, tinyusb only works with eptri. | 14:37 |
vesim | which one is better/newer? | 14:37 |
xobs | eptri is newer -- it supports more software (because of tinyusb), and supports up to 32 USB endpoints (as opposed to ~3) | 14:37 |
* xobs posted a file: top.bin (102KB) < https://matrix.org/_matrix/media/r0/download/matrix.org/kvWgIlPCjyfpybgQECgogJyE > | 14:38 | |
xobs | eptri is documented at https://rm.fomu.im/usb.html | 14:38 |
tpb | Title: USB Fomu Bootloader documentation (at rm.fomu.im) | 14:38 |
vesim | unfortunately tinyusb is not working with -march=rv32im and yours top.bin | 14:47 |
vesim | but it is working fine w/o m | 14:47 |
vesim | on your top.bin | 14:47 |
xobs | Guess "m" is broken somehow. | 14:47 |
vesim | 121280.118519] usbhid 1-2:1.0: can't add hid device: -110 [121280.118549] probe of 1-2:1.0 returned 0 after 20738188 usecs | 14:47 |
vesim | but it recoginzed the usb device | 14:48 |
vesim | [121259.374291] usb 1-2: Product: TinyUSB Device | 14:48 |
xobs | So it enumerated, but isn't working? | 14:48 |
vesim | yep | 14:48 |
vesim | i've tried stock hid_composite example | 14:50 |
vesim | after commenting out hid_task(); it is working(the led is blinking) | 14:51 |
vesim | ok, i narrowed down the issue, board_delay() is not working for some reasons | 14:53 |
vesim | it is stuck at that function | 14:53 |
xobs | That probably does a multiply. | 14:56 |
vesim | nope, just subtraction | 14:57 |
xobs | Oops. I don't know how to Git. I pulled "master", but it didn't get merged. | 14:57 |
vesim | it more looks like issue with the interrupt controller | 14:58 |
vesim | after commenting ot tud_task(); in board_delay() it is working but "sleeping" less than specified amount of time | 14:58 |
vesim | https://github.com/hathach/tinyusb/blob/master/hw/bsp/board.h#L105 | 14:59 |
tpb | Title: tinyusb/board.h at master · hathach/tinyusb · GitHub (at github.com) | 14:59 |
xobs | Hmm... I need to run for the night. But you can try this to see if it helps -- it's based on upstream vex, which seems to have fixed the pipeline issues in a different way: | 15:05 |
* xobs posted a file: top.bin (102KB) < https://matrix.org/_matrix/media/r0/download/matrix.org/yjfrqtdREsTEuPVwNcJJSlFB > | 15:05 | |
xobs | Anyway, goodnight! | 15:05 |
vesim | thx for the help | 15:06 |
vesim | it is working :D | 15:07 |
vesim | it went down from 53 seconds to 8 seconds | 15:14 |
vesim | it went down from 53 seconds to 8 seconds | 15:44 |
vesim | 5.96s i think i won't be able to get lower | 15:49 |
vesim | so... it is time to learn risc-v assembler | 15:49 |
CarlFK | vesim: it=256 mul ? | 16:30 |
vesim | CarlFK: what do you mean? | 16:31 |
CarlFK | vesim: wondering what it is that is now working | 16:32 |
vesim | CarlFK: standard m extension for risc-v | 16:33 |
CarlFK | I remember discussion about multiplying 256 bits | 16:33 |
vesim | it is just mul/div/mod on 32-bit registers | 16:33 |
vesim | so, i need to get some fast 256 bit intiger multiplication for risc-v | 16:34 |
vesim | that's necessary to get u2f working on fomu | 16:35 |
CarlFK | I didn't realize it was at 50 seconds | 16:42 |
vesim | at 50s was the whole registration process needed by u2f | 16:43 |
vesim | it is calculation of p256r1 public key, calculating sha256, and p256r1 signing of that sha256 hash | 16:45 |
CarlFK | ah. I was thinking that was just for the mul instruction (or whatever the mnemonic is) | 16:46 |
vesim | calculating p256r1 public key is taking around 1.5s | 16:46 |
vesim | but that's really hacky, it varies between 1.5s and 4s | 16:47 |
vesim | recompiling the code is enough to double the execution time | 16:47 |
CarlFK | Sounds like a good catalyst to attract optimization efforts ;) | 16:49 |
CarlFK | a few months ago someone was talking to me about the various ways of doing math with gates... | 16:50 |
vesim | i don't think we have enough space for doing 256bit multiplication that way :P | 16:51 |
vesim | https://github.com/gl-sergei/u2f-token/blob/master/src/muladd_256.h i just need to port this to risc-v | 16:51 |
tpb | Title: u2f-token/muladd_256.h at master · gl-sergei/u2f-token · GitHub (at github.com) | 16:51 |
vesim | did someone tried to increase the clock? :D | 17:07 |
*** emeb has joined #tomu | 17:23 | |
*** Vercas has quit IRC | 19:02 | |
*** Vercas has joined #tomu | 19:10 | |
*** Dan_au has quit IRC | 19:16 | |
*** Dan_au has joined #tomu | 19:23 | |
*** rohitksingh has quit IRC | 19:24 | |
vesim | 2.76s for whole u2f registration process... i think that's the limit of compiler optimizations | 21:45 |
futarisIRCcloud | https://twitter.com/jonoxer/status/1212236630133628928 | 22:33 |
*** Vercas has quit IRC | 23:16 | |
*** Vercas has joined #tomu | 23:22 |
Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!