*** tpb has joined #timvideos | 00:00 | |
*** andi-m has joined #timvideos | 00:03 | |
*** TheAssassin has quit IRC | 00:12 | |
mithro | xobs: ping? I think I'm at the state of needing grainuum now | 00:15 |
---|---|---|
*** TheAssassin has joined #timvideos | 00:17 | |
*** andi-m has quit IRC | 00:39 | |
*** andi-m has joined #timvideos | 00:44 | |
*** andi-m has quit IRC | 01:14 | |
*** andi-m has joined #timvideos | 01:17 | |
xobs | CarlFK: usually it's "patch -p1 < [file].patch" | 02:08 |
xobs | Or, since that's git, "git am [file].patch" | 02:08 |
xobs | mithro: fast. what do you have? | 02:08 |
mithro | I have input + output FIFO and it seems to meet timing | 02:09 |
mithro | And it passes my test suite | 02:10 |
xobs | Okay. What's the interface look like? Is it in your repo? | 02:11 |
mithro | Yes, it's basically two csrs in both directions | 02:11 |
mithro | xobs: Just pushing now | 02:17 |
xobs | mithro: alright, I've got about 90 minutes before I need to go for about three hours. | 02:18 |
mithro | xobs: pushed | 02:19 |
xobs | still lm32? | 02:23 |
mithro | Yes | 02:23 |
mithro | Need to figure out why we can't disable the multiplier / divider | 02:24 |
cr1901_modern | Knowing my luck it'll still work w/ simulation... | 02:26 |
mithro | I'm a bit worried about not being able to process the packets fast enough | 02:26 |
cr1901_modern | mithro: Does the mult/div fail on any targets where the BIOS runs from BRAM? | 02:28 |
mithro | cr1901_modern: Dunno | 02:29 |
mithro | xobs: Any luck? | 02:40 |
xobs | mithro: still trying to build it. what configuration are you using? | 02:40 |
mithro | LX P=tomu_fpga_hacker T=usb.minimal F=tinyusb R=tinyusb-work | 02:40 |
xobs | did you check in usb.py? | 02:41 |
mithro | xobs: Nope! sorry | 02:41 |
mithro | xobs: Pushed | 02:41 |
xobs | Thanks, building now. | 02:49 |
mithro | https://github.com/mithro/valentyusb/blob/master/valentyusb/test_usbcore.py#L3958-L3987 | 02:50 |
tpb | Title: valentyusb/test_usbcore.py at master · mithro/valentyusb · GitHub (at github.com) | 02:50 |
mithro | That is how to queue stuff for sending | 02:50 |
mithro | xobs: How to read -> https://github.com/mithro/valentyusb/blob/master/valentyusb/test_usbcore.py#L3936-L3956 | 02:51 |
tpb | Title: valentyusb/test_usbcore.py at master · mithro/valentyusb · GitHub (at github.com) | 02:51 |
xobs | That's good. | 02:51 |
xobs | I was thinking we may need to wrap it around a simple state machine to handle things like sending ACK and NAK. Is that there yet? | 02:52 |
mithro | xobs: Well, I have Python code which does that in the tests | 02:52 |
mithro | https://github.com/mithro/valentyusb/blob/master/valentyusb/test_usbcore.py#L3994-L4111 | 02:53 |
tpb | Title: valentyusb/test_usbcore.py at master · mithro/valentyusb · GitHub (at github.com) | 02:53 |
xobs | Okay, I see. | 02:53 |
xobs | I actually realized last night that for comparing the PID to generate a response, as a tiny optimization we can look at only the first 4 bits. That way each PID uses only a single LUT. | 02:53 |
mithro | xobs: I can see a lot of room for optimisation | 02:55 |
mithro | xobs: We should be using the FFs in the IO tile | 02:57 |
xobs | We can also use LVDS pins, though I'm still not sure what that actually means. The reference manual isn't very clear what benefits that has over normal IO. | 02:57 |
mithro | With the two FFs in both IO tiles can be fed directly into the rx state machine | 02:59 |
xobs | "Info: Max frequency for clock 'usb_48_clk_$glb_clk': 48.49 MHz (PASS at 48.00 MHz)" Wow, that is shaving it close. | 03:00 |
mithro | xobs: you can increase the margin by reducing the FIFO depth | 03:01 |
mithro | xobs: Need to figure out why the margin is so low | 03:02 |
mithro | Info: 8.3 ns logic, 12.3 ns routing | 03:03 |
xobs | Oh, you're still calculating CRC16 in hardware? | 03:03 |
mithro | xobs: no | 03:03 |
xobs | In top.v I see things in clk_usb_48 like "usb_tx_crc[13] <= usb_tx_crc[12];" | 03:03 |
mithro | xobs: Hrm? Are you looking at an old one? | 03:04 |
xobs | Maybe. Let me remove build and try again. | 03:04 |
xobs | Oh, oops. And I've just discovered that removing "build" also removes conda. | 03:04 |
mithro | grep crc build/tomu_fpga_hacker_usb_lm32.minimal/gateware/top.v | 03:04 |
mithro | That returns nothing for me... | 03:05 |
cr1901_modern | > And I've just discovered that removing "build" also removes conda. | 03:06 |
cr1901_modern | Congrats. That's a rite of passage :). | 03:06 |
mithro | Internet is nice and fast in singapore right? :-P | 03:07 |
xobs | Ymithro yeah, it spends most of its time "solving environment" | 03:07 |
cr1901_modern | faster than AUS? | 03:10 |
* cr1901_modern has heard horror stories | 03:10 | |
xobs | cr1901_modern: gigabit, but it depends on if a CDN is on-island or not. | 03:11 |
mithro | xobs: I want to move everything apart from the shift register and bitstuffer out of the 48MHz domain | 03:19 |
xobs | mithro: at the very least, you'll need a simple state machine that responds to packets with ACK, NAK, or (if the epnum matches) DATA. | 03:20 |
xobs | But that state machine has like 5 states. | 03:21 |
xobs | I wonder why Conda doesn't install newlib by default. | 03:47 |
mithro | xobs: Our compiler doesn't use newlib by default | 03:48 |
xobs | mithro: but the bios does, so "make gateware" fails. | 03:48 |
mithro | xobs: It only does in my branch at the moment | 03:49 |
mithro | because of lto | 03:49 |
xobs | Oh! I see. | 03:49 |
mithro | https://github.com/enjoy-digital/litex/issues/139 | 03:49 |
tpb | Title: Enable link time optimisation in builds · Issue #139 · enjoy-digital/litex · GitHub (at github.com) | 03:49 |
xobs | Okay, I got things building again. And I don't see crc in the 48 MHz domain anymore. | 03:50 |
mithro | xobs: I think we should move towards the stuff at the bottom of this diagram -> https://docs.google.com/drawings/d/1olpdWXglPGzJdW_R1DwWviBumlCetyuSZjl-luO_Xe8/edit | 03:57 |
tpb | Title: valentyusb - Parts - Google Drawings (at docs.google.com) | 03:57 |
xobs | mithro: that's actually what I was about to try to implement. The simple FSM can work in the slow domain, once the fast domain collects bytes. | 03:58 |
mithro | I actually think it could be even easier | 03:59 |
xobs | That way we can shrink the FIFO down from 16 bytes to 2 or 3. | 03:59 |
mithro | Hrm... | 04:03 |
mithro | xobs: We get 1 bit every 4 clock cycles of the 48MHz -- so the actual bit rate is 12 MHz? | 04:04 |
xobs | That's correct. If you group those into bytes, it gets even better. | 04:05 |
mithro | So, bytes occur at 1.5MHz | 04:05 |
cr1901_modern | and transition rate (0->1 or 1->0) is at 6 MHz | 04:06 |
mithro | cr1901_modern: Hrm? | 04:06 |
cr1901_modern | Oh I'm just adding some commentary :P | 04:07 |
cr1901_modern | USB 1.x is 12 Mbps, but the transition rate is half that | 04:07 |
xobs | I'll be back in about three hours. | 04:16 |
xobs | Strange how it doesn't meet timing if I set the FIFO depth to 8 or 2. Only 16 works. I'll need to investigate. | 04:18 |
cr1901_modern | xobs: When you get back... do you have a demo of your BRAM reload? | 04:18 |
xobs | cr1901_modern: sure, I'd love to help someone else get it working, too. What board are you targeting? | 04:19 |
cr1901_modern | litex-buildenv in general... so no board in particular. I can at least use your idea for inspiration | 04:20 |
cr1901_modern | xobs: Oh yes, do you still have your working minimal vexriscv? https://logs.timvideos.us/%23timvideos/%23timvideos.2018-12-19.log.html#t2018-12-19T16:28:11 | 04:21 |
cr1901_modern | Gaaaah, I'm bad at multitasking lol | 04:21 |
xobs | No worries. | 04:23 |
cr1901_modern | your minimal vexriscv should _also_ give me some hints as to why it behaves so poorly when doing XIP | 04:24 |
xobs | It'll be for the Fomu evt board, which is what I actually have. I'll try it again once I get home. | 04:26 |
cr1901_modern | alright cool. Just link your code in here when you can. I'll prob be asleep (holy crap, regular sleep schedule) | 04:27 |
mithro | xobs: It probably stops using the bram? | 04:30 |
mithro | xobs: I'm going to have to switch to doing SymbiFlow stuff | 04:34 |
mithro | xobs: It would be pretty easy to just map the RX / TX to a pair of CSR registers | 04:35 |
xobs | mithro: alright, does that mean I'm going to take over the project now? | 04:35 |
mithro | xobs: Can work more on it at CCC after my talk and next year | 04:36 |
mithro | xobs: You could even preload the TX CSR register with the sync byte? | 04:37 |
xobs | Okay. I'd still like to get an MVP working to prove it's doable. So I'll concentrate on it now. | 04:37 |
mithro | xobs: I'm very unclear how tinyfpga was getting the frequency he was... | 04:38 |
mithro | xobs: It would also be interesting to see if my other reworked CPU interface is able to meet timing | 04:39 |
mithro | ./test_usbcore.py should run the same set of tests against all 3 versions | 04:39 |
mithro | It's probably broken on everything other than TestUsbDeviceSimple at the moment | 04:40 |
mithro | xobs: If we convert things into 48MHz and 12MHz clock domains, then it is going to meet timing easily | 04:41 |
mithro | I think we may possible need to use the SB_PLL40_2_PAD to pass the 48MHz clock through while also generating a 12MHz clock... | 04:43 |
xobs | mithro: I don't understand what that means yet, but I'll give it a shot. | 04:47 |
cr1901_modern | xobs: SB_PLL40_* is an ice40 primitive. Consult your neighborhood friendly "SiliconBlue ICE Technology Library" manual for more information | 04:48 |
cr1901_modern | (Read as: I don't know what it does either ;)...) | 04:49 |
cr1901_modern | if I had to guess (cc: daveshah) it's a hint that the FPGA PLL should be fed from an input dedicated to the PLL | 04:50 |
mithro | The SB_PLL40_2*** primitives seem to be able to generate two output clocks | 04:50 |
mithro | Oh, although we already have the CPU running at 12MHz, so I guess we could just use that.... | 04:51 |
cr1901_modern | mithro: Are you omitting clock domain crossing logic between 12 and 48MHz domain? | 04:52 |
xobs | Yeah, and the timing doesn't need to be so precise. | 04:52 |
mithro | I might take a look when I get home | 04:54 |
*** andi-m has quit IRC | 04:58 | |
*** andi-m has joined #timvideos | 04:58 | |
*** cr1901_modern1 has joined #timvideos | 05:28 | |
*** cr1901_modern has quit IRC | 05:28 | |
*** cr1901_modern1 has quit IRC | 05:29 | |
*** cr1901_modern has joined #timvideos | 05:29 | |
*** rohitksingh has joined #timvideos | 06:03 | |
*** rohitksingh has quit IRC | 06:12 | |
*** rohitksingh has joined #timvideos | 06:33 | |
*** rohitksingh has quit IRC | 08:05 | |
xobs | Okay, that took a while. But time to look into this USB approach again. | 08:21 |
*** rohitksingh has joined #timvideos | 08:32 | |
*** rohitksingh has quit IRC | 09:04 | |
daveshah | mithro: the PAD PLL uses dedicated routing from an input pin to the PLL, the CORE variant through fabric | 09:16 |
daveshah | You can't really choose one of the two on a one PLL device, you have to use the PAD PLL if your input is the dedicated input pin or CORE otherwise | 09:17 |
xobs | daveshah: that's why some pins are defined as clock-capable? | 09:17 |
daveshah | Those have dedicated routing to the global networks | 09:18 |
daveshah | One in particular has dedicated routing to the PLL | 09:18 |
xobs | Our approach is to use the external 48 MHz clock for the small USB domain, and use the oscillator for the low-speed 12 MHz domain. | 09:18 |
daveshah | Might as well just use two registers to divide down 48MHz | 09:18 |
xobs | Would that simplify routing at all? | 09:19 |
daveshah | No, but it would improve accuracy of the 12MHz domain | 09:19 |
daveshah | BTW the _2_ PLL variants pass through the input to the second output (with PAD PLLs the dedicated input is otherwise unusable) | 09:20 |
daveshah | The _2F_ variants have two outputs, although the possibilities for differing output frequencies are limited. | 09:20 |
daveshah | Those variants can also give you a variable phase shift between the two for certain IO applications | 09:21 |
*** rohitksingh has joined #timvideos | 09:26 | |
*** rohitksingh has quit IRC | 09:38 | |
*** rohitksingh has joined #timvideos | 10:01 | |
*** rohitksingh has quit IRC | 10:32 | |
xobs | Wow, okay. Definitely enjoying vexriscv over lm32. It's slightly larger (by about 300 gates - 4552 vs 4228), but the code size is smaller so I can stuff more in ROM. | 10:51 |
xobs | Also: "Max frequency for clock 'clk48_$glb_clk': 56.74 MHz (PASS at 48.00 MHz)" | 10:51 |
xobs | ...though presumably the larger register files uses more BRAM, since I have to reduce the ROM to 0x2400 bytes (down from 0x2c00 for lm32) | 12:31 |
xobs | cr1901_modern: The target I'm using is at https://github.com/xobs/litex-buildenv/blob/fomu-evt-usb/targets/fomu_evt/usb.py but the magic field is mostly "self.submodules.random_rom = RandomFirmwareROM(bios_size)" | 14:14 |
tpb | Title: litex-buildenv/usb.py at fomu-evt-usb · xobs/litex-buildenv · GitHub (at github.com) | 14:14 |
xobs | The trick is you need to figure out what file gets generated. Usually it's "mem_1.init". Then you run it through https://github.com/xobs/ice40-repack with "ice40-repack top.bin mem_1.init $BIOS top-patched.bin" to replace the random memory with the contents of $BIOS, and save it to top-patched.bin. | 14:16 |
tpb | Title: GitHub - xobs/ice40-repack: Repack an ICE40 bitstream image with new memory contents (at github.com) | 14:16 |
xobs | There's some endianness weirdness I need to fix (lm32 vs riscv), and i'd like to simplify ice40-repack (or even replace it with a python script), but it works great, and is the process I'm using right now. | 14:17 |
*** heyy has joined #timvideos | 16:36 | |
*** heyy has quit IRC | 16:38 | |
*** rohitksingh has joined #timvideos | 16:51 | |
*** rohitksingh has quit IRC | 19:07 | |
mithro | xobs: Were did you get too? | 21:14 |
CarlFK | mithro: when do you leave for CCC or whatever your first stop is? | 21:28 |
mithro | 25th | 21:29 |
CarlFK | mithro: will you be back before LCA? | 21:32 |
mithro | CarlFK: for a couple of days, yes | 21:33 |
CarlFK | I was wondering how crazy you were ;) | 21:33 |
CarlFK | tumbleweed still has the more crazy schedule | 21:34 |
cr1901_modern | xobs: What's the possibility you could submit your work upstream to icestorm (if only to explain to clifford about the shortcomings of icebram alone)? | 21:42 |
*** techman83 has quit IRC | 22:07 | |
*** techman83 has joined #timvideos | 22:08 | |
*** ChanServ sets mode: +v techman83 | 22:08 |
Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!