Sunday, 2018-12-23

*** tpb has joined #timvideos00:00
*** andi-m has joined #timvideos00:03
*** TheAssassin has quit IRC00:12
mithroxobs: ping? I think I'm at the state of needing grainuum now00:15
*** TheAssassin has joined #timvideos00:17
*** andi-m has quit IRC00:39
*** andi-m has joined #timvideos00:44
*** andi-m has quit IRC01:14
*** andi-m has joined #timvideos01:17
xobsCarlFK: usually it's "patch -p1 < [file].patch"02:08
xobsOr, since that's git, "git am [file].patch"02:08
xobsmithro: fast. what do you have?02:08
mithroI have input + output FIFO and it seems to meet timing02:09
mithroAnd it passes my test suite02:10
xobsOkay.  What's the interface look like?  Is it in your repo?02:11
mithroYes, it's basically two csrs in both directions02:11
mithroxobs: Just pushing now02:17
xobsmithro: alright, I've got about 90 minutes before I need to go for about three hours.02:18
mithroxobs: pushed02:19
xobsstill lm32?02:23
mithroYes02:23
mithroNeed to figure out why we can't disable the multiplier / divider02:24
cr1901_modernKnowing my luck it'll still work w/ simulation...02:26
mithroI'm a bit worried about not being able to process the packets fast enough02:26
cr1901_modernmithro: Does the mult/div fail on any targets where the BIOS runs from BRAM?02:28
mithrocr1901_modern: Dunno02:29
mithroxobs: Any luck?02:40
xobsmithro: still trying to build it. what configuration are you using?02:40
mithroLX P=tomu_fpga_hacker T=usb.minimal F=tinyusb R=tinyusb-work02:40
xobsdid you check in usb.py?02:41
mithroxobs: Nope! sorry02:41
mithroxobs: Pushed02:41
xobsThanks, building now.02:49
mithrohttps://github.com/mithro/valentyusb/blob/master/valentyusb/test_usbcore.py#L3958-L398702:50
tpbTitle: valentyusb/test_usbcore.py at master · mithro/valentyusb · GitHub (at github.com)02:50
mithroThat is how to queue stuff for sending02:50
mithroxobs: How to read -> https://github.com/mithro/valentyusb/blob/master/valentyusb/test_usbcore.py#L3936-L395602:51
tpbTitle: valentyusb/test_usbcore.py at master · mithro/valentyusb · GitHub (at github.com)02:51
xobsThat's good.02:51
xobsI was thinking we may need to wrap it around a simple state machine to handle things like sending ACK and NAK.  Is that there yet?02:52
mithroxobs: Well, I have Python code which does that in the tests02:52
mithrohttps://github.com/mithro/valentyusb/blob/master/valentyusb/test_usbcore.py#L3994-L411102:53
tpbTitle: valentyusb/test_usbcore.py at master · mithro/valentyusb · GitHub (at github.com)02:53
xobsOkay, I see.02:53
xobsI actually realized last night that for comparing the PID to generate a response, as a tiny optimization we can look at only the first 4 bits.  That way each PID uses only a single LUT.02:53
mithroxobs: I can see a lot of room for optimisation02:55
mithroxobs: We should be using the FFs in the IO tile02:57
xobsWe can also use LVDS pins, though I'm still not sure what that actually means.  The reference manual isn't very clear what benefits that has over normal IO.02:57
mithroWith the two FFs in both IO tiles can be fed directly into the rx state machine02:59
xobs"Info: Max frequency for clock 'usb_48_clk_$glb_clk': 48.49 MHz (PASS at 48.00 MHz)" Wow, that is shaving it close.03:00
mithroxobs: you can increase the margin by reducing the FIFO depth03:01
mithroxobs: Need to figure out why the margin is so low03:02
mithroInfo: 8.3 ns logic, 12.3 ns routing03:03
xobsOh, you're still calculating CRC16 in hardware?03:03
mithroxobs: no03:03
xobsIn top.v I see things in clk_usb_48 like "usb_tx_crc[13] <= usb_tx_crc[12];"03:03
mithroxobs: Hrm? Are you looking at an old one?03:04
xobsMaybe.  Let me remove build and try again.03:04
xobsOh, oops.  And I've just discovered that removing "build" also removes conda.03:04
mithrogrep crc build/tomu_fpga_hacker_usb_lm32.minimal/gateware/top.v03:04
mithroThat returns nothing for me...03:05
cr1901_modern> And I've just discovered that removing "build" also removes conda.03:06
cr1901_modernCongrats. That's a rite of passage :).03:06
mithroInternet is nice and fast in singapore right? :-P03:07
xobsYmithro yeah, it spends most of its time "solving environment"03:07
cr1901_modernfaster than AUS?03:10
* cr1901_modern has heard horror stories03:10
xobscr1901_modern: gigabit, but it depends on if a CDN is on-island or not.03:11
mithroxobs: I want to move everything apart from the shift register and bitstuffer out of the 48MHz domain03:19
xobsmithro: at the very least, you'll need a simple state machine that responds to packets with ACK, NAK, or (if the epnum matches) DATA.03:20
xobsBut that state machine has like 5 states.03:21
xobsI wonder why Conda doesn't install newlib by default.03:47
mithroxobs: Our compiler doesn't use newlib by default03:48
xobsmithro: but the bios does, so "make gateware" fails.03:48
mithroxobs: It only does in my branch at the moment03:49
mithrobecause of lto03:49
xobsOh!  I see.03:49
mithrohttps://github.com/enjoy-digital/litex/issues/13903:49
tpbTitle: Enable link time optimisation in builds · Issue #139 · enjoy-digital/litex · GitHub (at github.com)03:49
xobsOkay, I got things building again.  And I don't see crc in the 48 MHz domain anymore.03:50
mithroxobs: I think we should move towards the stuff at the bottom of this diagram -> https://docs.google.com/drawings/d/1olpdWXglPGzJdW_R1DwWviBumlCetyuSZjl-luO_Xe8/edit03:57
tpbTitle: valentyusb - Parts - Google Drawings (at docs.google.com)03:57
xobsmithro: that's actually what I was about to try to implement.  The simple FSM can work in the slow domain, once the fast domain collects bytes.03:58
mithroI actually think it could be even easier03:59
xobsThat way we can shrink the FIFO down from 16 bytes to 2 or 3.03:59
mithroHrm...04:03
mithroxobs: We get 1 bit every 4 clock cycles of the 48MHz -- so the actual bit rate is 12 MHz?04:04
xobsThat's correct.  If you group those into bytes, it gets even better.04:05
mithroSo, bytes occur at 1.5MHz04:05
cr1901_modernand transition rate (0->1 or 1->0) is at 6 MHz04:06
mithrocr1901_modern: Hrm?04:06
cr1901_modernOh I'm just adding some commentary :P04:07
cr1901_modernUSB 1.x is 12 Mbps, but the transition rate is half that04:07
xobsI'll be back in about three hours.04:16
xobsStrange how it doesn't meet timing if I set the FIFO depth to 8 or 2. Only 16 works. I'll need to investigate.04:18
cr1901_modernxobs: When you get back... do you have a demo of your BRAM reload?04:18
xobscr1901_modern: sure, I'd love to help someone else get it working, too. What board are you targeting?04:19
cr1901_modernlitex-buildenv in general... so no board in particular. I can at least use your idea for inspiration04:20
cr1901_modernxobs: Oh yes, do you still have your working minimal vexriscv? https://logs.timvideos.us/%23timvideos/%23timvideos.2018-12-19.log.html#t2018-12-19T16:28:1104:21
cr1901_modernGaaaah, I'm bad at multitasking lol04:21
xobsNo worries.04:23
cr1901_modernyour minimal vexriscv should _also_ give me some hints as to why it behaves so poorly when doing XIP04:24
xobsIt'll be for the Fomu evt board, which is what I actually have. I'll try it again once I get home.04:26
cr1901_modernalright cool. Just link your code in here when you can. I'll prob be asleep (holy crap, regular sleep schedule)04:27
mithroxobs: It probably stops using the bram?04:30
mithroxobs: I'm going to have to switch to doing SymbiFlow stuff04:34
mithroxobs: It would be pretty easy to just map the RX / TX to a pair of CSR registers04:35
xobsmithro: alright, does that mean I'm going to take over the project now?04:35
mithroxobs: Can work more on it at CCC after my talk and next year04:36
mithroxobs: You could even preload the TX CSR register with the sync byte?04:37
xobsOkay. I'd still like to get an MVP working to prove it's doable. So I'll concentrate on it now.04:37
mithroxobs: I'm very unclear how tinyfpga was getting the frequency he was...04:38
mithroxobs: It would also be interesting to see if my other reworked CPU interface is able to meet timing04:39
mithro./test_usbcore.py should run the same set of tests against all 3 versions04:39
mithroIt's probably broken on everything other than TestUsbDeviceSimple at the moment04:40
mithroxobs: If we convert things into 48MHz and 12MHz clock domains, then it is going to meet timing easily04:41
mithroI think we may possible need to use the SB_PLL40_2_PAD to pass the 48MHz clock through while also generating a 12MHz clock...04:43
xobsmithro: I don't understand what that means yet, but I'll give it a shot.04:47
cr1901_modernxobs: SB_PLL40_* is an ice40 primitive. Consult your neighborhood friendly "SiliconBlue ICE Technology Library" manual for more information04:48
cr1901_modern(Read as: I don't know what it does either ;)...)04:49
cr1901_modernif I had to guess (cc: daveshah) it's a hint that the FPGA PLL should be fed from an input dedicated to the PLL04:50
mithroThe SB_PLL40_2*** primitives seem to be able to generate two output clocks04:50
mithroOh, although we already have the CPU running at 12MHz, so I guess we could just use that....04:51
cr1901_modernmithro: Are you omitting clock domain crossing logic between 12 and 48MHz domain?04:52
xobsYeah, and the timing doesn't need to be so precise.04:52
mithroI might take a look when I get home04:54
*** andi-m has quit IRC04:58
*** andi-m has joined #timvideos04:58
*** cr1901_modern1 has joined #timvideos05:28
*** cr1901_modern has quit IRC05:28
*** cr1901_modern1 has quit IRC05:29
*** cr1901_modern has joined #timvideos05:29
*** rohitksingh has joined #timvideos06:03
*** rohitksingh has quit IRC06:12
*** rohitksingh has joined #timvideos06:33
*** rohitksingh has quit IRC08:05
xobsOkay, that took a while.  But time to look into this USB approach again.08:21
*** rohitksingh has joined #timvideos08:32
*** rohitksingh has quit IRC09:04
daveshahmithro: the PAD PLL uses dedicated routing from an input pin to the PLL, the CORE variant through fabric09:16
daveshahYou can't really choose one of the two on a one PLL device, you have to use the PAD PLL if your input is the dedicated input pin or CORE otherwise09:17
xobsdaveshah: that's why some pins are defined as clock-capable?09:17
daveshahThose have dedicated routing to the global networks09:18
daveshahOne in particular has dedicated routing to the PLL09:18
xobsOur approach is to use the external 48 MHz clock for the small USB domain, and use the oscillator for the low-speed 12 MHz domain.09:18
daveshahMight as well just use two registers to divide down 48MHz09:18
xobsWould that simplify routing at all?09:19
daveshahNo, but it would improve accuracy of the 12MHz domain09:19
daveshahBTW the _2_ PLL variants pass through the input to the second output (with PAD PLLs the dedicated input is otherwise unusable)09:20
daveshahThe _2F_ variants have two outputs, although the possibilities for differing output frequencies are limited.09:20
daveshahThose variants can also give you a variable phase shift between the two for certain IO applications09:21
*** rohitksingh has joined #timvideos09:26
*** rohitksingh has quit IRC09:38
*** rohitksingh has joined #timvideos10:01
*** rohitksingh has quit IRC10:32
xobsWow, okay.  Definitely enjoying vexriscv over lm32.  It's slightly larger (by about 300 gates - 4552 vs 4228), but the code size is smaller so I can stuff more in ROM.10:51
xobsAlso: "Max frequency for clock 'clk48_$glb_clk': 56.74 MHz (PASS at 48.00 MHz)"10:51
xobs...though presumably the larger register files uses more BRAM, since I have to reduce the ROM to 0x2400 bytes (down from 0x2c00 for lm32)12:31
xobscr1901_modern: The target I'm using is at https://github.com/xobs/litex-buildenv/blob/fomu-evt-usb/targets/fomu_evt/usb.py but the magic field is mostly "self.submodules.random_rom = RandomFirmwareROM(bios_size)"14:14
tpbTitle: litex-buildenv/usb.py at fomu-evt-usb · xobs/litex-buildenv · GitHub (at github.com)14:14
xobsThe trick is you need to figure out what file gets generated.  Usually it's "mem_1.init".  Then you run it through https://github.com/xobs/ice40-repack with "ice40-repack top.bin mem_1.init $BIOS top-patched.bin" to replace the random memory with the contents of $BIOS, and save it to top-patched.bin.14:16
tpbTitle: GitHub - xobs/ice40-repack: Repack an ICE40 bitstream image with new memory contents (at github.com)14:16
xobsThere's some endianness weirdness I need to fix (lm32 vs riscv), and i'd like to simplify ice40-repack (or even replace it with a python script), but it works great, and is the process I'm using right now.14:17
*** heyy has joined #timvideos16:36
*** heyy has quit IRC16:38
*** rohitksingh has joined #timvideos16:51
*** rohitksingh has quit IRC19:07
mithroxobs: Were did you get too?21:14
CarlFKmithro: when do you leave for CCC or whatever your first stop is?21:28
mithro25th21:29
CarlFKmithro: will you be back before LCA?21:32
mithroCarlFK: for a couple of days, yes21:33
CarlFKI was wondering how crazy you were ;)21:33
CarlFKtumbleweed still has the more crazy schedule21:34
cr1901_modernxobs: What's the possibility you could submit your work upstream to icestorm (if only to explain to clifford about the shortcomings of icebram alone)?21:42
*** techman83 has quit IRC22:07
*** techman83 has joined #timvideos22:08
*** ChanServ sets mode: +v techman8322:08

Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!