Saturday, 2018-06-30

*** tpb has joined #yosys00:00
*** seldridge has joined #yosys00:03
*** digshadow has quit IRC00:14
*** promach_ has joined #yosys00:25
*** digshadow has joined #yosys00:45
*** dxld has quit IRC00:55
*** dxld has joined #yosys00:57
*** m_w has quit IRC01:43
promach_$global_clock is the same as smt_clk , right ?02:05
promach_smt_clock02:05
*** promach_ has quit IRC02:20
*** promach_ has joined #yosys02:21
ZipCPUemeb_mac: It's pipelined.  That part isn't configurable.  However, you can configure the size of the FFT, the number of bits in the input, the number of bits in the output, the number of multiplies used, whether or not the FFT is to accept two samples per clock, 1 sample per clock, 1 sample every two clocks, or 1 sample every three clocks.02:59
ZipCPUBeyond that, the FFT is limited by your hardware ...03:00
ZipCPUand by the fact that the updates I'm working through aren't (yet) working.  So ... without the new updates that I'm working on, the FFT only does 2 samples per clock plus the other configurables.03:01
emeb_macZipCPU: sounds pretty useful.03:11
ZipCPUThe 1 clock per sample and the 2 clocks per sample just passed my test at 2048 points!  Yaaay ...  (3 clocks per sample still fails)03:11
ZipCPUOh, and thanks!03:12
emeb_macthe radio I'm working on now needs two 1024-pt 16-bit in/out transforms for the RX and a 2048 16 in/out on the TX.03:12
emeb_macwe're using the Xilinx IP core for these and we've been pretty luck that they are working well03:13
emeb_mac(other IP cores from Xilinx have turned out to be disasters)03:13
ZipCPU;)  Yeah, the Xilinx cores seem to be reliable enough.03:15
ZipCPUOnce mine start working, they'll still need some tuning and optimization to match what Xilinx has done ... if I can do it at all.03:16
emeb_macZipCPU: that's strong competition - they've got good performance and well optimized resource usage.03:17
ZipCPUExactly.  Like I said, I don't know if I'll make a good showing in the end there, but at least what I have will work.03:17
emeb_macIIRC the 1kpt transforms we use require only 18 of the MAC cores03:18
ZipCPULet's see ... in the one I'm doing, you can tell it how many DSP cores you have.  If you go full bore, 1 sample per clock, you'll need (10-2)*3 = 24 multiplies.03:19
ZipCPUYou could do it for less, at the cost of more LUT's.03:19
ZipCPUOn the other hand, if you want 2 clocks per sample, it would take (10-2)*2 = 16 multiplies, or if you want 3 clocks per sample it will take 8 multiplies.03:20
emeb_macThat's not bad03:20
ZipCPUOn the other hand, if you are doing two samples per clock, then you'll want (10-2)*6 = 48 multiplies.  It's all a tradeoff.03:21
ZipCPUThe soft multiply option isn't all that efficient though.  I've got a slower option that's more efficient still, and I've thought of integrating that one in later.03:22
ZipCPUThe current soft multiply used is fully pipelined, so ... it requires a lot of flip flops and luts at every stage of the multiply.03:22
emeb_macNice to have the option for soft multiplies03:24
ZipCPUYep!03:25
promach_ZipCPU: which soft multiply are you referring to ?03:25
ZipCPUWhat do you mean?03:25
promach_you have your own multiply algo ?03:25
ZipCPUYes.03:25
emeb_macI try to avoid FPGAs w/o some hard multiplier resources for DSP stuff, but sometimes you gotta go with what's available03:26
promach_wallace, Is uppose ?03:26
ZipCPUIt's a basic shift/add multiply, nothing fancy in this case.03:26
promach_ok03:26
ZipCPUI've built a wallace before, but ... the FFT doesn't use it.03:26
ZipCPUOk, 1, 2, and 3 clocks per sample now works using hardware multiplies, time to double check the soft multiplies03:31
emeb_macso does 1clk/sample allow continuous feed w/o any gaps?03:34
ZipCPUYes.03:34
ZipCPUYou can also feed it with unpredictable gaps too.03:34
emeb_macroughly what latency do you see from input to output?03:35
ZipCPUDepends on the size of the FFT.  Curious about a 1k FFT?  I can go measure that.03:35
emeb_macyeah!03:35
ZipCPULooks like about 4176 clocks from the start of the first frame going in to the start of the first output frame.03:38
*** AlexDaniel has quit IRC03:38
*** seldridge has quit IRC03:39
*** AlexDaniel has joined #yosys03:39
ZipCPUThere's probably a couple clocks in there I could whittle out if latency was an issue, but that's what it is currently.03:39
emeb_macThat's not bad.03:39
ZipCPUAre you looking for low latency?03:39
emeb_macGenerally yes - these radio designs tend to have fairly long datapaths with lots of things going on in them.03:40
ZipCPUI'm not quite sure how I would, or if I would, redesign things for lower latency.03:43
emeb_macIIRC the cores we use have about 3k clocks latency. I don't think 4k would be a huge disadvantage tho03:44
ZipCPUHmm ... not sure where I'd find a full 1k latency from this design ....03:45
ZipCPUSure, there's a clock or two in each stage, but at ten stages that'd be at most 20 clocks.03:45
emeb_macWell, you're ahead of me. I've never thought too much about how to build an FFT.03:46
emeb_macabout 20 years ago a guy I shared an office with architected one as a single-chip ASIC so I've only had peripheral exposure to it from discussing w/ him.03:47
ZipCPU:)03:48
ZipCPUI suppose I might go faster if I did something other than a Radix two FFT ...03:48
* ZipCPU tugs at his beard03:48
emeb_macAha - that must be it. Radix-4 was part of the optimization he did on his.03:48
ZipCPUI might have to look into that in the future.03:50
ZipCPUFor now, I just want to get it running in the first place.03:51
ZipCPUI'm pretty close, but ... not all cases work (yet)03:51
*** ar3itrary has quit IRC04:08
*** ar3itrary has joined #yosys04:13
*** AlexDaniel has quit IRC04:21
*** AlexDaniel has joined #yosys04:22
*** ar3itrary has quit IRC04:29
*** ar3itrary has joined #yosys04:36
cr1901_modernFFT was one of those things where I had to derive "how it works" exactly once and now I don't remember how to do it :(. I know you can split into bins by time or frequency (either works), but Idk if any way is better04:52
*** emeb_mac has quit IRC05:30
*** xerpi has joined #yosys06:25
*** marbler has quit IRC06:27
*** jfng has quit IRC06:27
*** samayra has quit IRC06:27
*** indefini has quit IRC06:27
*** nrossi has quit IRC06:27
*** lok[m] has quit IRC06:27
*** swick has quit IRC06:27
*** pointfree1 has quit IRC06:28
*** Guest18568 has quit IRC06:28
*** fevv8[m] has quit IRC06:28
*** weebull[m] has quit IRC06:28
*** cr1901_modern1 has joined #yosys06:31
*** cr1901_modern1 has quit IRC06:33
*** cr1901_modern has quit IRC06:33
*** cr1901_modern1 has joined #yosys06:34
*** cr1901_modern1 has quit IRC06:34
*** cr1901_modern has joined #yosys06:34
*** promach_ has quit IRC06:35
*** promach_ has joined #yosys06:36
*** cr1901_modern has quit IRC06:43
*** cr1901_modern has joined #yosys06:44
*** samayra has joined #yosys07:41
*** promach_ has quit IRC07:43
*** Guest16831 has joined #yosys08:24
*** lok[m] has joined #yosys08:24
*** jfng has joined #yosys08:24
*** fevv8[m] has joined #yosys08:24
*** pointfree1 has joined #yosys08:24
*** swick has joined #yosys08:24
*** marbler has joined #yosys08:24
*** indefini has joined #yosys08:24
*** nrossi has joined #yosys08:24
*** weebull[m] has joined #yosys08:24
*** indy has quit IRC08:34
*** promach_ has joined #yosys08:45
*** pie_ has quit IRC09:56
*** dys has joined #yosys10:37
*** indy has joined #yosys10:45
*** m_t has joined #yosys12:50
*** emeb_mac has joined #yosys15:00
emeb_macZipCPU: You've spoken before about the difficulty of applying formal to multipliers. Would it be a safe assumption that formal is generally not practical for DSP datapaths which rely heavily on math operations like multiplication / division / transformation?15:02
ZipCPUYes and no .... there are some ways around the problems.15:02
emeb_macI get the impression that the best way to apply formal in these types of designs is to partition complex control logic out and apply formal at the unit level.15:02
ZipCPUI've had mixed success with data paths including multiplication or division.15:02
* ZipCPU rummages through his designs for an example ....15:03
ZipCPUHere's an example of an FIR filter (multiplies and all) where I manage to formally verify that the impulse response is correct using an abstract multiply: https://github.com/ZipCPU/dspfilters/blob/master/rtl/fastfir.v15:04
tpbTitle: dspfilters/fastfir.v at master · ZipCPU/dspfilters · GitHub (at github.com)15:04
ZipCPUYou might find the abstract multiply a fascinating read in and of itself: https://github.com/ZipCPU/dspfilters/blob/master/bench/formal/abs_mpy.v15:05
tpbTitle: dspfilters/abs_mpy.v at master · ZipCPU/dspfilters · GitHub (at github.com)15:05
emeb_macInteresting.15:06
emeb_macWould you call that exercise difficult? I have very little basis for comparison, but it seems somewhat contorted compared to simply running a stimulus / response simulation. Does it provide you with significantly more confidence in the design than a simpler approach?15:09
ZipCPUNot sure.15:20
ZipCPULet's just say that, in this example, the jury is still out.15:20
ZipCPUConsider this, I'm working with a perfect example right now ... I have 6 types of code for a butterfly.  Three use DSP elements, three do not.15:21
ZipCPUThe three that use DSP elements work, the three that do not ... don't.15:21
ZipCPUI'm trying to find out why.15:21
ZipCPUIf I try to apply formal methods to those other three right now, the formal methods don't complete.  The multiply is just too difficult for them.15:21
ZipCPUEven when I bring it down to a three bit multiply they are struggling.15:22
ZipCPUFor example, one of those soft multiply-based butterflies has now run its formal proof for over 12 hours, and has only made it to state 14 of 30.15:23
ZipCPUOn the other hand, the two butterflies that didn't require hardware multiplies could be formally verified quite quickly.15:25
awygleiceradio continues to look awesome, btw15:33
ZipCPUhttp://www.iceradio.ca ?15:37
promach_Can https://github.com/ZipCPU/dspfilters/blob/master/bench/formal/abs_mpy.v actually multiply ? why "abstract" ?15:43
tpbTitle: dspfilters/abs_mpy.v at master · ZipCPU/dspfilters · GitHub (at github.com)15:43
ZipCPUIt's abstract because it's not really a multiply, but yet it still maintains many of the properties of a multiply.15:43
ZipCPUThe idea behind abstraction is that if (AB)->C and I can prove that A->C irrespective of B, then I've proved AB->C as well.15:46
ZipCPUIt's useful in those cases where B is really hard to express or work with.15:46
emeb_macawygle: Thanks! I haven't done much with it for the last year due to $DAYJOB getting in the way, but I've got plans for more features.15:54
emeb_macZipCPU: Thanks for the insight.15:56
*** m_t has quit IRC15:59
*** dxld has quit IRC16:01
*** dxld has joined #yosys16:02
*** dxld has quit IRC16:12
*** dxld has joined #yosys16:14
*** luismarques has joined #yosys16:15
awygleThe biggest upgrade to the iceradio for my purposes would be a lower power adc. I can't afford 73mW :-(16:16
awygleBut the DSP is the hard part from my perspective, I can always replace the afe if I decide to do something with it16:17
awygleOh it's actually much more. Idk what adc I was looking at lol16:19
*** emeb_mac has quit IRC16:26
*** luismarques has quit IRC17:38
*** luismarques has joined #yosys17:39
*** promach_ has quit IRC17:40
*** luismarques has quit IRC17:46
*** luismarques has joined #yosys17:48
*** dxld has quit IRC17:49
*** dxld has joined #yosys17:49
*** xerpi has quit IRC18:02
*** luismarques has quit IRC18:04
cr1901_modernZipCPU: I think awygle means this: http://ebrombaugh.studionebula.com/radio/iceRadio/index.html18:09
tpbTitle: iceRadio (at ebrombaugh.studionebula.com)18:09
cr1901_modernit does look cool. Idk what I could do w/ it tho18:10
awyglecr1901_modern is correct18:10
ZipCPUThanks, that makes a lot more sense than the other.18:10
cr1901_modernI've had my ham radio license since the end of 2013; I've made like 4 or so contacts b/c I don't like voice all that much, and there's little to no digital activity18:12
*** luismarques has joined #yosys18:16
*** luismarques has quit IRC18:21
*** luismarques has joined #yosys18:51
*** pie_ has joined #yosys19:05
*** luismarques has quit IRC19:12
*** proteus-guy has quit IRC19:16
*** X-Scale has joined #yosys19:27
*** proteus-guy has joined #yosys19:28
*** luismarques has joined #yosys19:37
*** luismarques has quit IRC19:42
*** luismarques has joined #yosys19:47
*** digshadow has quit IRC19:57
*** luismarques has quit IRC20:04
*** m_w has joined #yosys20:06
*** luismarques has joined #yosys20:31
*** luismarques has quit IRC20:37
*** sklv has quit IRC20:40
*** sklv has joined #yosys20:41
*** luismarques has joined #yosys20:43
*** sklv has quit IRC20:46
*** luismarques has quit IRC20:48
*** luismarques has joined #yosys20:56
*** [X-Scale] has joined #yosys21:04
*** X-Scale has quit IRC21:06
*** [X-Scale] is now known as X-Scale21:06
*** luismarques has quit IRC21:13
*** emeb_mac has joined #yosys21:19
*** sklv has joined #yosys21:28
*** digshadow has joined #yosys21:29
*** luismarques has joined #yosys21:34
*** luismarques has quit IRC21:42
*** dys has quit IRC21:45
*** luismarques has joined #yosys21:51
*** luismarques has quit IRC21:58
*** luismarques has joined #yosys22:03
*** luismarques has quit IRC22:11
*** luismarques has joined #yosys22:12
*** luismarques has quit IRC22:25
*** luismarques has joined #yosys22:29
*** luismarques has quit IRC22:34
*** m_w has quit IRC22:35
*** luismarques has joined #yosys22:38
*** ar3itrary has quit IRC22:38
*** digshadow has quit IRC22:39
*** luismarques has quit IRC22:43
*** m_w has joined #yosys22:43
*** luismarques has joined #yosys22:49
*** X-Scale has quit IRC23:05
*** luismarques has quit IRC23:10
*** X-Scale has joined #yosys23:11
*** seldridge has joined #yosys23:12
*** luismarques has joined #yosys23:14
*** luismarques has quit IRC23:22
*** luismarques has joined #yosys23:23
*** luismarques has quit IRC23:40
*** luismarques has joined #yosys23:41
*** promach_ has joined #yosys23:46
*** luismarques has quit IRC23:51
*** luismarques has joined #yosys23:56

Generated by irclog2html.py 2.13.1 by Marius Gedminas - find it at mg.pov.lt!