Home >

DSP/ML computing libraries for IoT

Laurent Le Faucheur - Watch Now - Duration: 28:01

CMSIS-NN and CMSIS-DSP provide developers with a collection of efficient neural network kernels aimed at maximizing performance and minimizing the memory footprint on Cortex-M processors for applications that require machine learning and DSP capabilities. Join Laurent Le Faucheur, Principal IoT Software Engineer at Arm as he shares the latest developments for these computing libraries and how they can be used efficiently with future processing technology, including the Arm Cortex-M55 processor.

italicssurround text with
boldsurround text with
**two asterisks**
or just a bare URL
surround text with
strikethroughsurround text with
~~two tilde characters~~
prefix with

Score: 0 | 2 years ago | no reply

Yes, the renormalization ends with an arithmetic right shift.
Cortex-M have a short pipeline with internal bypass, the compiler takes care of instruction reordering when needed.

Score: 0 | 2 years ago | no reply

I also have a question about many sequential shifts and MAC. The slide on the processors calculates the most optimal times, yes? Back when I worked with processors that could reorder instructions, it seemed that we had to be careful using the raw instruction times, because the processor could skip a load (if I remember correctly) by chaining operations, and in the other direction cache misses were critical to minimize.
A potential reference, that I've only just skimmed through:
(Aside: It'd be nice to have slide numbers to reference questions to :)

Score: 0 | 2 years ago | no reply

On the CSD slide, the final line would read A x 76 = -A<<2 + A<<4 + A<<6, correct? I assume the final line would be A x 0.594 = (-A<<2 + A<<4 + A<<6) >> 7?
CSD looks fascinating.