DSP/ML computing libraries for IoT

Laurent Le Faucheur - Watch Now - DSP Online Conference 2021 - Duration: 28:01

Abstract Questions & Comments (3)

CMSIS-NN and CMSIS-DSP provide developers with a collection of efficient neural network kernels aimed at maximizing performance and minimizing the memory footprint on Cortex-M processors for applications that require machine learning and DSP capabilities. Join Laurent Le Faucheur, Principal IoT Software Engineer at Arm as he shares the latest developments for these computing libraries and how they can be used efficiently with future processing technology, including the Arm Cortex-M55 processor.

M↓ MARKDOWN HELP

italics	surround text with asterisks
bold	surround text with two asterisks
hyperlink	[hyperlink](https://example.com) or just a bare URL
code	surround text with `backticks`
~~strikethrough~~	surround text with ~~two tilde characters~~
quote	prefix with >

Upvotes Newest Oldest

laurentlefaucheurSpeaker

Score: 0 | 4 years ago | no reply

Yes, the renormalization ends with an arithmetic right shift.
Cortex-M have a short pipeline with internal bypass, the compiler takes care of instruction reordering when needed.

Darkphibre

Score: 0 | 4 years ago | no reply

I also have a question about many sequential shifts and MAC. The slide on the processors calculates the most optimal times, yes? Back when I worked with processors that could reorder instructions, it seemed that we had to be careful using the raw instruction times, because the processor could skip a load (if I remember correctly) by chaining operations, and in the other direction cache misses were critical to minimize.
A potential reference, that I've only just skimmed through:
https://blog.stuffedcow.net/2013/05/measuring-rob-capacity/
(Aside: It'd be nice to have slide numbers to reference questions to :)

Darkphibre

Score: 0 | 4 years ago | no reply

On the CSD slide, the final line would read A x 76 = -A<<2 + A<<4 + A<<6, correct? I assume the final line would be A x 0.594 = (-A<<2 + A<<4 + A<<6) >> 7?
CSD looks fascinating.

Login

About Laurent Le Faucheur

DSP/ML computing libraries for IoT