The best way to understand the requirements is to examine typical DSP algorithms and identify how their compositional requirements have influenced the architectures of DSP processor. Let us consider one of the most common processing tasks the finite impulse response filter.
Description of DSP Processor
For each tap of the filter a data sample is multiplied by a filter coefficient with result added to a running sum for all of the taps .Hence the main component of the FIR filter is dot product: multiply and add .These options are not unique to the FIR filter algorithm; in fact multiplication is one of the most common operation performed in signal processing -convolution, IIR filtering and Fourier transform also involve heavy use of multiply -accumulate operation.
Originally, microprocessors implemented multiplication by a series of shift and add operation, each of which consumes one or more clock cycle .First a DSP processor requires a hardware which can multiply in one single cycle. Most of the DSP algorithm require a multiply and accumulate unit (MAC).In comparison to other type of computing tasks, DSP application typically have very high computational requirements since they often must execute DSP algorithms in real time on lengthy segments ,therefore parallel operation of several independent execution units is a must -for example in addition to MAC unit an ALU and shifter is also required .
Executing a MAC in every clock cycle requires more than just single cycle MAC unit. It also requires the ability to fetch the MAC instruction, a data sample, and a filter coefficient from a memory in a single cycle.