Published December 23, 2020 © CC BY-NC

ApproxFFT: Fastest FFT Function for Arduino

This function performs FFT with very good speed while maintaining accuracy.

IntermediateFull instructions provided17,888

ApproxFFT: Fastest FFT Function for Arduino

Things used in this project

Hardware components

Arduino Mega 2560

Software apps and online services

Arduino IDE

Story

FFT or Fast Fourier transform is one of the most important tools in signal processing. This can be used to detect frequencies with good speed. However, it is a computationally-intensive process and takes a good amount of processing power as well as memory especially on Arduino. This makes it difficult to use for real-time applications where multiple FFT results required every second.

In this tutorial, the fastest program to perform FFT on Arduino is presented. I am naming it ApproxFFT as there are so many approximations taken at multiple stages. However, with very little compromise inaccuracy, I was able to achieve around 3 times speed.

If you want to only know how to use it directly go to 5.

The complete project was also explained in a half-hour long explanation video. you may choose to go through that or read this tutorial.

Video Explaining The project

I have prepared two code previously for the same application that can be accessed here:EasyFFT: this code gives accurate output for FFT, however, it is relatively slow and consumes high memory.It is recommended to go through this tutorial for background on FFT prior to this tutorial.

QuickFFT: This code gives a very fast output. However, amplitudes are far off and not usable for most of the applications. It also gives multiple peaks that might mislead the results.

1: Need of the ApproxFFT Function

If you refer to the attached image, it is very clear that QuickFFT has some significant advantages over conventional approaches (speed calculated with inbuilt sine/cosine functions for EasyFFT). It is almost 4 times faster but lacks accuracy. As can be seen in the second image, the amplitude is way off. The main reason behind this the fact that we have used the square wave (Approximate sine wave!!!) that consists of multiple frequencies (harmonics). there are multiple waves present in the result that makes it difficult for various applications.

1 / 2 • Comparision

Here we have assumed the sine/cosine wave as a square wave. If we take some better approximation for these waves we can reach close to the exact output. In this code better approximations were considered for these waves. we also have some provisions to control the accuracy of these waves. By tweaking this setting we can also play with various accuracy/speed relations.

2: Fast Sine and Cosine Function

If you refer to the previous EasyFFT function, we had two options available. One of those was accurate one to make use of inbuilt sine/cosine function which are accurate but slow. If we go with another method where we have used a lookup table, speed was improved slightly with a small loss inaccuracy.

1 / 2

Here completely different approach was used to calculate this value, which is around 10-12 times faster than conventional ones. These function intensively make use of only bitshift operation for multiplication and (approximate) division. These operations are significantly fast than normal division or multiplication. however, these may cause a precision loss that we need to somehow take care of.

below two lines are the ones that made the speed difference between EasyFFT and QuickFFT. Typically first we have to calculate the sine/cosine for a given value. once the value is available that needs to be multiplied to some another float value. Both of these processes are slow and takes too much time.

tr=c*out_r[i10+n1]-s*out_im[i10+n1];<br>ti=s*out_r[i10+n1]+c*out_im[i10+n1];  // c is cosine and s is sine of some value

Here we have used something similar to the binary shoring algorithm. It will directly give Approximate output for A*sin(θ) and we do not need to do multiplication. In the first plot, ArcsinX vs X plot is shown. a similar plot is a store as the table in Arduino. where angles are mapped from 0-1024 which are equivalent to 0-360 degrees. Arcsine values are also considered as multiple of 128.

this is how the flow of calculation works: here we are ignoring negative values and will only consider for 0-90. (i.e. 500*sin 50 (in degree))

1. For any value (i.e. 500) sine multiple can be 0 (for 0 degrees) to that value (for 90 degrees) (i.e.0-500). we will check angle value for midpoint. we will half the value of the input (500) and check the angle for 0.5. based on that we can check out output will be 0-mid point or midpoint-max (in our case angle for 0.5 will be 30 degrees which is lower than 50, so we can conclude that the output will be somewhere in between 250-500). At this point of time, we have halved the size of the possible range. (previous 0-500 to 250-500 )

2. newly defined range we will again do a similar step repeated. In our example, the midpoint will be 375, where we will check the angle for.75 that will be 48 degrees. so our answer will lie in between 375 to 500.

similar steps need to be repeated multiple times for accurate output. As can be shown in the second image (graph). The more level we go to, the output will be more close to the exact value.

Here we can perform all operations by only a bit shift and we also eliminated the floating multiplication.

3: Fast Root of Squared Sum (FastRSS)

This function makes some approximation for the RSS value. its impact on overall speed is not very significant. However, it saves a few Milliseconds of time.

The attached plot shows the value of error if we assume root off squared sum output same as bigger value. The Error associated with this assumption decreases as the ratio increases. In this code, if the ratio exceeds it simply assumes a larger value as output that saves time. If the ratio is lower than that, based on the value some predefined scaling is set that is applied by doing some repetitions. All these values are stored as the RSSdata.

4: Overall Calculation Flow:

Flow for overall calculation will be as follow.

1. Input data is scaled to +512 to -512: this step is important to preserve the precision of data. as we are using bit shift operation, there will be precision loss in case of an odd number. The bigger the number lower the loss.

2. Bit reversed order is generated,

3. Input data is ordered as per generated bit reverse order.

4. Frequency transformed is performed. fast sine and cosine were used wherever required. after every loop, the value of all integer checks if the amplitude is higher than 15000, both real and imaginary arrays are divided by 2.

wherever the scaling is performed, scale value is recorded.

5. Root of the squared sum is calculated using the FastRSS function.

6. Max value is detected and returned as output.

5: Application

1. Lookup data needs to be declared as a global variable. we need to simply paste the below data at the top of the code.

//---------------------------------lookup data------------------------------------//<br>byte isin_data[128]=
{0,  1,   3,   4,   5,   6,   8,   9,   10,  11,  13,  14,  15,  17,  18,  19,  20, 
22,  23,  24,  26,  27,  28,  29,  31,  32,  33,  35,  36,  37,  39,  40,  41,  42, 
44,  45,  46,  48,  49,  50,  52,  53,  54,  56,  57,  59,  60,  61,  63,  64,  65, 
67,  68,  70,  71,  72,  74,  75,  77,  78,  80,  81,  82,  84,  85,  87,  88,  90, 
91,  93,  94,  96,  97,  99,  100, 102, 104, 105, 107, 108, 110, 112, 113, 115, 117, 
118, 120, 122, 124, 125, 127, 129, 131, 133, 134, 136, 138, 140, 142, 144, 146, 148, 
150, 152, 155, 157, 159, 161, 164, 166, 169, 171, 174, 176, 179, 182, 185, 188, 191, 
195, 198, 202, 206, 210, 215, 221, 227, 236};
unsigned int Pow2[14]={1,2,4,8,16,32,64,128,256,512,1024,2048,4096};
byte RSSdata[20]={7,6,6,5,5,5,4,4,4,4,3,3,3,3,3,3,3,2,2,2};
//---------------------------------------------------------------------------------//

2. ApproxFFT, fast_sine, fast_cosine, and fastRSS function need to be pasted at the end of the code.

3. Using function:

float f=Approx_FFT(data,sample,sampling_rate);

This function will return the value of frequency with max amplitude by default. This is exactly the same as EasyFFT and QuickFFT function.

first is the array of which we need to perform FFT,

second is the number of samples: ideally, it should be 2^n, which can be 2, 4, 8, 16, 32, 64, 128,..here the max number of sample is limited by the memory available

the third input is the sampling rate.

4. Accuracy setting:

this value can be set from 1 to 7. higher the number better accuracy. here improving the accuracy will improve the fit of our approximate sine wave. By default, the accuracy value is set to 5. In most cases, this value works well. All result and speed claims are based on this value. Speed and accuracy can be tweaked by changing this value.

int fast_sine(int Amp, int th)<br>{
int temp3,m1,m2;
byte temp1,temp2, test,quad,accuracy;
accuracy=5;    // set it value from 1 to 7, where 7 being most accurate but slowest
               // accuracy value of 5 recommended for typical applicaiton
while(th>1024){th=th-1024;}   // here 1024 = 2*pi or 360 deg
while(th<0){th=th+1024;}
.
.
.

5. This code can be modified to display (and use) Raw output or amplitude of various frequencies. the output will as multiple of 2^n. As integer can only hold up to +-32000 values, the output is represented as multiple.

6: Conclusion

1 / 2

All the test shown on the above image is done on Arduino mega with an accuracy level of 5. below are some of the significant observation on the test:

Speed is more than 3 times faster than conventional FFT,

Memory consumption is low (almost half),

The output is comparable with exact values (low error),

Hope this code will useful for projects. In case of any query or feedback, do let me know by comment or Email.

//---------------------------------lookup data------------------------------------//
byte isin_data[128]=
{0,  1,   3,   4,   5,   6,   8,   9,   10,  11,  13,  14,  15,  17,  18,  19,  20, 
22,  23,  24,  26,  27,  28,  29,  31,  32,  33,  35,  36,  37,  39,  40,  41,  42, 
44,  45,  46,  48,  49,  50,  52,  53,  54,  56,  57,  59,  60,  61,  63,  64,  65, 
67,  68,  70,  71,  72,  74,  75,  77,  78,  80,  81,  82,  84,  85,  87,  88,  90, 
91,  93,  94,  96,  97,  99,  100, 102, 104, 105, 107, 108, 110, 112, 113, 115, 117, 
118, 120, 122, 124, 125, 127, 129, 131, 133, 134, 136, 138, 140, 142, 144, 146, 148, 
150, 152, 155, 157, 159, 161, 164, 166, 169, 171, 174, 176, 179, 182, 185, 188, 191, 
195, 198, 202, 206, 210, 215, 221, 227, 236};
unsigned int Pow2[14]={1,2,4,8,16,32,64,128,256,512,1024,2048,4096};
byte RSSdata[20]={7,6,6,5,5,5,4,4,4,4,3,3,3,3,3,3,3,2,2,2};
//---------------------------------------------------------------------------------//



void setup() 
        {
     Serial.begin(250000);  
        }

void loop() {

//float f=Approx_FFT(data,256,100);
//Serial.println(f);
//delay(99999);
            }




//-----------------------------FFT Function----------------------------------------------//
/*
Code to perform High speed and Accurate FFT on arduino,
setup:

1. in[]     : Data array, 
2. N        : Number of sample (recommended sample size 2,4,8,16,32,64,128,256,512...)
3. Frequency: sampling frequency required as input (Hz)

It will by default return frequency with max aplitude,
if you need complex output or magnitudes uncomment required sections

If sample size is not in power of 2 it will be clipped to lower side of number. 
i.e, for 150 number of samples, code will consider first 128 sample, remaining sample  will be omitted.
For Arduino nano, FFT of more than 256 sample not possible due to mamory limitation 
Code by ABHILASH
Contact: abhilashpatel121@gmail.com
Documentation & details: https://www.instructables.com/member/abhilash_patel/instructables/

Update(06/05/21): Correction made for support on Arduino Due
*/

float Approx_FFT(int in[],int N,float Frequency)
{ 
int a,c1,f,o,x,data_max,data_min=0;
long data_avg,data_mag,temp11;         
byte scale,check=0;

data_max=0;
data_avg=0;
data_min=0;

      for(int i=0;i<12;i++)                 //calculating the levels
         { if(Pow2[i]<=N){o=i;}}
     a=Pow2[o];  
int out_r[a];   //real part of transform
int out_im[a];  //imaginory part of transform

      for(int i=0;i<a;i++)                //getting min max and average for scalling
          { out_r[i]=0; out_im[i]=0;
            data_avg=data_avg+in[i];
            if(in[i]>data_max){data_max=in[i];}
            if(in[i]<data_min){data_min=in[i];} 
          }

data_avg=data_avg>>o;
scale=0;
data_mag=data_max-data_min;
temp11=data_mag;

 //scalling data  from +512 to -512

      if(data_mag>1024)
          {while(temp11>1024)
                {temp11=temp11>>1;
                scale=scale+1;
                }   
          }
 
      if(data_mag<1024)
          {while(temp11<1024)
                {temp11=temp11<<1;
                scale=scale+1;
                }
          }


      if(data_mag>1024)
          {
              for(int i=0;i<a;i++)
                    { in[i]=in[i]-data_avg;
                      in[i]=in[i]>>scale;
                    }
                    scale=128-scale;
          }

      if(data_mag<1024)
          { scale=scale-1;
            for(int i=0;i<a;i++)
                    {
                      in[i]=in[i]-data_avg;
                      in[i]=in[i]<<scale;
                    }

                    scale=128+scale;
          }


x=0;  
      for(int b=0;b<o;b++)                     // bit reversal order stored in im_out array
         {
          c1=Pow2[b];
          f=Pow2[o]/(c1+c1);
                for(int j=0;j<c1;j++)
                    { 
                     x=x+1;
                     out_im[x]=out_im[j]+f;
                    }
         }

      for(int i=0;i<a;i++)            // update input array as per bit reverse order
         {
          out_r[i]=in[out_im[i]]; 
          out_im[i]=0;
         }


int i10,i11,n1,tr,ti;
float e;
int c,s,temp4;
    for(int i=0;i<o;i++)                                    //fft
    {
     i10=Pow2[i];              // overall values of sine/cosine  
     i11=Pow2[o]/Pow2[i+1];    // loop with similar sine cosine
     e=1024/Pow2[i+1];  //1024 is equivalent to 360 deg
     e=0-e;
     n1=0;

          for(int j=0;j<i10;j++)
          {
            c=e*j;    //c is angle as where 1024 unit is 360 deg
  while(c<0){c=c+1024;}
  while(c>1024){c=c-1024;}

          n1=j;
          
          for(int k=0;k<i11;k++)
                 {
                   temp4=i10+n1;
       if(c==0)   {tr=out_r[temp4];
                   ti=out_im[temp4];}
  else if(c==256) {tr= -out_im[temp4];
                   ti=out_r[temp4];}
  else if(c==512) {tr=-out_r[temp4];
                  ti=-out_im[temp4];}
  else if(c==768) {tr=out_im[temp4];
                   ti=-out_r[temp4];}
  else if(c==1024){tr=out_r[temp4];
                   ti=out_im[temp4];}
  else{
    tr=fast_cosine(out_r[temp4],c)-fast_sine(out_im[temp4],c);            //the fast sine/cosine function gives direct (approx) output for A*sinx
    ti=fast_sine(out_r[temp4],c)+fast_cosine(out_im[temp4],c);            
      }
          
                 out_r[n1+i10]=out_r[n1]-tr;
                 out_r[n1]=out_r[n1]+tr;
                 if(out_r[n1]>15000 || out_r[n1]<-15000){check=1;}   //check for int size, it can handle only +31000 to -31000,
          
                 out_im[n1+i10]=out_im[n1]-ti;
                 out_im[n1]=out_im[n1]+ti;
                 if(out_im[n1]>15000 || out_im[n1]<-15000){check=1;}          
          
                 n1=n1+i10+i10;
                  }       
             }

    if(check==1){                                             // scalling the matrics if value higher than 15000 to prevent varible from overflowing
                for(int i=0;i<a;i++)
                    {
                     out_r[i]=out_r[i]>>1;           
                     out_im[i]=out_im[i]>>1; 
                    }
                     check=0; 
                     scale=scale-1;                 // tracking overall scalling of input data
                }           

     }


if(scale>128)
    {scale=scale-128;
     for(int i=0;i<a;i++)
      {out_r[i]=out_r[i]>>scale;
       out_im[i]=out_im[i]>>scale;
      }
      scale=0;
    }                                                   // revers all scalling we done till here,
else{scale=128-scale;}                             // in case of nnumber getting higher than 32000, we will represent in as multiple of 2^scale

/*
for(int i=0;i<a;i++)
{
Serial.print(out_r[i]);Serial.print("\t");                    // un comment to print RAW o/p    
Serial.print(out_im[i]); 
Serial.print("i");Serial.print("\t"); 
Serial.print("*2^");Serial.println(scale); 
}
*/

//---> here onward out_r contains amplitude and our_in conntains frequency (Hz)
int fout,fm,fstp;
float fstep;
fstep=Frequency/N;
fstp=fstep;
fout=0;fm=0;

    for(int i=1;i<Pow2[o-1];i++)               // getting amplitude from compex number
        { 
              out_r[i]=fastRSS(out_r[i],out_im[i]);
   // Approx RSS function used to calculated magnitude quickly
        
out_im[i]=out_im[i-1]+fstp;
if (fout<out_r[i]){fm=i; fout=out_r[i];}
         
         // un comment to print Amplitudes (1st value (offset) is not printed)
         //Serial.print(out_r[i]); Serial.print("\t"); 
         //Serial.print("*2^");Serial.println(scale); 
        }


float fa,fb,fc;
fa=out_r[fm-1];
fb=out_r[fm]; 
fc=out_r[fm+1];
fstep=(fa*(fm-1)+fb*fm+fc*(fm+1))/(fa+fb+fc);

return(fstep*Frequency/N);
}

//---------------------------------fast sine/cosine---------------------------------------//

int fast_sine(int Amp, int th)
{
int temp3,m1,m2;
byte temp1,temp2, test,quad,accuracy;
accuracy=5;    // set it value from 1 to 7, where 7 being most accurate but slowest
               // accuracy value of 5 recommended for typical applicaiton
while(th>1024){th=th-1024;}   // here 1024 = 2*pi or 360 deg
while(th<0){th=th+1024;}
quad=th>>8;

       if(quad==1){th= 512-th;}
  else if(quad==2){th= th-512;}
  else if(quad==3){th= 1024-th;}

temp1= 0;
temp2= 128;     //2 multiple
m1=0;
m2=Amp;

    temp3=(m1+m2)>>1;
    Amp=temp3;
      for(int i=0;i<accuracy;i++)
        { test=(temp1+temp2)>>1;
          temp3=temp3>>1; 
          if(th>isin_data[test]){temp1=test; Amp=Amp+temp3; m1=Amp;}
          else if(th<isin_data[test]){temp2=test; Amp=Amp-temp3; m2=Amp;}
        }

         if(quad==2){Amp= 0-Amp;}
    else if(quad==3){Amp= 0-Amp;}
return(Amp);
}

int fast_cosine(int Amp, int th)
  {  
  th=256-th;  //cos th = sin (90-th) formula
  return(fast_sine(Amp,th)); 
  }

//--------------------------------------------------------------------------------//


//--------------------------------Fast RSS----------------------------------------//
int fastRSS(int a, int b)
{ if(a==0 && b==0){return(0);}
  int min,max,temp1,temp2;
  byte clevel;
  if(a<0){a=-a;}
  if(b<0){b=-b;}
clevel=0;
if(a>b){max=a;min=b;} else{max=b;min=a;}

  if(max>(min+min+min))
    {return max;}
  else
    {
     temp1=min>>3; if(temp1==0){temp1=1;}
     temp2=min;
     while(temp2<max){temp2=temp2+temp1;clevel=clevel+1;}
     temp2=RSSdata[clevel];temp1=temp1>>1;  
     for(int i=0;i<temp2;i++){max=max+temp1;}
     return(max);
    }
}
//--------------------------------------------------------------------------------//

//---------------------------------lookup data------------------------------------//
byte isin_data[128]=
{0,  1,   3,   4,   5,   6,   8,   9,   10,  11,  13,  14,  15,  17,  18,  19,  20, 
22,  23,  24,  26,  27,  28,  29,  31,  32,  33,  35,  36,  37,  39,  40,  41,  42, 
44,  45,  46,  48,  49,  50,  52,  53,  54,  56,  57,  59,  60,  61,  63,  64,  65, 
67,  68,  70,  71,  72,  74,  75,  77,  78,  80,  81,  82,  84,  85,  87,  88,  90, 
91,  93,  94,  96,  97,  99,  100, 102, 104, 105, 107, 108, 110, 112, 113, 115, 117, 
118, 120, 122, 124, 125, 127, 129, 131, 133, 134, 136, 138, 140, 142, 144, 146, 148, 
150, 152, 155, 157, 159, 161, 164, 166, 169, 171, 174, 176, 179, 182, 185, 188, 191, 
195, 198, 202, 206, 210, 215, 221, 227, 236};
unsigned int Pow2[14]={1,2,4,8,16,32,64,128,256,512,1024,2048,4096};
byte RSSdata[20]={7,6,6,5,5,5,4,4,4,4,3,3,3,3,3,3,3,2,2,2};
//---------------------------------------------------------------------------------//


//int data[256]={};


void setup() 
        {
     Serial.begin(250000);  
        }


void loop() {

//float f=Approx_FFT(data,512,100);
//Serial.println(f);
            }


//-----------------------------FFT Function----------------------------------------------//
/*
Code to perform High speed and Accurate FFT on arduino,
setup:

1. in[]     : Data array, 
2. N        : Number of sample (recommended sample size 2,4,8,16,32,64,128,256,512...)
3. Frequency: sampling frequency required as input (Hz)

It will by default return frequency with max aplitude,
if you need complex output or magnitudes uncomment required sections

If sample size is not in power of 2 it will be clipped to lower side of number. 
i.e, for 150 number of samples, code will consider first 128 sample, remaining sample  will be omitted.
For Arduino nano, FFT of more than 256 sample not possible due to mamory limitation 
Code by ABHILASH
Contact: abhilashpatel121@gmail.com
Documentation & details: https://www.instructables.com/member/abhilash_patel/instructables/
*/

float Approx_FFT(int in[],int N,float Frequency)
{ 
int a,c1,f,o,x,data_max,data_min=0;
long data_avg,data_mag,temp11;         
byte scale,check=0;

data_max=0;
data_avg=0;
data_min=0;

      for(int i=0;i<12;i++)                 //calculating the levels
         { if(Pow2[i]<=N){o=i;} }
     a=Pow2[o];  
int out_r[a];   //real part of transform
int out_im[a];  //imaginory part of transform

      for(int i=0;i<a;i++)                //getting min max and average for scalling
          { out_r[i]=0; out_r[i]=0;
            data_avg=data_avg+in[i];
            if(in[i]>data_max){data_max=in[i];}
            if(in[i]<data_min){data_min=in[i];}
          }

data_avg=data_avg>>o;
scale=0;
data_mag=data_max-data_min;
temp11=data_mag;

for(int i;i<128;i++)              //scalling data  from +512 to -512

      if(data_mag>1024)
          {while(temp11>1024)
                {temp11=temp11>>1;
                scale=scale+1;
                }   
          }
 
      if(data_mag<1024)
          {while(temp11<1024)
                {temp11=temp11<<1;
                scale=scale+1;
                }
          }


      if(data_mag>1024)
          {
              for(int i=0;i<a;i++)
                    { in[i]=in[i]-data_avg;
                      in[i]=in[i]>>scale;
                    }
                    scale=128-scale;
          }

      if(data_mag<1024)
          { scale=scale-1;
            for(int i=0;i<a;i++)
                    {
                      in[i]=in[i]-data_avg;
                      in[i]=in[i]<<scale;
                    }

                    scale=128+scale;
          }

       
x=0;  
      for(int b=0;b<o;b++)                     // bit reversal order stored in im_out array
         {
          c1=Pow2[b];
          f=Pow2[o]/(c1+c1);
                for(int j=0;j<c1;j++)
                    { 
                     x=x+1;
                     out_im[x]=out_im[j]+f;
                    }
         }

 
      for(int i=0;i<a;i++)            // update input array as per bit reverse order
         {
          out_r[i]=in[out_im[i]]; 
          out_im[i]=0;
         }


int i10,i11,n1,tr,ti;
float e;
int c,s,temp4;
    for(int i=0;i<o;i++)                                    //fft
    {
     i10=Pow2[i];              // overall values of sine/cosine  
     i11=Pow2[o]/Pow2[i+1];    // loop with similar sine cosine
     e=1024/Pow2[i+1];  //1024 is equivalent to 360 deg
     e=0-e;
     n1=0;

          for(int j=0;j<i10;j++)
          {
            c=e*j;    //c is angle as where 1024 unit is 360 deg
  while(c<0){c=c+1024;}
  while(c>1024){c=c-1024;}

          n1=j;
          
          for(int k=0;k<i11;k++)
                 {
                   temp4=i10+n1;
       if(c==0)   {tr=out_r[temp4];
                   ti=out_im[temp4];}
  else if(c==256) {tr= -out_im[temp4];
                   ti=out_r[temp4];}
  else if(c==512) {tr=-out_r[temp4];
                  ti=-out_im[temp4];}
  else if(c==768) {tr=out_im[temp4];
                   ti=-out_r[temp4];}
  else if(c==1024){tr=out_r[temp4];
                   ti=out_im[temp4];}
  else{
    tr=fast_cosine(out_r[temp4],c)-fast_sine(out_im[temp4],c);            //the fast sine/cosine function gives direct (approx) output for A*sinx
    ti=fast_sine(out_r[temp4],c)+fast_cosine(out_im[temp4],c);            
      }
          
                 out_r[n1+i10]=out_r[n1]-tr;
                 out_r[n1]=out_r[n1]+tr;
                 if(out_r[n1]>15000 || out_r[n1]<-15000){check=1;}   //check for int size, it can handle only +31000 to -31000,
          
                 out_im[n1+i10]=out_im[n1]-ti;
                 out_im[n1]=out_im[n1]+ti;
                 if(out_im[n1]>15000 || out_im[n1]<-15000){check=1;}          
          
                 n1=n1+i10+i10;
                  }       
             }

    if(check==1){                                             // scalling the matrics if value higher than 15000 to prevent varible from overflowing
                for(int i=0;i<a;i++)
                    {
                     out_r[i]=out_r[i]>>1;           
                     out_im[i]=out_im[i]>>1; 
                    }
                     check=0; 
                     scale=scale-1;                 // tracking overall scalling of input data
                }           

     }


if(scale>128)
    {scale=scale-128;
     for(int i=0;i<a;i++)
      {out_r[i]=out_r[i]>>scale;
       out_im[i]=out_im[i]>>scale;
      }
      scale=0;
    }                                                   // revers all scalling we done till here,
else{scale=128-scale;}                             // in case of nnumber getting higher than 32000, we will represent in as multiple of 2^scale

/*
for(int i=0;i<a;i++)
{
Serial.print(out_r[i]);Serial.print("\t");                    // un comment to print RAW o/p    
Serial.print(out_im[i]); 
Serial.print("i");Serial.print("\t"); 
Serial.print("*2^");Serial.println(scale); 
}
*/

//---> here onward out_r contains amplitude and our_in conntains frequency (Hz)
int fout,fm,fstp;
float fstep;
fstep=Frequency/N;
fstp=fstep;
fout=0;fm=0;

    for(int i=1;i<Pow2[o-1];i++)               // getting amplitude from compex number
        { 
              out_r[i]=fastRSS(out_r[i],out_im[i]);
   // Approx RSS function used to calculated magnitude quickly
        
out_im[i]=out_im[i-1]+fstp;
if (fout<out_r[i]){fm=i; fout=out_r[i];}
         /*
         // un comment to print Amplitudes (1st value (offset) is not printed)
         Serial.print(out_r[i]); Serial.print("\t"); 
         Serial.print("*2^");Serial.println(scale); 
         */ 
        }


float fa,fb,fc;
fa=out_r[fm-1];
fb=out_r[fm]; 
fc=out_r[fm+1];
fstep=(fa*(fm-1)+fb*fm+fc*(fm+1))/(fa+fb+fc);

return(fstep*Frequency/N);
}

//---------------------------------fast sine/cosine---------------------------------------//

int fast_sine(int Amp, int th)
{
int temp3,m1,m2;
byte temp1,temp2, test,quad,accuracy;
accuracy=5;    // set it value from 1 to 7, where 7 being most accurate but slowest
               // accuracy value of 5 recommended for typical applicaiton
while(th>1024){th=th-1024;}   // here 1024 = 2*pi or 360 deg
while(th<0){th=th+1024;}
quad=th>>8;

       if(quad==1){th= 512-th;}
  else if(quad==2){th= th-512;}
  else if(quad==3){th= 1024-th;}

temp1= 0;
temp2= 128;     //2 multiple
m1=0;
m2=Amp;

    temp3=(m1+m2)>>1;
    Amp=temp3;
      for(int i=0;i<accuracy;i++)
        { test=(temp1+temp2)>>1;
          temp3=temp3>>1; 
          if(th>isin_data[test]){temp1=test; Amp=Amp+temp3; m1=Amp;}
          else if(th<isin_data[test]){temp2=test; Amp=Amp-temp3; m2=Amp;}
        }

         if(quad==2){Amp= 0-Amp;}
    else if(quad==3){Amp= 0-Amp;}
return(Amp);
}

int fast_cosine(int Amp, int th)
  {  
  th=256-th;  //cos th = sin (90-th) formula
  return(fast_sine(Amp,th)); 
  }

//--------------------------------------------------------------------------------//


//--------------------------------Fast RSS----------------------------------------//
int fastRSS(int a, int b)
{ if(a==0 && b==0){return(0);}
  int min,max,temp1,temp2;
  byte clevel;
  if(a<0){a=-a;}
  if(b<0){b=-b;}
clevel=0;
if(a>b){max=a;min=b;} else{max=b;min=a;}

  if(max>(min+min+min))
    {return max;}
  else
    {
     temp1=min>>3; if(temp1==0){temp1=1;}
     temp2=min;
     while(temp2<max){temp2=temp2+temp1;clevel=clevel+1;}
     temp2=RSSdata[clevel];temp1=temp1>>1;  
     for(int i=0;i<temp2;i++){max=max+temp1;}
     return(max);
    }
}
//--------------------------------------------------------------------------------//

ApproxFFT: Fastest FFT Function for Arduino

Things used in this project

Hardware components

Software apps and online services

Story

1: Need of the ApproxFFT Function

2: Fast Sine and Cosine Function

3: Fast Root of Squared Sum (FastRSS)

4: Overall Calculation Flow:

5: Application

6: Conclusion

Code

ApproxFFT_V2_060521

ApproxFFT

ApproxFFT.ino

Credits

abhilash_patel

Comments

Embed the widget on your own site

ApproxFFT: Fastest FFT Function for Arduino

ApproxFFT: Fastest FFT Function for Arduino

Things used in this project

Hardware components

Software apps and online services

Story

1: Need of the ApproxFFT Function

2: Fast Sine and Cosine Function

3: Fast Root of Squared Sum (FastRSS)

4: Overall Calculation Flow:

5: Application

6: Conclusion

Code

ApproxFFT_V2_060521

ApproxFFT

ApproxFFT.ino

Credits

abhilash_patel

Comments

Related channels and tags