main () { short input[256] __attribute__((aligned(16))); short input2[256] __attribute__((aligned(16))); short output[256] __attribute__((aligned(16))); int i; Zero_Buffer(output,256*2); for (i=0;i<16;i+=2) printf("output[%d] = %d + %d i\n",i,output[i],output[i+1]); input[0] = 100; input[1] = 200; input[2] = 100; input[3] = 200; input[4] = 1234; input[5] = -1234; input[6] = 1234; input[7] = -1234; input[8] = 100; input[9] = 200; input[10] = 100; input[11] = 200; input[12] = 1000; input[13] = 2000; input[14] = 1000; input[15] = 2000; input2[0] = 1; input2[1] = 2; input2[2] = 1; input2[3] = 2; input2[4] = 10; input2[5] = 20; input2[6] = 10; input2[7] = 20; input2[8] = 1; input2[9] = 2; input2[10] = 1; input2[11] = 2; input2[12] = 1000; input2[13] = 2000; input2[14] = 1000; input2[15] = 2000; mult_cpx_vector_h(input2,input2,output,8,0,1); for (i=0;i<16;i+=2) printf("output[%d] = %d + %d i\n",i,output[i],output[i+1]); Zero_Buffer(output,256*2); mult_cpx_vector_h(input2,input2,output,8,0,-1); for (i=0;i<16;i+=2) printf("output[%d] = %d + %d i\n",i,output[i],output[i+1]); }
int mult_cpx_matrix_vector(int *x1[2][2], int *x2[2], int *y[2], unsigned int N, unsigned short output_shift) { Zero_Buffer(y[0],N*8); Zero_Buffer(y[1],N*8); // this computes x1*x2 and stores it in y (32 bit) mult_cpx_vector_add32((short*)x2[0],(short*)x1[0][0],(short*)y[0],N); mult_cpx_vector_add32((short*)x2[1],(short*)x1[0][1],(short*)y[0],N); mult_cpx_vector_add32((short*)x2[0],(short*)x1[1][0],(short*)y[1],N); mult_cpx_vector_add32((short*)x2[1],(short*)x1[1][1],(short*)y[1],N); // shift and pack shift_and_pack((short*)y[0],N,output_shift); shift_and_pack((short*)y[1],N,output_shift); }
main () { short input[256] __attribute__((aligned(16))); short output[256] __attribute__((aligned(16))); int i; struct complex16 alpha; Zero_Buffer(output,256*2); input[0] = 100; input[1] = 200; input[2] = 100; input[3] = 200; input[4] = 1234; input[5] = -1234; input[6] = 1234; input[7] = -1234; input[8] = 100; input[9] = 200; input[10] = 100; input[11] = 200; input[12] = 1000; input[13] = 2000; input[14] = 1000; input[15] = 2000; alpha.r = 10; alpha.i = -10; add_cpx_vector(input,(short*) &alpha,input,8); for (i=0; i<16; i+=2) printf("output[%d] = %d + %d i\n",i,input[i],input[i+1]); }
main () { short x1_00[256] __attribute__((aligned(16))); short x1_10[256] __attribute__((aligned(16))); short x1_01[256] __attribute__((aligned(16))); short x1_11[256] __attribute__((aligned(16))); short x2_0[256] __attribute__((aligned(16))); short x2_1[256] __attribute__((aligned(16))); short y_0[256] __attribute__((aligned(16))); short y_1[256] __attribute__((aligned(16))); int *x1[2][2]; int *x2[2]; int *y[2]; int i,m,n; x1[0][0] = (int*)x1_00; x1[0][1] = (int*)x1_01; x1[1][0] = (int*)x1_10; x1[1][1] = (int*)x1_11; x2[0] = (int*)x2_0; x2[1] = (int*)x2_1; y[0] = (int*)y_0; y[1] = (int*)y_1; for(m=0; m<2; m++) { for(n=0; n<2; n++) { for(i=0; i<256; i+=4) { ((short*)x1[m][n])[i] = ((short) rand())/4; ((short*)x1[m][n])[i+1] = ((short) rand())/4; ((short*)x1[m][n])[i+2] = -((short*)x1[m][n])[i+1]; ((short*)x1[m][n])[i+3] = ((short*)x1[m][n])[i]; } } for(i=0; i<256; i+=4) { ((short*)x2[m])[i] = ((short) rand())/4; ((short*)x2[m])[i+1] = ((short) rand())/4; ((short*)x2[m])[i+2] = ((short*)x2[m])[i]; ((short*)x2[m])[i+3] = ((short*)x2[m])[i+1]; } Zero_Buffer(y[m],512); } /* input[0] = 100; input[1] = 200; input[2] = -200; input[3] = 100; input[4] = 1000; input[5] = 2000; input[6] = -2000; input[7] = 1000; input[8] = 100; input[9] = 200; input[10] = -200; input[11] = 100; input[12] = 1000; input[13] = 2000; input[14] = -2000; input[15] = 1000; input2[0] = 2; input2[1] = 1; input2[2] = 2; input2[3] = 1; input2[4] = 20; input2[5] = 10; input2[6] = 20; input2[7] = 10; input2[8] = 2; input2[9] = 1; input2[10] = 2; input2[11] = 1; input2[12] = 2000; input2[13] = 1000; input2[14] = 2000; input2[15] = 1000; x1[0][0] = (int*)input; x1[0][1] = (int*)input; x1[1][0] = (int*)input; x1[1][1] = (int*)input; x2[0] = (int*)input2; x2[1] = (int*)input2; y[0] = (int*)output; y[1] = (int*)output2; */ mult_cpx_matrix_vector(x1,x2,y,64,15); //mult_cpx_vector_add32(x2[0],x1[0][0],y[0],64); for (i=0; i<128; i+=2) printf("i=%d, x1 = [%d+1i*%d %d+1i*%d; %d+1i*%d %d+1i*%d]; x2 = [%d+1i*%d; %d+1i*%d]; y = [%d+1i*%d; %d+1i*%d]; y_m= round(x1*x2./pow2(15)); y-y_m \n", i, ((short*)x1[0][0])[2*i], ((short*)x1[0][0])[2*i+2], ((short*)x1[0][1])[2*i], ((short*)x1[0][1])[2*i+2], ((short*)x1[1][0])[2*i], ((short*)x1[1][0])[2*i+2], ((short*)x1[1][1])[2*i], ((short*)x1[1][1])[2*i+2], ((short*)x2[0])[2*i], ((short*)x2[0])[2*i+1], ((short*)x2[1])[2*i], ((short*)x2[1])[2*i+1], ((short*)y[0])[2*i], ((short*)y[0])[2*i+1], ((short*)y[1])[2*i], ((short*)y[1])[2*i+1]); //((int*)y[0])[i], ((int*)y[0])[i+1], //((int*)y[1])[i], ((int*)y[1])[i+1]); }