Array2D<Real> SingleGaussian::inverseMatrix(const Array2D<Real>& matrix) const { if (matrix.dim1() != matrix.dim2()) { throw EssentiaException("SingleGaussian: Cannot solve linear system because matrix is not a square matrix"); } // make a copy to ensure that the computation of the inverse matrix is done with double precission Array2D<double> matrixDouble(matrix.dim1(), matrix.dim2()); for (int row=0; row<matrix.dim1(); ++row) for (int col=0; col<matrix.dim2();++col) matrixDouble[row][col] = matrix[row][col]; LU<double> solver(matrixDouble); if (!solver.isNonsingular()) { throw EssentiaException("SingleGaussian: Cannot solve linear system because matrix is singular"); } int dim = matrixDouble.dim1(); Array2D<double> identity(dim, dim, 0.0); for (int i=0; i<(int)dim; i++) { identity[i][i] = 1.0; } //return solver.solve(identity); Array2D<double> inverseDouble = solver.solve(identity); Array2D<Real> inverse(inverseDouble.dim1(), inverseDouble.dim2()); for (int row=0; row<inverseDouble.dim1(); ++row) for (int col=0; col<inverseDouble.dim2();++col) inverse[row][col] = inverseDouble[row][col]; return inverse; }
void assign_mat_ar2d(Matrix& m, const Array2D<double>& a) { m.NewMatrix(a.dim1(), a.dim2()); Matrix::r_iterator p_m(m.begin()); for(long i = 0; i < a.dim1(); ++i) for(long j = 0; j < a.dim2(); ++j) { *p_m = a[i][j]; ++p_m; } }
// z = x * y void multiply(const Array2D<double>& x, const Array2D<double>& y, Array2D<double>& z) { assert(x.dim2() == y.dim1()); for (int i=0; i<x.dim1(); ++i) { for (int j=0; j<y.dim2(); ++j) { double sum = 0; int d = y.dim1(); for (int k=0; k<d; k++) { sum += x[i][k] * y[k][j]; } z[i][j] = sum; } } }
void transpose(const Array2D<double>& src, Array2D<double>& target) { for (int i=0; i<src.dim1(); ++i) { for (int j=0; j<src.dim2(); ++j) { target[j][i] = src[i][j]; } } }
vector<Real> SingleGaussian::meanMatrix(const Array2D<Real>& matrix, int dim = 1) const { int rows = matrix.dim1(); int columns = matrix.dim2(); vector<Real> means; if (dim == 1) { means.resize(columns); for (int j=0; j<columns; j++) { Real m = 0; for (int i=0; i<rows; i++) { m += matrix[i][j]; } means[j] = m / rows; } } else { if (dim == 2) { means.resize(rows); for (int i=0; i<rows; i++) { Real m = 0; for (int j=0; j<columns; j++) { m += matrix[i][j]; } means[i]= m / columns; } } else { throw EssentiaException("SingleGaussian: The dimension for meanMatrix must be 1 or 2"); } } return means; }
void compute_covariance_matrix(const Array2D<double> & d, Array2D<double> & covar_matrix) { int dim = d.dim2(); assert(dim == covar_matrix.dim1()); assert(dim == covar_matrix.dim2()); for (int i=0; i<dim; ++i) { for (int j=i; j<dim; ++j) { covar_matrix[i][j] = compute_covariance(d, i, j); } } // fill the Left triangular matrix for (int i=1; i<dim; i++) { for (int j=0; j<i; ++j) { covar_matrix[i][j] = covar_matrix[j][i]; } } }
Array2D<Real> SingleGaussian::transposeMatrix(const Array2D<Real>& matrix) const { int rows = matrix.dim1(); int columns = matrix.dim2(); Array2D<Real> transpose(columns, rows); for (int j=0; j<columns; j++) { for (int i=0; i<rows; i++) { transpose[j][i] = matrix[i][j]; } } return transpose; }
void adjust_data(Array2D<double>& d, Array1D<double>& means) { for (int i=0; i<d.dim2(); ++i) { double mean = 0; for (int j=0; j<d.dim1(); ++j) { mean += d[j][i]; } mean /= d.dim1(); // store the mean means[i] = mean; // subtract the mean for (int j=0; j<d.dim1(); ++j) { d[j][i] -= mean; } } }
// This function finds the next change in matrix int SBic::bicChangeSearch(const Array2D<Real>& matrix, int inc, int current) const { int nFeatures = matrix.dim1(); int nFrames = matrix.dim2(); Real d, dmin, penalty; Real s, s1, s2; Array2D<Real> half; int n1, n2, seg = 0, shift = inc-1; // according to the paper the penalty coefficient should be the following: // penalty = 0.5*(3*nFeatures + nFeatures*nFeatures); penalty = _cpw * _cp * log(Real(nFrames)); dmin = numeric_limits<Real>::max(); // log-determinant for the entire window s = logDet(matrix); // loop on all mid positions while (shift < nFrames - inc) { // first part n1 = shift + 1; half = subarray(matrix, 0, nFeatures-1, 0, shift); s1 = logDet(half); // second part n2 = nFrames - n1; half = subarray(matrix, 0, nFeatures-1, shift+1, nFrames-1); s2 = logDet(half); d = 0.5 * (n1*s1 + n2*s2 - nFrames*s + penalty); if (d < dmin) { seg = shift; dmin = d; } shift += inc; } if (dmin > 0) return 0; return current + seg; }
// This function computes the delta bic. It is actually used to determine // whether two consecutive segments have the same probability distribution // or not. In such case, these segments are joined. Real SBic::delta_bic(const Array2D<Real>& matrix, Real segPoint) const{ int nFeatures = matrix.dim1(); int nFrames = matrix.dim2(); Array2D<Real> half; Real s, s1, s2; // entire segment s = logDet(matrix); // first half half = subarray(matrix, 0, nFeatures-1, 0, int(segPoint)); s1 = logDet(half); // second half half = subarray(matrix, 0, nFeatures-1, int(segPoint + 1), nFrames-1); s2 = logDet(half); return 0.5 * ( segPoint*s1 + (nFrames - segPoint)*s2 - nFrames*s + _cpw*_cp*log(Real(nFrames)) ); }
// This function returns the logarithm of the determinant of (the covariance) matrix // Seems kind of magic that all together can be computed in just few lines... Real SBic::logDet(const Array2D<Real>& matrix) const { // Remember dimensions are swapped: dim1 is the number of features and dim2 is the number of frames! // As we are computing the determinant of the covariance matrix and this matrix is known to be symmetric // and positive definite, we can apply the cholesky decomposition: A = LL*. // The determinant, in this case, is known to be the product of the squares of the diagonal // of the decomposed matrix (L). // The diagonal of L (l_ii) is in fact sqrt(a_ii). As the prod(sqr(l_ii)) = prod(sqr(sqrt(a_ii))) = prod(a_ii), // the determinant of A, will be the product of the diagonal elements. // Due to computing the log_determinant, then log(prod(a_ii])) = sum(log(a_ii)) // http://en.wikipedia.org/wiki/Cholesky_decomposition int dim1 = matrix.dim1(); int dim2 = matrix.dim2(); vector<Real> mp(dim1, 0.0); vector<Real> vp(dim1, 0.0); Real a, logd = 0.0; Real z = 1.0 / Real(dim2); Real zz = z * z; // As for computing the determinant we are only interested in the diagonal of the covariance matrix, which for // each feature vector is: // 1/n(sum(x_ii - mu_i)^2) = 1/n(sum(x_i^2) - 2*mu_i*sum(x_i) + sum(mu_i)^2) = // 1/n(sum(x_i^2) - 2*n*mu_i*mu_i + n*mu_i^2) = 1/n(sum(x_i^2) - n*mu^2) = 1/n*sum(x_i^2)+ mu_i^2 // where mu_i is the mean of feature i, and n is the number of frames for (int i=0; i<dim1; ++i) { for (int j=0; j<dim2; ++j) { a = matrix[i][j]; mp[i] += a; vp[i] += a * a; } } // this code accumulates rounding errors which causes bad behaviour when input features are constant. // A possible soultion would be to check for a higher threshold (1e-6), as constant features should // give a covariance of zero, because (x_i - mu)^2 = 0 Real diag_cov = 0.0; // diagonal values of the covariance matrix for (int j=0; j<dim1; ++j) { diag_cov = vp[j] * z - mp[j] * mp[j] * zz; // 1/n*sum(x_i^2)+ mu_i^2. // although it could be zero when input is constant, this operation can never be negative by definition // however due to rounding errors, it does get negative at times with values of order 1e-9, thus we convert // them to zero (1e-10), bounding the logarithm to -10 logd += diag_cov > 1e-5 ? log(diag_cov):-5; } return logd; // another way of computing the same as above, with possibly less rounding errors, but more expensive. //vector<Real> cov(dim1, 0.0); //for (int i=0; i<dim1; ++i) { // Real mean = mp[i]/dim2; // Real cov = 0.0; // for (int j=0; j<dim2; ++j) { // a = matrix[i][j]; // cov += (a-mean)*(a-mean); // } // cov /= dim2; // logd += cov > 0 ? log(cov):-30; //} }
Array2D<Real> SingleGaussian::covarianceMatrix(const Array2D<Real>& matrix, bool lowmem) const { int rows = matrix.dim1(); int columns = matrix.dim2(); vector<Real> means(columns, 0.0); Array2D<Real> cov(columns, columns); if (lowmem) { // compute means first means = meanMatrix(matrix,1); // compute covariance matrix vector<Real> dim1(rows); for (int i=0; i<columns; i++) { Real m1 = means[i]; for (int k=0; k<rows; k++) { dim1[k] = matrix[k][i] - m1; } for (int j=0; j<=i; j++) { // compute cov(i,j) Real covij = 0.0; Real m2 = means[j]; for (int k=0; k<rows; k++) { covij += dim1[k] * (matrix[k][j] - m2); } covij /= (rows - 1); // unbiased estimator cov[i][j] = cov[j][i] = covij; } } } else { // much faster version, but uses a bit more memory // speed optimization: transpose the matrix so that it's in row-major order Array2D<Real> transpose = transposeMatrix(matrix); // compute means first means = meanMatrix(matrix,1); // end of optimization: substract means for (int i=0; i<columns; i++) { for (int j=0; j<rows; j++) { transpose[i][j] -= means[i]; } } // compute covariance matrix for (int i=0; i<columns; i++) { for (int j=0; j<=i; j++) { // compute cov(i,j) Real covij = 0.0; for (int k=0; k<rows; k++) { covij += transpose[i][k] * transpose[j][k]; } covij /= (rows - 1); // unbiased estimator cov[i][j] = cov[j][i] = covij; } } } return cov; }
void PCA::compute() { const Pool& poolIn = _poolIn.get(); Pool& poolOut = _poolOut.get(); // get data from the pool string nameIn = parameter("namespaceIn").toString(); string nameOut = parameter("namespaceOut").toString(); vector<vector<Real> > rawFeats = poolIn.value<vector<vector<Real> > >(nameIn); // how many dimensions are there? int bands = rawFeats[0].size(); // calculate covariance for this songs frames // before there was an implementation for covariance from Vincent akkerman. I (eaylon) think it is better // and more maintainable to use an algorithm that computes covariance. // Using singleGaussian algo seems to give slightly different results for variances (8 decimal places) Array2D<Real> matrix, covMatrix, icov; vector<Real> means; matrix = vecvecToArray2D(rawFeats); Algorithm* sg = AlgorithmFactory::create("SingleGaussian"); sg->input("matrix").set(matrix); sg->output("mean").set(means); sg->output("covariance").set(covMatrix); sg->output("inverseCovariance").set(icov); sg->compute(); delete sg; // calculate eigenvectors, get the eigenvector matrix Eigenvalue<Real> eigMatrixCalc(covMatrix); Array2D<Real> eigMatrix; eigMatrixCalc.getV(eigMatrix); int nFrames = rawFeats.size(); for (int row=0; row<nFrames; row++) { for (int col=0; col<bands; col++) { rawFeats[row][col] -= means[col]; } } // reduce dimensions of eigMatrix int requiredDimensions = parameter("dimensions").toInt(); if (requiredDimensions > eigMatrix.dim2() || requiredDimensions < 1) requiredDimensions = eigMatrix.dim2(); Array2D<Real> reducedEig(eigMatrix.dim1(), requiredDimensions); for (int row=0; row<eigMatrix.dim1(); row++) { for (int column=0; column<requiredDimensions; column++) { reducedEig[row][column] = eigMatrix[row][column+eigMatrix.dim2()-requiredDimensions]; } } // transform all the frames and add to the output Array2D<Real> featVector(1,bands, 0.0); vector<Real> results = vector<Real>(requiredDimensions, 0.0); for (int row=0; row<nFrames; row++) { for (int col=0; col<bands; col++) { featVector[0][col] = rawFeats[row][col]; } featVector = matmult(featVector, reducedEig); for (int i=0; i<requiredDimensions; i++) { results[i] = featVector[0][i]; } poolOut.add(nameOut, results); } }
int BfrmScreener::SetData(Array2D<double>& AllData, Array2D<double>& AllMask, Array1D<int>& indicator, Array2D<double>& FH, Array1D<double>& Weight, int yfactors, Array1D<int>& VariablesIn, int nVarIn, bool bHasXMask) { int i,j; maOriginalIndex.clear(); //X first mnVariables = indicator.dim() + yfactors; mbHasXMask = bHasXMask; mnSampleSize = FH.dim2(); mX = Array2D<double>(mnVariables, mnSampleSize); if (mbHasXMask) { mXMask = Array2D<double>(mnVariables - yfactors, mnSampleSize); } for (j = 0; j < mnSampleSize; j++) { for (i = 0; i < yfactors; i++) { mX[i][j] = AllData[i][j]; } for (i = 0; i < nVarIn; i++) { mX[i+yfactors][j] = AllData[yfactors + VariablesIn[i]][j]; } if (mbHasXMask) { for (i = 0; i < nVarIn; i++) { mXMask[i][j] = AllMask[VariablesIn[i]][j]; } } } for (i = 0; i < nVarIn; i++) { maOriginalIndex.push_back(VariablesIn[i]); } int count = nVarIn + yfactors; for (i = 0; i < indicator.dim(); i++) { if (indicator[i]) { for (j = 0; j < mnSampleSize; j++) { mX[count][j] = AllData[yfactors + i][j]; } if (mbHasXMask) { for (j = 0; j < mnSampleSize; j++) { mXMask[count-yfactors][j] = AllMask[i][j]; } } count++; maOriginalIndex.push_back(i); } } //mH mH = Array2D<double>(FH.dim2(), FH.dim1()); for (i = 0; i < FH.dim1(); i++) { for (j = 0; j < FH.dim2(); j++) { mH[j][i] = FH[i][j]; } } //mWeight mnLeaveout = 0; mWeight = Array1D<double>(Weight.dim()); for (i = 0; i < mWeight.dim(); i++) { mWeight[i] = Weight[i]; if (mWeight[i] == 0) { mnLeaveout++; } } return 1; }