bool operator< (const rset<Type1, Type2, Type1Equality, Type2Equality>& x, const rset<Type1, Type2, Type1Equality, Type2Equality>& y) { typedef typename rset<Type1, Type2, Type1Equality, Type2Equality>::reference reference; const std::size_t x_size(x.size()); const std::size_t y_size(y.size()); for (std::size_t i(0); ; ++i) { if (i >= y_size) return false; if (i >= x_size) return true; reference x_ref(x.at(i)); reference y_ref(y.at(i)); if (y_ref.first < x_ref.first || (y_ref.first == x_ref.first && y_ref.second < x_ref.second)) return false; if (x_ref.first < y_ref.first || (x_ref.first == y_ref.first && x_ref.second < y_ref.second)) return true; } }
BOOST_AUTO_TEST_CASE_TEMPLATE( prefix_sum, DeviceType, DTK_SEARCH_DEVICE_TYPES ) { int const n = 10; Kokkos::View<int *, DeviceType> x( "x", n ); std::vector<int> x_ref( n, 1 ); x_ref.back() = 0; auto x_host = Kokkos::create_mirror_view( x ); for ( int i = 0; i < n; ++i ) x_host( i ) = x_ref[i]; Kokkos::deep_copy( x, x_host ); Kokkos::View<int *, DeviceType> y( "y", n ); ArborX::exclusivePrefixSum( x, y ); std::vector<int> y_ref( n ); std::iota( y_ref.begin(), y_ref.end(), 0 ); auto y_host = Kokkos::create_mirror_view( y ); Kokkos::deep_copy( y_host, y ); Kokkos::deep_copy( x_host, x ); BOOST_TEST( y_host == y_ref, tt::per_element() ); BOOST_TEST( x_host == x_ref, tt::per_element() ); // in-place ArborX::exclusivePrefixSum( x, x ); Kokkos::deep_copy( x_host, x ); BOOST_TEST( x_host == y_ref, tt::per_element() ); int const m = 11; BOOST_TEST( n != m ); Kokkos::View<int *, DeviceType> z( "z", m ); BOOST_CHECK_THROW( ArborX::exclusivePrefixSum( x, z ), ArborX::SearchException ); Kokkos::View<double[3], DeviceType> v( "v" ); auto v_host = Kokkos::create_mirror_view( v ); v_host( 0 ) = 1.; v_host( 1 ) = 1.; v_host( 2 ) = 0.; Kokkos::deep_copy( v, v_host ); ArborX::exclusivePrefixSum( v ); Kokkos::deep_copy( v_host, v ); std::vector<double> v_ref = {0., 1., 2.}; BOOST_TEST( v_host == v_ref, tt::per_element() ); Kokkos::View<double *, DeviceType> w( "w", 4 ); BOOST_CHECK_THROW( ArborX::exclusivePrefixSum( v, w ), ArborX::SearchException ); v_host( 0 ) = 1.; v_host( 1 ) = 0.; v_host( 2 ) = 0.; Kokkos::deep_copy( v, v_host ); Kokkos::resize( w, 3 ); ArborX::exclusivePrefixSum( v, w ); auto w_host = Kokkos::create_mirror_view( w ); Kokkos::deep_copy( w_host, w ); std::vector<double> w_ref = {0., 1., 1.}; BOOST_TEST( w_host == w_ref, tt::per_element() ); }
// Ans = Ax static void compute_Ax( char ***A, FLOAT ***L, FLOAT ***x, FLOAT ***ans, int n ) { FLOAT h2 = 1.0/(n*n); OPENMP_FOR FOR_EVERY_COMP(n) { if( A[i][j][k] == FLUID ) { ans[i][j][k] = (6.0*x[i][j][k] -x_ref(A,L,x,i,j,k,i+1,j,k,n)-x_ref(A,L,x,i,j,k,i-1,j,k,n) -x_ref(A,L,x,i,j,k,i,j+1,k,n)-x_ref(A,L,x,i,j,k,i,j-1,k,n) -x_ref(A,L,x,i,j,k,i,j,k+1,n)-x_ref(A,L,x,i,j,k,i,j,k-1,n))/h2; } else { ans[i][j][k] = 0.0; } } END_FOR; }
int vectester(const PairScalars& example, const std::string& testname) { typedef typename Antioch::value_type<PairScalars>::type Scalar; const unsigned int n_data(40); std::vector<Scalar> x_ref(n_data,0),y_ref(n_data,0); const Scalar min = -5L; const Scalar max = 8L; fill_ref(x_ref,y_ref,n_data,min, max); Antioch::GSLSpliner gsl_spline(x_ref,y_ref); // Construct from example to avoid resizing issues PairScalars x = example; for (unsigned int tuple=0; tuple != ANTIOCH_N_TUPLES; ++tuple) { x[2*tuple] = -3.5; x[2*tuple+1] = 5.1; } int return_flag = 0; #ifdef ANTIOCH_HAVE_GRVY gt.BeginTimer(testname); #endif const PairScalars gsl = gsl_spline.interpolated_value(x); #ifdef ANTIOCH_HAVE_GRVY gt.EndTimer(testname); #endif const PairScalars exact = function(x); for (unsigned int tuple=0; tuple != ANTIOCH_N_TUPLES; ++tuple) { return_flag = check_value<Scalar>(exact[2*tuple], gsl[2*tuple], x[2*tuple], "gsl vectorized") || return_flag; return_flag = check_value<Scalar>(exact[2*tuple+1], gsl[2*tuple+1], x[2*tuple+1], "gsl vectorized") || return_flag; } return return_flag; }
/* Subroutine */ int dlaexc_(logical *wantq, integer *n, doublereal *t, integer *ldt, doublereal *q, integer *ldq, integer *j1, integer *n1, integer *n2, doublereal *work, integer *info) { /* -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= DLAEXC swaps adjacent diagonal blocks T11 and T22 of order 1 or 2 in an upper quasi-triangular matrix T by an orthogonal similarity transformation. T must be in Schur canonical form, that is, block upper triangular with 1-by-1 and 2-by-2 diagonal blocks; each 2-by-2 diagonal block has its diagonal elemnts equal and its off-diagonal elements of opposite sign. Arguments ========= WANTQ (input) LOGICAL = .TRUE. : accumulate the transformation in the matrix Q; = .FALSE.: do not accumulate the transformation. N (input) INTEGER The order of the matrix T. N >= 0. T (input/output) DOUBLE PRECISION array, dimension (LDT,N) On entry, the upper quasi-triangular matrix T, in Schur canonical form. On exit, the updated matrix T, again in Schur canonical form. LDT (input) INTEGER The leading dimension of the array T. LDT >= max(1,N). Q (input/output) DOUBLE PRECISION array, dimension (LDQ,N) On entry, if WANTQ is .TRUE., the orthogonal matrix Q. On exit, if WANTQ is .TRUE., the updated matrix Q. If WANTQ is .FALSE., Q is not referenced. LDQ (input) INTEGER The leading dimension of the array Q. LDQ >= 1; and if WANTQ is .TRUE., LDQ >= N. J1 (input) INTEGER The index of the first row of the first block T11. N1 (input) INTEGER The order of the first block T11. N1 = 0, 1 or 2. N2 (input) INTEGER The order of the second block T22. N2 = 0, 1 or 2. WORK (workspace) DOUBLE PRECISION array, dimension (N) INFO (output) INTEGER = 0: successful exit = 1: the transformed matrix T would be too far from Schur form; the blocks are not swapped and T and Q are unchanged. ===================================================================== Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; static integer c__4 = 4; static logical c_false = FALSE_; static integer c_n1 = -1; static integer c__2 = 2; static integer c__3 = 3; /* System generated locals */ integer q_dim1, q_offset, t_dim1, t_offset, i__1; doublereal d__1, d__2, d__3, d__4, d__5, d__6; /* Local variables */ static integer ierr; static doublereal temp; extern /* Subroutine */ int drot_(integer *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *); static doublereal d__[16] /* was [4][4] */; static integer k; static doublereal u[3], scale, x[4] /* was [2][2] */, dnorm; static integer j2, j3, j4; static doublereal xnorm, u1[3], u2[3]; extern /* Subroutine */ int dlanv2_(doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *), dlasy2_( logical *, logical *, integer *, integer *, integer *, doublereal *, integer *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, integer *, doublereal *, integer *); static integer nd; static doublereal cs, t11, t22; extern doublereal dlamch_(char *); static doublereal t33; extern doublereal dlange_(char *, integer *, integer *, doublereal *, integer *, doublereal *); extern /* Subroutine */ int dlarfg_(integer *, doublereal *, doublereal *, integer *, doublereal *); static doublereal sn; extern /* Subroutine */ int dlacpy_(char *, integer *, integer *, doublereal *, integer *, doublereal *, integer *), dlartg_(doublereal *, doublereal *, doublereal *, doublereal *, doublereal *), dlarfx_(char *, integer *, integer *, doublereal *, doublereal *, doublereal *, integer *, doublereal *); static doublereal thresh, smlnum, wi1, wi2, wr1, wr2, eps, tau, tau1, tau2; #define d___ref(a_1,a_2) d__[(a_2)*4 + a_1 - 5] #define q_ref(a_1,a_2) q[(a_2)*q_dim1 + a_1] #define t_ref(a_1,a_2) t[(a_2)*t_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*2 + a_1 - 3] t_dim1 = *ldt; t_offset = 1 + t_dim1 * 1; t -= t_offset; q_dim1 = *ldq; q_offset = 1 + q_dim1 * 1; q -= q_offset; --work; /* Function Body */ *info = 0; /* Quick return if possible */ if (*n == 0 || *n1 == 0 || *n2 == 0) { return 0; } if (*j1 + *n1 > *n) { return 0; } j2 = *j1 + 1; j3 = *j1 + 2; j4 = *j1 + 3; if (*n1 == 1 && *n2 == 1) { /* Swap two 1-by-1 blocks. */ t11 = t_ref(*j1, *j1); t22 = t_ref(j2, j2); /* Determine the transformation to perform the interchange. */ d__1 = t22 - t11; dlartg_(&t_ref(*j1, j2), &d__1, &cs, &sn, &temp); /* Apply transformation to the matrix T. */ if (j3 <= *n) { i__1 = *n - *j1 - 1; drot_(&i__1, &t_ref(*j1, j3), ldt, &t_ref(j2, j3), ldt, &cs, &sn); } i__1 = *j1 - 1; drot_(&i__1, &t_ref(1, *j1), &c__1, &t_ref(1, j2), &c__1, &cs, &sn); t_ref(*j1, *j1) = t22; t_ref(j2, j2) = t11; if (*wantq) { /* Accumulate transformation in the matrix Q. */ drot_(n, &q_ref(1, *j1), &c__1, &q_ref(1, j2), &c__1, &cs, &sn); } } else { /* Swapping involves at least one 2-by-2 block. Copy the diagonal block of order N1+N2 to the local array D and compute its norm. */ nd = *n1 + *n2; dlacpy_("Full", &nd, &nd, &t_ref(*j1, *j1), ldt, d__, &c__4); dnorm = dlange_("Max", &nd, &nd, d__, &c__4, &work[1]); /* Compute machine-dependent threshold for test for accepting swap. */ eps = dlamch_("P"); smlnum = dlamch_("S") / eps; /* Computing MAX */ d__1 = eps * 10. * dnorm; thresh = max(d__1,smlnum); /* Solve T11*X - X*T22 = scale*T12 for X. */ dlasy2_(&c_false, &c_false, &c_n1, n1, n2, d__, &c__4, &d___ref(*n1 + 1, *n1 + 1), &c__4, &d___ref(1, *n1 + 1), &c__4, &scale, x, & c__2, &xnorm, &ierr); /* Swap the adjacent diagonal blocks. */ k = *n1 + *n1 + *n2 - 3; switch (k) { case 1: goto L10; case 2: goto L20; case 3: goto L30; } L10: /* N1 = 1, N2 = 2: generate elementary reflector H so that: ( scale, X11, X12 ) H = ( 0, 0, * ) */ u[0] = scale; u[1] = x_ref(1, 1); u[2] = x_ref(1, 2); dlarfg_(&c__3, &u[2], u, &c__1, &tau); u[2] = 1.; t11 = t_ref(*j1, *j1); /* Perform swap provisionally on diagonal block in D. */ dlarfx_("L", &c__3, &c__3, u, &tau, d__, &c__4, &work[1]); dlarfx_("R", &c__3, &c__3, u, &tau, d__, &c__4, &work[1]); /* Test whether to reject swap. Computing MAX */ d__4 = (d__1 = d___ref(3, 1), abs(d__1)), d__5 = (d__2 = d___ref(3, 2) , abs(d__2)), d__4 = max(d__4,d__5), d__5 = (d__3 = d___ref(3, 3) - t11, abs(d__3)); if (max(d__4,d__5) > thresh) { goto L50; } /* Accept swap: apply transformation to the entire matrix T. */ i__1 = *n - *j1 + 1; dlarfx_("L", &c__3, &i__1, u, &tau, &t_ref(*j1, *j1), ldt, &work[1]); dlarfx_("R", &j2, &c__3, u, &tau, &t_ref(1, *j1), ldt, &work[1]); t_ref(j3, *j1) = 0.; t_ref(j3, j2) = 0.; t_ref(j3, j3) = t11; if (*wantq) { /* Accumulate transformation in the matrix Q. */ dlarfx_("R", n, &c__3, u, &tau, &q_ref(1, *j1), ldq, &work[1]); } goto L40; L20: /* N1 = 2, N2 = 1: generate elementary reflector H so that: H ( -X11 ) = ( * ) ( -X21 ) = ( 0 ) ( scale ) = ( 0 ) */ u[0] = -x_ref(1, 1); u[1] = -x_ref(2, 1); u[2] = scale; dlarfg_(&c__3, u, &u[1], &c__1, &tau); u[0] = 1.; t33 = t_ref(j3, j3); /* Perform swap provisionally on diagonal block in D. */ dlarfx_("L", &c__3, &c__3, u, &tau, d__, &c__4, &work[1]); dlarfx_("R", &c__3, &c__3, u, &tau, d__, &c__4, &work[1]); /* Test whether to reject swap. Computing MAX */ d__4 = (d__1 = d___ref(2, 1), abs(d__1)), d__5 = (d__2 = d___ref(3, 1) , abs(d__2)), d__4 = max(d__4,d__5), d__5 = (d__3 = d___ref(1, 1) - t33, abs(d__3)); if (max(d__4,d__5) > thresh) { goto L50; } /* Accept swap: apply transformation to the entire matrix T. */ dlarfx_("R", &j3, &c__3, u, &tau, &t_ref(1, *j1), ldt, &work[1]); i__1 = *n - *j1; dlarfx_("L", &c__3, &i__1, u, &tau, &t_ref(*j1, j2), ldt, &work[1]); t_ref(*j1, *j1) = t33; t_ref(j2, *j1) = 0.; t_ref(j3, *j1) = 0.; if (*wantq) { /* Accumulate transformation in the matrix Q. */ dlarfx_("R", n, &c__3, u, &tau, &q_ref(1, *j1), ldq, &work[1]); } goto L40; L30: /* N1 = 2, N2 = 2: generate elementary reflectors H(1) and H(2) so that: H(2) H(1) ( -X11 -X12 ) = ( * * ) ( -X21 -X22 ) ( 0 * ) ( scale 0 ) ( 0 0 ) ( 0 scale ) ( 0 0 ) */ u1[0] = -x_ref(1, 1); u1[1] = -x_ref(2, 1); u1[2] = scale; dlarfg_(&c__3, u1, &u1[1], &c__1, &tau1); u1[0] = 1.; temp = -tau1 * (x_ref(1, 2) + u1[1] * x_ref(2, 2)); u2[0] = -temp * u1[1] - x_ref(2, 2); u2[1] = -temp * u1[2]; u2[2] = scale; dlarfg_(&c__3, u2, &u2[1], &c__1, &tau2); u2[0] = 1.; /* Perform swap provisionally on diagonal block in D. */ dlarfx_("L", &c__3, &c__4, u1, &tau1, d__, &c__4, &work[1]) ; dlarfx_("R", &c__4, &c__3, u1, &tau1, d__, &c__4, &work[1]) ; dlarfx_("L", &c__3, &c__4, u2, &tau2, &d___ref(2, 1), &c__4, &work[1]); dlarfx_("R", &c__4, &c__3, u2, &tau2, &d___ref(1, 2), &c__4, &work[1]); /* Test whether to reject swap. Computing MAX */ d__5 = (d__1 = d___ref(3, 1), abs(d__1)), d__6 = (d__2 = d___ref(3, 2) , abs(d__2)), d__5 = max(d__5,d__6), d__6 = (d__3 = d___ref(4, 1), abs(d__3)), d__5 = max(d__5,d__6), d__6 = (d__4 = d___ref(4, 2), abs(d__4)); if (max(d__5,d__6) > thresh) { goto L50; } /* Accept swap: apply transformation to the entire matrix T. */ i__1 = *n - *j1 + 1; dlarfx_("L", &c__3, &i__1, u1, &tau1, &t_ref(*j1, *j1), ldt, &work[1]); dlarfx_("R", &j4, &c__3, u1, &tau1, &t_ref(1, *j1), ldt, &work[1]); i__1 = *n - *j1 + 1; dlarfx_("L", &c__3, &i__1, u2, &tau2, &t_ref(j2, *j1), ldt, &work[1]); dlarfx_("R", &j4, &c__3, u2, &tau2, &t_ref(1, j2), ldt, &work[1]); t_ref(j3, *j1) = 0.; t_ref(j3, j2) = 0.; t_ref(j4, *j1) = 0.; t_ref(j4, j2) = 0.; if (*wantq) { /* Accumulate transformation in the matrix Q. */ dlarfx_("R", n, &c__3, u1, &tau1, &q_ref(1, *j1), ldq, &work[1]); dlarfx_("R", n, &c__3, u2, &tau2, &q_ref(1, j2), ldq, &work[1]); } L40: if (*n2 == 2) { /* Standardize new 2-by-2 block T11 */ dlanv2_(&t_ref(*j1, *j1), &t_ref(*j1, j2), &t_ref(j2, *j1), & t_ref(j2, j2), &wr1, &wi1, &wr2, &wi2, &cs, &sn); i__1 = *n - *j1 - 1; drot_(&i__1, &t_ref(*j1, *j1 + 2), ldt, &t_ref(j2, *j1 + 2), ldt, &cs, &sn); i__1 = *j1 - 1; drot_(&i__1, &t_ref(1, *j1), &c__1, &t_ref(1, j2), &c__1, &cs, & sn); if (*wantq) { drot_(n, &q_ref(1, *j1), &c__1, &q_ref(1, j2), &c__1, &cs, & sn); } } if (*n1 == 2) { /* Standardize new 2-by-2 block T22 */ j3 = *j1 + *n2; j4 = j3 + 1; dlanv2_(&t_ref(j3, j3), &t_ref(j3, j4), &t_ref(j4, j3), &t_ref(j4, j4), &wr1, &wi1, &wr2, &wi2, &cs, &sn); if (j3 + 2 <= *n) { i__1 = *n - j3 - 1; drot_(&i__1, &t_ref(j3, j3 + 2), ldt, &t_ref(j4, j3 + 2), ldt, &cs, &sn); } i__1 = j3 - 1; drot_(&i__1, &t_ref(1, j3), &c__1, &t_ref(1, j4), &c__1, &cs, &sn) ; if (*wantq) { drot_(n, &q_ref(1, j3), &c__1, &q_ref(1, j4), &c__1, &cs, &sn) ; } } } return 0; /* Exit with INFO = 1 if swap was rejected. */ L50: *info = 1; return 0; /* End of DLAEXC */ } /* dlaexc_ */
/* Subroutine */ int stbt03_(char *uplo, char *trans, char *diag, integer *n, integer *kd, integer *nrhs, real *ab, integer *ldab, real *scale, real *cnorm, real *tscal, real *x, integer *ldx, real *b, integer * ldb, real *work, real *resid) { /* System generated locals */ integer ab_dim1, ab_offset, b_dim1, b_offset, x_dim1, x_offset, i__1; real r__1, r__2, r__3; /* Local variables */ static integer j; extern logical lsame_(char *, char *); extern /* Subroutine */ int sscal_(integer *, real *, real *, integer *); static real xscal; extern /* Subroutine */ int stbmv_(char *, char *, char *, integer *, integer *, real *, integer *, real *, integer *), scopy_(integer *, real *, integer *, real *, integer *); static real tnorm, xnorm; extern /* Subroutine */ int saxpy_(integer *, real *, real *, integer *, real *, integer *), slabad_(real *, real *); static integer ix; extern doublereal slamch_(char *); static real bignum; extern integer isamax_(integer *, real *, integer *); static real smlnum, eps, err; #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] #define ab_ref(a_1,a_2) ab[(a_2)*ab_dim1 + a_1] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= STBT03 computes the residual for the solution to a scaled triangular system of equations A*x = s*b or A'*x = s*b when A is a triangular band matrix. Here A' is the transpose of A, s is a scalar, and x and b are N by NRHS matrices. The test ratio is the maximum over the number of right hand sides of norm(s*b - op(A)*x) / ( norm(op(A)) * norm(x) * EPS ), where op(A) denotes A or A' and EPS is the machine epsilon. Arguments ========= UPLO (input) CHARACTER*1 Specifies whether the matrix A is upper or lower triangular. = 'U': Upper triangular = 'L': Lower triangular TRANS (input) CHARACTER*1 Specifies the operation applied to A. = 'N': A *x = b (No transpose) = 'T': A'*x = b (Transpose) = 'C': A'*x = b (Conjugate transpose = Transpose) DIAG (input) CHARACTER*1 Specifies whether or not the matrix A is unit triangular. = 'N': Non-unit triangular = 'U': Unit triangular N (input) INTEGER The order of the matrix A. N >= 0. KD (input) INTEGER The number of superdiagonals or subdiagonals of the triangular band matrix A. KD >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices X and B. NRHS >= 0. AB (input) REAL array, dimension (LDAB,N) The upper or lower triangular band matrix A, stored in the first kd+1 rows of the array. The j-th column of A is stored in the j-th column of the array AB as follows: if UPLO = 'U', AB(kd+1+i-j,j) = A(i,j) for max(1,j-kd)<=i<=j; if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+kd). LDAB (input) INTEGER The leading dimension of the array AB. LDAB >= KD+1. SCALE (input) REAL The scaling factor s used in solving the triangular system. CNORM (input) REAL array, dimension (N) The 1-norms of the columns of A, not counting the diagonal. TSCAL (input) REAL The scaling factor used in computing the 1-norms in CNORM. CNORM actually contains the column norms of TSCAL*A. X (input) REAL array, dimension (LDX,NRHS) The computed solution vectors for the system of linear equations. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). B (input) REAL array, dimension (LDB,NRHS) The right hand side vectors for the system of linear equations. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). WORK (workspace) REAL array, dimension (N) RESID (output) REAL The maximum over the number of right hand sides of norm(op(A)*x - s*b) / ( norm(op(A)) * norm(x) * EPS ). ===================================================================== Quick exit if N = 0 Parameter adjustments */ ab_dim1 = *ldab; ab_offset = 1 + ab_dim1 * 1; ab -= ab_offset; --cnorm; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; --work; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { *resid = 0.f; return 0; } eps = slamch_("Epsilon"); smlnum = slamch_("Safe minimum"); bignum = 1.f / smlnum; slabad_(&smlnum, &bignum); /* Compute the norm of the triangular matrix A using the column norms already computed by SLATBS. */ tnorm = 0.f; if (lsame_(diag, "N")) { if (lsame_(uplo, "U")) { i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ r__2 = tnorm, r__3 = *tscal * (r__1 = ab_ref(*kd + 1, j), dabs(r__1)) + cnorm[j]; tnorm = dmax(r__2,r__3); /* L10: */ } } else { i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ r__2 = tnorm, r__3 = *tscal * (r__1 = ab_ref(1, j), dabs(r__1) ) + cnorm[j]; tnorm = dmax(r__2,r__3); /* L20: */ } } } else { i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ r__1 = tnorm, r__2 = *tscal + cnorm[j]; tnorm = dmax(r__1,r__2); /* L30: */ } } /* Compute the maximum over the number of right hand sides of norm(op(A)*x - s*b) / ( norm(op(A)) * norm(x) * EPS ). */ *resid = 0.f; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { scopy_(n, &x_ref(1, j), &c__1, &work[1], &c__1); ix = isamax_(n, &work[1], &c__1); /* Computing MAX */ r__2 = 1.f, r__3 = (r__1 = x_ref(ix, j), dabs(r__1)); xnorm = dmax(r__2,r__3); xscal = 1.f / xnorm / (real) (*kd + 1); sscal_(n, &xscal, &work[1], &c__1); stbmv_(uplo, trans, diag, n, kd, &ab[ab_offset], ldab, &work[1], & c__1); r__1 = -(*scale) * xscal; saxpy_(n, &r__1, &b_ref(1, j), &c__1, &work[1], &c__1); ix = isamax_(n, &work[1], &c__1); err = *tscal * (r__1 = work[ix], dabs(r__1)); ix = isamax_(n, &x_ref(1, j), &c__1); xnorm = (r__1 = x_ref(ix, j), dabs(r__1)); if (err * smlnum <= xnorm) { if (xnorm > 0.f) { err /= xnorm; } } else { if (err > 0.f) { err = 1.f / eps; } } if (err * smlnum <= tnorm) { if (tnorm > 0.f) { err /= tnorm; } } else { if (err > 0.f) { err = 1.f / eps; } } *resid = dmax(*resid,err); /* L40: */ } return 0; /* End of STBT03 */ } /* stbt03_ */
/* Subroutine */ int dlapmt_(logical *forwrd, integer *m, integer *n, doublereal *x, integer *ldx, integer *k) { /* -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University March 31, 1993 Purpose ======= DLAPMT rearranges the columns of the M by N matrix X as specified by the permutation K(1),K(2),...,K(N) of the integers 1,...,N. If FORWRD = .TRUE., forward permutation: X(*,K(J)) is moved X(*,J) for J = 1,2,...,N. If FORWRD = .FALSE., backward permutation: X(*,J) is moved to X(*,K(J)) for J = 1,2,...,N. Arguments ========= FORWRD (input) LOGICAL = .TRUE., forward permutation = .FALSE., backward permutation M (input) INTEGER The number of rows of the matrix X. M >= 0. N (input) INTEGER The number of columns of the matrix X. N >= 0. X (input/output) DOUBLE PRECISION array, dimension (LDX,N) On entry, the M by N matrix X. On exit, X contains the permuted matrix X. LDX (input) INTEGER The leading dimension of the array X, LDX >= MAX(1,M). K (input) INTEGER array, dimension (N) On entry, K contains the permutation vector. ===================================================================== Parameter adjustments */ /* System generated locals */ integer x_dim1, x_offset, i__1, i__2; /* Local variables */ static doublereal temp; static integer i__, j, ii, in; #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --k; /* Function Body */ if (*n <= 1) { return 0; } i__1 = *n; for (i__ = 1; i__ <= i__1; ++i__) { k[i__] = -k[i__]; /* L10: */ } if (*forwrd) { /* Forward permutation */ i__1 = *n; for (i__ = 1; i__ <= i__1; ++i__) { if (k[i__] > 0) { goto L40; } j = i__; k[j] = -k[j]; in = k[j]; L20: if (k[in] > 0) { goto L40; } i__2 = *m; for (ii = 1; ii <= i__2; ++ii) { temp = x_ref(ii, j); x_ref(ii, j) = x_ref(ii, in); x_ref(ii, in) = temp; /* L30: */ } k[in] = -k[in]; j = in; in = k[in]; goto L20; L40: /* L50: */ ; } } else { /* Backward permutation */ i__1 = *n; for (i__ = 1; i__ <= i__1; ++i__) { if (k[i__] > 0) { goto L80; } k[i__] = -k[i__]; j = k[i__]; L60: if (j == i__) { goto L80; } i__2 = *m; for (ii = 1; ii <= i__2; ++ii) { temp = x_ref(ii, i__); x_ref(ii, i__) = x_ref(ii, j); x_ref(ii, j) = temp; /* L70: */ } k[j] = -k[j]; j = k[j]; goto L60; L80: /* L90: */ ; } } return 0; /* End of DLAPMT */ } /* dlapmt_ */
/* Subroutine */ int dtrt03_(char *uplo, char *trans, char *diag, integer *n, integer *nrhs, doublereal *a, integer *lda, doublereal *scale, doublereal *cnorm, doublereal *tscal, doublereal *x, integer *ldx, doublereal *b, integer *ldb, doublereal *work, doublereal *resid) { /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, i__1; doublereal d__1, d__2, d__3; /* Local variables */ static integer j; extern /* Subroutine */ int dscal_(integer *, doublereal *, doublereal *, integer *); extern logical lsame_(char *, char *); static doublereal xscal; extern /* Subroutine */ int dcopy_(integer *, doublereal *, integer *, doublereal *, integer *), daxpy_(integer *, doublereal *, doublereal *, integer *, doublereal *, integer *), dtrmv_(char *, char *, char *, integer *, doublereal *, integer *, doublereal *, integer *); static doublereal tnorm, xnorm; extern /* Subroutine */ int dlabad_(doublereal *, doublereal *); extern doublereal dlamch_(char *); static integer ix; extern integer idamax_(integer *, doublereal *, integer *); static doublereal bignum, smlnum, eps, err; #define a_ref(a_1,a_2) a[(a_2)*a_dim1 + a_1] #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= DTRT03 computes the residual for the solution to a scaled triangular system of equations A*x = s*b or A'*x = s*b. Here A is a triangular matrix, A' is the transpose of A, s is a scalar, and x and b are N by NRHS matrices. The test ratio is the maximum over the number of right hand sides of norm(s*b - op(A)*x) / ( norm(op(A)) * norm(x) * EPS ), where op(A) denotes A or A' and EPS is the machine epsilon. Arguments ========= UPLO (input) CHARACTER*1 Specifies whether the matrix A is upper or lower triangular. = 'U': Upper triangular = 'L': Lower triangular TRANS (input) CHARACTER*1 Specifies the operation applied to A. = 'N': A *x = s*b (No transpose) = 'T': A'*x = s*b (Transpose) = 'C': A'*x = s*b (Conjugate transpose = Transpose) DIAG (input) CHARACTER*1 Specifies whether or not the matrix A is unit triangular. = 'N': Non-unit triangular = 'U': Unit triangular N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices X and B. NRHS >= 0. A (input) DOUBLE PRECISION array, dimension (LDA,N) The triangular matrix A. If UPLO = 'U', the leading n by n upper triangular part of the array A contains the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If UPLO = 'L', the leading n by n lower triangular part of the array A contains the lower triangular matrix, and the strictly upper triangular part of A is not referenced. If DIAG = 'U', the diagonal elements of A are also not referenced and are assumed to be 1. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,N). SCALE (input) DOUBLE PRECISION The scaling factor s used in solving the triangular system. CNORM (input) DOUBLE PRECISION array, dimension (N) The 1-norms of the columns of A, not counting the diagonal. TSCAL (input) DOUBLE PRECISION The scaling factor used in computing the 1-norms in CNORM. CNORM actually contains the column norms of TSCAL*A. X (input) DOUBLE PRECISION array, dimension (LDX,NRHS) The computed solution vectors for the system of linear equations. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). B (input) DOUBLE PRECISION array, dimension (LDB,NRHS) The right hand side vectors for the system of linear equations. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). WORK (workspace) DOUBLE PRECISION array, dimension (N) RESID (output) DOUBLE PRECISION The maximum over the number of right hand sides of norm(op(A)*x - s*b) / ( norm(op(A)) * norm(x) * EPS ). ===================================================================== Quick exit if N = 0 Parameter adjustments */ a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; --cnorm; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; --work; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { *resid = 0.; return 0; } eps = dlamch_("Epsilon"); smlnum = dlamch_("Safe minimum"); bignum = 1. / smlnum; dlabad_(&smlnum, &bignum); /* Compute the norm of the triangular matrix A using the column norms already computed by DLATRS. */ tnorm = 0.; if (lsame_(diag, "N")) { i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ d__2 = tnorm, d__3 = *tscal * (d__1 = a_ref(j, j), abs(d__1)) + cnorm[j]; tnorm = max(d__2,d__3); /* L10: */ } } else { i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ d__1 = tnorm, d__2 = *tscal + cnorm[j]; tnorm = max(d__1,d__2); /* L20: */ } } /* Compute the maximum over the number of right hand sides of norm(op(A)*x - s*b) / ( norm(op(A)) * norm(x) * EPS ). */ *resid = 0.; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dcopy_(n, &x_ref(1, j), &c__1, &work[1], &c__1); ix = idamax_(n, &work[1], &c__1); /* Computing MAX */ d__2 = 1., d__3 = (d__1 = x_ref(ix, j), abs(d__1)); xnorm = max(d__2,d__3); xscal = 1. / xnorm / (doublereal) (*n); dscal_(n, &xscal, &work[1], &c__1); dtrmv_(uplo, trans, diag, n, &a[a_offset], lda, &work[1], &c__1); d__1 = -(*scale) * xscal; daxpy_(n, &d__1, &b_ref(1, j), &c__1, &work[1], &c__1); ix = idamax_(n, &work[1], &c__1); err = *tscal * (d__1 = work[ix], abs(d__1)); ix = idamax_(n, &x_ref(1, j), &c__1); xnorm = (d__1 = x_ref(ix, j), abs(d__1)); if (err * smlnum <= xnorm) { if (xnorm > 0.) { err /= xnorm; } } else { if (err > 0.) { err = 1. / eps; } } if (err * smlnum <= tnorm) { if (tnorm > 0.) { err /= tnorm; } } else { if (err > 0.) { err = 1. / eps; } } *resid = max(*resid,err); /* L30: */ } return 0; /* End of DTRT03 */ } /* dtrt03_ */
/* Subroutine */ int ztrt02_(char *uplo, char *trans, char *diag, integer *n, integer *nrhs, doublecomplex *a, integer *lda, doublecomplex *x, integer *ldx, doublecomplex *b, integer *ldb, doublecomplex *work, doublereal *rwork, doublereal *resid) { /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, i__1; doublereal d__1, d__2; /* Local variables */ static integer j; extern logical lsame_(char *, char *); static doublereal anorm, bnorm, xnorm; extern /* Subroutine */ int zcopy_(integer *, doublecomplex *, integer *, doublecomplex *, integer *), zaxpy_(integer *, doublecomplex *, doublecomplex *, integer *, doublecomplex *, integer *), ztrmv_( char *, char *, char *, integer *, doublecomplex *, integer *, doublecomplex *, integer *); extern doublereal dlamch_(char *), dzasum_(integer *, doublecomplex *, integer *), zlantr_(char *, char *, char *, integer *, integer *, doublecomplex *, integer *, doublereal *); static doublereal eps; #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= ZTRT02 computes the residual for the computed solution to a triangular system of linear equations A*x = b, A**T *x = b, or A**H *x = b. Here A is a triangular matrix, A**T is the transpose of A, A**H is the conjugate transpose of A, and x and b are N by NRHS matrices. The test ratio is the maximum over the number of right hand sides of norm(b - op(A)*x) / ( norm(op(A)) * norm(x) * EPS ), where op(A) denotes A, A**T, or A**H, and EPS is the machine epsilon. Arguments ========= UPLO (input) CHARACTER*1 Specifies whether the matrix A is upper or lower triangular. = 'U': Upper triangular = 'L': Lower triangular TRANS (input) CHARACTER*1 Specifies the operation applied to A. = 'N': A *x = b (No transpose) = 'T': A**T *x = b (Transpose) = 'C': A**H *x = b (Conjugate transpose) DIAG (input) CHARACTER*1 Specifies whether or not the matrix A is unit triangular. = 'N': Non-unit triangular = 'U': Unit triangular N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices X and B. NRHS >= 0. A (input) COMPLEX*16 array, dimension (LDA,N) The triangular matrix A. If UPLO = 'U', the leading n by n upper triangular part of the array A contains the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If UPLO = 'L', the leading n by n lower triangular part of the array A contains the lower triangular matrix, and the strictly upper triangular part of A is not referenced. If DIAG = 'U', the diagonal elements of A are also not referenced and are assumed to be 1. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,N). X (input) COMPLEX*16 array, dimension (LDX,NRHS) The computed solution vectors for the system of linear equations. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). B (input) COMPLEX*16 array, dimension (LDB,NRHS) The right hand side vectors for the system of linear equations. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). WORK (workspace) COMPLEX*16 array, dimension (N) RWORK (workspace) DOUBLE PRECISION array, dimension (N) RESID (output) DOUBLE PRECISION The maximum over the number of right hand sides of norm(op(A)*x - b) / ( norm(op(A)) * norm(x) * EPS ). ===================================================================== Quick exit if N = 0 or NRHS = 0 Parameter adjustments */ a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; --work; --rwork; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { *resid = 0.; return 0; } /* Compute the 1-norm of A or A**H. */ if (lsame_(trans, "N")) { anorm = zlantr_("1", uplo, diag, n, n, &a[a_offset], lda, &rwork[1]); } else { anorm = zlantr_("I", uplo, diag, n, n, &a[a_offset], lda, &rwork[1]); } /* Exit with RESID = 1/EPS if ANORM = 0. */ eps = dlamch_("Epsilon"); if (anorm <= 0.) { *resid = 1. / eps; return 0; } /* Compute the maximum over the number of right hand sides of norm(op(A)*x - b) / ( norm(op(A)) * norm(x) * EPS ) */ *resid = 0.; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { zcopy_(n, &x_ref(1, j), &c__1, &work[1], &c__1); ztrmv_(uplo, trans, diag, n, &a[a_offset], lda, &work[1], &c__1); zaxpy_(n, &c_b12, &b_ref(1, j), &c__1, &work[1], &c__1); bnorm = dzasum_(n, &work[1], &c__1); xnorm = dzasum_(n, &x_ref(1, j), &c__1); if (xnorm <= 0.) { *resid = 1. / eps; } else { /* Computing MAX */ d__1 = *resid, d__2 = bnorm / anorm / xnorm / eps; *resid = max(d__1,d__2); } /* L10: */ } return 0; /* End of ZTRT02 */ } /* ztrt02_ */
/* Subroutine */ int ztpt05_(char *uplo, char *trans, char *diag, integer *n, integer *nrhs, doublecomplex *ap, doublecomplex *b, integer *ldb, doublecomplex *x, integer *ldx, doublecomplex *xact, integer *ldxact, doublereal *ferr, doublereal *berr, doublereal *reslts) { /* System generated locals */ integer b_dim1, b_offset, x_dim1, x_offset, xact_dim1, xact_offset, i__1, i__2, i__3, i__4, i__5; doublereal d__1, d__2, d__3, d__4; doublecomplex z__1, z__2; /* Builtin functions */ double d_imag(doublecomplex *); /* Local variables */ static doublereal diff, axbi; static integer imax; static doublereal unfl, ovfl; static logical unit; static integer i__, j, k; extern logical lsame_(char *, char *); static logical upper; static doublereal xnorm; static integer jc; extern doublereal dlamch_(char *); static doublereal errbnd; extern integer izamax_(integer *, doublecomplex *, integer *); static logical notran; static integer ifu; static doublereal eps, tmp; #define xact_subscr(a_1,a_2) (a_2)*xact_dim1 + a_1 #define xact_ref(a_1,a_2) xact[xact_subscr(a_1,a_2)] #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= ZTPT05 tests the error bounds from iterative refinement for the computed solution to a system of equations A*X = B, where A is a triangular matrix in packed storage format. RESLTS(1) = test of the error bound = norm(X - XACT) / ( norm(X) * FERR ) A large value is returned if this ratio is not less than one. RESLTS(2) = residual from the iterative refinement routine = the maximum of BERR / ( (n+1)*EPS + (*) ), where (*) = (n+1)*UNFL / (min_i (abs(A)*abs(X) +abs(b))_i ) Arguments ========= UPLO (input) CHARACTER*1 Specifies whether the matrix A is upper or lower triangular. = 'U': Upper triangular = 'L': Lower triangular TRANS (input) CHARACTER*1 Specifies the form of the system of equations. = 'N': A * X = B (No transpose) = 'T': A'* X = B (Transpose) = 'C': A'* X = B (Conjugate transpose = Transpose) DIAG (input) CHARACTER*1 Specifies whether or not the matrix A is unit triangular. = 'N': Non-unit triangular = 'U': Unit triangular N (input) INTEGER The number of rows of the matrices X, B, and XACT, and the order of the matrix A. N >= 0. NRHS (input) INTEGER The number of columns of the matrices X, B, and XACT. NRHS >= 0. AP (input) COMPLEX*16 array, dimension (N*(N+1)/2) The upper or lower triangular matrix A, packed columnwise in a linear array. The j-th column of A is stored in the array AP as follows: if UPLO = 'U', AP(i + (j-1)*j/2) = A(i,j) for 1<=i<=j; if UPLO = 'L', AP(i + (j-1)*(2n-j)/2) = A(i,j) for j<=i<=n. If DIAG = 'U', the diagonal elements of A are not referenced and are assumed to be 1. B (input) COMPLEX*16 array, dimension (LDB,NRHS) The right hand side vectors for the system of linear equations. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input) COMPLEX*16 array, dimension (LDX,NRHS) The computed solution vectors. Each vector is stored as a column of the matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). XACT (input) COMPLEX*16 array, dimension (LDX,NRHS) The exact solution vectors. Each vector is stored as a column of the matrix XACT. LDXACT (input) INTEGER The leading dimension of the array XACT. LDXACT >= max(1,N). FERR (input) DOUBLE PRECISION array, dimension (NRHS) The estimated forward error bounds for each solution vector X. If XTRUE is the true solution, FERR bounds the magnitude of the largest entry in (X - XTRUE) divided by the magnitude of the largest entry in X. BERR (input) DOUBLE PRECISION array, dimension (NRHS) The componentwise relative backward error of each solution vector (i.e., the smallest relative change in any entry of A or B that makes X an exact solution). RESLTS (output) DOUBLE PRECISION array, dimension (2) The maximum over the NRHS solution vectors of the ratios: RESLTS(1) = norm(X - XACT) / ( norm(X) * FERR ) RESLTS(2) = BERR / ( (n+1)*EPS + (*) ) ===================================================================== Quick exit if N = 0 or NRHS = 0. Parameter adjustments */ --ap; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; xact_dim1 = *ldxact; xact_offset = 1 + xact_dim1 * 1; xact -= xact_offset; --ferr; --berr; --reslts; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { reslts[1] = 0.; reslts[2] = 0.; return 0; } eps = dlamch_("Epsilon"); unfl = dlamch_("Safe minimum"); ovfl = 1. / unfl; upper = lsame_(uplo, "U"); notran = lsame_(trans, "N"); unit = lsame_(diag, "U"); /* Test 1: Compute the maximum of norm(X - XACT) / ( norm(X) * FERR ) over all the vectors X and XACT using the infinity-norm. */ errbnd = 0.; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { imax = izamax_(n, &x_ref(1, j), &c__1); /* Computing MAX */ i__2 = x_subscr(imax, j); d__3 = (d__1 = x[i__2].r, abs(d__1)) + (d__2 = d_imag(&x_ref(imax, j)) , abs(d__2)); xnorm = max(d__3,unfl); diff = 0.; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = x_subscr(i__, j); i__4 = xact_subscr(i__, j); z__2.r = x[i__3].r - xact[i__4].r, z__2.i = x[i__3].i - xact[i__4] .i; z__1.r = z__2.r, z__1.i = z__2.i; /* Computing MAX */ d__3 = diff, d__4 = (d__1 = z__1.r, abs(d__1)) + (d__2 = d_imag(& z__1), abs(d__2)); diff = max(d__3,d__4); /* L10: */ } if (xnorm > 1.) { goto L20; } else if (diff <= ovfl * xnorm) { goto L20; } else { errbnd = 1. / eps; goto L30; } L20: if (diff / xnorm <= ferr[j]) { /* Computing MAX */ d__1 = errbnd, d__2 = diff / xnorm / ferr[j]; errbnd = max(d__1,d__2); } else { errbnd = 1. / eps; } L30: ; } reslts[1] = errbnd; /* Test 2: Compute the maximum of BERR / ( (n+1)*EPS + (*) ), where (*) = (n+1)*UNFL / (min_i (abs(A)*abs(X) +abs(b))_i ) */ ifu = 0; if (unit) { ifu = 1; } i__1 = *nrhs; for (k = 1; k <= i__1; ++k) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = b_subscr(i__, k); tmp = (d__1 = b[i__3].r, abs(d__1)) + (d__2 = d_imag(&b_ref(i__, k)), abs(d__2)); if (upper) { jc = (i__ - 1) * i__ / 2; if (! notran) { i__3 = i__ - ifu; for (j = 1; j <= i__3; ++j) { i__4 = jc + j; i__5 = x_subscr(j, k); tmp += ((d__1 = ap[i__4].r, abs(d__1)) + (d__2 = d_imag(&ap[jc + j]), abs(d__2))) * ((d__3 = x[ i__5].r, abs(d__3)) + (d__4 = d_imag(&x_ref(j, k)), abs(d__4))); /* L40: */ } if (unit) { i__3 = x_subscr(i__, k); tmp += (d__1 = x[i__3].r, abs(d__1)) + (d__2 = d_imag( &x_ref(i__, k)), abs(d__2)); } } else { jc += i__; if (unit) { i__3 = x_subscr(i__, k); tmp += (d__1 = x[i__3].r, abs(d__1)) + (d__2 = d_imag( &x_ref(i__, k)), abs(d__2)); jc += i__; } i__3 = *n; for (j = i__ + ifu; j <= i__3; ++j) { i__4 = jc; i__5 = x_subscr(j, k); tmp += ((d__1 = ap[i__4].r, abs(d__1)) + (d__2 = d_imag(&ap[jc]), abs(d__2))) * ((d__3 = x[ i__5].r, abs(d__3)) + (d__4 = d_imag(&x_ref(j, k)), abs(d__4))); jc += j; /* L50: */ } } } else { if (notran) { jc = i__; i__3 = i__ - ifu; for (j = 1; j <= i__3; ++j) { i__4 = jc; i__5 = x_subscr(j, k); tmp += ((d__1 = ap[i__4].r, abs(d__1)) + (d__2 = d_imag(&ap[jc]), abs(d__2))) * ((d__3 = x[ i__5].r, abs(d__3)) + (d__4 = d_imag(&x_ref(j, k)), abs(d__4))); jc = jc + *n - j; /* L60: */ } if (unit) { i__3 = x_subscr(i__, k); tmp += (d__1 = x[i__3].r, abs(d__1)) + (d__2 = d_imag( &x_ref(i__, k)), abs(d__2)); } } else { jc = (i__ - 1) * (*n - i__) + i__ * (i__ + 1) / 2; if (unit) { i__3 = x_subscr(i__, k); tmp += (d__1 = x[i__3].r, abs(d__1)) + (d__2 = d_imag( &x_ref(i__, k)), abs(d__2)); } i__3 = *n; for (j = i__ + ifu; j <= i__3; ++j) { i__4 = jc + j - i__; i__5 = x_subscr(j, k); tmp += ((d__1 = ap[i__4].r, abs(d__1)) + (d__2 = d_imag(&ap[jc + j - i__]), abs(d__2))) * (( d__3 = x[i__5].r, abs(d__3)) + (d__4 = d_imag( &x_ref(j, k)), abs(d__4))); /* L70: */ } } } if (i__ == 1) { axbi = tmp; } else { axbi = min(axbi,tmp); } /* L80: */ } /* Computing MAX */ d__1 = axbi, d__2 = (*n + 1) * unfl; tmp = berr[k] / ((*n + 1) * eps + (*n + 1) * unfl / max(d__1,d__2)); if (k == 1) { reslts[2] = tmp; } else { reslts[2] = max(reslts[2],tmp); } /* L90: */ } return 0; /* End of ZTPT05 */ } /* ztpt05_ */
/* Subroutine */ int ctrrfs_(char *uplo, char *trans, char *diag, integer *n, integer *nrhs, complex *a, integer *lda, complex *b, integer *ldb, complex *x, integer *ldx, real *ferr, real *berr, complex *work, real *rwork, integer *info) { /* -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= CTRRFS provides error bounds and backward error estimates for the solution to a system of linear equations with a triangular coefficient matrix. The solution matrix X must be computed by CTRTRS or some other means before entering this routine. CTRRFS does not do iterative refinement because doing so cannot improve the backward error. Arguments ========= UPLO (input) CHARACTER*1 = 'U': A is upper triangular; = 'L': A is lower triangular. TRANS (input) CHARACTER*1 Specifies the form of the system of equations: = 'N': A * X = B (No transpose) = 'T': A**T * X = B (Transpose) = 'C': A**H * X = B (Conjugate transpose) DIAG (input) CHARACTER*1 = 'N': A is non-unit triangular; = 'U': A is unit triangular. N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0. A (input) COMPLEX array, dimension (LDA,N) The triangular matrix A. If UPLO = 'U', the leading N-by-N upper triangular part of the array A contains the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If UPLO = 'L', the leading N-by-N lower triangular part of the array A contains the lower triangular matrix, and the strictly upper triangular part of A is not referenced. If DIAG = 'U', the diagonal elements of A are also not referenced and are assumed to be 1. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,N). B (input) COMPLEX array, dimension (LDB,NRHS) The right hand side matrix B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input) COMPLEX array, dimension (LDX,NRHS) The solution matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). FERR (output) REAL array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) REAL array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace) COMPLEX array, dimension (2*N) RWORK (workspace) REAL array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value ===================================================================== Test the input parameters. Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, i__1, i__2, i__3, i__4, i__5; real r__1, r__2, r__3, r__4; complex q__1; /* Builtin functions */ double r_imag(complex *); /* Local variables */ static integer kase; static real safe1, safe2; static integer i__, j, k; static real s; extern logical lsame_(char *, char *); extern /* Subroutine */ int ccopy_(integer *, complex *, integer *, complex *, integer *), caxpy_(integer *, complex *, complex *, integer *, complex *, integer *); static logical upper; extern /* Subroutine */ int ctrmv_(char *, char *, char *, integer *, complex *, integer *, complex *, integer *), ctrsv_(char *, char *, char *, integer *, complex *, integer *, complex *, integer *), clacon_( integer *, complex *, complex *, real *, integer *); static real xk; extern doublereal slamch_(char *); static integer nz; static real safmin; extern /* Subroutine */ int xerbla_(char *, integer *); static logical notran; static char transn[1], transt[1]; static logical nounit; static real lstres, eps; #define a_subscr(a_1,a_2) (a_2)*a_dim1 + a_1 #define a_ref(a_1,a_2) a[a_subscr(a_1,a_2)] #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --rwork; /* Function Body */ *info = 0; upper = lsame_(uplo, "U"); notran = lsame_(trans, "N"); nounit = lsame_(diag, "N"); if (! upper && ! lsame_(uplo, "L")) { *info = -1; } else if (! notran && ! lsame_(trans, "T") && ! lsame_(trans, "C")) { *info = -2; } else if (! nounit && ! lsame_(diag, "U")) { *info = -3; } else if (*n < 0) { *info = -4; } else if (*nrhs < 0) { *info = -5; } else if (*lda < max(1,*n)) { *info = -7; } else if (*ldb < max(1,*n)) { *info = -9; } else if (*ldx < max(1,*n)) { *info = -11; } if (*info != 0) { i__1 = -(*info); xerbla_("CTRRFS", &i__1); return 0; } /* Quick return if possible */ if (*n == 0 || *nrhs == 0) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] = 0.f; berr[j] = 0.f; /* L10: */ } return 0; } if (notran) { *(unsigned char *)transn = 'N'; *(unsigned char *)transt = 'C'; } else { *(unsigned char *)transn = 'C'; *(unsigned char *)transt = 'N'; } /* NZ = maximum number of nonzero elements in each row of A, plus 1 */ nz = *n + 1; eps = slamch_("Epsilon"); safmin = slamch_("Safe minimum"); safe1 = nz * safmin; safe2 = safe1 / eps; /* Do for each right hand side */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { /* Compute residual R = B - op(A) * X, where op(A) = A, A**T, or A**H, depending on TRANS. */ ccopy_(n, &x_ref(1, j), &c__1, &work[1], &c__1); ctrmv_(uplo, trans, diag, n, &a[a_offset], lda, &work[1], &c__1); q__1.r = -1.f, q__1.i = 0.f; caxpy_(n, &q__1, &b_ref(1, j), &c__1, &work[1], &c__1); /* Compute componentwise relative backward error from formula max(i) ( abs(R(i)) / ( abs(op(A))*abs(X) + abs(B) )(i) ) where abs(Z) is the componentwise absolute value of the matrix or vector Z. If the i-th component of the denominator is less than SAFE2, then SAFE1 is added to the i-th components of the numerator and denominator before dividing. */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = b_subscr(i__, j); rwork[i__] = (r__1 = b[i__3].r, dabs(r__1)) + (r__2 = r_imag(& b_ref(i__, j)), dabs(r__2)); /* L20: */ } if (notran) { /* Compute abs(A)*abs(X) + abs(B). */ if (upper) { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { i__3 = x_subscr(k, j); xk = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(& x_ref(k, j)), dabs(r__2)); i__3 = k; for (i__ = 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); rwork[i__] += ((r__1 = a[i__4].r, dabs(r__1)) + ( r__2 = r_imag(&a_ref(i__, k)), dabs(r__2)) ) * xk; /* L30: */ } /* L40: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { i__3 = x_subscr(k, j); xk = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(& x_ref(k, j)), dabs(r__2)); i__3 = k - 1; for (i__ = 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); rwork[i__] += ((r__1 = a[i__4].r, dabs(r__1)) + ( r__2 = r_imag(&a_ref(i__, k)), dabs(r__2)) ) * xk; /* L50: */ } rwork[k] += xk; /* L60: */ } } } else { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { i__3 = x_subscr(k, j); xk = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(& x_ref(k, j)), dabs(r__2)); i__3 = *n; for (i__ = k; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); rwork[i__] += ((r__1 = a[i__4].r, dabs(r__1)) + ( r__2 = r_imag(&a_ref(i__, k)), dabs(r__2)) ) * xk; /* L70: */ } /* L80: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { i__3 = x_subscr(k, j); xk = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(& x_ref(k, j)), dabs(r__2)); i__3 = *n; for (i__ = k + 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); rwork[i__] += ((r__1 = a[i__4].r, dabs(r__1)) + ( r__2 = r_imag(&a_ref(i__, k)), dabs(r__2)) ) * xk; /* L90: */ } rwork[k] += xk; /* L100: */ } } } } else { /* Compute abs(A**H)*abs(X) + abs(B). */ if (upper) { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.f; i__3 = k; for (i__ = 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); i__5 = x_subscr(i__, j); s += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(&a_ref(i__, k)), dabs(r__2))) * (( r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(&x_ref(i__, j)), dabs(r__4))); /* L110: */ } rwork[k] += s; /* L120: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { i__3 = x_subscr(k, j); s = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(& x_ref(k, j)), dabs(r__2)); i__3 = k - 1; for (i__ = 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); i__5 = x_subscr(i__, j); s += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(&a_ref(i__, k)), dabs(r__2))) * (( r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(&x_ref(i__, j)), dabs(r__4))); /* L130: */ } rwork[k] += s; /* L140: */ } } } else { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.f; i__3 = *n; for (i__ = k; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); i__5 = x_subscr(i__, j); s += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(&a_ref(i__, k)), dabs(r__2))) * (( r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(&x_ref(i__, j)), dabs(r__4))); /* L150: */ } rwork[k] += s; /* L160: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { i__3 = x_subscr(k, j); s = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(& x_ref(k, j)), dabs(r__2)); i__3 = *n; for (i__ = k + 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); i__5 = x_subscr(i__, j); s += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(&a_ref(i__, k)), dabs(r__2))) * (( r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(&x_ref(i__, j)), dabs(r__4))); /* L170: */ } rwork[k] += s; /* L180: */ } } } } s = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (rwork[i__] > safe2) { /* Computing MAX */ i__3 = i__; r__3 = s, r__4 = ((r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2))) / rwork[i__]; s = dmax(r__3,r__4); } else { /* Computing MAX */ i__3 = i__; r__3 = s, r__4 = ((r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2)) + safe1) / (rwork[i__] + safe1); s = dmax(r__3,r__4); } /* L190: */ } berr[j] = s; /* Bound error from formula norm(X - XTRUE) / norm(X) .le. FERR = norm( abs(inv(op(A)))* ( abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) / norm(X) where norm(Z) is the magnitude of the largest component of Z inv(op(A)) is the inverse of op(A) abs(Z) is the componentwise absolute value of the matrix or vector Z NZ is the maximum number of nonzeros in any row of A, plus 1 EPS is machine epsilon The i-th component of abs(R)+NZ*EPS*(abs(op(A))*abs(X)+abs(B)) is incremented by SAFE1 if the i-th component of abs(op(A))*abs(X) + abs(B) is less than SAFE2. Use CLACON to estimate the infinity-norm of the matrix inv(op(A)) * diag(W), where W = abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (rwork[i__] > safe2) { i__3 = i__; rwork[i__] = (r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2)) + nz * eps * rwork[ i__]; } else { i__3 = i__; rwork[i__] = (r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2)) + nz * eps * rwork[ i__] + safe1; } /* L200: */ } kase = 0; L210: clacon_(n, &work[*n + 1], &work[1], &ferr[j], &kase); if (kase != 0) { if (kase == 1) { /* Multiply by diag(W)*inv(op(A)**H). */ ctrsv_(uplo, transt, diag, n, &a[a_offset], lda, &work[1], & c__1); i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = i__; i__4 = i__; i__5 = i__; q__1.r = rwork[i__4] * work[i__5].r, q__1.i = rwork[i__4] * work[i__5].i; work[i__3].r = q__1.r, work[i__3].i = q__1.i; /* L220: */ } } else { /* Multiply by inv(op(A))*diag(W). */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = i__; i__4 = i__; i__5 = i__; q__1.r = rwork[i__4] * work[i__5].r, q__1.i = rwork[i__4] * work[i__5].i; work[i__3].r = q__1.r, work[i__3].i = q__1.i; /* L230: */ } ctrsv_(uplo, transn, diag, n, &a[a_offset], lda, &work[1], & c__1); } goto L210; } /* Normalize error. */ lstres = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { /* Computing MAX */ i__3 = x_subscr(i__, j); r__3 = lstres, r__4 = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(&x_ref(i__, j)), dabs(r__2)); lstres = dmax(r__3,r__4); /* L240: */ } if (lstres != 0.f) { ferr[j] /= lstres; } /* L250: */ } return 0; /* End of CTRRFS */ } /* ctrrfs_ */
/* Subroutine */ int slagtm_(char *trans, integer *n, integer *nrhs, real * alpha, real *dl, real *d__, real *du, real *x, integer *ldx, real * beta, real *b, integer *ldb) { /* -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 Purpose ======= SLAGTM performs a matrix-vector product of the form B := alpha * A * X + beta * B where A is a tridiagonal matrix of order N, B and X are N by NRHS matrices, and alpha and beta are real scalars, each of which may be 0., 1., or -1. Arguments ========= TRANS (input) CHARACTER Specifies the operation applied to A. = 'N': No transpose, B := alpha * A * X + beta * B = 'T': Transpose, B := alpha * A'* X + beta * B = 'C': Conjugate transpose = Transpose N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices X and B. ALPHA (input) REAL The scalar alpha. ALPHA must be 0., 1., or -1.; otherwise, it is assumed to be 0. DL (input) REAL array, dimension (N-1) The (n-1) sub-diagonal elements of T. D (input) REAL array, dimension (N) The diagonal elements of T. DU (input) REAL array, dimension (N-1) The (n-1) super-diagonal elements of T. X (input) REAL array, dimension (LDX,NRHS) The N by NRHS matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(N,1). BETA (input) REAL The scalar beta. BETA must be 0., 1., or -1.; otherwise, it is assumed to be 1. B (input/output) REAL array, dimension (LDB,NRHS) On entry, the N by NRHS matrix B. On exit, B is overwritten by the matrix expression B := alpha * A * X + beta * B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(N,1). ===================================================================== Parameter adjustments */ /* System generated locals */ integer b_dim1, b_offset, x_dim1, x_offset, i__1, i__2; /* Local variables */ static integer i__, j; extern logical lsame_(char *, char *); #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] --dl; --d__; --du; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; /* Function Body */ if (*n == 0) { return 0; } /* Multiply B by BETA if BETA.NE.1. */ if (*beta == 0.f) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { b_ref(i__, j) = 0.f; /* L10: */ } /* L20: */ } } else if (*beta == -1.f) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { b_ref(i__, j) = -b_ref(i__, j); /* L30: */ } /* L40: */ } } if (*alpha == 1.f) { if (lsame_(trans, "N")) { /* Compute B := B + A*X */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { if (*n == 1) { b_ref(1, j) = b_ref(1, j) + d__[1] * x_ref(1, j); } else { b_ref(1, j) = b_ref(1, j) + d__[1] * x_ref(1, j) + du[1] * x_ref(2, j); b_ref(*n, j) = b_ref(*n, j) + dl[*n - 1] * x_ref(*n - 1, j) + d__[*n] * x_ref(*n, j); i__2 = *n - 1; for (i__ = 2; i__ <= i__2; ++i__) { b_ref(i__, j) = b_ref(i__, j) + dl[i__ - 1] * x_ref( i__ - 1, j) + d__[i__] * x_ref(i__, j) + du[ i__] * x_ref(i__ + 1, j); /* L50: */ } } /* L60: */ } } else { /* Compute B := B + A'*X */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { if (*n == 1) { b_ref(1, j) = b_ref(1, j) + d__[1] * x_ref(1, j); } else { b_ref(1, j) = b_ref(1, j) + d__[1] * x_ref(1, j) + dl[1] * x_ref(2, j); b_ref(*n, j) = b_ref(*n, j) + du[*n - 1] * x_ref(*n - 1, j) + d__[*n] * x_ref(*n, j); i__2 = *n - 1; for (i__ = 2; i__ <= i__2; ++i__) { b_ref(i__, j) = b_ref(i__, j) + du[i__ - 1] * x_ref( i__ - 1, j) + d__[i__] * x_ref(i__, j) + dl[ i__] * x_ref(i__ + 1, j); /* L70: */ } } /* L80: */ } } } else if (*alpha == -1.f) { if (lsame_(trans, "N")) { /* Compute B := B - A*X */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { if (*n == 1) { b_ref(1, j) = b_ref(1, j) - d__[1] * x_ref(1, j); } else { b_ref(1, j) = b_ref(1, j) - d__[1] * x_ref(1, j) - du[1] * x_ref(2, j); b_ref(*n, j) = b_ref(*n, j) - dl[*n - 1] * x_ref(*n - 1, j) - d__[*n] * x_ref(*n, j); i__2 = *n - 1; for (i__ = 2; i__ <= i__2; ++i__) { b_ref(i__, j) = b_ref(i__, j) - dl[i__ - 1] * x_ref( i__ - 1, j) - d__[i__] * x_ref(i__, j) - du[ i__] * x_ref(i__ + 1, j); /* L90: */ } } /* L100: */ } } else { /* Compute B := B - A'*X */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { if (*n == 1) { b_ref(1, j) = b_ref(1, j) - d__[1] * x_ref(1, j); } else { b_ref(1, j) = b_ref(1, j) - d__[1] * x_ref(1, j) - dl[1] * x_ref(2, j); b_ref(*n, j) = b_ref(*n, j) - du[*n - 1] * x_ref(*n - 1, j) - d__[*n] * x_ref(*n, j); i__2 = *n - 1; for (i__ = 2; i__ <= i__2; ++i__) { b_ref(i__, j) = b_ref(i__, j) - du[i__ - 1] * x_ref( i__ - 1, j) - d__[i__] * x_ref(i__, j) - dl[ i__] * x_ref(i__ + 1, j); /* L110: */ } } /* L120: */ } } } return 0; /* End of SLAGTM */ } /* slagtm_ */
/* Subroutine */ int slatm6_(integer *type__, integer *n, real *a, integer * lda, real *b, real *x, integer *ldx, real *y, integer *ldy, real * alpha, real *beta, real *wx, real *wy, real *s, real *dif) { /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, y_dim1, y_offset, i__1, i__2; /* Builtin functions */ double sqrt(doublereal); /* Local variables */ static integer info; static real work[100]; static integer i__, j; static real z__[144] /* was [12][12] */; extern /* Subroutine */ int slakf2_(integer *, integer *, real *, integer *, real *, real *, real *, real *, integer *), sgesvd_(char *, char *, integer *, integer *, real *, integer *, real *, real *, integer *, real *, integer *, real *, integer *, integer *), slacpy_(char *, integer *, integer *, real *, integer *, real *, integer *); #define a_ref(a_1,a_2) a[(a_2)*a_dim1 + a_1] #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] #define y_ref(a_1,a_2) y[(a_2)*y_dim1 + a_1] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1999 Purpose ======= SLATM6 generates test matrices for the generalized eigenvalue problem, their corresponding right and left eigenvector matrices, and also reciprocal condition numbers for all eigenvalues and the reciprocal condition numbers of eigenvectors corresponding to the 1th and 5th eigenvalues. Test Matrices ============= Two kinds of test matrix pairs (A, B) = inverse(YH) * (Da, Db) * inverse(X) are used in the tests: Type 1: Da = 1+a 0 0 0 0 Db = 1 0 0 0 0 0 2+a 0 0 0 0 1 0 0 0 0 0 3+a 0 0 0 0 1 0 0 0 0 0 4+a 0 0 0 0 1 0 0 0 0 0 5+a , 0 0 0 0 1 , and Type 2: Da = 1 -1 0 0 0 Db = 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1+a 1+b 0 0 0 1 0 0 0 0 -1-b 1+a , 0 0 0 0 1 . In both cases the same inverse(YH) and inverse(X) are used to compute (A, B), giving the exact eigenvectors to (A,B) as (YH, X): YH: = 1 0 -y y -y X = 1 0 -x -x x 0 1 -y y -y 0 1 x -x -x 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1, 0 0 0 0 1 , where a, b, x and y will have all values independently of each other. Arguments ========= TYPE (input) INTEGER Specifies the problem type (see futher details). N (input) INTEGER Size of the matrices A and B. A (output) REAL array, dimension (LDA, N). On exit A N-by-N is initialized according to TYPE. LDA (input) INTEGER The leading dimension of A and of B. B (output) REAL array, dimension (LDA, N). On exit B N-by-N is initialized according to TYPE. X (output) REAL array, dimension (LDX, N). On exit X is the N-by-N matrix of right eigenvectors. LDX (input) INTEGER The leading dimension of X. Y (output) REAL array, dimension (LDY, N). On exit Y is the N-by-N matrix of left eigenvectors. LDY (input) INTEGER The leading dimension of Y. ALPHA (input) REAL BETA (input) REAL Weighting constants for matrix A. WX (input) REAL Constant for right eigenvector matrix. WY (input) REAL Constant for left eigenvector matrix. S (output) REAL array, dimension (N) S(i) is the reciprocal condition number for eigenvalue i. DIF (output) REAL array, dimension (N) DIF(i) is the reciprocal condition number for eigenvector i. ===================================================================== Generate test problem ... (Da, Db) ... Parameter adjustments */ b_dim1 = *lda; b_offset = 1 + b_dim1 * 1; b -= b_offset; a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; y_dim1 = *ldy; y_offset = 1 + y_dim1 * 1; y -= y_offset; --s; --dif; /* Function Body */ i__1 = *n; for (i__ = 1; i__ <= i__1; ++i__) { i__2 = *n; for (j = 1; j <= i__2; ++j) { if (i__ == j) { a_ref(i__, i__) = (real) i__ + *alpha; b_ref(i__, i__) = 1.f; } else { a_ref(i__, j) = 0.f; b_ref(i__, j) = 0.f; } /* L10: */ } /* L20: */ } /* Form X and Y */ slacpy_("F", n, n, &b[b_offset], lda, &y[y_offset], ldy); y_ref(3, 1) = -(*wy); y_ref(4, 1) = *wy; y_ref(5, 1) = -(*wy); y_ref(3, 2) = -(*wy); y_ref(4, 2) = *wy; y_ref(5, 2) = -(*wy); slacpy_("F", n, n, &b[b_offset], lda, &x[x_offset], ldx); x_ref(1, 3) = -(*wx); x_ref(1, 4) = -(*wx); x_ref(1, 5) = *wx; x_ref(2, 3) = *wx; x_ref(2, 4) = -(*wx); x_ref(2, 5) = -(*wx); /* Form (A, B) */ b_ref(1, 3) = *wx + *wy; b_ref(2, 3) = -(*wx) + *wy; b_ref(1, 4) = *wx - *wy; b_ref(2, 4) = *wx - *wy; b_ref(1, 5) = -(*wx) + *wy; b_ref(2, 5) = *wx + *wy; if (*type__ == 1) { a_ref(1, 3) = *wx * a_ref(1, 1) + *wy * a_ref(3, 3); a_ref(2, 3) = -(*wx) * a_ref(2, 2) + *wy * a_ref(3, 3); a_ref(1, 4) = *wx * a_ref(1, 1) - *wy * a_ref(4, 4); a_ref(2, 4) = *wx * a_ref(2, 2) - *wy * a_ref(4, 4); a_ref(1, 5) = -(*wx) * a_ref(1, 1) + *wy * a_ref(5, 5); a_ref(2, 5) = *wx * a_ref(2, 2) + *wy * a_ref(5, 5); } else if (*type__ == 2) { a_ref(1, 3) = *wx * 2.f + *wy; a_ref(2, 3) = *wy; a_ref(1, 4) = -(*wy) * (*alpha + 2.f + *beta); a_ref(2, 4) = *wx * 2.f - *wy * (*alpha + 2.f + *beta); a_ref(1, 5) = *wx * -2.f + *wy * (*alpha - *beta); a_ref(2, 5) = *wy * (*alpha - *beta); a_ref(1, 1) = 1.f; a_ref(1, 2) = -1.f; a_ref(2, 1) = 1.f; a_ref(2, 2) = a_ref(1, 1); a_ref(3, 3) = 1.f; a_ref(4, 4) = *alpha + 1.f; a_ref(4, 5) = *beta + 1.f; a_ref(5, 4) = -a_ref(4, 5); a_ref(5, 5) = a_ref(4, 4); } /* Compute condition numbers */ if (*type__ == 1) { s[1] = 1.f / sqrt((*wy * 3.f * *wy + 1.f) / (a_ref(1, 1) * a_ref(1, 1) + 1.f)); s[2] = 1.f / sqrt((*wy * 3.f * *wy + 1.f) / (a_ref(2, 2) * a_ref(2, 2) + 1.f)); s[3] = 1.f / sqrt((*wx * 2.f * *wx + 1.f) / (a_ref(3, 3) * a_ref(3, 3) + 1.f)); s[4] = 1.f / sqrt((*wx * 2.f * *wx + 1.f) / (a_ref(4, 4) * a_ref(4, 4) + 1.f)); s[5] = 1.f / sqrt((*wx * 2.f * *wx + 1.f) / (a_ref(5, 5) * a_ref(5, 5) + 1.f)); slakf2_(&c__1, &c__4, &a[a_offset], lda, &a_ref(2, 2), &b[b_offset], & b_ref(2, 2), z__, &c__12); sgesvd_("N", "N", &c__8, &c__8, z__, &c__12, work, &work[8], &c__1, & work[9], &c__1, &work[10], &c__40, &info); dif[1] = work[7]; slakf2_(&c__4, &c__1, &a[a_offset], lda, &a_ref(5, 5), &b[b_offset], & b_ref(5, 5), z__, &c__12); sgesvd_("N", "N", &c__8, &c__8, z__, &c__12, work, &work[8], &c__1, & work[9], &c__1, &work[10], &c__40, &info); dif[5] = work[7]; } else if (*type__ == 2) { s[1] = 1.f / sqrt(*wy * *wy + .33333333333333331f); s[2] = s[1]; s[3] = 1.f / sqrt(*wx * *wx + .5f); s[4] = 1.f / sqrt((*wx * 2.f * *wx + 1.f) / ((*alpha + 1.f) * (*alpha + 1.f) + 1.f + (*beta + 1.f) * (*beta + 1.f))); s[5] = s[4]; slakf2_(&c__2, &c__3, &a[a_offset], lda, &a_ref(3, 3), &b[b_offset], & b_ref(3, 3), z__, &c__12); sgesvd_("N", "N", &c__12, &c__12, z__, &c__12, work, &work[12], &c__1, &work[13], &c__1, &work[14], &c__60, &info); dif[1] = work[11]; slakf2_(&c__3, &c__2, &a[a_offset], lda, &a_ref(4, 4), &b[b_offset], & b_ref(4, 4), z__, &c__12); sgesvd_("N", "N", &c__12, &c__12, z__, &c__12, work, &work[12], &c__1, &work[13], &c__1, &work[14], &c__60, &info); dif[5] = work[11]; } return 0; /* End of SLATM6 */ } /* slatm6_ */
/* Subroutine */ int dppsvx_(char *fact, char *uplo, integer *n, integer * nrhs, doublereal *ap, doublereal *afp, char *equed, doublereal *s, doublereal *b, integer *ldb, doublereal *x, integer *ldx, doublereal * rcond, doublereal *ferr, doublereal *berr, doublereal *work, integer * iwork, integer *info) { /* -- LAPACK driver routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1999 Purpose ======= DPPSVX uses the Cholesky factorization A = U**T*U or A = L*L**T to compute the solution to a real system of linear equations A * X = B, where A is an N-by-N symmetric positive definite matrix stored in packed format and X and B are N-by-NRHS matrices. Error bounds on the solution and a condition estimate are also provided. Description =========== The following steps are performed: 1. If FACT = 'E', real scaling factors are computed to equilibrate the system: diag(S) * A * diag(S) * inv(diag(S)) * X = diag(S) * B Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if equilibration is used, A is overwritten by diag(S)*A*diag(S) and B by diag(S)*B. 2. If FACT = 'N' or 'E', the Cholesky decomposition is used to factor the matrix A (after equilibration if FACT = 'E') as A = U**T* U, if UPLO = 'U', or A = L * L**T, if UPLO = 'L', where U is an upper triangular matrix and L is a lower triangular matrix. 3. If the leading i-by-i principal minor is not positive definite, then the routine returns with INFO = i. Otherwise, the factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the condition number is less than machine precision, INFO = N+1 is returned as a warning, but the routine still goes on to solve for X and compute error bounds as described below. 4. The system of equations is solved for X using the factored form of A. 5. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and backward error estimates for it. 6. If equilibration was used, the matrix X is premultiplied by diag(S) so that it solves the original system before equilibration. Arguments ========= FACT (input) CHARACTER*1 Specifies whether or not the factored form of the matrix A is supplied on entry, and if not, whether the matrix A should be equilibrated before it is factored. = 'F': On entry, AFP contains the factored form of A. If EQUED = 'Y', the matrix A has been equilibrated with scaling factors given by S. AP and AFP will not be modified. = 'N': The matrix A will be copied to AFP and factored. = 'E': The matrix A will be equilibrated if necessary, then copied to AFP and factored. UPLO (input) CHARACTER*1 = 'U': Upper triangle of A is stored; = 'L': Lower triangle of A is stored. N (input) INTEGER The number of linear equations, i.e., the order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0. AP (input/output) DOUBLE PRECISION array, dimension (N*(N+1)/2) On entry, the upper or lower triangle of the symmetric matrix A, packed columnwise in a linear array, except if FACT = 'F' and EQUED = 'Y', then A must contain the equilibrated matrix diag(S)*A*diag(S). The j-th column of A is stored in the array AP as follows: if UPLO = 'U', AP(i + (j-1)*j/2) = A(i,j) for 1<=i<=j; if UPLO = 'L', AP(i + (j-1)*(2n-j)/2) = A(i,j) for j<=i<=n. See below for further details. A is not modified if FACT = 'F' or 'N', or if FACT = 'E' and EQUED = 'N' on exit. On exit, if FACT = 'E' and EQUED = 'Y', A is overwritten by diag(S)*A*diag(S). AFP (input or output) DOUBLE PRECISION array, dimension (N*(N+1)/2) If FACT = 'F', then AFP is an input argument and on entry contains the triangular factor U or L from the Cholesky factorization A = U'*U or A = L*L', in the same storage format as A. If EQUED .ne. 'N', then AFP is the factored form of the equilibrated matrix A. If FACT = 'N', then AFP is an output argument and on exit returns the triangular factor U or L from the Cholesky factorization A = U'*U or A = L*L' of the original matrix A. If FACT = 'E', then AFP is an output argument and on exit returns the triangular factor U or L from the Cholesky factorization A = U'*U or A = L*L' of the equilibrated matrix A (see the description of AP for the form of the equilibrated matrix). EQUED (input or output) CHARACTER*1 Specifies the form of equilibration that was done. = 'N': No equilibration (always true if FACT = 'N'). = 'Y': Equilibration was done, i.e., A has been replaced by diag(S) * A * diag(S). EQUED is an input argument if FACT = 'F'; otherwise, it is an output argument. S (input or output) DOUBLE PRECISION array, dimension (N) The scale factors for A; not accessed if EQUED = 'N'. S is an input argument if FACT = 'F'; otherwise, S is an output argument. If FACT = 'F' and EQUED = 'Y', each element of S must be positive. B (input/output) DOUBLE PRECISION array, dimension (LDB,NRHS) On entry, the N-by-NRHS right hand side matrix B. On exit, if EQUED = 'N', B is not modified; if EQUED = 'Y', B is overwritten by diag(S) * B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (output) DOUBLE PRECISION array, dimension (LDX,NRHS) If INFO = 0 or INFO = N+1, the N-by-NRHS solution matrix X to the original system of equations. Note that if EQUED = 'Y', A and B are modified on exit, and the solution to the equilibrated system is inv(diag(S))*X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). RCOND (output) DOUBLE PRECISION The estimate of the reciprocal condition number of the matrix A after equilibration (if done). If RCOND is less than the machine precision (in particular, if RCOND = 0), the matrix is singular to working precision. This condition is indicated by a return code of INFO > 0. FERR (output) DOUBLE PRECISION array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) DOUBLE PRECISION array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace) DOUBLE PRECISION array, dimension (3*N) IWORK (workspace) INTEGER array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value > 0: if INFO = i, and i is <= N: the leading minor of order i of A is not positive definite, so the factorization could not be completed, and the solution has not been computed. RCOND = 0 is returned. = N+1: U is nonsingular, but RCOND is less than machine precision, meaning that the matrix is singular to working precision. Nevertheless, the solution and error bounds are computed because there are a number of situations where the computed solution can be more accurate than the value of RCOND would suggest. Further Details =============== The packed storage scheme is illustrated by the following example when N = 4, UPLO = 'U': Two-dimensional storage of the symmetric matrix A: a11 a12 a13 a14 a22 a23 a24 a33 a34 (aij = conjg(aji)) a44 Packed storage of the upper triangle of A: AP = [ a11, a12, a22, a13, a23, a33, a14, a24, a34, a44 ] ===================================================================== Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; /* System generated locals */ integer b_dim1, b_offset, x_dim1, x_offset, i__1, i__2; doublereal d__1, d__2; /* Local variables */ static doublereal amax, smin, smax; static integer i__, j; extern logical lsame_(char *, char *); static doublereal scond, anorm; extern /* Subroutine */ int dcopy_(integer *, doublereal *, integer *, doublereal *, integer *); static logical equil, rcequ; extern doublereal dlamch_(char *); static logical nofact; extern /* Subroutine */ int dlacpy_(char *, integer *, integer *, doublereal *, integer *, doublereal *, integer *), xerbla_(char *, integer *); static doublereal bignum; extern doublereal dlansp_(char *, char *, integer *, doublereal *, doublereal *); extern /* Subroutine */ int dppcon_(char *, integer *, doublereal *, doublereal *, doublereal *, doublereal *, integer *, integer *), dlaqsp_(char *, integer *, doublereal *, doublereal *, doublereal *, doublereal *, char *); static integer infequ; extern /* Subroutine */ int dppequ_(char *, integer *, doublereal *, doublereal *, doublereal *, doublereal *, integer *), dpprfs_(char *, integer *, integer *, doublereal *, doublereal *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, doublereal *, integer *, integer *), dpptrf_(char *, integer *, doublereal *, integer *); static doublereal smlnum; extern /* Subroutine */ int dpptrs_(char *, integer *, integer *, doublereal *, doublereal *, integer *, integer *); #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] --ap; --afp; --s; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --iwork; /* Function Body */ *info = 0; nofact = lsame_(fact, "N"); equil = lsame_(fact, "E"); if (nofact || equil) { *(unsigned char *)equed = 'N'; rcequ = FALSE_; } else { rcequ = lsame_(equed, "Y"); smlnum = dlamch_("Safe minimum"); bignum = 1. / smlnum; } /* Test the input parameters. */ if (! nofact && ! equil && ! lsame_(fact, "F")) { *info = -1; } else if (! lsame_(uplo, "U") && ! lsame_(uplo, "L")) { *info = -2; } else if (*n < 0) { *info = -3; } else if (*nrhs < 0) { *info = -4; } else if (lsame_(fact, "F") && ! (rcequ || lsame_( equed, "N"))) { *info = -7; } else { if (rcequ) { smin = bignum; smax = 0.; i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MIN */ d__1 = smin, d__2 = s[j]; smin = min(d__1,d__2); /* Computing MAX */ d__1 = smax, d__2 = s[j]; smax = max(d__1,d__2); /* L10: */ } if (smin <= 0.) { *info = -8; } else if (*n > 0) { scond = max(smin,smlnum) / min(smax,bignum); } else { scond = 1.; } } if (*info == 0) { if (*ldb < max(1,*n)) { *info = -10; } else if (*ldx < max(1,*n)) { *info = -12; } } } if (*info != 0) { i__1 = -(*info); xerbla_("DPPSVX", &i__1); return 0; } if (equil) { /* Compute row and column scalings to equilibrate the matrix A. */ dppequ_(uplo, n, &ap[1], &s[1], &scond, &amax, &infequ); if (infequ == 0) { /* Equilibrate the matrix. */ dlaqsp_(uplo, n, &ap[1], &s[1], &scond, &amax, equed); rcequ = lsame_(equed, "Y"); } } /* Scale the right-hand side. */ if (rcequ) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { b_ref(i__, j) = s[i__] * b_ref(i__, j); /* L20: */ } /* L30: */ } } if (nofact || equil) { /* Compute the Cholesky factorization A = U'*U or A = L*L'. */ i__1 = *n * (*n + 1) / 2; dcopy_(&i__1, &ap[1], &c__1, &afp[1], &c__1); dpptrf_(uplo, n, &afp[1], info); /* Return if INFO is non-zero. */ if (*info != 0) { if (*info > 0) { *rcond = 0.; } return 0; } } /* Compute the norm of the matrix A. */ anorm = dlansp_("I", uplo, n, &ap[1], &work[1]); /* Compute the reciprocal of the condition number of A. */ dppcon_(uplo, n, &afp[1], &anorm, rcond, &work[1], &iwork[1], info); /* Set INFO = N+1 if the matrix is singular to working precision. */ if (*rcond < dlamch_("Epsilon")) { *info = *n + 1; } /* Compute the solution matrix X. */ dlacpy_("Full", n, nrhs, &b[b_offset], ldb, &x[x_offset], ldx); dpptrs_(uplo, n, nrhs, &afp[1], &x[x_offset], ldx, info); /* Use iterative refinement to improve the computed solution and compute error bounds and backward error estimates for it. */ dpprfs_(uplo, n, nrhs, &ap[1], &afp[1], &b[b_offset], ldb, &x[x_offset], ldx, &ferr[1], &berr[1], &work[1], &iwork[1], info); /* Transform the solution matrix X to a solution of the original system. */ if (rcequ) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { x_ref(i__, j) = s[i__] * x_ref(i__, j); /* L40: */ } /* L50: */ } i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] /= scond; /* L60: */ } } return 0; /* End of DPPSVX */ } /* dppsvx_ */
/* Subroutine */ int sget31_(real *rmax, integer *lmax, integer *ninfo, integer *knt) { /* Initialized data */ static logical ltrans[2] = { FALSE_,TRUE_ }; /* System generated locals */ real r__1, r__2, r__3, r__4, r__5, r__6, r__7, r__8, r__9, r__10, r__11, r__12, r__13, r__14, r__15, r__16, r__17; /* Builtin functions */ double sqrt(doublereal); /* Local variables */ static integer info; static real unfl, smin, a[4] /* was [2][2] */, b[4] /* was [2][2] */, scale, x[4] /* was [2][2] */; static integer ismin; static real d1, d2, vsmin[4], xnorm; extern /* Subroutine */ int slaln2_(logical *, integer *, integer *, real *, real *, real *, integer *, real *, real *, real *, integer *, real *, real *, real *, integer *, real *, real *, integer *); static real ca; static integer ia, ib, na; extern /* Subroutine */ int slabad_(real *, real *); static real wi; static integer nw; extern doublereal slamch_(char *); static real wr, bignum; static integer id1, id2, itrans; static real smlnum; static integer ica; static real den, vab[3], vca[5], vdd[4], eps; static integer iwi; static real res, tmp; static integer iwr; static real vwi[4], vwr[4]; #define a_ref(a_1,a_2) a[(a_2)*2 + a_1 - 3] #define b_ref(a_1,a_2) b[(a_2)*2 + a_1 - 3] #define x_ref(a_1,a_2) x[(a_2)*2 + a_1 - 3] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1999 Purpose ======= SGET31 tests SLALN2, a routine for solving (ca A - w D)X = sB where A is an NA by NA matrix (NA=1 or 2 only), w is a real (NW=1) or complex (NW=2) constant, ca is a real constant, D is an NA by NA real diagonal matrix, and B is an NA by NW matrix (when NW=2 the second column of B contains the imaginary part of the solution). The code returns X and s, where s is a scale factor, less than or equal to 1, which is chosen to avoid overflow in X. If any singular values of ca A-w D are less than another input parameter SMIN, they are perturbed up to SMIN. The test condition is that the scaled residual norm( (ca A-w D)*X - s*B ) / ( max( ulp*norm(ca A-w D), SMIN )*norm(X) ) should be on the order of 1. Here, ulp is the machine precision. Also, it is verified that SCALE is less than or equal to 1, and that XNORM = infinity-norm(X). Arguments ========== RMAX (output) REAL Value of the largest test ratio. LMAX (output) INTEGER Example number where largest test ratio achieved. NINFO (output) INTEGER array, dimension (3) NINFO(1) = number of examples with INFO less than 0 NINFO(2) = number of examples with INFO greater than 0 KNT (output) INTEGER Total number of examples tested. ===================================================================== Parameter adjustments */ --ninfo; /* Function Body Get machine parameters */ eps = slamch_("P"); unfl = slamch_("U"); smlnum = slamch_("S") / eps; bignum = 1.f / smlnum; slabad_(&smlnum, &bignum); /* Set up test case parameters */ vsmin[0] = smlnum; vsmin[1] = eps; vsmin[2] = .01f; vsmin[3] = 1.f / eps; vab[0] = sqrt(smlnum); vab[1] = 1.f; vab[2] = sqrt(bignum); vwr[0] = 0.f; vwr[1] = .5f; vwr[2] = 2.f; vwr[3] = 1.f; vwi[0] = smlnum; vwi[1] = eps; vwi[2] = 1.f; vwi[3] = 2.f; vdd[0] = sqrt(smlnum); vdd[1] = 1.f; vdd[2] = 2.f; vdd[3] = sqrt(bignum); vca[0] = 0.f; vca[1] = sqrt(smlnum); vca[2] = eps; vca[3] = .5f; vca[4] = 1.f; *knt = 0; ninfo[1] = 0; ninfo[2] = 0; *lmax = 0; *rmax = 0.f; /* Begin test loop */ for (id1 = 1; id1 <= 4; ++id1) { d1 = vdd[id1 - 1]; for (id2 = 1; id2 <= 4; ++id2) { d2 = vdd[id2 - 1]; for (ica = 1; ica <= 5; ++ica) { ca = vca[ica - 1]; for (itrans = 0; itrans <= 1; ++itrans) { for (ismin = 1; ismin <= 4; ++ismin) { smin = vsmin[ismin - 1]; na = 1; nw = 1; for (ia = 1; ia <= 3; ++ia) { a_ref(1, 1) = vab[ia - 1]; for (ib = 1; ib <= 3; ++ib) { b_ref(1, 1) = vab[ib - 1]; for (iwr = 1; iwr <= 4; ++iwr) { if (d1 == 1.f && d2 == 1.f && ca == 1.f) { wr = vwr[iwr - 1] * a_ref(1, 1); } else { wr = vwr[iwr - 1]; } wi = 0.f; slaln2_(<rans[itrans], &na, &nw, &smin, &ca, a, &c__2, &d1, &d2, b, &c__2, &wr, &wi, x, &c__2, &scale, & xnorm, &info); if (info < 0) { ++ninfo[1]; } if (info > 0) { ++ninfo[2]; } res = (r__1 = (ca * a_ref(1, 1) - wr * d1) * x_ref(1, 1) - scale * b_ref(1, 1), dabs(r__1)); if (info == 0) { /* Computing MAX */ r__2 = eps * (r__1 = (ca * a_ref(1, 1) - wr * d1) * x_ref(1, 1), dabs(r__1)); den = dmax(r__2,smlnum); } else { /* Computing MAX */ r__2 = smin * (r__1 = x_ref(1, 1), dabs(r__1)); den = dmax(r__2,smlnum); } res /= den; if ((r__1 = x_ref(1, 1), dabs(r__1)) < unfl && (r__3 = b_ref(1, 1), dabs( r__3)) <= smlnum * (r__2 = ca * a_ref(1, 1) - wr * d1, dabs(r__2)) ) { res = 0.f; } if (scale > 1.f) { res += 1.f / eps; } res += (r__2 = xnorm - (r__1 = x_ref(1, 1) , dabs(r__1)), dabs(r__2)) / dmax( smlnum,xnorm) / eps; if (info != 0 && info != 1) { res += 1.f / eps; } ++(*knt); if (res > *rmax) { *lmax = *knt; *rmax = res; } /* L10: */ } /* L20: */ } /* L30: */ } na = 1; nw = 2; for (ia = 1; ia <= 3; ++ia) { a_ref(1, 1) = vab[ia - 1]; for (ib = 1; ib <= 3; ++ib) { b_ref(1, 1) = vab[ib - 1]; b_ref(1, 2) = vab[ib - 1] * -.5f; for (iwr = 1; iwr <= 4; ++iwr) { if (d1 == 1.f && d2 == 1.f && ca == 1.f) { wr = vwr[iwr - 1] * a_ref(1, 1); } else { wr = vwr[iwr - 1]; } for (iwi = 1; iwi <= 4; ++iwi) { if (d1 == 1.f && d2 == 1.f && ca == 1.f) { wi = vwi[iwi - 1] * a_ref(1, 1); } else { wi = vwi[iwi - 1]; } slaln2_(<rans[itrans], &na, &nw, & smin, &ca, a, &c__2, &d1, &d2, b, &c__2, &wr, &wi, x, &c__2, &scale, &xnorm, &info); if (info < 0) { ++ninfo[1]; } if (info > 0) { ++ninfo[2]; } res = (r__1 = (ca * a_ref(1, 1) - wr * d1) * x_ref(1, 1) + wi * d1 * x_ref(1, 2) - scale * b_ref( 1, 1), dabs(r__1)); res += (r__1 = -wi * d1 * x_ref(1, 1) + (ca * a_ref(1, 1) - wr * d1) * x_ref(1, 2) - scale * b_ref(1, 2), dabs(r__1)); if (info == 0) { /* Computing MAX Computing MAX */ r__6 = (r__3 = ca * a_ref(1, 1) - wr * d1, dabs(r__3)), r__7 = (r__4 = d1 * wi, dabs(r__4)); r__5 = eps * (dmax(r__6,r__7) * (( r__1 = x_ref(1, 1), dabs( r__1)) + (r__2 = x_ref(1, 2), dabs(r__2)))); den = dmax(r__5,smlnum); } else { /* Computing MAX */ r__3 = smin * ((r__1 = x_ref(1, 1) , dabs(r__1)) + (r__2 = x_ref(1, 2), dabs(r__2))); den = dmax(r__3,smlnum); } res /= den; if ((r__1 = x_ref(1, 1), dabs(r__1)) < unfl && (r__2 = x_ref(1, 2), dabs(r__2)) < unfl && (r__4 = b_ref(1, 1), dabs(r__4)) <= smlnum * (r__3 = ca * a_ref(1, 1) - wr * d1, dabs(r__3))) { res = 0.f; } if (scale > 1.f) { res += 1.f / eps; } res += (r__3 = xnorm - (r__1 = x_ref( 1, 1), dabs(r__1)) - (r__2 = x_ref(1, 2), dabs(r__2)), dabs(r__3)) / dmax(smlnum, xnorm) / eps; if (info != 0 && info != 1) { res += 1.f / eps; } ++(*knt); if (res > *rmax) { *lmax = *knt; *rmax = res; } /* L40: */ } /* L50: */ } /* L60: */ } /* L70: */ } na = 2; nw = 1; for (ia = 1; ia <= 3; ++ia) { a_ref(1, 1) = vab[ia - 1]; a_ref(1, 2) = vab[ia - 1] * -3.f; a_ref(2, 1) = vab[ia - 1] * -7.f; a_ref(2, 2) = vab[ia - 1] * 21.f; for (ib = 1; ib <= 3; ++ib) { b_ref(1, 1) = vab[ib - 1]; b_ref(2, 1) = vab[ib - 1] * -2.f; for (iwr = 1; iwr <= 4; ++iwr) { if (d1 == 1.f && d2 == 1.f && ca == 1.f) { wr = vwr[iwr - 1] * a_ref(1, 1); } else { wr = vwr[iwr - 1]; } wi = 0.f; slaln2_(<rans[itrans], &na, &nw, &smin, &ca, a, &c__2, &d1, &d2, b, &c__2, &wr, &wi, x, &c__2, &scale, & xnorm, &info); if (info < 0) { ++ninfo[1]; } if (info > 0) { ++ninfo[2]; } if (itrans == 1) { tmp = a_ref(1, 2); a_ref(1, 2) = a_ref(2, 1); a_ref(2, 1) = tmp; } res = (r__1 = (ca * a_ref(1, 1) - wr * d1) * x_ref(1, 1) + ca * a_ref(1, 2) * x_ref(2, 1) - scale * b_ref(1, 1), dabs(r__1)); res += (r__1 = ca * a_ref(2, 1) * x_ref(1, 1) + (ca * a_ref(2, 2) - wr * d2) * x_ref(2, 1) - scale * b_ref(2, 1), dabs(r__1)); if (info == 0) { /* Computing MAX Computing MAX */ r__8 = (r__1 = ca * a_ref(1, 1) - wr * d1, dabs(r__1)) + (r__2 = ca * a_ref(1, 2), dabs(r__2)), r__9 = (r__3 = ca * a_ref(2, 1), dabs(r__3)) + (r__4 = ca * a_ref(2, 2) - wr * d2, dabs( r__4)); /* Computing MAX */ r__10 = (r__5 = x_ref(1, 1), dabs( r__5)), r__11 = (r__6 = x_ref( 2, 1), dabs(r__6)); r__7 = eps * (dmax(r__8,r__9) * dmax( r__10,r__11)); den = dmax(r__7,smlnum); } else { /* Computing MAX Computing MAX Computing MAX */ r__10 = (r__1 = ca * a_ref(1, 1) - wr * d1, dabs(r__1)) + (r__2 = ca * a_ref(1, 2), dabs(r__2)), r__11 = (r__3 = ca * a_ref(2, 1), dabs(r__3)) + (r__4 = ca * a_ref(2, 2) - wr * d2, dabs( r__4)); r__8 = smin / eps, r__9 = dmax(r__10, r__11); /* Computing MAX */ r__12 = (r__5 = x_ref(1, 1), dabs( r__5)), r__13 = (r__6 = x_ref( 2, 1), dabs(r__6)); r__7 = eps * (dmax(r__8,r__9) * dmax( r__12,r__13)); den = dmax(r__7,smlnum); } res /= den; if ((r__1 = x_ref(1, 1), dabs(r__1)) < unfl && (r__2 = x_ref(2, 1), dabs( r__2)) < unfl && (r__3 = b_ref(1, 1), dabs(r__3)) + (r__4 = b_ref(2, 1), dabs(r__4)) <= smlnum * (( r__5 = ca * a_ref(1, 1) - wr * d1, dabs(r__5)) + (r__6 = ca * a_ref( 1, 2), dabs(r__6)) + (r__7 = ca * a_ref(2, 1), dabs(r__7)) + (r__8 = ca * a_ref(2, 2) - wr * d2, dabs( r__8)))) { res = 0.f; } if (scale > 1.f) { res += 1.f / eps; } /* Computing MAX */ r__4 = (r__1 = x_ref(1, 1), dabs(r__1)), r__5 = (r__2 = x_ref(2, 1), dabs( r__2)); res += (r__3 = xnorm - dmax(r__4,r__5), dabs(r__3)) / dmax(smlnum,xnorm) / eps; if (info != 0 && info != 1) { res += 1.f / eps; } ++(*knt); if (res > *rmax) { *lmax = *knt; *rmax = res; } /* L80: */ } /* L90: */ } /* L100: */ } na = 2; nw = 2; for (ia = 1; ia <= 3; ++ia) { a_ref(1, 1) = vab[ia - 1] * 2.f; a_ref(1, 2) = vab[ia - 1] * -3.f; a_ref(2, 1) = vab[ia - 1] * -7.f; a_ref(2, 2) = vab[ia - 1] * 21.f; for (ib = 1; ib <= 3; ++ib) { b_ref(1, 1) = vab[ib - 1]; b_ref(2, 1) = vab[ib - 1] * -2.f; b_ref(1, 2) = vab[ib - 1] * 4.f; b_ref(2, 2) = vab[ib - 1] * -7.f; for (iwr = 1; iwr <= 4; ++iwr) { if (d1 == 1.f && d2 == 1.f && ca == 1.f) { wr = vwr[iwr - 1] * a_ref(1, 1); } else { wr = vwr[iwr - 1]; } for (iwi = 1; iwi <= 4; ++iwi) { if (d1 == 1.f && d2 == 1.f && ca == 1.f) { wi = vwi[iwi - 1] * a_ref(1, 1); } else { wi = vwi[iwi - 1]; } slaln2_(<rans[itrans], &na, &nw, & smin, &ca, a, &c__2, &d1, &d2, b, &c__2, &wr, &wi, x, &c__2, &scale, &xnorm, &info); if (info < 0) { ++ninfo[1]; } if (info > 0) { ++ninfo[2]; } if (itrans == 1) { tmp = a_ref(1, 2); a_ref(1, 2) = a_ref(2, 1); a_ref(2, 1) = tmp; } res = (r__1 = (ca * a_ref(1, 1) - wr * d1) * x_ref(1, 1) + ca * a_ref(1, 2) * x_ref(2, 1) + wi * d1 * x_ref(1, 2) - scale * b_ref(1, 1), dabs(r__1)); res += (r__1 = (ca * a_ref(1, 1) - wr * d1) * x_ref(1, 2) + ca * a_ref(1, 2) * x_ref(2, 2) - wi * d1 * x_ref(1, 1) - scale * b_ref(1, 2), dabs(r__1)); res += (r__1 = ca * a_ref(2, 1) * x_ref(1, 1) + (ca * a_ref(2, 2) - wr * d2) * x_ref(2, 1) + wi * d2 * x_ref(2, 2) - scale * b_ref(2, 1), dabs(r__1)); res += (r__1 = ca * a_ref(2, 1) * x_ref(1, 2) + (ca * a_ref(2, 2) - wr * d2) * x_ref(2, 2) - wi * d2 * x_ref(2, 1) - scale * b_ref(2, 2), dabs(r__1)); if (info == 0) { /* Computing MAX Computing MAX */ r__12 = (r__1 = ca * a_ref(1, 1) - wr * d1, dabs(r__1)) + ( r__2 = ca * a_ref(1, 2), dabs(r__2)) + (r__3 = wi * d1, dabs(r__3)), r__13 = (r__4 = ca * a_ref(2, 1), dabs(r__4)) + (r__5 = ca * a_ref(2, 2) - wr * d2, dabs(r__5)) + (r__6 = wi * d2, dabs(r__6)); /* Computing MAX */ r__14 = (r__7 = x_ref(1, 1), dabs( r__7)) + (r__8 = x_ref(2, 1), dabs(r__8)), r__15 = ( r__9 = x_ref(1, 2), dabs( r__9)) + (r__10 = x_ref(2, 2), dabs(r__10)); r__11 = eps * (dmax(r__12,r__13) * dmax(r__14,r__15)); den = dmax(r__11,smlnum); } else { /* Computing MAX Computing MAX Computing MAX */ r__14 = (r__1 = ca * a_ref(1, 1) - wr * d1, dabs(r__1)) + ( r__2 = ca * a_ref(1, 2), dabs(r__2)) + (r__3 = wi * d1, dabs(r__3)), r__15 = (r__4 = ca * a_ref(2, 1), dabs(r__4)) + (r__5 = ca * a_ref(2, 2) - wr * d2, dabs(r__5)) + (r__6 = wi * d2, dabs(r__6)); r__12 = smin / eps, r__13 = dmax( r__14,r__15); /* Computing MAX */ r__16 = (r__7 = x_ref(1, 1), dabs( r__7)) + (r__8 = x_ref(2, 1), dabs(r__8)), r__17 = ( r__9 = x_ref(1, 2), dabs( r__9)) + (r__10 = x_ref(2, 2), dabs(r__10)); r__11 = eps * (dmax(r__12,r__13) * dmax(r__16,r__17)); den = dmax(r__11,smlnum); } res /= den; if ((r__1 = x_ref(1, 1), dabs(r__1)) < unfl && (r__2 = x_ref(2, 1), dabs(r__2)) < unfl && (r__3 = x_ref(1, 2), dabs(r__3)) < unfl && (r__4 = x_ref(2, 2), dabs(r__4)) < unfl && (r__5 = b_ref(1, 1), dabs(r__5)) + ( r__6 = b_ref(2, 1), dabs(r__6) ) <= smlnum * ((r__7 = ca * a_ref(1, 1) - wr * d1, dabs( r__7)) + (r__8 = ca * a_ref(1, 2), dabs(r__8)) + (r__9 = ca * a_ref(2, 1), dabs(r__9)) + ( r__10 = ca * a_ref(2, 2) - wr * d2, dabs(r__10)) + (r__11 = wi * d2, dabs(r__11)) + ( r__12 = wi * d1, dabs(r__12))) ) { res = 0.f; } if (scale > 1.f) { res += 1.f / eps; } /* Computing MAX */ r__6 = (r__1 = x_ref(1, 1), dabs(r__1) ) + (r__2 = x_ref(1, 2), dabs( r__2)), r__7 = (r__3 = x_ref( 2, 1), dabs(r__3)) + (r__4 = x_ref(2, 2), dabs(r__4)); res += (r__5 = xnorm - dmax(r__6,r__7) , dabs(r__5)) / dmax(smlnum, xnorm) / eps; if (info != 0 && info != 1) { res += 1.f / eps; } ++(*knt); if (res > *rmax) { *lmax = *knt; *rmax = res; } /* L110: */ } /* L120: */ } /* L130: */ } /* L140: */ } /* L150: */ } /* L160: */ } /* L170: */ } /* L180: */ } /* L190: */ } return 0; /* End of SGET31 */ } /* sget31_ */
/* Subroutine */ int dlasy2_(logical *ltranl, logical *ltranr, integer *isgn, integer *n1, integer *n2, doublereal *tl, integer *ldtl, doublereal * tr, integer *ldtr, doublereal *b, integer *ldb, doublereal *scale, doublereal *x, integer *ldx, doublereal *xnorm, integer *info) { /* -- LAPACK auxiliary routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University October 31, 1992 Purpose ======= DLASY2 solves for the N1 by N2 matrix X, 1 <= N1,N2 <= 2, in op(TL)*X + ISGN*X*op(TR) = SCALE*B, where TL is N1 by N1, TR is N2 by N2, B is N1 by N2, and ISGN = 1 or -1. op(T) = T or T', where T' denotes the transpose of T. Arguments ========= LTRANL (input) LOGICAL On entry, LTRANL specifies the op(TL): = .FALSE., op(TL) = TL, = .TRUE., op(TL) = TL'. LTRANR (input) LOGICAL On entry, LTRANR specifies the op(TR): = .FALSE., op(TR) = TR, = .TRUE., op(TR) = TR'. ISGN (input) INTEGER On entry, ISGN specifies the sign of the equation as described before. ISGN may only be 1 or -1. N1 (input) INTEGER On entry, N1 specifies the order of matrix TL. N1 may only be 0, 1 or 2. N2 (input) INTEGER On entry, N2 specifies the order of matrix TR. N2 may only be 0, 1 or 2. TL (input) DOUBLE PRECISION array, dimension (LDTL,2) On entry, TL contains an N1 by N1 matrix. LDTL (input) INTEGER The leading dimension of the matrix TL. LDTL >= max(1,N1). TR (input) DOUBLE PRECISION array, dimension (LDTR,2) On entry, TR contains an N2 by N2 matrix. LDTR (input) INTEGER The leading dimension of the matrix TR. LDTR >= max(1,N2). B (input) DOUBLE PRECISION array, dimension (LDB,2) On entry, the N1 by N2 matrix B contains the right-hand side of the equation. LDB (input) INTEGER The leading dimension of the matrix B. LDB >= max(1,N1). SCALE (output) DOUBLE PRECISION On exit, SCALE contains the scale factor. SCALE is chosen less than or equal to 1 to prevent the solution overflowing. X (output) DOUBLE PRECISION array, dimension (LDX,2) On exit, X contains the N1 by N2 solution. LDX (input) INTEGER The leading dimension of the matrix X. LDX >= max(1,N1). XNORM (output) DOUBLE PRECISION On exit, XNORM is the infinity-norm of the solution. INFO (output) INTEGER On exit, INFO is set to 0: successful exit. 1: TL and TR have too close eigenvalues, so TL or TR is perturbed to get a nonsingular equation. NOTE: In the interests of speed, this routine does not check the inputs for errors. ===================================================================== Parameter adjustments */ /* Table of constant values */ static integer c__4 = 4; static integer c__1 = 1; static integer c__16 = 16; static integer c__0 = 0; /* Initialized data */ static integer locu12[4] = { 3,4,1,2 }; static integer locl21[4] = { 2,1,4,3 }; static integer locu22[4] = { 4,3,2,1 }; static logical xswpiv[4] = { FALSE_,FALSE_,TRUE_,TRUE_ }; static logical bswpiv[4] = { FALSE_,TRUE_,FALSE_,TRUE_ }; /* System generated locals */ integer b_dim1, b_offset, tl_dim1, tl_offset, tr_dim1, tr_offset, x_dim1, x_offset; doublereal d__1, d__2, d__3, d__4, d__5, d__6, d__7, d__8; /* Local variables */ static doublereal btmp[4], smin; static integer ipiv; static doublereal temp; static integer jpiv[4]; static doublereal xmax; static integer ipsv, jpsv, i__, j, k; static logical bswap; extern /* Subroutine */ int dcopy_(integer *, doublereal *, integer *, doublereal *, integer *), dswap_(integer *, doublereal *, integer *, doublereal *, integer *); static logical xswap; static doublereal x2[2], l21, u11, u12; static integer ip, jp; static doublereal u22, t16[16] /* was [4][4] */; extern doublereal dlamch_(char *); extern integer idamax_(integer *, doublereal *, integer *); static doublereal smlnum, gam, bet, eps, sgn, tmp[4], tau1; #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] #define t16_ref(a_1,a_2) t16[(a_2)*4 + a_1 - 5] #define tl_ref(a_1,a_2) tl[(a_2)*tl_dim1 + a_1] #define tr_ref(a_1,a_2) tr[(a_2)*tr_dim1 + a_1] tl_dim1 = *ldtl; tl_offset = 1 + tl_dim1 * 1; tl -= tl_offset; tr_dim1 = *ldtr; tr_offset = 1 + tr_dim1 * 1; tr -= tr_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; /* Function Body Do not check the input parameters for errors */ *info = 0; /* Quick return if possible */ if (*n1 == 0 || *n2 == 0) { return 0; } /* Set constants to control overflow */ eps = dlamch_("P"); smlnum = dlamch_("S") / eps; sgn = (doublereal) (*isgn); k = *n1 + *n1 + *n2 - 2; switch (k) { case 1: goto L10; case 2: goto L20; case 3: goto L30; case 4: goto L50; } /* 1 by 1: TL11*X + SGN*X*TR11 = B11 */ L10: tau1 = tl_ref(1, 1) + sgn * tr_ref(1, 1); bet = abs(tau1); if (bet <= smlnum) { tau1 = smlnum; bet = smlnum; *info = 1; } *scale = 1.; gam = (d__1 = b_ref(1, 1), abs(d__1)); if (smlnum * gam > bet) { *scale = 1. / gam; } x_ref(1, 1) = b_ref(1, 1) * *scale / tau1; *xnorm = (d__1 = x_ref(1, 1), abs(d__1)); return 0; /* 1 by 2: TL11*[X11 X12] + ISGN*[X11 X12]*op[TR11 TR12] = [B11 B12] [TR21 TR22] */ L20: /* Computing MAX Computing MAX */ d__7 = (d__1 = tl_ref(1, 1), abs(d__1)), d__8 = (d__2 = tr_ref(1, 1), abs( d__2)), d__7 = max(d__7,d__8), d__8 = (d__3 = tr_ref(1, 2), abs( d__3)), d__7 = max(d__7,d__8), d__8 = (d__4 = tr_ref(2, 1), abs( d__4)), d__7 = max(d__7,d__8), d__8 = (d__5 = tr_ref(2, 2), abs( d__5)); d__6 = eps * max(d__7,d__8); smin = max(d__6,smlnum); tmp[0] = tl_ref(1, 1) + sgn * tr_ref(1, 1); tmp[3] = tl_ref(1, 1) + sgn * tr_ref(2, 2); if (*ltranr) { tmp[1] = sgn * tr_ref(2, 1); tmp[2] = sgn * tr_ref(1, 2); } else { tmp[1] = sgn * tr_ref(1, 2); tmp[2] = sgn * tr_ref(2, 1); } btmp[0] = b_ref(1, 1); btmp[1] = b_ref(1, 2); goto L40; /* 2 by 1: op[TL11 TL12]*[X11] + ISGN* [X11]*TR11 = [B11] [TL21 TL22] [X21] [X21] [B21] */ L30: /* Computing MAX Computing MAX */ d__7 = (d__1 = tr_ref(1, 1), abs(d__1)), d__8 = (d__2 = tl_ref(1, 1), abs( d__2)), d__7 = max(d__7,d__8), d__8 = (d__3 = tl_ref(1, 2), abs( d__3)), d__7 = max(d__7,d__8), d__8 = (d__4 = tl_ref(2, 1), abs( d__4)), d__7 = max(d__7,d__8), d__8 = (d__5 = tl_ref(2, 2), abs( d__5)); d__6 = eps * max(d__7,d__8); smin = max(d__6,smlnum); tmp[0] = tl_ref(1, 1) + sgn * tr_ref(1, 1); tmp[3] = tl_ref(2, 2) + sgn * tr_ref(1, 1); if (*ltranl) { tmp[1] = tl_ref(1, 2); tmp[2] = tl_ref(2, 1); } else { tmp[1] = tl_ref(2, 1); tmp[2] = tl_ref(1, 2); } btmp[0] = b_ref(1, 1); btmp[1] = b_ref(2, 1); L40: /* Solve 2 by 2 system using complete pivoting. Set pivots less than SMIN to SMIN. */ ipiv = idamax_(&c__4, tmp, &c__1); u11 = tmp[ipiv - 1]; if (abs(u11) <= smin) { *info = 1; u11 = smin; } u12 = tmp[locu12[ipiv - 1] - 1]; l21 = tmp[locl21[ipiv - 1] - 1] / u11; u22 = tmp[locu22[ipiv - 1] - 1] - u12 * l21; xswap = xswpiv[ipiv - 1]; bswap = bswpiv[ipiv - 1]; if (abs(u22) <= smin) { *info = 1; u22 = smin; } if (bswap) { temp = btmp[1]; btmp[1] = btmp[0] - l21 * temp; btmp[0] = temp; } else { btmp[1] -= l21 * btmp[0]; } *scale = 1.; if (smlnum * 2. * abs(btmp[1]) > abs(u22) || smlnum * 2. * abs(btmp[0]) > abs(u11)) { /* Computing MAX */ d__1 = abs(btmp[0]), d__2 = abs(btmp[1]); *scale = .5 / max(d__1,d__2); btmp[0] *= *scale; btmp[1] *= *scale; } x2[1] = btmp[1] / u22; x2[0] = btmp[0] / u11 - u12 / u11 * x2[1]; if (xswap) { temp = x2[1]; x2[1] = x2[0]; x2[0] = temp; } x_ref(1, 1) = x2[0]; if (*n1 == 1) { x_ref(1, 2) = x2[1]; *xnorm = (d__1 = x_ref(1, 1), abs(d__1)) + (d__2 = x_ref(1, 2), abs( d__2)); } else { x_ref(2, 1) = x2[1]; /* Computing MAX */ d__3 = (d__1 = x_ref(1, 1), abs(d__1)), d__4 = (d__2 = x_ref(2, 1), abs(d__2)); *xnorm = max(d__3,d__4); } return 0; /* 2 by 2: op[TL11 TL12]*[X11 X12] +ISGN* [X11 X12]*op[TR11 TR12] = [B11 B12] [TL21 TL22] [X21 X22] [X21 X22] [TR21 TR22] [B21 B22] Solve equivalent 4 by 4 system using complete pivoting. Set pivots less than SMIN to SMIN. */ L50: /* Computing MAX */ d__5 = (d__1 = tr_ref(1, 1), abs(d__1)), d__6 = (d__2 = tr_ref(1, 2), abs( d__2)), d__5 = max(d__5,d__6), d__6 = (d__3 = tr_ref(2, 1), abs( d__3)), d__5 = max(d__5,d__6), d__6 = (d__4 = tr_ref(2, 2), abs( d__4)); smin = max(d__5,d__6); /* Computing MAX */ d__5 = smin, d__6 = (d__1 = tl_ref(1, 1), abs(d__1)), d__5 = max(d__5, d__6), d__6 = (d__2 = tl_ref(1, 2), abs(d__2)), d__5 = max(d__5, d__6), d__6 = (d__3 = tl_ref(2, 1), abs(d__3)), d__5 = max(d__5, d__6), d__6 = (d__4 = tl_ref(2, 2), abs(d__4)); smin = max(d__5,d__6); /* Computing MAX */ d__1 = eps * smin; smin = max(d__1,smlnum); btmp[0] = 0.; dcopy_(&c__16, btmp, &c__0, t16, &c__1); t16_ref(1, 1) = tl_ref(1, 1) + sgn * tr_ref(1, 1); t16_ref(2, 2) = tl_ref(2, 2) + sgn * tr_ref(1, 1); t16_ref(3, 3) = tl_ref(1, 1) + sgn * tr_ref(2, 2); t16_ref(4, 4) = tl_ref(2, 2) + sgn * tr_ref(2, 2); if (*ltranl) { t16_ref(1, 2) = tl_ref(2, 1); t16_ref(2, 1) = tl_ref(1, 2); t16_ref(3, 4) = tl_ref(2, 1); t16_ref(4, 3) = tl_ref(1, 2); } else { t16_ref(1, 2) = tl_ref(1, 2); t16_ref(2, 1) = tl_ref(2, 1); t16_ref(3, 4) = tl_ref(1, 2); t16_ref(4, 3) = tl_ref(2, 1); } if (*ltranr) { t16_ref(1, 3) = sgn * tr_ref(1, 2); t16_ref(2, 4) = sgn * tr_ref(1, 2); t16_ref(3, 1) = sgn * tr_ref(2, 1); t16_ref(4, 2) = sgn * tr_ref(2, 1); } else { t16_ref(1, 3) = sgn * tr_ref(2, 1); t16_ref(2, 4) = sgn * tr_ref(2, 1); t16_ref(3, 1) = sgn * tr_ref(1, 2); t16_ref(4, 2) = sgn * tr_ref(1, 2); } btmp[0] = b_ref(1, 1); btmp[1] = b_ref(2, 1); btmp[2] = b_ref(1, 2); btmp[3] = b_ref(2, 2); /* Perform elimination */ for (i__ = 1; i__ <= 3; ++i__) { xmax = 0.; for (ip = i__; ip <= 4; ++ip) { for (jp = i__; jp <= 4; ++jp) { if ((d__1 = t16_ref(ip, jp), abs(d__1)) >= xmax) { xmax = (d__1 = t16_ref(ip, jp), abs(d__1)); ipsv = ip; jpsv = jp; } /* L60: */ } /* L70: */ } if (ipsv != i__) { dswap_(&c__4, &t16_ref(ipsv, 1), &c__4, &t16_ref(i__, 1), &c__4); temp = btmp[i__ - 1]; btmp[i__ - 1] = btmp[ipsv - 1]; btmp[ipsv - 1] = temp; } if (jpsv != i__) { dswap_(&c__4, &t16_ref(1, jpsv), &c__1, &t16_ref(1, i__), &c__1); } jpiv[i__ - 1] = jpsv; if ((d__1 = t16_ref(i__, i__), abs(d__1)) < smin) { *info = 1; t16_ref(i__, i__) = smin; } for (j = i__ + 1; j <= 4; ++j) { t16_ref(j, i__) = t16_ref(j, i__) / t16_ref(i__, i__); btmp[j - 1] -= t16_ref(j, i__) * btmp[i__ - 1]; for (k = i__ + 1; k <= 4; ++k) { t16_ref(j, k) = t16_ref(j, k) - t16_ref(j, i__) * t16_ref(i__, k); /* L80: */ } /* L90: */ } /* L100: */ } if ((d__1 = t16_ref(4, 4), abs(d__1)) < smin) { t16_ref(4, 4) = smin; } *scale = 1.; if (smlnum * 8. * abs(btmp[0]) > (d__1 = t16_ref(1, 1), abs(d__1)) || smlnum * 8. * abs(btmp[1]) > (d__2 = t16_ref(2, 2), abs(d__2)) || smlnum * 8. * abs(btmp[2]) > (d__3 = t16_ref(3, 3), abs(d__3)) || smlnum * 8. * abs(btmp[3]) > (d__4 = t16_ref(4, 4), abs(d__4))) { /* Computing MAX */ d__1 = abs(btmp[0]), d__2 = abs(btmp[1]), d__1 = max(d__1,d__2), d__2 = abs(btmp[2]), d__1 = max(d__1,d__2), d__2 = abs(btmp[3]); *scale = .125 / max(d__1,d__2); btmp[0] *= *scale; btmp[1] *= *scale; btmp[2] *= *scale; btmp[3] *= *scale; } for (i__ = 1; i__ <= 4; ++i__) { k = 5 - i__; temp = 1. / t16_ref(k, k); tmp[k - 1] = btmp[k - 1] * temp; for (j = k + 1; j <= 4; ++j) { tmp[k - 1] -= temp * t16_ref(k, j) * tmp[j - 1]; /* L110: */ } /* L120: */ } for (i__ = 1; i__ <= 3; ++i__) { if (jpiv[4 - i__ - 1] != 4 - i__) { temp = tmp[4 - i__ - 1]; tmp[4 - i__ - 1] = tmp[jpiv[4 - i__ - 1] - 1]; tmp[jpiv[4 - i__ - 1] - 1] = temp; } /* L130: */ } x_ref(1, 1) = tmp[0]; x_ref(2, 1) = tmp[1]; x_ref(1, 2) = tmp[2]; x_ref(2, 2) = tmp[3]; /* Computing MAX */ d__1 = abs(tmp[0]) + abs(tmp[2]), d__2 = abs(tmp[1]) + abs(tmp[3]); *xnorm = max(d__1,d__2); return 0; /* End of DLASY2 */ } /* dlasy2_ */
/* Subroutine */ int dgtrfs_(char *trans, integer *n, integer *nrhs, doublereal *dl, doublereal *d__, doublereal *du, doublereal *dlf, doublereal *df, doublereal *duf, doublereal *du2, integer *ipiv, doublereal *b, integer *ldb, doublereal *x, integer *ldx, doublereal * ferr, doublereal *berr, doublereal *work, integer *iwork, integer * info) { /* -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= DGTRFS improves the computed solution to a system of linear equations when the coefficient matrix is tridiagonal, and provides error bounds and backward error estimates for the solution. Arguments ========= TRANS (input) CHARACTER*1 Specifies the form of the system of equations: = 'N': A * X = B (No transpose) = 'T': A**T * X = B (Transpose) = 'C': A**H * X = B (Conjugate transpose = Transpose) N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrix B. NRHS >= 0. DL (input) DOUBLE PRECISION array, dimension (N-1) The (n-1) subdiagonal elements of A. D (input) DOUBLE PRECISION array, dimension (N) The diagonal elements of A. DU (input) DOUBLE PRECISION array, dimension (N-1) The (n-1) superdiagonal elements of A. DLF (input) DOUBLE PRECISION array, dimension (N-1) The (n-1) multipliers that define the matrix L from the LU factorization of A as computed by DGTTRF. DF (input) DOUBLE PRECISION array, dimension (N) The n diagonal elements of the upper triangular matrix U from the LU factorization of A. DUF (input) DOUBLE PRECISION array, dimension (N-1) The (n-1) elements of the first superdiagonal of U. DU2 (input) DOUBLE PRECISION array, dimension (N-2) The (n-2) elements of the second superdiagonal of U. IPIV (input) INTEGER array, dimension (N) The pivot indices; for 1 <= i <= n, row i of the matrix was interchanged with row IPIV(i). IPIV(i) will always be either i or i+1; IPIV(i) = i indicates a row interchange was not required. B (input) DOUBLE PRECISION array, dimension (LDB,NRHS) The right hand side matrix B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input/output) DOUBLE PRECISION array, dimension (LDX,NRHS) On entry, the solution matrix X, as computed by DGTTRS. On exit, the improved solution matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). FERR (output) DOUBLE PRECISION array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) DOUBLE PRECISION array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace) DOUBLE PRECISION array, dimension (3*N) IWORK (workspace) INTEGER array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value Internal Parameters =================== ITMAX is the maximum number of steps of iterative refinement. ===================================================================== Test the input parameters. Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; static doublereal c_b18 = -1.; static doublereal c_b19 = 1.; /* System generated locals */ integer b_dim1, b_offset, x_dim1, x_offset, i__1, i__2; doublereal d__1, d__2, d__3, d__4; /* Local variables */ static integer kase; static doublereal safe1, safe2; static integer i__, j; static doublereal s; extern logical lsame_(char *, char *); extern /* Subroutine */ int dcopy_(integer *, doublereal *, integer *, doublereal *, integer *), daxpy_(integer *, doublereal *, doublereal *, integer *, doublereal *, integer *); static integer count; extern doublereal dlamch_(char *); extern /* Subroutine */ int dlacon_(integer *, doublereal *, doublereal *, integer *, doublereal *, integer *); static integer nz; extern /* Subroutine */ int dlagtm_(char *, integer *, integer *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, integer *, doublereal *, doublereal *, integer *); static doublereal safmin; extern /* Subroutine */ int xerbla_(char *, integer *); static logical notran; static char transn[1]; extern /* Subroutine */ int dgttrs_(char *, integer *, integer *, doublereal *, doublereal *, doublereal *, doublereal *, integer *, doublereal *, integer *, integer *); static char transt[1]; static doublereal lstres, eps; #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] --dl; --d__; --du; --dlf; --df; --duf; --du2; --ipiv; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --iwork; /* Function Body */ *info = 0; notran = lsame_(trans, "N"); if (! notran && ! lsame_(trans, "T") && ! lsame_( trans, "C")) { *info = -1; } else if (*n < 0) { *info = -2; } else if (*nrhs < 0) { *info = -3; } else if (*ldb < max(1,*n)) { *info = -13; } else if (*ldx < max(1,*n)) { *info = -15; } if (*info != 0) { i__1 = -(*info); xerbla_("DGTRFS", &i__1); return 0; } /* Quick return if possible */ if (*n == 0 || *nrhs == 0) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] = 0.; berr[j] = 0.; /* L10: */ } return 0; } if (notran) { *(unsigned char *)transn = 'N'; *(unsigned char *)transt = 'T'; } else { *(unsigned char *)transn = 'T'; *(unsigned char *)transt = 'N'; } /* NZ = maximum number of nonzero elements in each row of A, plus 1 */ nz = 4; eps = dlamch_("Epsilon"); safmin = dlamch_("Safe minimum"); safe1 = nz * safmin; safe2 = safe1 / eps; /* Do for each right hand side */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { count = 1; lstres = 3.; L20: /* Loop until stopping criterion is satisfied. Compute residual R = B - op(A) * X, where op(A) = A, A**T, or A**H, depending on TRANS. */ dcopy_(n, &b_ref(1, j), &c__1, &work[*n + 1], &c__1); dlagtm_(trans, n, &c__1, &c_b18, &dl[1], &d__[1], &du[1], &x_ref(1, j) , ldx, &c_b19, &work[*n + 1], n); /* Compute abs(op(A))*abs(x) + abs(b) for use in the backward error bound. */ if (notran) { if (*n == 1) { work[1] = (d__1 = b_ref(1, j), abs(d__1)) + (d__2 = d__[1] * x_ref(1, j), abs(d__2)); } else { work[1] = (d__1 = b_ref(1, j), abs(d__1)) + (d__2 = d__[1] * x_ref(1, j), abs(d__2)) + (d__3 = du[1] * x_ref(2, j), abs(d__3)); i__2 = *n - 1; for (i__ = 2; i__ <= i__2; ++i__) { work[i__] = (d__1 = b_ref(i__, j), abs(d__1)) + (d__2 = dl[i__ - 1] * x_ref(i__ - 1, j), abs(d__2)) + ( d__3 = d__[i__] * x_ref(i__, j), abs(d__3)) + ( d__4 = du[i__] * x_ref(i__ + 1, j), abs(d__4)); /* L30: */ } work[*n] = (d__1 = b_ref(*n, j), abs(d__1)) + (d__2 = dl[*n - 1] * x_ref(*n - 1, j), abs(d__2)) + (d__3 = d__[*n] * x_ref(*n, j), abs(d__3)); } } else { if (*n == 1) { work[1] = (d__1 = b_ref(1, j), abs(d__1)) + (d__2 = d__[1] * x_ref(1, j), abs(d__2)); } else { work[1] = (d__1 = b_ref(1, j), abs(d__1)) + (d__2 = d__[1] * x_ref(1, j), abs(d__2)) + (d__3 = dl[1] * x_ref(2, j), abs(d__3)); i__2 = *n - 1; for (i__ = 2; i__ <= i__2; ++i__) { work[i__] = (d__1 = b_ref(i__, j), abs(d__1)) + (d__2 = du[i__ - 1] * x_ref(i__ - 1, j), abs(d__2)) + ( d__3 = d__[i__] * x_ref(i__, j), abs(d__3)) + ( d__4 = dl[i__] * x_ref(i__ + 1, j), abs(d__4)); /* L40: */ } work[*n] = (d__1 = b_ref(*n, j), abs(d__1)) + (d__2 = du[*n - 1] * x_ref(*n - 1, j), abs(d__2)) + (d__3 = d__[*n] * x_ref(*n, j), abs(d__3)); } } /* Compute componentwise relative backward error from formula max(i) ( abs(R(i)) / ( abs(op(A))*abs(X) + abs(B) )(i) ) where abs(Z) is the componentwise absolute value of the matrix or vector Z. If the i-th component of the denominator is less than SAFE2, then SAFE1 is added to the i-th components of the numerator and denominator before dividing. */ s = 0.; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { /* Computing MAX */ d__2 = s, d__3 = (d__1 = work[*n + i__], abs(d__1)) / work[ i__]; s = max(d__2,d__3); } else { /* Computing MAX */ d__2 = s, d__3 = ((d__1 = work[*n + i__], abs(d__1)) + safe1) / (work[i__] + safe1); s = max(d__2,d__3); } /* L50: */ } berr[j] = s; /* Test stopping criterion. Continue iterating if 1) The residual BERR(J) is larger than machine epsilon, and 2) BERR(J) decreased by at least a factor of 2 during the last iteration, and 3) At most ITMAX iterations tried. */ if (berr[j] > eps && berr[j] * 2. <= lstres && count <= 5) { /* Update solution and try again. */ dgttrs_(trans, n, &c__1, &dlf[1], &df[1], &duf[1], &du2[1], &ipiv[ 1], &work[*n + 1], n, info); daxpy_(n, &c_b19, &work[*n + 1], &c__1, &x_ref(1, j), &c__1); lstres = berr[j]; ++count; goto L20; } /* Bound error from formula norm(X - XTRUE) / norm(X) .le. FERR = norm( abs(inv(op(A)))* ( abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) / norm(X) where norm(Z) is the magnitude of the largest component of Z inv(op(A)) is the inverse of op(A) abs(Z) is the componentwise absolute value of the matrix or vector Z NZ is the maximum number of nonzeros in any row of A, plus 1 EPS is machine epsilon The i-th component of abs(R)+NZ*EPS*(abs(op(A))*abs(X)+abs(B)) is incremented by SAFE1 if the i-th component of abs(op(A))*abs(X) + abs(B) is less than SAFE2. Use DLACON to estimate the infinity-norm of the matrix inv(op(A)) * diag(W), where W = abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { work[i__] = (d__1 = work[*n + i__], abs(d__1)) + nz * eps * work[i__]; } else { work[i__] = (d__1 = work[*n + i__], abs(d__1)) + nz * eps * work[i__] + safe1; } /* L60: */ } kase = 0; L70: dlacon_(n, &work[(*n << 1) + 1], &work[*n + 1], &iwork[1], &ferr[j], & kase); if (kase != 0) { if (kase == 1) { /* Multiply by diag(W)*inv(op(A)**T). */ dgttrs_(transt, n, &c__1, &dlf[1], &df[1], &duf[1], &du2[1], & ipiv[1], &work[*n + 1], n, info); i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L80: */ } } else { /* Multiply by inv(op(A))*diag(W). */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L90: */ } dgttrs_(transn, n, &c__1, &dlf[1], &df[1], &duf[1], &du2[1], & ipiv[1], &work[*n + 1], n, info); } goto L70; } /* Normalize error. */ lstres = 0.; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { /* Computing MAX */ d__2 = lstres, d__3 = (d__1 = x_ref(i__, j), abs(d__1)); lstres = max(d__2,d__3); /* L100: */ } if (lstres != 0.) { ferr[j] /= lstres; } /* L110: */ } return 0; /* End of DGTRFS */ } /* dgtrfs_ */
/* Subroutine */ int cqrt16_(char *trans, integer *m, integer *n, integer * nrhs, complex *a, integer *lda, complex *x, integer *ldx, complex *b, integer *ldb, real *rwork, real *resid) { /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, i__1; real r__1, r__2; complex q__1; /* Local variables */ static integer j; extern /* Subroutine */ int cgemm_(char *, char *, integer *, integer *, integer *, complex *, complex *, integer *, complex *, integer *, complex *, complex *, integer *); extern logical lsame_(char *, char *); static real anorm, bnorm; static integer n1, n2; static real xnorm; extern doublereal clange_(char *, integer *, integer *, complex *, integer *, real *), slamch_(char *), scasum_( integer *, complex *, integer *); static real eps; #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= CQRT16 computes the residual for a solution of a system of linear equations A*x = b or A'*x = b: RESID = norm(B - A*X) / ( max(m,n) * norm(A) * norm(X) * EPS ), where EPS is the machine epsilon. Arguments ========= TRANS (input) CHARACTER*1 Specifies the form of the system of equations: = 'N': A *x = b = 'T': A^T*x = b, where A^T is the transpose of A = 'C': A^H*x = b, where A^H is the conjugate transpose of A M (input) INTEGER The number of rows of the matrix A. M >= 0. N (input) INTEGER The number of columns of the matrix A. N >= 0. NRHS (input) INTEGER The number of columns of B, the matrix of right hand sides. NRHS >= 0. A (input) COMPLEX array, dimension (LDA,N) The original M x N matrix A. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,M). X (input) COMPLEX array, dimension (LDX,NRHS) The computed solution vectors for the system of linear equations. LDX (input) INTEGER The leading dimension of the array X. If TRANS = 'N', LDX >= max(1,N); if TRANS = 'T' or 'C', LDX >= max(1,M). B (input/output) COMPLEX array, dimension (LDB,NRHS) On entry, the right hand side vectors for the system of linear equations. On exit, B is overwritten with the difference B - A*X. LDB (input) INTEGER The leading dimension of the array B. IF TRANS = 'N', LDB >= max(1,M); if TRANS = 'T' or 'C', LDB >= max(1,N). RWORK (workspace) REAL array, dimension (M) RESID (output) REAL The maximum over the number of right hand sides of norm(B - A*X) / ( max(m,n) * norm(A) * norm(X) * EPS ). ===================================================================== Quick exit if M = 0 or N = 0 or NRHS = 0 Parameter adjustments */ a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; --rwork; /* Function Body */ if (*m <= 0 || *n <= 0 || *nrhs == 0) { *resid = 0.f; return 0; } if (lsame_(trans, "T") || lsame_(trans, "C")) { anorm = clange_("I", m, n, &a[a_offset], lda, &rwork[1]); n1 = *n; n2 = *m; } else { anorm = clange_("1", m, n, &a[a_offset], lda, &rwork[1]); n1 = *m; n2 = *n; } eps = slamch_("Epsilon"); /* Compute B - A*X (or B - A'*X ) and store in B. */ q__1.r = -1.f, q__1.i = 0.f; cgemm_(trans, "No transpose", &n1, nrhs, &n2, &q__1, &a[a_offset], lda, & x[x_offset], ldx, &c_b1, &b[b_offset], ldb) ; /* Compute the maximum over the number of right hand sides of norm(B - A*X) / ( max(m,n) * norm(A) * norm(X) * EPS ) . */ *resid = 0.f; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { bnorm = scasum_(&n1, &b_ref(1, j), &c__1); xnorm = scasum_(&n2, &x_ref(1, j), &c__1); if (anorm == 0.f && bnorm == 0.f) { *resid = 0.f; } else if (anorm <= 0.f || xnorm <= 0.f) { *resid = 1.f / eps; } else { /* Computing MAX */ r__1 = *resid, r__2 = bnorm / anorm / xnorm / (max(*m,*n) * eps); *resid = dmax(r__1,r__2); } /* L10: */ } return 0; /* End of CQRT16 */ } /* cqrt16_ */
/* Subroutine */ int dgbsvx_(char *fact, char *trans, integer *n, integer *kl, integer *ku, integer *nrhs, doublereal *ab, integer *ldab, doublereal *afb, integer *ldafb, integer *ipiv, char *equed, doublereal *r__, doublereal *c__, doublereal *b, integer *ldb, doublereal *x, integer *ldx, doublereal *rcond, doublereal *ferr, doublereal *berr, doublereal *work, integer *iwork, integer *info) { /* -- LAPACK driver routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University June 30, 1999 Purpose ======= DGBSVX uses the LU factorization to compute the solution to a real system of linear equations A * X = B, A**T * X = B, or A**H * X = B, where A is a band matrix of order N with KL subdiagonals and KU superdiagonals, and X and B are N-by-NRHS matrices. Error bounds on the solution and a condition estimate are also provided. Description =========== The following steps are performed by this subroutine: 1. If FACT = 'E', real scaling factors are computed to equilibrate the system: TRANS = 'N': diag(R)*A*diag(C) *inv(diag(C))*X = diag(R)*B TRANS = 'T': (diag(R)*A*diag(C))**T *inv(diag(R))*X = diag(C)*B TRANS = 'C': (diag(R)*A*diag(C))**H *inv(diag(R))*X = diag(C)*B Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if equilibration is used, A is overwritten by diag(R)*A*diag(C) and B by diag(R)*B (if TRANS='N') or diag(C)*B (if TRANS = 'T' or 'C'). 2. If FACT = 'N' or 'E', the LU decomposition is used to factor the matrix A (after equilibration if FACT = 'E') as A = L * U, where L is a product of permutation and unit lower triangular matrices with KL subdiagonals, and U is upper triangular with KL+KU superdiagonals. 3. If some U(i,i)=0, so that U is exactly singular, then the routine returns with INFO = i. Otherwise, the factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the condition number is less than machine precision, INFO = N+1 is returned as a warning, but the routine still goes on to solve for X and compute error bounds as described below. 4. The system of equations is solved for X using the factored form of A. 5. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and backward error estimates for it. 6. If equilibration was used, the matrix X is premultiplied by diag(C) (if TRANS = 'N') or diag(R) (if TRANS = 'T' or 'C') so that it solves the original system before equilibration. Arguments ========= FACT (input) CHARACTER*1 Specifies whether or not the factored form of the matrix A is supplied on entry, and if not, whether the matrix A should be equilibrated before it is factored. = 'F': On entry, AFB and IPIV contain the factored form of A. If EQUED is not 'N', the matrix A has been equilibrated with scaling factors given by R and C. AB, AFB, and IPIV are not modified. = 'N': The matrix A will be copied to AFB and factored. = 'E': The matrix A will be equilibrated if necessary, then copied to AFB and factored. TRANS (input) CHARACTER*1 Specifies the form of the system of equations. = 'N': A * X = B (No transpose) = 'T': A**T * X = B (Transpose) = 'C': A**H * X = B (Transpose) N (input) INTEGER The number of linear equations, i.e., the order of the matrix A. N >= 0. KL (input) INTEGER The number of subdiagonals within the band of A. KL >= 0. KU (input) INTEGER The number of superdiagonals within the band of A. KU >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0. AB (input/output) DOUBLE PRECISION array, dimension (LDAB,N) On entry, the matrix A in band storage, in rows 1 to KL+KU+1. The j-th column of A is stored in the j-th column of the array AB as follows: AB(KU+1+i-j,j) = A(i,j) for max(1,j-KU)<=i<=min(N,j+kl) If FACT = 'F' and EQUED is not 'N', then A must have been equilibrated by the scaling factors in R and/or C. AB is not modified if FACT = 'F' or 'N', or if FACT = 'E' and EQUED = 'N' on exit. On exit, if EQUED .ne. 'N', A is scaled as follows: EQUED = 'R': A := diag(R) * A EQUED = 'C': A := A * diag(C) EQUED = 'B': A := diag(R) * A * diag(C). LDAB (input) INTEGER The leading dimension of the array AB. LDAB >= KL+KU+1. AFB (input or output) DOUBLE PRECISION array, dimension (LDAFB,N) If FACT = 'F', then AFB is an input argument and on entry contains details of the LU factorization of the band matrix A, as computed by DGBTRF. U is stored as an upper triangular band matrix with KL+KU superdiagonals in rows 1 to KL+KU+1, and the multipliers used during the factorization are stored in rows KL+KU+2 to 2*KL+KU+1. If EQUED .ne. 'N', then AFB is the factored form of the equilibrated matrix A. If FACT = 'N', then AFB is an output argument and on exit returns details of the LU factorization of A. If FACT = 'E', then AFB is an output argument and on exit returns details of the LU factorization of the equilibrated matrix A (see the description of AB for the form of the equilibrated matrix). LDAFB (input) INTEGER The leading dimension of the array AFB. LDAFB >= 2*KL+KU+1. IPIV (input or output) INTEGER array, dimension (N) If FACT = 'F', then IPIV is an input argument and on entry contains the pivot indices from the factorization A = L*U as computed by DGBTRF; row i of the matrix was interchanged with row IPIV(i). If FACT = 'N', then IPIV is an output argument and on exit contains the pivot indices from the factorization A = L*U of the original matrix A. If FACT = 'E', then IPIV is an output argument and on exit contains the pivot indices from the factorization A = L*U of the equilibrated matrix A. EQUED (input or output) CHARACTER*1 Specifies the form of equilibration that was done. = 'N': No equilibration (always true if FACT = 'N'). = 'R': Row equilibration, i.e., A has been premultiplied by diag(R). = 'C': Column equilibration, i.e., A has been postmultiplied by diag(C). = 'B': Both row and column equilibration, i.e., A has been replaced by diag(R) * A * diag(C). EQUED is an input argument if FACT = 'F'; otherwise, it is an output argument. R (input or output) DOUBLE PRECISION array, dimension (N) The row scale factors for A. If EQUED = 'R' or 'B', A is multiplied on the left by diag(R); if EQUED = 'N' or 'C', R is not accessed. R is an input argument if FACT = 'F'; otherwise, R is an output argument. If FACT = 'F' and EQUED = 'R' or 'B', each element of R must be positive. C (input or output) DOUBLE PRECISION array, dimension (N) The column scale factors for A. If EQUED = 'C' or 'B', A is multiplied on the right by diag(C); if EQUED = 'N' or 'R', C is not accessed. C is an input argument if FACT = 'F'; otherwise, C is an output argument. If FACT = 'F' and EQUED = 'C' or 'B', each element of C must be positive. B (input/output) DOUBLE PRECISION array, dimension (LDB,NRHS) On entry, the right hand side matrix B. On exit, if EQUED = 'N', B is not modified; if TRANS = 'N' and EQUED = 'R' or 'B', B is overwritten by diag(R)*B; if TRANS = 'T' or 'C' and EQUED = 'C' or 'B', B is overwritten by diag(C)*B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (output) DOUBLE PRECISION array, dimension (LDX,NRHS) If INFO = 0 or INFO = N+1, the N-by-NRHS solution matrix X to the original system of equations. Note that A and B are modified on exit if EQUED .ne. 'N', and the solution to the equilibrated system is inv(diag(C))*X if TRANS = 'N' and EQUED = 'C' or 'B', or inv(diag(R))*X if TRANS = 'T' or 'C' and EQUED = 'R' or 'B'. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). RCOND (output) DOUBLE PRECISION The estimate of the reciprocal condition number of the matrix A after equilibration (if done). If RCOND is less than the machine precision (in particular, if RCOND = 0), the matrix is singular to working precision. This condition is indicated by a return code of INFO > 0. FERR (output) DOUBLE PRECISION array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) DOUBLE PRECISION array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace/output) DOUBLE PRECISION array, dimension (3*N) On exit, WORK(1) contains the reciprocal pivot growth factor norm(A)/norm(U). The "max absolute element" norm is used. If WORK(1) is much less than 1, then the stability of the LU factorization of the (equilibrated) matrix A could be poor. This also means that the solution X, condition estimator RCOND, and forward error bound FERR could be unreliable. If factorization fails with 0<INFO<=N, then WORK(1) contains the reciprocal pivot growth factor for the leading INFO columns of A. IWORK (workspace) INTEGER array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value > 0: if INFO = i, and i is <= N: U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, so the solution and error bounds could not be computed. RCOND = 0 is returned. = N+1: U is nonsingular, but RCOND is less than machine precision, meaning that the matrix is singular to working precision. Nevertheless, the solution and error bounds are computed because there are a number of situations where the computed solution can be more accurate than the value of RCOND would suggest. ===================================================================== Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; /* System generated locals */ integer ab_dim1, ab_offset, afb_dim1, afb_offset, b_dim1, b_offset, x_dim1, x_offset, i__1, i__2, i__3, i__4, i__5; doublereal d__1, d__2, d__3; /* Local variables */ static doublereal amax; static char norm[1]; static integer i__, j; extern logical lsame_(char *, char *); static doublereal rcmin, rcmax, anorm; extern /* Subroutine */ int dcopy_(integer *, doublereal *, integer *, doublereal *, integer *); static logical equil; static integer j1, j2; extern doublereal dlamch_(char *), dlangb_(char *, integer *, integer *, integer *, doublereal *, integer *, doublereal *); extern /* Subroutine */ int dlaqgb_(integer *, integer *, integer *, integer *, doublereal *, integer *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, char *), dgbcon_(char *, integer *, integer *, integer *, doublereal *, integer *, integer *, doublereal *, doublereal *, doublereal *, integer *, integer *); static doublereal colcnd; extern doublereal dlantb_(char *, char *, char *, integer *, integer *, doublereal *, integer *, doublereal *); extern /* Subroutine */ int dgbequ_(integer *, integer *, integer *, integer *, doublereal *, integer *, doublereal *, doublereal *, doublereal *, doublereal *, doublereal *, integer *), dgbrfs_( char *, integer *, integer *, integer *, integer *, doublereal *, integer *, doublereal *, integer *, integer *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, doublereal *, integer *, integer *), dgbtrf_(integer *, integer *, integer *, integer *, doublereal *, integer *, integer *, integer *); static logical nofact; extern /* Subroutine */ int dlacpy_(char *, integer *, integer *, doublereal *, integer *, doublereal *, integer *), xerbla_(char *, integer *); static doublereal bignum; extern /* Subroutine */ int dgbtrs_(char *, integer *, integer *, integer *, integer *, doublereal *, integer *, integer *, doublereal *, integer *, integer *); static integer infequ; static logical colequ; static doublereal rowcnd; static logical notran; static doublereal smlnum; static logical rowequ; static doublereal rpvgrw; #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] #define ab_ref(a_1,a_2) ab[(a_2)*ab_dim1 + a_1] #define afb_ref(a_1,a_2) afb[(a_2)*afb_dim1 + a_1] ab_dim1 = *ldab; ab_offset = 1 + ab_dim1 * 1; ab -= ab_offset; afb_dim1 = *ldafb; afb_offset = 1 + afb_dim1 * 1; afb -= afb_offset; --ipiv; --r__; --c__; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --iwork; /* Function Body */ *info = 0; nofact = lsame_(fact, "N"); equil = lsame_(fact, "E"); notran = lsame_(trans, "N"); if (nofact || equil) { *(unsigned char *)equed = 'N'; rowequ = FALSE_; colequ = FALSE_; } else { rowequ = lsame_(equed, "R") || lsame_(equed, "B"); colequ = lsame_(equed, "C") || lsame_(equed, "B"); smlnum = dlamch_("Safe minimum"); bignum = 1. / smlnum; } /* Test the input parameters. */ if (! nofact && ! equil && ! lsame_(fact, "F")) { *info = -1; } else if (! notran && ! lsame_(trans, "T") && ! lsame_(trans, "C")) { *info = -2; } else if (*n < 0) { *info = -3; } else if (*kl < 0) { *info = -4; } else if (*ku < 0) { *info = -5; } else if (*nrhs < 0) { *info = -6; } else if (*ldab < *kl + *ku + 1) { *info = -8; } else if (*ldafb < (*kl << 1) + *ku + 1) { *info = -10; } else if (lsame_(fact, "F") && ! (rowequ || colequ || lsame_(equed, "N"))) { *info = -12; } else { if (rowequ) { rcmin = bignum; rcmax = 0.; i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MIN */ d__1 = rcmin, d__2 = r__[j]; rcmin = min(d__1,d__2); /* Computing MAX */ d__1 = rcmax, d__2 = r__[j]; rcmax = max(d__1,d__2); /* L10: */ } if (rcmin <= 0.) { *info = -13; } else if (*n > 0) { rowcnd = max(rcmin,smlnum) / min(rcmax,bignum); } else { rowcnd = 1.; } } if (colequ && *info == 0) { rcmin = bignum; rcmax = 0.; i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MIN */ d__1 = rcmin, d__2 = c__[j]; rcmin = min(d__1,d__2); /* Computing MAX */ d__1 = rcmax, d__2 = c__[j]; rcmax = max(d__1,d__2); /* L20: */ } if (rcmin <= 0.) { *info = -14; } else if (*n > 0) { colcnd = max(rcmin,smlnum) / min(rcmax,bignum); } else { colcnd = 1.; } } if (*info == 0) { if (*ldb < max(1,*n)) { *info = -16; } else if (*ldx < max(1,*n)) { *info = -18; } } } if (*info != 0) { i__1 = -(*info); xerbla_("DGBSVX", &i__1); return 0; } if (equil) { /* Compute row and column scalings to equilibrate the matrix A. */ dgbequ_(n, n, kl, ku, &ab[ab_offset], ldab, &r__[1], &c__[1], &rowcnd, &colcnd, &amax, &infequ); if (infequ == 0) { /* Equilibrate the matrix. */ dlaqgb_(n, n, kl, ku, &ab[ab_offset], ldab, &r__[1], &c__[1], & rowcnd, &colcnd, &amax, equed); rowequ = lsame_(equed, "R") || lsame_(equed, "B"); colequ = lsame_(equed, "C") || lsame_(equed, "B"); } } /* Scale the right hand side. */ if (notran) { if (rowequ) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { b_ref(i__, j) = r__[i__] * b_ref(i__, j); /* L30: */ } /* L40: */ } } } else if (colequ) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { b_ref(i__, j) = c__[i__] * b_ref(i__, j); /* L50: */ } /* L60: */ } } if (nofact || equil) { /* Compute the LU factorization of the band matrix A. */ i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ i__2 = j - *ku; j1 = max(i__2,1); /* Computing MIN */ i__2 = j + *kl; j2 = min(i__2,*n); i__2 = j2 - j1 + 1; dcopy_(&i__2, &ab_ref(*ku + 1 - j + j1, j), &c__1, &afb_ref(*kl + *ku + 1 - j + j1, j), &c__1); /* L70: */ } dgbtrf_(n, n, kl, ku, &afb[afb_offset], ldafb, &ipiv[1], info); /* Return if INFO is non-zero. */ if (*info != 0) { if (*info > 0) { /* Compute the reciprocal pivot growth factor of the leading rank-deficient INFO columns of A. */ anorm = 0.; i__1 = *info; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ i__2 = *ku + 2 - j; /* Computing MIN */ i__4 = *n + *ku + 1 - j, i__5 = *kl + *ku + 1; i__3 = min(i__4,i__5); for (i__ = max(i__2,1); i__ <= i__3; ++i__) { /* Computing MAX */ d__2 = anorm, d__3 = (d__1 = ab_ref(i__, j), abs(d__1) ); anorm = max(d__2,d__3); /* L80: */ } /* L90: */ } /* Computing MAX */ i__1 = 1, i__3 = *kl + *ku + 2 - *info; /* Computing MIN */ i__4 = *info - 1, i__5 = *kl + *ku; i__2 = min(i__4,i__5); rpvgrw = dlantb_("M", "U", "N", info, &i__2, &afb_ref(max( i__1,i__3), 1), ldafb, &work[1]); if (rpvgrw == 0.) { rpvgrw = 1.; } else { rpvgrw = anorm / rpvgrw; } work[1] = rpvgrw; *rcond = 0.; } return 0; } } /* Compute the norm of the matrix A and the reciprocal pivot growth factor RPVGRW. */ if (notran) { *(unsigned char *)norm = '1'; } else { *(unsigned char *)norm = 'I'; } anorm = dlangb_(norm, n, kl, ku, &ab[ab_offset], ldab, &work[1]); i__1 = *kl + *ku; rpvgrw = dlantb_("M", "U", "N", n, &i__1, &afb[afb_offset], ldafb, &work[ 1]); if (rpvgrw == 0.) { rpvgrw = 1.; } else { rpvgrw = dlangb_("M", n, kl, ku, &ab[ab_offset], ldab, &work[1]) / rpvgrw; } /* Compute the reciprocal of the condition number of A. */ dgbcon_(norm, n, kl, ku, &afb[afb_offset], ldafb, &ipiv[1], &anorm, rcond, &work[1], &iwork[1], info); /* Set INFO = N+1 if the matrix is singular to working precision. */ if (*rcond < dlamch_("Epsilon")) { *info = *n + 1; } /* Compute the solution matrix X. */ dlacpy_("Full", n, nrhs, &b[b_offset], ldb, &x[x_offset], ldx); dgbtrs_(trans, n, kl, ku, nrhs, &afb[afb_offset], ldafb, &ipiv[1], &x[ x_offset], ldx, info); /* Use iterative refinement to improve the computed solution and compute error bounds and backward error estimates for it. */ dgbrfs_(trans, n, kl, ku, nrhs, &ab[ab_offset], ldab, &afb[afb_offset], ldafb, &ipiv[1], &b[b_offset], ldb, &x[x_offset], ldx, &ferr[1], & berr[1], &work[1], &iwork[1], info); /* Transform the solution matrix X to a solution of the original system. */ if (notran) { if (colequ) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__3 = *n; for (i__ = 1; i__ <= i__3; ++i__) { x_ref(i__, j) = c__[i__] * x_ref(i__, j); /* L100: */ } /* L110: */ } i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] /= colcnd; /* L120: */ } } } else if (rowequ) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { i__3 = *n; for (i__ = 1; i__ <= i__3; ++i__) { x_ref(i__, j) = r__[i__] * x_ref(i__, j); /* L130: */ } /* L140: */ } i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] /= rowcnd; /* L150: */ } } work[1] = rpvgrw; return 0; /* End of DGBSVX */ } /* dgbsvx_ */
/* Subroutine */ int dgbt02_(char *trans, integer *m, integer *n, integer *kl, integer *ku, integer *nrhs, doublereal *a, integer *lda, doublereal * x, integer *ldx, doublereal *b, integer *ldb, doublereal *resid) { /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, i__1, i__2, i__3; doublereal d__1, d__2; /* Local variables */ static integer j; extern /* Subroutine */ int dgbmv_(char *, integer *, integer *, integer * , integer *, doublereal *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, integer *); extern logical lsame_(char *, char *); extern doublereal dasum_(integer *, doublereal *, integer *); static doublereal anorm, bnorm; static integer i1, i2, n1; static doublereal xnorm; static integer kd; extern doublereal dlamch_(char *); static doublereal eps; #define a_ref(a_1,a_2) a[(a_2)*a_dim1 + a_1] #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= DGBT02 computes the residual for a solution of a banded system of equations A*x = b or A'*x = b: RESID = norm( B - A*X ) / ( norm(A) * norm(X) * EPS). where EPS is the machine precision. Arguments ========= TRANS (input) CHARACTER*1 Specifies the form of the system of equations: = 'N': A *x = b = 'T': A'*x = b, where A' is the transpose of A = 'C': A'*x = b, where A' is the transpose of A M (input) INTEGER The number of rows of the matrix A. M >= 0. N (input) INTEGER The number of columns of the matrix A. N >= 0. KL (input) INTEGER The number of subdiagonals within the band of A. KL >= 0. KU (input) INTEGER The number of superdiagonals within the band of A. KU >= 0. NRHS (input) INTEGER The number of columns of B. NRHS >= 0. A (input) DOUBLE PRECISION array, dimension (LDA,N) The original matrix A in band storage, stored in rows 1 to KL+KU+1. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,KL+KU+1). X (input) DOUBLE PRECISION array, dimension (LDX,NRHS) The computed solution vectors for the system of linear equations. LDX (input) INTEGER The leading dimension of the array X. If TRANS = 'N', LDX >= max(1,N); if TRANS = 'T' or 'C', LDX >= max(1,M). B (input/output) DOUBLE PRECISION array, dimension (LDB,NRHS) On entry, the right hand side vectors for the system of linear equations. On exit, B is overwritten with the difference B - A*X. LDB (input) INTEGER The leading dimension of the array B. IF TRANS = 'N', LDB >= max(1,M); if TRANS = 'T' or 'C', LDB >= max(1,N). RESID (output) DOUBLE PRECISION The maximum over the number of right hand sides of norm(B - A*X) / ( norm(A) * norm(X) * EPS ). ===================================================================== Quick return if N = 0 pr NRHS = 0 Parameter adjustments */ a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; /* Function Body */ if (*m <= 0 || *n <= 0 || *nrhs <= 0) { *resid = 0.; return 0; } /* Exit with RESID = 1/EPS if ANORM = 0. */ eps = dlamch_("Epsilon"); kd = *ku + 1; anorm = 0.; i__1 = *n; for (j = 1; j <= i__1; ++j) { /* Computing MAX */ i__2 = kd + 1 - j; i1 = max(i__2,1); /* Computing MIN */ i__2 = kd + *m - j, i__3 = *kl + kd; i2 = min(i__2,i__3); /* Computing MAX */ i__2 = i2 - i1 + 1; d__1 = anorm, d__2 = dasum_(&i__2, &a_ref(i1, j), &c__1); anorm = max(d__1,d__2); /* L10: */ } if (anorm <= 0.) { *resid = 1. / eps; return 0; } if (lsame_(trans, "T") || lsame_(trans, "C")) { n1 = *n; } else { n1 = *m; } /* Compute B - A*X (or B - A'*X ) */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dgbmv_(trans, m, n, kl, ku, &c_b8, &a[a_offset], lda, &x_ref(1, j), & c__1, &c_b10, &b_ref(1, j), &c__1); /* L20: */ } /* Compute the maximum over the number of right hand sides of norm(B - A*X) / ( norm(A) * norm(X) * EPS ). */ *resid = 0.; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { bnorm = dasum_(&n1, &b_ref(1, j), &c__1); xnorm = dasum_(&n1, &x_ref(1, j), &c__1); if (xnorm <= 0.) { *resid = 1. / eps; } else { /* Computing MAX */ d__1 = *resid, d__2 = bnorm / anorm / xnorm / eps; *resid = max(d__1,d__2); } /* L30: */ } return 0; /* End of DGBT02 */ } /* dgbt02_ */
doublereal zqrt14_(char *trans, integer *m, integer *n, integer *nrhs, doublecomplex *a, integer *lda, doublecomplex *x, integer *ldx, doublecomplex *work, integer *lwork) { /* System generated locals */ integer a_dim1, a_offset, x_dim1, x_offset, i__1, i__2, i__3; doublereal ret_val, d__1, d__2; doublecomplex z__1; /* Builtin functions */ double z_abs(doublecomplex *); void d_cnjg(doublecomplex *, doublecomplex *); /* Local variables */ static integer info; static doublereal anrm; static logical tpsd; static doublereal xnrm; static integer i__, j; extern logical lsame_(char *, char *); static doublereal rwork[1]; extern /* Subroutine */ int zgelq2_(integer *, integer *, doublecomplex *, integer *, doublecomplex *, doublecomplex *, integer *), zgeqr2_( integer *, integer *, doublecomplex *, integer *, doublecomplex *, doublecomplex *, integer *); extern doublereal dlamch_(char *); extern /* Subroutine */ int xerbla_(char *, integer *); extern doublereal zlange_(char *, integer *, integer *, doublecomplex *, integer *, doublereal *); extern /* Subroutine */ int zlascl_(char *, integer *, integer *, doublereal *, doublereal *, integer *, integer *, doublecomplex *, integer *, integer *); static integer ldwork; extern /* Subroutine */ int zlacpy_(char *, integer *, integer *, doublecomplex *, integer *, doublecomplex *, integer *); static doublereal err; #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= ZQRT14 checks whether X is in the row space of A or A'. It does so by scaling both X and A such that their norms are in the range [sqrt(eps), 1/sqrt(eps)], then computing a QR factorization of [A,X] (if TRANS = 'C') or an LQ factorization of [A',X]' (if TRANS = 'N'), and returning the norm of the trailing triangle, scaled by MAX(M,N,NRHS)*eps. Arguments ========= TRANS (input) CHARACTER*1 = 'N': No transpose, check for X in the row space of A = 'C': Conjugate transpose, check for X in row space of A'. M (input) INTEGER The number of rows of the matrix A. N (input) INTEGER The number of columns of the matrix A. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of X. A (input) COMPLEX*16 array, dimension (LDA,N) The M-by-N matrix A. LDA (input) INTEGER The leading dimension of the array A. X (input) COMPLEX*16 array, dimension (LDX,NRHS) If TRANS = 'N', the N-by-NRHS matrix X. IF TRANS = 'C', the M-by-NRHS matrix X. LDX (input) INTEGER The leading dimension of the array X. WORK (workspace) COMPLEX*16 array dimension (LWORK) LWORK (input) INTEGER length of workspace array required If TRANS = 'N', LWORK >= (M+NRHS)*(N+2); if TRANS = 'C', LWORK >= (N+NRHS)*(M+2). ===================================================================== Parameter adjustments */ a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --work; /* Function Body */ ret_val = 0.; if (lsame_(trans, "N")) { ldwork = *m + *nrhs; tpsd = FALSE_; if (*lwork < (*m + *nrhs) * (*n + 2)) { xerbla_("ZQRT14", &c__10); return ret_val; } else if (*n <= 0 || *nrhs <= 0) { return ret_val; } } else if (lsame_(trans, "C")) { ldwork = *m; tpsd = TRUE_; if (*lwork < (*n + *nrhs) * (*m + 2)) { xerbla_("ZQRT14", &c__10); return ret_val; } else if (*m <= 0 || *nrhs <= 0) { return ret_val; } } else { xerbla_("ZQRT14", &c__1); return ret_val; } /* Copy and scale A */ zlacpy_("All", m, n, &a[a_offset], lda, &work[1], &ldwork); anrm = zlange_("M", m, n, &work[1], &ldwork, rwork); if (anrm != 0.) { zlascl_("G", &c__0, &c__0, &anrm, &c_b15, m, n, &work[1], &ldwork, & info); } /* Copy X or X' into the right place and scale it */ if (tpsd) { /* Copy X into columns n+1:n+nrhs of work */ zlacpy_("All", m, nrhs, &x[x_offset], ldx, &work[*n * ldwork + 1], & ldwork); xnrm = zlange_("M", m, nrhs, &work[*n * ldwork + 1], &ldwork, rwork); if (xnrm != 0.) { zlascl_("G", &c__0, &c__0, &xnrm, &c_b15, m, nrhs, &work[*n * ldwork + 1], &ldwork, &info); } i__1 = *n + *nrhs; anrm = zlange_("One-norm", m, &i__1, &work[1], &ldwork, rwork); /* Compute QR factorization of X */ i__1 = *n + *nrhs; /* Computing MIN */ i__2 = *m, i__3 = *n + *nrhs; zgeqr2_(m, &i__1, &work[1], &ldwork, &work[ldwork * (*n + *nrhs) + 1], &work[ldwork * (*n + *nrhs) + min(i__2,i__3) + 1], &info); /* Compute largest entry in upper triangle of work(n+1:m,n+1:n+nrhs) */ err = 0.; i__1 = *n + *nrhs; for (j = *n + 1; j <= i__1; ++j) { i__2 = min(*m,j); for (i__ = *n + 1; i__ <= i__2; ++i__) { /* Computing MAX */ d__1 = err, d__2 = z_abs(&work[i__ + (j - 1) * *m]); err = max(d__1,d__2); /* L10: */ } /* L20: */ } } else { /* Copy X' into rows m+1:m+nrhs of work */ i__1 = *n; for (i__ = 1; i__ <= i__1; ++i__) { i__2 = *nrhs; for (j = 1; j <= i__2; ++j) { i__3 = *m + j + (i__ - 1) * ldwork; d_cnjg(&z__1, &x_ref(i__, j)); work[i__3].r = z__1.r, work[i__3].i = z__1.i; /* L30: */ } /* L40: */ } xnrm = zlange_("M", nrhs, n, &work[*m + 1], &ldwork, rwork) ; if (xnrm != 0.) { zlascl_("G", &c__0, &c__0, &xnrm, &c_b15, nrhs, n, &work[*m + 1], &ldwork, &info); } /* Compute LQ factorization of work */ zgelq2_(&ldwork, n, &work[1], &ldwork, &work[ldwork * *n + 1], &work[ ldwork * (*n + 1) + 1], &info); /* Compute largest entry in lower triangle in work(m+1:m+nrhs,m+1:n) */ err = 0.; i__1 = *n; for (j = *m + 1; j <= i__1; ++j) { i__2 = ldwork; for (i__ = j; i__ <= i__2; ++i__) { /* Computing MAX */ d__1 = err, d__2 = z_abs(&work[i__ + (j - 1) * ldwork]); err = max(d__1,d__2); /* L50: */ } /* L60: */ } } /* Computing MAX */ i__1 = max(*m,*n); ret_val = err / ((doublereal) max(i__1,*nrhs) * dlamch_("Epsilon")); return ret_val; /* End of ZQRT14 */ } /* zqrt14_ */
/* Subroutine */ int dtbrfs_(char *uplo, char *trans, char *diag, integer *n, integer *kd, integer *nrhs, doublereal *ab, integer *ldab, doublereal *b, integer *ldb, doublereal *x, integer *ldx, doublereal *ferr, doublereal *berr, doublereal *work, integer *iwork, integer *info) { /* -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= DTBRFS provides error bounds and backward error estimates for the solution to a system of linear equations with a triangular band coefficient matrix. The solution matrix X must be computed by DTBTRS or some other means before entering this routine. DTBRFS does not do iterative refinement because doing so cannot improve the backward error. Arguments ========= UPLO (input) CHARACTER*1 = 'U': A is upper triangular; = 'L': A is lower triangular. TRANS (input) CHARACTER*1 Specifies the form of the system of equations: = 'N': A * X = B (No transpose) = 'T': A**T * X = B (Transpose) = 'C': A**H * X = B (Conjugate transpose = Transpose) DIAG (input) CHARACTER*1 = 'N': A is non-unit triangular; = 'U': A is unit triangular. N (input) INTEGER The order of the matrix A. N >= 0. KD (input) INTEGER The number of superdiagonals or subdiagonals of the triangular band matrix A. KD >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0. AB (input) DOUBLE PRECISION array, dimension (LDAB,N) The upper or lower triangular band matrix A, stored in the first kd+1 rows of the array. The j-th column of A is stored in the j-th column of the array AB as follows: if UPLO = 'U', AB(kd+1+i-j,j) = A(i,j) for max(1,j-kd)<=i<=j; if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+kd). If DIAG = 'U', the diagonal elements of A are not referenced and are assumed to be 1. LDAB (input) INTEGER The leading dimension of the array AB. LDAB >= KD+1. B (input) DOUBLE PRECISION array, dimension (LDB,NRHS) The right hand side matrix B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input) DOUBLE PRECISION array, dimension (LDX,NRHS) The solution matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). FERR (output) DOUBLE PRECISION array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) DOUBLE PRECISION array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace) DOUBLE PRECISION array, dimension (3*N) IWORK (workspace) INTEGER array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value ===================================================================== Test the input parameters. Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; static doublereal c_b19 = -1.; /* System generated locals */ integer ab_dim1, ab_offset, b_dim1, b_offset, x_dim1, x_offset, i__1, i__2, i__3, i__4, i__5; doublereal d__1, d__2, d__3; /* Local variables */ static integer kase; static doublereal safe1, safe2; static integer i__, j, k; static doublereal s; extern logical lsame_(char *, char *); extern /* Subroutine */ int dtbmv_(char *, char *, char *, integer *, integer *, doublereal *, integer *, doublereal *, integer *), dcopy_(integer *, doublereal *, integer * , doublereal *, integer *), dtbsv_(char *, char *, char *, integer *, integer *, doublereal *, integer *, doublereal *, integer *), daxpy_(integer *, doublereal * , doublereal *, integer *, doublereal *, integer *); static logical upper; extern doublereal dlamch_(char *); extern /* Subroutine */ int dlacon_(integer *, doublereal *, doublereal *, integer *, doublereal *, integer *); static doublereal xk; static integer nz; static doublereal safmin; extern /* Subroutine */ int xerbla_(char *, integer *); static logical notran; static char transt[1]; static logical nounit; static doublereal lstres, eps; #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] #define ab_ref(a_1,a_2) ab[(a_2)*ab_dim1 + a_1] ab_dim1 = *ldab; ab_offset = 1 + ab_dim1 * 1; ab -= ab_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --iwork; /* Function Body */ *info = 0; upper = lsame_(uplo, "U"); notran = lsame_(trans, "N"); nounit = lsame_(diag, "N"); if (! upper && ! lsame_(uplo, "L")) { *info = -1; } else if (! notran && ! lsame_(trans, "T") && ! lsame_(trans, "C")) { *info = -2; } else if (! nounit && ! lsame_(diag, "U")) { *info = -3; } else if (*n < 0) { *info = -4; } else if (*kd < 0) { *info = -5; } else if (*nrhs < 0) { *info = -6; } else if (*ldab < *kd + 1) { *info = -8; } else if (*ldb < max(1,*n)) { *info = -10; } else if (*ldx < max(1,*n)) { *info = -12; } if (*info != 0) { i__1 = -(*info); xerbla_("DTBRFS", &i__1); return 0; } /* Quick return if possible */ if (*n == 0 || *nrhs == 0) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] = 0.; berr[j] = 0.; /* L10: */ } return 0; } if (notran) { *(unsigned char *)transt = 'T'; } else { *(unsigned char *)transt = 'N'; } /* NZ = maximum number of nonzero elements in each row of A, plus 1 */ nz = *kd + 2; eps = dlamch_("Epsilon"); safmin = dlamch_("Safe minimum"); safe1 = nz * safmin; safe2 = safe1 / eps; /* Do for each right hand side */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { /* Compute residual R = B - op(A) * X, where op(A) = A or A', depending on TRANS. */ dcopy_(n, &x_ref(1, j), &c__1, &work[*n + 1], &c__1); dtbmv_(uplo, trans, diag, n, kd, &ab[ab_offset], ldab, &work[*n + 1], &c__1); daxpy_(n, &c_b19, &b_ref(1, j), &c__1, &work[*n + 1], &c__1); /* Compute componentwise relative backward error from formula max(i) ( abs(R(i)) / ( abs(op(A))*abs(X) + abs(B) )(i) ) where abs(Z) is the componentwise absolute value of the matrix or vector Z. If the i-th component of the denominator is less than SAFE2, then SAFE1 is added to the i-th components of the numerator and denominator before dividing. */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[i__] = (d__1 = b_ref(i__, j), abs(d__1)); /* L20: */ } if (notran) { /* Compute abs(A)*abs(X) + abs(B). */ if (upper) { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { xk = (d__1 = x_ref(k, j), abs(d__1)); /* Computing MAX */ i__3 = 1, i__4 = k - *kd; i__5 = k; for (i__ = max(i__3,i__4); i__ <= i__5; ++i__) { work[i__] += (d__1 = ab_ref(*kd + 1 + i__ - k, k), abs(d__1)) * xk; /* L30: */ } /* L40: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { xk = (d__1 = x_ref(k, j), abs(d__1)); /* Computing MAX */ i__5 = 1, i__3 = k - *kd; i__4 = k - 1; for (i__ = max(i__5,i__3); i__ <= i__4; ++i__) { work[i__] += (d__1 = ab_ref(*kd + 1 + i__ - k, k), abs(d__1)) * xk; /* L50: */ } work[k] += xk; /* L60: */ } } } else { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { xk = (d__1 = x_ref(k, j), abs(d__1)); /* Computing MIN */ i__5 = *n, i__3 = k + *kd; i__4 = min(i__5,i__3); for (i__ = k; i__ <= i__4; ++i__) { work[i__] += (d__1 = ab_ref(i__ + 1 - k, k), abs( d__1)) * xk; /* L70: */ } /* L80: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { xk = (d__1 = x_ref(k, j), abs(d__1)); /* Computing MIN */ i__5 = *n, i__3 = k + *kd; i__4 = min(i__5,i__3); for (i__ = k + 1; i__ <= i__4; ++i__) { work[i__] += (d__1 = ab_ref(i__ + 1 - k, k), abs( d__1)) * xk; /* L90: */ } work[k] += xk; /* L100: */ } } } } else { /* Compute abs(A')*abs(X) + abs(B). */ if (upper) { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.; /* Computing MAX */ i__4 = 1, i__5 = k - *kd; i__3 = k; for (i__ = max(i__4,i__5); i__ <= i__3; ++i__) { s += (d__1 = ab_ref(*kd + 1 + i__ - k, k), abs( d__1)) * (d__2 = x_ref(i__, j), abs(d__2)) ; /* L110: */ } work[k] += s; /* L120: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = (d__1 = x_ref(k, j), abs(d__1)); /* Computing MAX */ i__3 = 1, i__4 = k - *kd; i__5 = k - 1; for (i__ = max(i__3,i__4); i__ <= i__5; ++i__) { s += (d__1 = ab_ref(*kd + 1 + i__ - k, k), abs( d__1)) * (d__2 = x_ref(i__, j), abs(d__2)) ; /* L130: */ } work[k] += s; /* L140: */ } } } else { if (nounit) { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.; /* Computing MIN */ i__3 = *n, i__4 = k + *kd; i__5 = min(i__3,i__4); for (i__ = k; i__ <= i__5; ++i__) { s += (d__1 = ab_ref(i__ + 1 - k, k), abs(d__1)) * (d__2 = x_ref(i__, j), abs(d__2)); /* L150: */ } work[k] += s; /* L160: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = (d__1 = x_ref(k, j), abs(d__1)); /* Computing MIN */ i__3 = *n, i__4 = k + *kd; i__5 = min(i__3,i__4); for (i__ = k + 1; i__ <= i__5; ++i__) { s += (d__1 = ab_ref(i__ + 1 - k, k), abs(d__1)) * (d__2 = x_ref(i__, j), abs(d__2)); /* L170: */ } work[k] += s; /* L180: */ } } } } s = 0.; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { /* Computing MAX */ d__2 = s, d__3 = (d__1 = work[*n + i__], abs(d__1)) / work[ i__]; s = max(d__2,d__3); } else { /* Computing MAX */ d__2 = s, d__3 = ((d__1 = work[*n + i__], abs(d__1)) + safe1) / (work[i__] + safe1); s = max(d__2,d__3); } /* L190: */ } berr[j] = s; /* Bound error from formula norm(X - XTRUE) / norm(X) .le. FERR = norm( abs(inv(op(A)))* ( abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) / norm(X) where norm(Z) is the magnitude of the largest component of Z inv(op(A)) is the inverse of op(A) abs(Z) is the componentwise absolute value of the matrix or vector Z NZ is the maximum number of nonzeros in any row of A, plus 1 EPS is machine epsilon The i-th component of abs(R)+NZ*EPS*(abs(op(A))*abs(X)+abs(B)) is incremented by SAFE1 if the i-th component of abs(op(A))*abs(X) + abs(B) is less than SAFE2. Use DLACON to estimate the infinity-norm of the matrix inv(op(A)) * diag(W), where W = abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { work[i__] = (d__1 = work[*n + i__], abs(d__1)) + nz * eps * work[i__]; } else { work[i__] = (d__1 = work[*n + i__], abs(d__1)) + nz * eps * work[i__] + safe1; } /* L200: */ } kase = 0; L210: dlacon_(n, &work[(*n << 1) + 1], &work[*n + 1], &iwork[1], &ferr[j], & kase); if (kase != 0) { if (kase == 1) { /* Multiply by diag(W)*inv(op(A)'). */ dtbsv_(uplo, transt, diag, n, kd, &ab[ab_offset], ldab, &work[ *n + 1], &c__1); i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L220: */ } } else { /* Multiply by inv(op(A))*diag(W). */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L230: */ } dtbsv_(uplo, trans, diag, n, kd, &ab[ab_offset], ldab, &work[* n + 1], &c__1); } goto L210; } /* Normalize error. */ lstres = 0.; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { /* Computing MAX */ d__2 = lstres, d__3 = (d__1 = x_ref(i__, j), abs(d__1)); lstres = max(d__2,d__3); /* L240: */ } if (lstres != 0.) { ferr[j] /= lstres; } /* L250: */ } return 0; /* End of DTBRFS */ } /* dtbrfs_ */
/* Subroutine */ int sgerfs_(char *trans, integer *n, integer *nrhs, real *a, integer *lda, real *af, integer *ldaf, integer *ipiv, real *b, integer *ldb, real *x, integer *ldx, real *ferr, real *berr, real * work, integer *iwork, integer *info) { /* -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= SGERFS improves the computed solution to a system of linear equations and provides error bounds and backward error estimates for the solution. Arguments ========= TRANS (input) CHARACTER*1 Specifies the form of the system of equations: = 'N': A * X = B (No transpose) = 'T': A**T * X = B (Transpose) = 'C': A**H * X = B (Conjugate transpose = Transpose) N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0. A (input) REAL array, dimension (LDA,N) The original N-by-N matrix A. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,N). AF (input) REAL array, dimension (LDAF,N) The factors L and U from the factorization A = P*L*U as computed by SGETRF. LDAF (input) INTEGER The leading dimension of the array AF. LDAF >= max(1,N). IPIV (input) INTEGER array, dimension (N) The pivot indices from SGETRF; for 1<=i<=N, row i of the matrix was interchanged with row IPIV(i). B (input) REAL array, dimension (LDB,NRHS) The right hand side matrix B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input/output) REAL array, dimension (LDX,NRHS) On entry, the solution matrix X, as computed by SGETRS. On exit, the improved solution matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). FERR (output) REAL array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) REAL array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace) REAL array, dimension (3*N) IWORK (workspace) INTEGER array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value Internal Parameters =================== ITMAX is the maximum number of steps of iterative refinement. ===================================================================== Test the input parameters. Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; static real c_b15 = -1.f; static real c_b17 = 1.f; /* System generated locals */ integer a_dim1, a_offset, af_dim1, af_offset, b_dim1, b_offset, x_dim1, x_offset, i__1, i__2, i__3; real r__1, r__2, r__3; /* Local variables */ static integer kase; static real safe1, safe2; static integer i__, j, k; static real s; extern logical lsame_(char *, char *); extern /* Subroutine */ int sgemv_(char *, integer *, integer *, real *, real *, integer *, real *, integer *, real *, real *, integer *); static integer count; extern /* Subroutine */ int scopy_(integer *, real *, integer *, real *, integer *), saxpy_(integer *, real *, real *, integer *, real *, integer *); static real xk; extern doublereal slamch_(char *); static integer nz; static real safmin; extern /* Subroutine */ int xerbla_(char *, integer *), slacon_( integer *, real *, real *, integer *, real *, integer *); static logical notran; extern /* Subroutine */ int sgetrs_(char *, integer *, integer *, real *, integer *, integer *, real *, integer *, integer *); static char transt[1]; static real lstres, eps; #define a_ref(a_1,a_2) a[(a_2)*a_dim1 + a_1] #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; af_dim1 = *ldaf; af_offset = 1 + af_dim1 * 1; af -= af_offset; --ipiv; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --iwork; /* Function Body */ *info = 0; notran = lsame_(trans, "N"); if (! notran && ! lsame_(trans, "T") && ! lsame_( trans, "C")) { *info = -1; } else if (*n < 0) { *info = -2; } else if (*nrhs < 0) { *info = -3; } else if (*lda < max(1,*n)) { *info = -5; } else if (*ldaf < max(1,*n)) { *info = -7; } else if (*ldb < max(1,*n)) { *info = -10; } else if (*ldx < max(1,*n)) { *info = -12; } if (*info != 0) { i__1 = -(*info); xerbla_("SGERFS", &i__1); return 0; } /* Quick return if possible */ if (*n == 0 || *nrhs == 0) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] = 0.f; berr[j] = 0.f; /* L10: */ } return 0; } if (notran) { *(unsigned char *)transt = 'T'; } else { *(unsigned char *)transt = 'N'; } /* NZ = maximum number of nonzero elements in each row of A, plus 1 */ nz = *n + 1; eps = slamch_("Epsilon"); safmin = slamch_("Safe minimum"); safe1 = nz * safmin; safe2 = safe1 / eps; /* Do for each right hand side */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { count = 1; lstres = 3.f; L20: /* Loop until stopping criterion is satisfied. Compute residual R = B - op(A) * X, where op(A) = A, A**T, or A**H, depending on TRANS. */ scopy_(n, &b_ref(1, j), &c__1, &work[*n + 1], &c__1); sgemv_(trans, n, n, &c_b15, &a[a_offset], lda, &x_ref(1, j), &c__1, & c_b17, &work[*n + 1], &c__1); /* Compute componentwise relative backward error from formula max(i) ( abs(R(i)) / ( abs(op(A))*abs(X) + abs(B) )(i) ) where abs(Z) is the componentwise absolute value of the matrix or vector Z. If the i-th component of the denominator is less than SAFE2, then SAFE1 is added to the i-th components of the numerator and denominator before dividing. */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[i__] = (r__1 = b_ref(i__, j), dabs(r__1)); /* L30: */ } /* Compute abs(op(A))*abs(X) + abs(B). */ if (notran) { i__2 = *n; for (k = 1; k <= i__2; ++k) { xk = (r__1 = x_ref(k, j), dabs(r__1)); i__3 = *n; for (i__ = 1; i__ <= i__3; ++i__) { work[i__] += (r__1 = a_ref(i__, k), dabs(r__1)) * xk; /* L40: */ } /* L50: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.f; i__3 = *n; for (i__ = 1; i__ <= i__3; ++i__) { s += (r__1 = a_ref(i__, k), dabs(r__1)) * (r__2 = x_ref( i__, j), dabs(r__2)); /* L60: */ } work[k] += s; /* L70: */ } } s = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { /* Computing MAX */ r__2 = s, r__3 = (r__1 = work[*n + i__], dabs(r__1)) / work[ i__]; s = dmax(r__2,r__3); } else { /* Computing MAX */ r__2 = s, r__3 = ((r__1 = work[*n + i__], dabs(r__1)) + safe1) / (work[i__] + safe1); s = dmax(r__2,r__3); } /* L80: */ } berr[j] = s; /* Test stopping criterion. Continue iterating if 1) The residual BERR(J) is larger than machine epsilon, and 2) BERR(J) decreased by at least a factor of 2 during the last iteration, and 3) At most ITMAX iterations tried. */ if (berr[j] > eps && berr[j] * 2.f <= lstres && count <= 5) { /* Update solution and try again. */ sgetrs_(trans, n, &c__1, &af[af_offset], ldaf, &ipiv[1], &work[*n + 1], n, info); saxpy_(n, &c_b17, &work[*n + 1], &c__1, &x_ref(1, j), &c__1); lstres = berr[j]; ++count; goto L20; } /* Bound error from formula norm(X - XTRUE) / norm(X) .le. FERR = norm( abs(inv(op(A)))* ( abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) / norm(X) where norm(Z) is the magnitude of the largest component of Z inv(op(A)) is the inverse of op(A) abs(Z) is the componentwise absolute value of the matrix or vector Z NZ is the maximum number of nonzeros in any row of A, plus 1 EPS is machine epsilon The i-th component of abs(R)+NZ*EPS*(abs(op(A))*abs(X)+abs(B)) is incremented by SAFE1 if the i-th component of abs(op(A))*abs(X) + abs(B) is less than SAFE2. Use SLACON to estimate the infinity-norm of the matrix inv(op(A)) * diag(W), where W = abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { work[i__] = (r__1 = work[*n + i__], dabs(r__1)) + nz * eps * work[i__]; } else { work[i__] = (r__1 = work[*n + i__], dabs(r__1)) + nz * eps * work[i__] + safe1; } /* L90: */ } kase = 0; L100: slacon_(n, &work[(*n << 1) + 1], &work[*n + 1], &iwork[1], &ferr[j], & kase); if (kase != 0) { if (kase == 1) { /* Multiply by diag(W)*inv(op(A)**T). */ sgetrs_(transt, n, &c__1, &af[af_offset], ldaf, &ipiv[1], & work[*n + 1], n, info); i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L110: */ } } else { /* Multiply by inv(op(A))*diag(W). */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L120: */ } sgetrs_(trans, n, &c__1, &af[af_offset], ldaf, &ipiv[1], & work[*n + 1], n, info); } goto L100; } /* Normalize error. */ lstres = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { /* Computing MAX */ r__2 = lstres, r__3 = (r__1 = x_ref(i__, j), dabs(r__1)); lstres = dmax(r__2,r__3); /* L130: */ } if (lstres != 0.f) { ferr[j] /= lstres; } /* L140: */ } return 0; /* End of SGERFS */ } /* sgerfs_ */
/* Subroutine */ int dlarhs_(char *path, char *xtype, char *uplo, char *trans, integer *m, integer *n, integer *kl, integer *ku, integer *nrhs, doublereal *a, integer *lda, doublereal *x, integer *ldx, doublereal * b, integer *ldb, integer *iseed, integer *info) { /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, i__1; /* Builtin functions Subroutine */ int s_copy(char *, char *, ftnlen, ftnlen); /* Local variables */ static logical band; static char diag[1]; static logical tran; static integer j; extern /* Subroutine */ int dgemm_(char *, char *, integer *, integer *, integer *, doublereal *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, integer *), dgbmv_(char *, integer *, integer *, integer *, integer *, doublereal *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, integer *); extern logical lsame_(char *, char *); extern /* Subroutine */ int dsbmv_(char *, integer *, integer *, doublereal *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, integer *), dtbmv_(char *, char *, char *, integer *, integer *, doublereal *, integer *, doublereal *, integer *), dtrmm_(char *, char *, char *, char *, integer *, integer *, doublereal *, doublereal *, integer *, doublereal *, integer *); static char c1[1], c2[2]; extern /* Subroutine */ int dspmv_(char *, integer *, doublereal *, doublereal *, doublereal *, integer *, doublereal *, doublereal *, integer *), dsymm_(char *, char *, integer *, integer *, doublereal *, doublereal *, integer *, doublereal *, integer *, doublereal *, doublereal *, integer *), dtpmv_( char *, char *, char *, integer *, doublereal *, doublereal *, integer *); static integer mb, nx; extern /* Subroutine */ int dlacpy_(char *, integer *, integer *, doublereal *, integer *, doublereal *, integer *), xerbla_(char *, integer *); extern logical lsamen_(integer *, char *, char *); extern /* Subroutine */ int dlarnv_(integer *, integer *, integer *, doublereal *); static logical notran, gen, tri, qrs, sym; #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= DLARHS chooses a set of NRHS random solution vectors and sets up the right hand sides for the linear system op( A ) * X = B, where op( A ) may be A or A' (transpose of A). Arguments ========= PATH (input) CHARACTER*3 The type of the real matrix A. PATH may be given in any combination of upper and lower case. Valid types include xGE: General m x n matrix xGB: General banded matrix xPO: Symmetric positive definite, 2-D storage xPP: Symmetric positive definite packed xPB: Symmetric positive definite banded xSY: Symmetric indefinite, 2-D storage xSP: Symmetric indefinite packed xSB: Symmetric indefinite banded xTR: Triangular xTP: Triangular packed xTB: Triangular banded xQR: General m x n matrix xLQ: General m x n matrix xQL: General m x n matrix xRQ: General m x n matrix where the leading character indicates the precision. XTYPE (input) CHARACTER*1 Specifies how the exact solution X will be determined: = 'N': New solution; generate a random X. = 'C': Computed; use value of X on entry. UPLO (input) CHARACTER*1 Specifies whether the upper or lower triangular part of the matrix A is stored, if A is symmetric. = 'U': Upper triangular = 'L': Lower triangular TRANS (input) CHARACTER*1 Specifies the operation applied to the matrix A. = 'N': System is A * x = b = 'T': System is A'* x = b = 'C': System is A'* x = b M (input) INTEGER The number or rows of the matrix A. M >= 0. N (input) INTEGER The number of columns of the matrix A. N >= 0. KL (input) INTEGER Used only if A is a band matrix; specifies the number of subdiagonals of A if A is a general band matrix or if A is symmetric or triangular and UPLO = 'L'; specifies the number of superdiagonals of A if A is symmetric or triangular and UPLO = 'U'. 0 <= KL <= M-1. KU (input) INTEGER Used only if A is a general band matrix or if A is triangular. If PATH = xGB, specifies the number of superdiagonals of A, and 0 <= KU <= N-1. If PATH = xTR, xTP, or xTB, specifies whether or not the matrix has unit diagonal: = 1: matrix has non-unit diagonal (default) = 2: matrix has unit diagonal NRHS (input) INTEGER The number of right hand side vectors in the system A*X = B. A (input) DOUBLE PRECISION array, dimension (LDA,N) The test matrix whose type is given by PATH. LDA (input) INTEGER The leading dimension of the array A. If PATH = xGB, LDA >= KL+KU+1. If PATH = xPB, xSB, xHB, or xTB, LDA >= KL+1. Otherwise, LDA >= max(1,M). X (input or output) DOUBLE PRECISION array, dimension(LDX,NRHS) On entry, if XTYPE = 'C' (for 'Computed'), then X contains the exact solution to the system of linear equations. On exit, if XTYPE = 'N' (for 'New'), then X is initialized with random values. LDX (input) INTEGER The leading dimension of the array X. If TRANS = 'N', LDX >= max(1,N); if TRANS = 'T', LDX >= max(1,M). B (output) DOUBLE PRECISION array, dimension (LDB,NRHS) The right hand side vector(s) for the system of equations, computed from B = op(A) * X, where op(A) is determined by TRANS. LDB (input) INTEGER The leading dimension of the array B. If TRANS = 'N', LDB >= max(1,M); if TRANS = 'T', LDB >= max(1,N). ISEED (input/output) INTEGER array, dimension (4) The seed vector for the random number generator (used in DLATMS). Modified on exit. INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value ===================================================================== Test the input parameters. Parameter adjustments */ a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; --iseed; /* Function Body */ *info = 0; *(unsigned char *)c1 = *(unsigned char *)path; s_copy(c2, path + 1, (ftnlen)2, (ftnlen)2); tran = lsame_(trans, "T") || lsame_(trans, "C"); notran = ! tran; gen = lsame_(path + 1, "G"); qrs = lsame_(path + 1, "Q") || lsame_(path + 2, "Q"); sym = lsame_(path + 1, "P") || lsame_(path + 1, "S"); tri = lsame_(path + 1, "T"); band = lsame_(path + 2, "B"); if (! lsame_(c1, "Double precision")) { *info = -1; } else if (! (lsame_(xtype, "N") || lsame_(xtype, "C"))) { *info = -2; } else if ((sym || tri) && ! (lsame_(uplo, "U") || lsame_(uplo, "L"))) { *info = -3; } else if ((gen || qrs) && ! (tran || lsame_(trans, "N"))) { *info = -4; } else if (*m < 0) { *info = -5; } else if (*n < 0) { *info = -6; } else if (band && *kl < 0) { *info = -7; } else if (band && *ku < 0) { *info = -8; } else if (*nrhs < 0) { *info = -9; } else if (! band && *lda < max(1,*m) || band && (sym || tri) && *lda < * kl + 1 || band && gen && *lda < *kl + *ku + 1) { *info = -11; } else if (notran && *ldx < max(1,*n) || tran && *ldx < max(1,*m)) { *info = -13; } else if (notran && *ldb < max(1,*m) || tran && *ldb < max(1,*n)) { *info = -15; } if (*info != 0) { i__1 = -(*info); xerbla_("DLARHS", &i__1); return 0; } /* Initialize X to NRHS random vectors unless XTYPE = 'C'. */ if (tran) { nx = *m; mb = *n; } else { nx = *n; mb = *m; } if (! lsame_(xtype, "C")) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dlarnv_(&c__2, &iseed[1], n, &x_ref(1, j)); /* L10: */ } } /* Multiply X by op( A ) using an appropriate matrix multiply routine. */ if (lsamen_(&c__2, c2, "GE") || lsamen_(&c__2, c2, "QR") || lsamen_(&c__2, c2, "LQ") || lsamen_(&c__2, c2, "QL") || lsamen_(&c__2, c2, "RQ")) { /* General matrix */ dgemm_(trans, "N", &mb, nrhs, &nx, &c_b32, &a[a_offset], lda, &x[ x_offset], ldx, &c_b33, &b[b_offset], ldb); } else if (lsamen_(&c__2, c2, "PO") || lsamen_(& c__2, c2, "SY")) { /* Symmetric matrix, 2-D storage */ dsymm_("Left", uplo, n, nrhs, &c_b32, &a[a_offset], lda, &x[x_offset], ldx, &c_b33, &b[b_offset], ldb); } else if (lsamen_(&c__2, c2, "GB")) { /* General matrix, band storage */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dgbmv_(trans, &mb, &nx, kl, ku, &c_b32, &a[a_offset], lda, &x_ref( 1, j), &c__1, &c_b33, &b_ref(1, j), &c__1); /* L20: */ } } else if (lsamen_(&c__2, c2, "PB")) { /* Symmetric matrix, band storage */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dsbmv_(uplo, n, kl, &c_b32, &a[a_offset], lda, &x_ref(1, j), & c__1, &c_b33, &b_ref(1, j), &c__1); /* L30: */ } } else if (lsamen_(&c__2, c2, "PP") || lsamen_(& c__2, c2, "SP")) { /* Symmetric matrix, packed storage */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dspmv_(uplo, n, &c_b32, &a[a_offset], &x_ref(1, j), &c__1, &c_b33, &b_ref(1, j), &c__1); /* L40: */ } } else if (lsamen_(&c__2, c2, "TR")) { /* Triangular matrix. Note that for triangular matrices, KU = 1 => non-unit triangular KU = 2 => unit triangular */ dlacpy_("Full", n, nrhs, &x[x_offset], ldx, &b[b_offset], ldb); if (*ku == 2) { *(unsigned char *)diag = 'U'; } else { *(unsigned char *)diag = 'N'; } dtrmm_("Left", uplo, trans, diag, n, nrhs, &c_b32, &a[a_offset], lda, &b[b_offset], ldb) ; } else if (lsamen_(&c__2, c2, "TP")) { /* Triangular matrix, packed storage */ dlacpy_("Full", n, nrhs, &x[x_offset], ldx, &b[b_offset], ldb); if (*ku == 2) { *(unsigned char *)diag = 'U'; } else { *(unsigned char *)diag = 'N'; } i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dtpmv_(uplo, trans, diag, n, &a[a_offset], &b_ref(1, j), &c__1); /* L50: */ } } else if (lsamen_(&c__2, c2, "TB")) { /* Triangular matrix, banded storage */ dlacpy_("Full", n, nrhs, &x[x_offset], ldx, &b[b_offset], ldb); if (*ku == 2) { *(unsigned char *)diag = 'U'; } else { *(unsigned char *)diag = 'N'; } i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { dtbmv_(uplo, trans, diag, n, kl, &a[a_offset], lda, &b_ref(1, j), &c__1); /* L60: */ } } else { /* If PATH is none of the above, return with an error code. */ *info = -1; i__1 = -(*info); xerbla_("DLARHS", &i__1); } return 0; /* End of DLARHS */ } /* dlarhs_ */
/* Subroutine */ int ctpt02_(char *uplo, char *trans, char *diag, integer *n, integer *nrhs, complex *ap, complex *x, integer *ldx, complex *b, integer *ldb, complex *work, real *rwork, real *resid) { /* System generated locals */ integer b_dim1, b_offset, x_dim1, x_offset, i__1; real r__1, r__2; /* Local variables */ static integer j; extern logical lsame_(char *, char *); static real anorm, bnorm; extern /* Subroutine */ int ccopy_(integer *, complex *, integer *, complex *, integer *), caxpy_(integer *, complex *, complex *, integer *, complex *, integer *), ctpmv_(char *, char *, char *, integer *, complex *, complex *, integer *); static real xnorm; extern doublereal slamch_(char *), clantp_(char *, char *, char *, integer *, complex *, real *), scasum_( integer *, complex *, integer *); static real eps; #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= CTPT02 computes the residual for the computed solution to a triangular system of linear equations A*x = b, A**T *x = b, or A**H *x = b, when the triangular matrix A is stored in packed format. Here A**T denotes the transpose of A, A**H denotes the conjugate transpose of A, and x and b are N by NRHS matrices. The test ratio is the maximum over the number of right hand sides of the maximum over the number of right hand sides of norm(b - op(A)*x) / ( norm(op(A)) * norm(x) * EPS ), where op(A) denotes A, A**T, or A**H, and EPS is the machine epsilon. Arguments ========= UPLO (input) CHARACTER*1 Specifies whether the matrix A is upper or lower triangular. = 'U': Upper triangular = 'L': Lower triangular TRANS (input) CHARACTER*1 Specifies the operation applied to A. = 'N': A *x = b (No transpose) = 'T': A**T *x = b (Transpose) = 'C': A**H *x = b (Conjugate transpose) DIAG (input) CHARACTER*1 Specifies whether or not the matrix A is unit triangular. = 'N': Non-unit triangular = 'U': Unit triangular N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices X and B. NRHS >= 0. AP (input) COMPLEX array, dimension (N*(N+1)/2) The upper or lower triangular matrix A, packed columnwise in a linear array. The j-th column of A is stored in the array AP as follows: if UPLO = 'U', AP((j-1)*j/2 + i) = A(i,j) for 1<=i<=j; if UPLO = 'L', AP((j-1)*(n-j) + j*(j+1)/2 + i-j) = A(i,j) for j<=i<=n. X (input) COMPLEX array, dimension (LDX,NRHS) The computed solution vectors for the system of linear equations. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). B (input) COMPLEX array, dimension (LDB,NRHS) The right hand side vectors for the system of linear equations. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). WORK (workspace) COMPLEX array, dimension (N) RWORK (workspace) REAL array, dimension (N) RESID (output) REAL The maximum over the number of right hand sides of norm(op(A)*x - b) / ( norm(op(A)) * norm(x) * EPS ). ===================================================================== Quick exit if N = 0 or NRHS = 0 Parameter adjustments */ --ap; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; --work; --rwork; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { *resid = 0.f; return 0; } /* Compute the 1-norm of A or A**H. */ if (lsame_(trans, "N")) { anorm = clantp_("1", uplo, diag, n, &ap[1], &rwork[1]); } else { anorm = clantp_("I", uplo, diag, n, &ap[1], &rwork[1]); } /* Exit with RESID = 1/EPS if ANORM = 0. */ eps = slamch_("Epsilon"); if (anorm <= 0.f) { *resid = 1.f / eps; return 0; } /* Compute the maximum over the number of right hand sides of norm(op(A)*x - b) / ( norm(op(A)) * norm(x) * EPS ). */ *resid = 0.f; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ccopy_(n, &x_ref(1, j), &c__1, &work[1], &c__1); ctpmv_(uplo, trans, diag, n, &ap[1], &work[1], &c__1); caxpy_(n, &c_b12, &b_ref(1, j), &c__1, &work[1], &c__1); bnorm = scasum_(n, &work[1], &c__1); xnorm = scasum_(n, &x_ref(1, j), &c__1); if (xnorm <= 0.f) { *resid = 1.f / eps; } else { /* Computing MAX */ r__1 = *resid, r__2 = bnorm / anorm / xnorm / eps; *resid = dmax(r__1,r__2); } /* L10: */ } return 0; /* End of CTPT02 */ } /* ctpt02_ */
/* Subroutine */ int stbt05_(char *uplo, char *trans, char *diag, integer *n, integer *kd, integer *nrhs, real *ab, integer *ldab, real *b, integer *ldb, real *x, integer *ldx, real *xact, integer *ldxact, real *ferr, real *berr, real *reslts) { /* System generated locals */ integer ab_dim1, ab_offset, b_dim1, b_offset, x_dim1, x_offset, xact_dim1, xact_offset, i__1, i__2, i__3, i__4; real r__1, r__2, r__3; /* Local variables */ static real diff, axbi; static integer imax; static real unfl, ovfl; static logical unit; static integer i__, j, k; extern logical lsame_(char *, char *); static logical upper; static real xnorm; extern doublereal slamch_(char *); static integer nz; static real errbnd; extern integer isamax_(integer *, real *, integer *); static logical notran; static integer ifu; static real eps, tmp; #define xact_ref(a_1,a_2) xact[(a_2)*xact_dim1 + a_1] #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] #define ab_ref(a_1,a_2) ab[(a_2)*ab_dim1 + a_1] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= STBT05 tests the error bounds from iterative refinement for the computed solution to a system of equations A*X = B, where A is a triangular band matrix. RESLTS(1) = test of the error bound = norm(X - XACT) / ( norm(X) * FERR ) A large value is returned if this ratio is not less than one. RESLTS(2) = residual from the iterative refinement routine = the maximum of BERR / ( NZ*EPS + (*) ), where (*) = NZ*UNFL / (min_i (abs(A)*abs(X) +abs(b))_i ) and NZ = max. number of nonzeros in any row of A, plus 1 Arguments ========= UPLO (input) CHARACTER*1 Specifies whether the matrix A is upper or lower triangular. = 'U': Upper triangular = 'L': Lower triangular TRANS (input) CHARACTER*1 Specifies the form of the system of equations. = 'N': A * X = B (No transpose) = 'T': A'* X = B (Transpose) = 'C': A'* X = B (Conjugate transpose = Transpose) DIAG (input) CHARACTER*1 Specifies whether or not the matrix A is unit triangular. = 'N': Non-unit triangular = 'U': Unit triangular N (input) INTEGER The number of rows of the matrices X, B, and XACT, and the order of the matrix A. N >= 0. KD (input) INTEGER The number of super-diagonals of the matrix A if UPLO = 'U', or the number of sub-diagonals if UPLO = 'L'. KD >= 0. NRHS (input) INTEGER The number of columns of the matrices X, B, and XACT. NRHS >= 0. AB (input) REAL array, dimension (LDAB,N) The upper or lower triangular band matrix A, stored in the first kd+1 rows of the array. The j-th column of A is stored in the j-th column of the array AB as follows: if UPLO = 'U', AB(kd+1+i-j,j) = A(i,j) for max(1,j-kd)<=i<=j; if UPLO = 'L', AB(1+i-j,j) = A(i,j) for j<=i<=min(n,j+kd). If DIAG = 'U', the diagonal elements of A are not referenced and are assumed to be 1. LDAB (input) INTEGER The leading dimension of the array AB. LDAB >= KD+1. B (input) REAL array, dimension (LDB,NRHS) The right hand side vectors for the system of linear equations. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input) REAL array, dimension (LDX,NRHS) The computed solution vectors. Each vector is stored as a column of the matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). XACT (input) REAL array, dimension (LDX,NRHS) The exact solution vectors. Each vector is stored as a column of the matrix XACT. LDXACT (input) INTEGER The leading dimension of the array XACT. LDXACT >= max(1,N). FERR (input) REAL array, dimension (NRHS) The estimated forward error bounds for each solution vector X. If XTRUE is the true solution, FERR bounds the magnitude of the largest entry in (X - XTRUE) divided by the magnitude of the largest entry in X. BERR (input) REAL array, dimension (NRHS) The componentwise relative backward error of each solution vector (i.e., the smallest relative change in any entry of A or B that makes X an exact solution). RESLTS (output) REAL array, dimension (2) The maximum over the NRHS solution vectors of the ratios: RESLTS(1) = norm(X - XACT) / ( norm(X) * FERR ) RESLTS(2) = BERR / ( NZ*EPS + (*) ) ===================================================================== Quick exit if N = 0 or NRHS = 0. Parameter adjustments */ ab_dim1 = *ldab; ab_offset = 1 + ab_dim1 * 1; ab -= ab_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; xact_dim1 = *ldxact; xact_offset = 1 + xact_dim1 * 1; xact -= xact_offset; --ferr; --berr; --reslts; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { reslts[1] = 0.f; reslts[2] = 0.f; return 0; } eps = slamch_("Epsilon"); unfl = slamch_("Safe minimum"); ovfl = 1.f / unfl; upper = lsame_(uplo, "U"); notran = lsame_(trans, "N"); unit = lsame_(diag, "U"); /* Computing MIN */ i__1 = *kd, i__2 = *n - 1; nz = min(i__1,i__2) + 1; /* Test 1: Compute the maximum of norm(X - XACT) / ( norm(X) * FERR ) over all the vectors X and XACT using the infinity-norm. */ errbnd = 0.f; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { imax = isamax_(n, &x_ref(1, j), &c__1); /* Computing MAX */ r__2 = (r__1 = x_ref(imax, j), dabs(r__1)); xnorm = dmax(r__2,unfl); diff = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { /* Computing MAX */ r__2 = diff, r__3 = (r__1 = x_ref(i__, j) - xact_ref(i__, j), dabs(r__1)); diff = dmax(r__2,r__3); /* L10: */ } if (xnorm > 1.f) { goto L20; } else if (diff <= ovfl * xnorm) { goto L20; } else { errbnd = 1.f / eps; goto L30; } L20: if (diff / xnorm <= ferr[j]) { /* Computing MAX */ r__1 = errbnd, r__2 = diff / xnorm / ferr[j]; errbnd = dmax(r__1,r__2); } else { errbnd = 1.f / eps; } L30: ; } reslts[1] = errbnd; /* Test 2: Compute the maximum of BERR / ( NZ*EPS + (*) ), where (*) = NZ*UNFL / (min_i (abs(A)*abs(X) +abs(b))_i ) */ ifu = 0; if (unit) { ifu = 1; } i__1 = *nrhs; for (k = 1; k <= i__1; ++k) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { tmp = (r__1 = b_ref(i__, k), dabs(r__1)); if (upper) { if (! notran) { /* Computing MAX */ i__3 = i__ - *kd; i__4 = i__ - ifu; for (j = max(i__3,1); j <= i__4; ++j) { tmp += (r__1 = ab_ref(*kd + 1 - i__ + j, i__), dabs( r__1)) * (r__2 = x_ref(j, k), dabs(r__2)); /* L40: */ } if (unit) { tmp += (r__1 = x_ref(i__, k), dabs(r__1)); } } else { if (unit) { tmp += (r__1 = x_ref(i__, k), dabs(r__1)); } /* Computing MIN */ i__3 = i__ + *kd; i__4 = min(i__3,*n); for (j = i__ + ifu; j <= i__4; ++j) { tmp += (r__1 = ab_ref(*kd + 1 + i__ - j, j), dabs( r__1)) * (r__2 = x_ref(j, k), dabs(r__2)); /* L50: */ } } } else { if (notran) { /* Computing MAX */ i__4 = i__ - *kd; i__3 = i__ - ifu; for (j = max(i__4,1); j <= i__3; ++j) { tmp += (r__1 = ab_ref(i__ + 1 - j, j), dabs(r__1)) * ( r__2 = x_ref(j, k), dabs(r__2)); /* L60: */ } if (unit) { tmp += (r__1 = x_ref(i__, k), dabs(r__1)); } } else { if (unit) { tmp += (r__1 = x_ref(i__, k), dabs(r__1)); } /* Computing MIN */ i__4 = i__ + *kd; i__3 = min(i__4,*n); for (j = i__ + ifu; j <= i__3; ++j) { tmp += (r__1 = ab_ref(j + 1 - i__, i__), dabs(r__1)) * (r__2 = x_ref(j, k), dabs(r__2)); /* L70: */ } } } if (i__ == 1) { axbi = tmp; } else { axbi = dmin(axbi,tmp); } /* L80: */ } /* Computing MAX */ r__1 = axbi, r__2 = nz * unfl; tmp = berr[k] / (nz * eps + nz * unfl / dmax(r__1,r__2)); if (k == 1) { reslts[2] = tmp; } else { reslts[2] = dmax(reslts[2],tmp); } /* L90: */ } return 0; /* End of STBT05 */ } /* stbt05_ */
/* Subroutine */ int cgbt05_(char *trans, integer *n, integer *kl, integer * ku, integer *nrhs, complex *ab, integer *ldab, complex *b, integer * ldb, complex *x, integer *ldx, complex *xact, integer *ldxact, real * ferr, real *berr, real *reslts) { /* System generated locals */ integer ab_dim1, ab_offset, b_dim1, b_offset, x_dim1, x_offset, xact_dim1, xact_offset, i__1, i__2, i__3, i__4, i__5; real r__1, r__2, r__3, r__4; complex q__1, q__2; /* Builtin functions */ double r_imag(complex *); /* Local variables */ static real diff, axbi; static integer imax; static real unfl, ovfl; static integer i__, j, k; extern logical lsame_(char *, char *); static real xnorm; extern integer icamax_(integer *, complex *, integer *); extern doublereal slamch_(char *); static integer nz; static real errbnd; static logical notran; static real eps, tmp; #define xact_subscr(a_1,a_2) (a_2)*xact_dim1 + a_1 #define xact_ref(a_1,a_2) xact[xact_subscr(a_1,a_2)] #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] #define ab_subscr(a_1,a_2) (a_2)*ab_dim1 + a_1 #define ab_ref(a_1,a_2) ab[ab_subscr(a_1,a_2)] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University February 29, 1992 Purpose ======= CGBT05 tests the error bounds from iterative refinement for the computed solution to a system of equations op(A)*X = B, where A is a general band matrix of order n with kl subdiagonals and ku superdiagonals and op(A) = A or A**T, depending on TRANS. RESLTS(1) = test of the error bound = norm(X - XACT) / ( norm(X) * FERR ) A large value is returned if this ratio is not less than one. RESLTS(2) = residual from the iterative refinement routine = the maximum of BERR / ( NZ*EPS + (*) ), where (*) = NZ*UNFL / (min_i (abs(op(A))*abs(X) +abs(b))_i ) and NZ = max. number of nonzeros in any row of A, plus 1 Arguments ========= TRANS (input) CHARACTER*1 Specifies the form of the system of equations. = 'N': A * X = B (No transpose) = 'T': A**T * X = B (Transpose) = 'C': A**H * X = B (Conjugate transpose = Transpose) N (input) INTEGER The number of rows of the matrices X, B, and XACT, and the order of the matrix A. N >= 0. KL (input) INTEGER The number of subdiagonals within the band of A. KL >= 0. KU (input) INTEGER The number of superdiagonals within the band of A. KU >= 0. NRHS (input) INTEGER The number of columns of the matrices X, B, and XACT. NRHS >= 0. AB (input) COMPLEX array, dimension (LDAB,N) The original band matrix A, stored in rows 1 to KL+KU+1. The j-th column of A is stored in the j-th column of the array AB as follows: AB(ku+1+i-j,j) = A(i,j) for max(1,j-ku)<=i<=min(n,j+kl). LDAB (input) INTEGER The leading dimension of the array AB. LDAB >= KL+KU+1. B (input) COMPLEX array, dimension (LDB,NRHS) The right hand side vectors for the system of linear equations. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input) COMPLEX array, dimension (LDX,NRHS) The computed solution vectors. Each vector is stored as a column of the matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). XACT (input) COMPLEX array, dimension (LDX,NRHS) The exact solution vectors. Each vector is stored as a column of the matrix XACT. LDXACT (input) INTEGER The leading dimension of the array XACT. LDXACT >= max(1,N). FERR (input) REAL array, dimension (NRHS) The estimated forward error bounds for each solution vector X. If XTRUE is the true solution, FERR bounds the magnitude of the largest entry in (X - XTRUE) divided by the magnitude of the largest entry in X. BERR (input) REAL array, dimension (NRHS) The componentwise relative backward error of each solution vector (i.e., the smallest relative change in any entry of A or B that makes X an exact solution). RESLTS (output) REAL array, dimension (2) The maximum over the NRHS solution vectors of the ratios: RESLTS(1) = norm(X - XACT) / ( norm(X) * FERR ) RESLTS(2) = BERR / ( NZ*EPS + (*) ) ===================================================================== Quick exit if N = 0 or NRHS = 0. Parameter adjustments */ ab_dim1 = *ldab; ab_offset = 1 + ab_dim1 * 1; ab -= ab_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; xact_dim1 = *ldxact; xact_offset = 1 + xact_dim1 * 1; xact -= xact_offset; --ferr; --berr; --reslts; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { reslts[1] = 0.f; reslts[2] = 0.f; return 0; } eps = slamch_("Epsilon"); unfl = slamch_("Safe minimum"); ovfl = 1.f / unfl; notran = lsame_(trans, "N"); /* Computing MIN */ i__1 = *kl + *ku + 2, i__2 = *n + 1; nz = min(i__1,i__2); /* Test 1: Compute the maximum of norm(X - XACT) / ( norm(X) * FERR ) over all the vectors X and XACT using the infinity-norm. */ errbnd = 0.f; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { imax = icamax_(n, &x_ref(1, j), &c__1); /* Computing MAX */ i__2 = x_subscr(imax, j); r__3 = (r__1 = x[i__2].r, dabs(r__1)) + (r__2 = r_imag(&x_ref(imax, j) ), dabs(r__2)); xnorm = dmax(r__3,unfl); diff = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = x_subscr(i__, j); i__4 = xact_subscr(i__, j); q__2.r = x[i__3].r - xact[i__4].r, q__2.i = x[i__3].i - xact[i__4] .i; q__1.r = q__2.r, q__1.i = q__2.i; /* Computing MAX */ r__3 = diff, r__4 = (r__1 = q__1.r, dabs(r__1)) + (r__2 = r_imag(& q__1), dabs(r__2)); diff = dmax(r__3,r__4); /* L10: */ } if (xnorm > 1.f) { goto L20; } else if (diff <= ovfl * xnorm) { goto L20; } else { errbnd = 1.f / eps; goto L30; } L20: if (diff / xnorm <= ferr[j]) { /* Computing MAX */ r__1 = errbnd, r__2 = diff / xnorm / ferr[j]; errbnd = dmax(r__1,r__2); } else { errbnd = 1.f / eps; } L30: ; } reslts[1] = errbnd; /* Test 2: Compute the maximum of BERR / ( NZ*EPS + (*) ), where (*) = NZ*UNFL / (min_i (abs(op(A))*abs(X) +abs(b))_i ) */ i__1 = *nrhs; for (k = 1; k <= i__1; ++k) { i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = b_subscr(i__, k); tmp = (r__1 = b[i__3].r, dabs(r__1)) + (r__2 = r_imag(&b_ref(i__, k)), dabs(r__2)); if (notran) { /* Computing MAX */ i__3 = i__ - *kl; /* Computing MIN */ i__5 = i__ + *ku; i__4 = min(i__5,*n); for (j = max(i__3,1); j <= i__4; ++j) { i__3 = ab_subscr(*ku + 1 + i__ - j, j); i__5 = x_subscr(j, k); tmp += ((r__1 = ab[i__3].r, dabs(r__1)) + (r__2 = r_imag(& ab_ref(*ku + 1 + i__ - j, j)), dabs(r__2))) * (( r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(& x_ref(j, k)), dabs(r__4))); /* L40: */ } } else { /* Computing MAX */ i__4 = i__ - *ku; /* Computing MIN */ i__5 = i__ + *kl; i__3 = min(i__5,*n); for (j = max(i__4,1); j <= i__3; ++j) { i__4 = ab_subscr(*ku + 1 + j - i__, i__); i__5 = x_subscr(j, k); tmp += ((r__1 = ab[i__4].r, dabs(r__1)) + (r__2 = r_imag(& ab_ref(*ku + 1 + j - i__, i__)), dabs(r__2))) * (( r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(& x_ref(j, k)), dabs(r__4))); /* L50: */ } } if (i__ == 1) { axbi = tmp; } else { axbi = dmin(axbi,tmp); } /* L60: */ } /* Computing MAX */ r__1 = axbi, r__2 = nz * unfl; tmp = berr[k] / (nz * eps + nz * unfl / dmax(r__1,r__2)); if (k == 1) { reslts[2] = tmp; } else { reslts[2] = dmax(reslts[2],tmp); } /* L70: */ } return 0; /* End of CGBT05 */ } /* cgbt05_ */
/* Subroutine */ int cpbt02_(char *uplo, integer *n, integer *kd, integer * nrhs, complex *a, integer *lda, complex *x, integer *ldx, complex *b, integer *ldb, real *rwork, real *resid) { /* System generated locals */ integer a_dim1, a_offset, b_dim1, b_offset, x_dim1, x_offset, i__1; real r__1, r__2; complex q__1; /* Local variables */ static integer j; extern /* Subroutine */ int chbmv_(char *, integer *, integer *, complex * , complex *, integer *, complex *, integer *, complex *, complex * , integer *); static real anorm, bnorm, xnorm; extern doublereal clanhb_(char *, char *, integer *, integer *, complex *, integer *, real *), slamch_(char *), scasum_(integer *, complex *, integer *); static real eps; #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] /* -- LAPACK test routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= CPBT02 computes the residual for a solution of a Hermitian banded system of equations A*x = b: RESID = norm( B - A*X ) / ( norm(A) * norm(X) * EPS) where EPS is the machine precision. Arguments ========= UPLO (input) CHARACTER*1 Specifies whether the upper or lower triangular part of the Hermitian matrix A is stored: = 'U': Upper triangular = 'L': Lower triangular N (input) INTEGER The number of rows and columns of the matrix A. N >= 0. KD (input) INTEGER The number of super-diagonals of the matrix A if UPLO = 'U', or the number of sub-diagonals if UPLO = 'L'. KD >= 0. A (input) COMPLEX array, dimension (LDA,N) The original Hermitian band matrix A. If UPLO = 'U', the upper triangular part of A is stored as a band matrix; if UPLO = 'L', the lower triangular part of A is stored. The columns of the appropriate triangle are stored in the columns of A and the diagonals of the triangle are stored in the rows of A. See CPBTRF for further details. LDA (input) INTEGER. The leading dimension of the array A. LDA >= max(1,KD+1). X (input) COMPLEX array, dimension (LDX,NRHS) The computed solution vectors for the system of linear equations. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). B (input/output) COMPLEX array, dimension (LDB,NRHS) On entry, the right hand side vectors for the system of linear equations. On exit, B is overwritten with the difference B - A*X. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). RWORK (workspace) REAL array, dimension (N) RESID (output) REAL The maximum over the number of right hand sides of norm(B - A*X) / ( norm(A) * norm(X) * EPS ). ===================================================================== Quick exit if N = 0 or NRHS = 0. Parameter adjustments */ a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; --rwork; /* Function Body */ if (*n <= 0 || *nrhs <= 0) { *resid = 0.f; return 0; } /* Exit with RESID = 1/EPS if ANORM = 0. */ eps = slamch_("Epsilon"); anorm = clanhb_("1", uplo, n, kd, &a[a_offset], lda, &rwork[1]); if (anorm <= 0.f) { *resid = 1.f / eps; return 0; } /* Compute B - A*X */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { q__1.r = -1.f, q__1.i = 0.f; chbmv_(uplo, n, kd, &q__1, &a[a_offset], lda, &x_ref(1, j), &c__1, & c_b1, &b_ref(1, j), &c__1); /* L10: */ } /* Compute the maximum over the number of right hand sides of norm( B - A*X ) / ( norm(A) * norm(X) * EPS ) */ *resid = 0.f; i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { bnorm = scasum_(n, &b_ref(1, j), &c__1); xnorm = scasum_(n, &x_ref(1, j), &c__1); if (xnorm <= 0.f) { *resid = 1.f / eps; } else { /* Computing MAX */ r__1 = *resid, r__2 = bnorm / anorm / xnorm / eps; *resid = dmax(r__1,r__2); } /* L20: */ } return 0; /* End of CPBT02 */ } /* cpbt02_ */
/* Subroutine */ int cherfs_(char *uplo, integer *n, integer *nrhs, complex * a, integer *lda, complex *af, integer *ldaf, integer *ipiv, complex * b, integer *ldb, complex *x, integer *ldx, real *ferr, real *berr, complex *work, real *rwork, integer *info) { /* -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= CHERFS improves the computed solution to a system of linear equations when the coefficient matrix is Hermitian indefinite, and provides error bounds and backward error estimates for the solution. Arguments ========= UPLO (input) CHARACTER*1 = 'U': Upper triangle of A is stored; = 'L': Lower triangle of A is stored. N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0. A (input) COMPLEX array, dimension (LDA,N) The Hermitian matrix A. If UPLO = 'U', the leading N-by-N upper triangular part of A contains the upper triangular part of the matrix A, and the strictly lower triangular part of A is not referenced. If UPLO = 'L', the leading N-by-N lower triangular part of A contains the lower triangular part of the matrix A, and the strictly upper triangular part of A is not referenced. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,N). AF (input) COMPLEX array, dimension (LDAF,N) The factored form of the matrix A. AF contains the block diagonal matrix D and the multipliers used to obtain the factor U or L from the factorization A = U*D*U**H or A = L*D*L**H as computed by CHETRF. LDAF (input) INTEGER The leading dimension of the array AF. LDAF >= max(1,N). IPIV (input) INTEGER array, dimension (N) Details of the interchanges and the block structure of D as determined by CHETRF. B (input) COMPLEX array, dimension (LDB,NRHS) The right hand side matrix B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input/output) COMPLEX array, dimension (LDX,NRHS) On entry, the solution matrix X, as computed by CHETRS. On exit, the improved solution matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). FERR (output) REAL array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) REAL array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace) COMPLEX array, dimension (2*N) RWORK (workspace) REAL array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value Internal Parameters =================== ITMAX is the maximum number of steps of iterative refinement. ===================================================================== Test the input parameters. Parameter adjustments */ /* Table of constant values */ static complex c_b1 = {1.f,0.f}; static integer c__1 = 1; /* System generated locals */ integer a_dim1, a_offset, af_dim1, af_offset, b_dim1, b_offset, x_dim1, x_offset, i__1, i__2, i__3, i__4, i__5; real r__1, r__2, r__3, r__4; complex q__1; /* Builtin functions */ double r_imag(complex *); /* Local variables */ static integer kase; static real safe1, safe2; static integer i__, j, k; static real s; extern logical lsame_(char *, char *); extern /* Subroutine */ int chemv_(char *, integer *, complex *, complex * , integer *, complex *, integer *, complex *, complex *, integer * ), ccopy_(integer *, complex *, integer *, complex *, integer *), caxpy_(integer *, complex *, complex *, integer *, complex *, integer *); static integer count; static logical upper; extern /* Subroutine */ int clacon_(integer *, complex *, complex *, real *, integer *); static real xk; extern doublereal slamch_(char *); static integer nz; static real safmin; extern /* Subroutine */ int xerbla_(char *, integer *), chetrs_( char *, integer *, integer *, complex *, integer *, integer *, complex *, integer *, integer *); static real lstres, eps; #define a_subscr(a_1,a_2) (a_2)*a_dim1 + a_1 #define a_ref(a_1,a_2) a[a_subscr(a_1,a_2)] #define b_subscr(a_1,a_2) (a_2)*b_dim1 + a_1 #define b_ref(a_1,a_2) b[b_subscr(a_1,a_2)] #define x_subscr(a_1,a_2) (a_2)*x_dim1 + a_1 #define x_ref(a_1,a_2) x[x_subscr(a_1,a_2)] a_dim1 = *lda; a_offset = 1 + a_dim1 * 1; a -= a_offset; af_dim1 = *ldaf; af_offset = 1 + af_dim1 * 1; af -= af_offset; --ipiv; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --rwork; /* Function Body */ *info = 0; upper = lsame_(uplo, "U"); if (! upper && ! lsame_(uplo, "L")) { *info = -1; } else if (*n < 0) { *info = -2; } else if (*nrhs < 0) { *info = -3; } else if (*lda < max(1,*n)) { *info = -5; } else if (*ldaf < max(1,*n)) { *info = -7; } else if (*ldb < max(1,*n)) { *info = -10; } else if (*ldx < max(1,*n)) { *info = -12; } if (*info != 0) { i__1 = -(*info); xerbla_("CHERFS", &i__1); return 0; } /* Quick return if possible */ if (*n == 0 || *nrhs == 0) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] = 0.f; berr[j] = 0.f; /* L10: */ } return 0; } /* NZ = maximum number of nonzero elements in each row of A, plus 1 */ nz = *n + 1; eps = slamch_("Epsilon"); safmin = slamch_("Safe minimum"); safe1 = nz * safmin; safe2 = safe1 / eps; /* Do for each right hand side */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { count = 1; lstres = 3.f; L20: /* Loop until stopping criterion is satisfied. Compute residual R = B - A * X */ ccopy_(n, &b_ref(1, j), &c__1, &work[1], &c__1); q__1.r = -1.f, q__1.i = 0.f; chemv_(uplo, n, &q__1, &a[a_offset], lda, &x_ref(1, j), &c__1, &c_b1, &work[1], &c__1); /* Compute componentwise relative backward error from formula max(i) ( abs(R(i)) / ( abs(A)*abs(X) + abs(B) )(i) ) where abs(Z) is the componentwise absolute value of the matrix or vector Z. If the i-th component of the denominator is less than SAFE2, then SAFE1 is added to the i-th components of the numerator and denominator before dividing. */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = b_subscr(i__, j); rwork[i__] = (r__1 = b[i__3].r, dabs(r__1)) + (r__2 = r_imag(& b_ref(i__, j)), dabs(r__2)); /* L30: */ } /* Compute abs(A)*abs(X) + abs(B). */ if (upper) { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.f; i__3 = x_subscr(k, j); xk = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(&x_ref(k, j)), dabs(r__2)); i__3 = k - 1; for (i__ = 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); rwork[i__] += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(&a_ref(i__, k)), dabs(r__2))) * xk; i__4 = a_subscr(i__, k); i__5 = x_subscr(i__, j); s += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(& a_ref(i__, k)), dabs(r__2))) * ((r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(&x_ref(i__, j)), dabs(r__4))); /* L40: */ } i__3 = a_subscr(k, k); rwork[k] = rwork[k] + (r__1 = a[i__3].r, dabs(r__1)) * xk + s; /* L50: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.f; i__3 = x_subscr(k, j); xk = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(&x_ref(k, j)), dabs(r__2)); i__3 = a_subscr(k, k); rwork[k] += (r__1 = a[i__3].r, dabs(r__1)) * xk; i__3 = *n; for (i__ = k + 1; i__ <= i__3; ++i__) { i__4 = a_subscr(i__, k); rwork[i__] += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(&a_ref(i__, k)), dabs(r__2))) * xk; i__4 = a_subscr(i__, k); i__5 = x_subscr(i__, j); s += ((r__1 = a[i__4].r, dabs(r__1)) + (r__2 = r_imag(& a_ref(i__, k)), dabs(r__2))) * ((r__3 = x[i__5].r, dabs(r__3)) + (r__4 = r_imag(&x_ref(i__, j)), dabs(r__4))); /* L60: */ } rwork[k] += s; /* L70: */ } } s = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (rwork[i__] > safe2) { /* Computing MAX */ i__3 = i__; r__3 = s, r__4 = ((r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2))) / rwork[i__]; s = dmax(r__3,r__4); } else { /* Computing MAX */ i__3 = i__; r__3 = s, r__4 = ((r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2)) + safe1) / (rwork[i__] + safe1); s = dmax(r__3,r__4); } /* L80: */ } berr[j] = s; /* Test stopping criterion. Continue iterating if 1) The residual BERR(J) is larger than machine epsilon, and 2) BERR(J) decreased by at least a factor of 2 during the last iteration, and 3) At most ITMAX iterations tried. */ if (berr[j] > eps && berr[j] * 2.f <= lstres && count <= 5) { /* Update solution and try again. */ chetrs_(uplo, n, &c__1, &af[af_offset], ldaf, &ipiv[1], &work[1], n, info); caxpy_(n, &c_b1, &work[1], &c__1, &x_ref(1, j), &c__1); lstres = berr[j]; ++count; goto L20; } /* Bound error from formula norm(X - XTRUE) / norm(X) .le. FERR = norm( abs(inv(A))* ( abs(R) + NZ*EPS*( abs(A)*abs(X)+abs(B) ))) / norm(X) where norm(Z) is the magnitude of the largest component of Z inv(A) is the inverse of A abs(Z) is the componentwise absolute value of the matrix or vector Z NZ is the maximum number of nonzeros in any row of A, plus 1 EPS is machine epsilon The i-th component of abs(R)+NZ*EPS*(abs(A)*abs(X)+abs(B)) is incremented by SAFE1 if the i-th component of abs(A)*abs(X) + abs(B) is less than SAFE2. Use CLACON to estimate the infinity-norm of the matrix inv(A) * diag(W), where W = abs(R) + NZ*EPS*( abs(A)*abs(X)+abs(B) ))) */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (rwork[i__] > safe2) { i__3 = i__; rwork[i__] = (r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2)) + nz * eps * rwork[ i__]; } else { i__3 = i__; rwork[i__] = (r__1 = work[i__3].r, dabs(r__1)) + (r__2 = r_imag(&work[i__]), dabs(r__2)) + nz * eps * rwork[ i__] + safe1; } /* L90: */ } kase = 0; L100: clacon_(n, &work[*n + 1], &work[1], &ferr[j], &kase); if (kase != 0) { if (kase == 1) { /* Multiply by diag(W)*inv(A'). */ chetrs_(uplo, n, &c__1, &af[af_offset], ldaf, &ipiv[1], &work[ 1], n, info); i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = i__; i__4 = i__; i__5 = i__; q__1.r = rwork[i__4] * work[i__5].r, q__1.i = rwork[i__4] * work[i__5].i; work[i__3].r = q__1.r, work[i__3].i = q__1.i; /* L110: */ } } else if (kase == 2) { /* Multiply by inv(A)*diag(W). */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { i__3 = i__; i__4 = i__; i__5 = i__; q__1.r = rwork[i__4] * work[i__5].r, q__1.i = rwork[i__4] * work[i__5].i; work[i__3].r = q__1.r, work[i__3].i = q__1.i; /* L120: */ } chetrs_(uplo, n, &c__1, &af[af_offset], ldaf, &ipiv[1], &work[ 1], n, info); } goto L100; } /* Normalize error. */ lstres = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { /* Computing MAX */ i__3 = x_subscr(i__, j); r__3 = lstres, r__4 = (r__1 = x[i__3].r, dabs(r__1)) + (r__2 = r_imag(&x_ref(i__, j)), dabs(r__2)); lstres = dmax(r__3,r__4); /* L130: */ } if (lstres != 0.f) { ferr[j] /= lstres; } /* L140: */ } return 0; /* End of CHERFS */ } /* cherfs_ */
/* Subroutine */ int spprfs_(char *uplo, integer *n, integer *nrhs, real *ap, real *afp, real *b, integer *ldb, real *x, integer *ldx, real *ferr, real *berr, real *work, integer *iwork, integer *info) { /* -- LAPACK routine (version 3.0) -- Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., Courant Institute, Argonne National Lab, and Rice University September 30, 1994 Purpose ======= SPPRFS improves the computed solution to a system of linear equations when the coefficient matrix is symmetric positive definite and packed, and provides error bounds and backward error estimates for the solution. Arguments ========= UPLO (input) CHARACTER*1 = 'U': Upper triangle of A is stored; = 'L': Lower triangle of A is stored. N (input) INTEGER The order of the matrix A. N >= 0. NRHS (input) INTEGER The number of right hand sides, i.e., the number of columns of the matrices B and X. NRHS >= 0. AP (input) REAL array, dimension (N*(N+1)/2) The upper or lower triangle of the symmetric matrix A, packed columnwise in a linear array. The j-th column of A is stored in the array AP as follows: if UPLO = 'U', AP(i + (j-1)*j/2) = A(i,j) for 1<=i<=j; if UPLO = 'L', AP(i + (j-1)*(2n-j)/2) = A(i,j) for j<=i<=n. AFP (input) REAL array, dimension (N*(N+1)/2) The triangular factor U or L from the Cholesky factorization A = U**T*U or A = L*L**T, as computed by SPPTRF/CPPTRF, packed columnwise in a linear array in the same format as A (see AP). B (input) REAL array, dimension (LDB,NRHS) The right hand side matrix B. LDB (input) INTEGER The leading dimension of the array B. LDB >= max(1,N). X (input/output) REAL array, dimension (LDX,NRHS) On entry, the solution matrix X, as computed by SPPTRS. On exit, the improved solution matrix X. LDX (input) INTEGER The leading dimension of the array X. LDX >= max(1,N). FERR (output) REAL array, dimension (NRHS) The estimated forward error bound for each solution vector X(j) (the j-th column of the solution matrix X). If XTRUE is the true solution corresponding to X(j), FERR(j) is an estimated upper bound for the magnitude of the largest element in (X(j) - XTRUE) divided by the magnitude of the largest element in X(j). The estimate is as reliable as the estimate for RCOND, and is almost always a slight overestimate of the true error. BERR (output) REAL array, dimension (NRHS) The componentwise relative backward error of each solution vector X(j) (i.e., the smallest relative change in any element of A or B that makes X(j) an exact solution). WORK (workspace) REAL array, dimension (3*N) IWORK (workspace) INTEGER array, dimension (N) INFO (output) INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value Internal Parameters =================== ITMAX is the maximum number of steps of iterative refinement. ===================================================================== Test the input parameters. Parameter adjustments */ /* Table of constant values */ static integer c__1 = 1; static real c_b12 = -1.f; static real c_b14 = 1.f; /* System generated locals */ integer b_dim1, b_offset, x_dim1, x_offset, i__1, i__2, i__3; real r__1, r__2, r__3; /* Local variables */ static integer kase; static real safe1, safe2; static integer i__, j, k; static real s; extern logical lsame_(char *, char *); static integer count; static logical upper; extern /* Subroutine */ int scopy_(integer *, real *, integer *, real *, integer *), saxpy_(integer *, real *, real *, integer *, real *, integer *), sspmv_(char *, integer *, real *, real *, real *, integer *, real *, real *, integer *); static integer ik, kk; static real xk; extern doublereal slamch_(char *); static integer nz; static real safmin; extern /* Subroutine */ int xerbla_(char *, integer *), slacon_( integer *, real *, real *, integer *, real *, integer *); static real lstres; extern /* Subroutine */ int spptrs_(char *, integer *, integer *, real *, real *, integer *, integer *); static real eps; #define b_ref(a_1,a_2) b[(a_2)*b_dim1 + a_1] #define x_ref(a_1,a_2) x[(a_2)*x_dim1 + a_1] --ap; --afp; b_dim1 = *ldb; b_offset = 1 + b_dim1 * 1; b -= b_offset; x_dim1 = *ldx; x_offset = 1 + x_dim1 * 1; x -= x_offset; --ferr; --berr; --work; --iwork; /* Function Body */ *info = 0; upper = lsame_(uplo, "U"); if (! upper && ! lsame_(uplo, "L")) { *info = -1; } else if (*n < 0) { *info = -2; } else if (*nrhs < 0) { *info = -3; } else if (*ldb < max(1,*n)) { *info = -7; } else if (*ldx < max(1,*n)) { *info = -9; } if (*info != 0) { i__1 = -(*info); xerbla_("SPPRFS", &i__1); return 0; } /* Quick return if possible */ if (*n == 0 || *nrhs == 0) { i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { ferr[j] = 0.f; berr[j] = 0.f; /* L10: */ } return 0; } /* NZ = maximum number of nonzero elements in each row of A, plus 1 */ nz = *n + 1; eps = slamch_("Epsilon"); safmin = slamch_("Safe minimum"); safe1 = nz * safmin; safe2 = safe1 / eps; /* Do for each right hand side */ i__1 = *nrhs; for (j = 1; j <= i__1; ++j) { count = 1; lstres = 3.f; L20: /* Loop until stopping criterion is satisfied. Compute residual R = B - A * X */ scopy_(n, &b_ref(1, j), &c__1, &work[*n + 1], &c__1); sspmv_(uplo, n, &c_b12, &ap[1], &x_ref(1, j), &c__1, &c_b14, &work[*n + 1], &c__1); /* Compute componentwise relative backward error from formula max(i) ( abs(R(i)) / ( abs(A)*abs(X) + abs(B) )(i) ) where abs(Z) is the componentwise absolute value of the matrix or vector Z. If the i-th component of the denominator is less than SAFE2, then SAFE1 is added to the i-th components of the numerator and denominator before dividing. */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[i__] = (r__1 = b_ref(i__, j), dabs(r__1)); /* L30: */ } /* Compute abs(A)*abs(X) + abs(B). */ kk = 1; if (upper) { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.f; xk = (r__1 = x_ref(k, j), dabs(r__1)); ik = kk; i__3 = k - 1; for (i__ = 1; i__ <= i__3; ++i__) { work[i__] += (r__1 = ap[ik], dabs(r__1)) * xk; s += (r__1 = ap[ik], dabs(r__1)) * (r__2 = x_ref(i__, j), dabs(r__2)); ++ik; /* L40: */ } work[k] = work[k] + (r__1 = ap[kk + k - 1], dabs(r__1)) * xk + s; kk += k; /* L50: */ } } else { i__2 = *n; for (k = 1; k <= i__2; ++k) { s = 0.f; xk = (r__1 = x_ref(k, j), dabs(r__1)); work[k] += (r__1 = ap[kk], dabs(r__1)) * xk; ik = kk + 1; i__3 = *n; for (i__ = k + 1; i__ <= i__3; ++i__) { work[i__] += (r__1 = ap[ik], dabs(r__1)) * xk; s += (r__1 = ap[ik], dabs(r__1)) * (r__2 = x_ref(i__, j), dabs(r__2)); ++ik; /* L60: */ } work[k] += s; kk += *n - k + 1; /* L70: */ } } s = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { /* Computing MAX */ r__2 = s, r__3 = (r__1 = work[*n + i__], dabs(r__1)) / work[ i__]; s = dmax(r__2,r__3); } else { /* Computing MAX */ r__2 = s, r__3 = ((r__1 = work[*n + i__], dabs(r__1)) + safe1) / (work[i__] + safe1); s = dmax(r__2,r__3); } /* L80: */ } berr[j] = s; /* Test stopping criterion. Continue iterating if 1) The residual BERR(J) is larger than machine epsilon, and 2) BERR(J) decreased by at least a factor of 2 during the last iteration, and 3) At most ITMAX iterations tried. */ if (berr[j] > eps && berr[j] * 2.f <= lstres && count <= 5) { /* Update solution and try again. */ spptrs_(uplo, n, &c__1, &afp[1], &work[*n + 1], n, info); saxpy_(n, &c_b14, &work[*n + 1], &c__1, &x_ref(1, j), &c__1); lstres = berr[j]; ++count; goto L20; } /* Bound error from formula norm(X - XTRUE) / norm(X) .le. FERR = norm( abs(inv(A))* ( abs(R) + NZ*EPS*( abs(A)*abs(X)+abs(B) ))) / norm(X) where norm(Z) is the magnitude of the largest component of Z inv(A) is the inverse of A abs(Z) is the componentwise absolute value of the matrix or vector Z NZ is the maximum number of nonzeros in any row of A, plus 1 EPS is machine epsilon The i-th component of abs(R)+NZ*EPS*(abs(A)*abs(X)+abs(B)) is incremented by SAFE1 if the i-th component of abs(A)*abs(X) + abs(B) is less than SAFE2. Use SLACON to estimate the infinity-norm of the matrix inv(A) * diag(W), where W = abs(R) + NZ*EPS*( abs(A)*abs(X)+abs(B) ))) */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { if (work[i__] > safe2) { work[i__] = (r__1 = work[*n + i__], dabs(r__1)) + nz * eps * work[i__]; } else { work[i__] = (r__1 = work[*n + i__], dabs(r__1)) + nz * eps * work[i__] + safe1; } /* L90: */ } kase = 0; L100: slacon_(n, &work[(*n << 1) + 1], &work[*n + 1], &iwork[1], &ferr[j], & kase); if (kase != 0) { if (kase == 1) { /* Multiply by diag(W)*inv(A'). */ spptrs_(uplo, n, &c__1, &afp[1], &work[*n + 1], n, info); i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L110: */ } } else if (kase == 2) { /* Multiply by inv(A)*diag(W). */ i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { work[*n + i__] = work[i__] * work[*n + i__]; /* L120: */ } spptrs_(uplo, n, &c__1, &afp[1], &work[*n + 1], n, info); } goto L100; } /* Normalize error. */ lstres = 0.f; i__2 = *n; for (i__ = 1; i__ <= i__2; ++i__) { /* Computing MAX */ r__2 = lstres, r__3 = (r__1 = x_ref(i__, j), dabs(r__1)); lstres = dmax(r__2,r__3); /* L130: */ } if (lstres != 0.f) { ferr[j] /= lstres; } /* L140: */ } return 0; /* End of SPPRFS */ } /* spprfs_ */