NAG FL Interface
f08rnf (zuncsd)

Settings help

FL Name Style:


FL Specification Language:


1 Purpose

f08rnf computes the CS decomposition of a complex m×m unitary matrix X, partitioned into a 2×2 array of submatrices.

2 Specification

Fortran Interface
Subroutine f08rnf ( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22, theta, u1, ldu1, u2, ldu2, v1t, ldv1t, v2t, ldv2t, work, lwork, rwork, lrwork, iwork, info)
Integer, Intent (In) :: m, p, q, ldx11, ldx12, ldx21, ldx22, ldu1, ldu2, ldv1t, ldv2t, lwork, lrwork
Integer, Intent (Out) :: iwork(m-min(p,m-p,q,m-q)), info
Real (Kind=nag_wp), Intent (Out) :: theta(min(p,m-p,q,m-q)), rwork(max(1,lrwork))
Complex (Kind=nag_wp), Intent (Inout) :: x11(ldx11,*), x12(ldx12,*), x21(ldx21,*), x22(ldx22,*), u1(ldu1,*), u2(ldu2,*), v1t(ldv1t,*), v2t(ldv2t,*)
Complex (Kind=nag_wp), Intent (Out) :: work(max(1,lwork))
Character (1), Intent (In) :: jobu1, jobu2, jobv1t, jobv2t, trans, signs
C Header Interface
#include <nag.h>
void  f08rnf_ (const char *jobu1, const char *jobu2, const char *jobv1t, const char *jobv2t, const char *trans, const char *signs, const Integer *m, const Integer *p, const Integer *q, Complex x11[], const Integer *ldx11, Complex x12[], const Integer *ldx12, Complex x21[], const Integer *ldx21, Complex x22[], const Integer *ldx22, double theta[], Complex u1[], const Integer *ldu1, Complex u2[], const Integer *ldu2, Complex v1t[], const Integer *ldv1t, Complex v2t[], const Integer *ldv2t, Complex work[], const Integer *lwork, double rwork[], const Integer *lrwork, Integer iwork[], Integer *info, const Charlen length_jobu1, const Charlen length_jobu2, const Charlen length_jobv1t, const Charlen length_jobv2t, const Charlen length_trans, const Charlen length_signs)
The routine may be called by the names f08rnf, nagf_lapackeig_zuncsd or its LAPACK name zuncsd.

3 Description

The m×m unitary matrix X is partitioned as
X= ( X11 X12 X21 X22 )  
where X11 is a p×q submatrix and the dimensions of the other submatrices X12, X21 and X22 are such that X remains m×m.
The CS decomposition of X is X= UΣpVT where U, V and Σp are m×m matrices, such that
U= ( U1 0 0 U2 )  
is a unitary matrix containing the p×p unitary matrix U1 and the (m-p)×(m-p) unitary matrix U2;
V= ( V1 0 0 V2 )  
is a unitary matrix containing the q×q unitary matrix V1 and the (m-q)×(m-q) unitary matrix V2; and
Σp= ( I11 0 0 0 C 0 0 -S 0 0 0 -I12 0 0 I22 0 0 S C 0 0 I21 0 0 )  
contains the r×r non-negative diagonal submatrices C and S satisfying C2+S2=I, where r=min(p,m-p,q,m-q) and the top left partition is p×q.
The identity matrix I11 is of order min(p,q)-r and vanishes if min(p,q)=r.
The identity matrix I12 is of order min(p,m-q)-r and vanishes if min(p,m-q)=r.
The identity matrix I21 is of order min(m-p,q)-r and vanishes if min(m-p,q)=r.
The identity matrix I22 is of order min(m-p,m-q)-r and vanishes if min(m-p,m-q)=r.
In each of the four cases r=p,q,m-p,m-q at least two of the identity matrices vanish.
The indicated zeros represent augmentations by additional rows or columns (but not both) to the square diagonal matrices formed by Iij and C or S.
Σp does not need to be stored in full; it is sufficient to return only the values θi for i=1,2,,r where Cii=cos(θi) and Sii=sin(θi).
The algorithm used to perform the complete CS decomposition is described fully in Sutton (2009) including discussions of the stability and accuracy of the algorithm.

4 References

Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J J, Du Croz J J, Greenbaum A, Hammarling S, McKenney A and Sorensen D (1999) LAPACK Users' Guide (3rd Edition) SIAM, Philadelphia https://www.netlib.org/lapack/lug
Golub G H and Van Loan C F (2012) Matrix Computations (4th Edition) Johns Hopkins University Press, Baltimore
Sutton B D (2009) Computing the complete CS decomposition Numerical Algorithms (Volume 50) 1017–1398 Springer US 33–65 https://dx.doi.org/10.1007/s11075-008-9215-6

5 Arguments

1: jobu1 Character(1) Input
On entry:
  • if jobu1='Y', U1 is computed;
  • otherwise, U1 is not computed.
2: jobu2 Character(1) Input
On entry:
  • if jobu2='Y', U2 is computed;
  • otherwise, U2 is not computed.
3: jobv1t Character(1) Input
On entry:
  • if jobv1t='Y', V1T is computed;
  • otherwise, V1T is not computed.
4: jobv2t Character(1) Input
On entry:
  • if jobv2t='Y', V2T is computed;
  • otherwise, V2T is not computed.
5: trans Character(1) Input
On entry:
  • if trans='T', X, U1, U2, V1T and V2T are stored in row-major order;
  • otherwise, X, U1, U2, V1T and V2T are stored in column-major order.
6: signs Character(1) Input
On entry:
  • if signs='O', the lower-left block is made nonpositive (the other convention);
  • otherwise, the upper-right block is made nonpositive (the default convention).
7: m Integer Input
On entry: m, the number of rows and columns in the unitary matrix X.
Constraint: m0.
8: p Integer Input
On entry: p, the number of rows in X11 and X12.
Constraint: 0pm.
9: q Integer Input
On entry: q, the number of columns in X11 and X21.
Constraint: 0qm.
10: x11(ldx11,*) Complex (Kind=nag_wp) array Input/Output
Note: the second dimension of the array x11 must be at least max(1,p) if trans='T', and at least q otherwise.
On entry: the upper left partition of the unitary matrix X whose CSD is desired.
On exit: contains details of the unitary matrix used in a simultaneous bidiagonalization process.
11: ldx11 Integer Input
On entry: the first dimension of the array x11 as declared in the (sub)program from which f08rnf is called.
Constraints:
  • if trans='T', ldx11max(1,q);
  • otherwise ldx11max(1,p).
12: x12(ldx12,*) Complex (Kind=nag_wp) array Input/Output
Note: the second dimension of the array x12 must be at least max(1,p) if trans='T', and at least m-q otherwise.
On entry: the upper right partition of the unitary matrix X whose CSD is desired.
On exit: contains details of the unitary matrix used in a simultaneous bidiagonalization process.
13: ldx12 Integer Input
On entry: the first dimension of the array x12 as declared in the (sub)program from which f08rnf is called.
Constraints:
  • if trans='T', ldx12max(1,m-q);
  • otherwise ldx12max(1,p).
14: x21(ldx21,*) Complex (Kind=nag_wp) array Input/Output
Note: the second dimension of the array x21 must be at least max(1,m-p) if trans='T', and at least q otherwise.
On entry: the lower left partition of the unitary matrix X whose CSD is desired.
On exit: contains details of the unitary matrix used in a simultaneous bidiagonalization process.
15: ldx21 Integer Input
On entry: the first dimension of the array x21 as declared in the (sub)program from which f08rnf is called.
Constraints:
  • if trans='T', ldx21max(1,q);
  • otherwise ldx21max(1,m-p).
16: x22(ldx22,*) Complex (Kind=nag_wp) array Input/Output
Note: the second dimension of the array x22 must be at least max(1,m-p) if trans='T', and at least m-q otherwise.
On entry: the lower right partition of the unitary matrix X CSD is desired.
On exit: contains details of the unitary matrix used in a simultaneous bidiagonalization process.
17: ldx22 Integer Input
On entry: the first dimension of the array x22 as declared in the (sub)program from which f08rnf is called.
Constraints:
  • if trans='T', ldx22max(1,m-q);
  • otherwise ldx22max(1,m-p).
18: theta(min(p,m-p,q,m-q)) Real (Kind=nag_wp) array Output
On exit: the values θi for i=1,2,,r where r=min(p,m-p,q,m-q). The diagonal submatrices C and S of Σp are constructed from these values as
  • C= diag(cos(theta(1)),,cos(theta(r))) and
  • S= diag(sin(theta(1)),,sin(theta(r))) .
19: u1(ldu1,*) Complex (Kind=nag_wp) array Output
Note: the second dimension of the array u1 must be at least max(1,p) if jobu1='Y', and at least 1 otherwise.
On exit: if jobu1='Y', u1 contains the p×p unitary matrix U1.
20: ldu1 Integer Input
On entry: the first dimension of the array u1 as declared in the (sub)program from which f08rnf is called.
Constraint: if jobu1='Y', ldu1max(1,p).
21: u2(ldu2,*) Complex (Kind=nag_wp) array Output
Note: the second dimension of the array u2 must be at least max(1,m-p) if jobu2='Y', and at least 1 otherwise.
On exit: if jobu2='Y', u2 contains the m-p×m-p unitary matrix U2.
22: ldu2 Integer Input
On entry: the first dimension of the array u2 as declared in the (sub)program from which f08rnf is called.
Constraint: if jobu2='Y', ldu2max(1,m-p).
23: v1t(ldv1t,*) Complex (Kind=nag_wp) array Output
Note: the second dimension of the array v1t must be at least max(1,q) if jobv1t='Y', and at least 1 otherwise.
On exit: if jobv1t='Y', v1t contains the q×q unitary matrix V1H.
24: ldv1t Integer Input
On entry: the first dimension of the array v1t as declared in the (sub)program from which f08rnf is called.
Constraint: if jobv1t='Y', ldv1tmax(1,q).
25: v2t(ldv2t,*) Complex (Kind=nag_wp) array Output
Note: the second dimension of the array v2t must be at least max(1,m-q) if jobv2t='Y', and at least 1 otherwise.
On exit: if jobv2t='Y', v2t contains the m-q×m-q unitary matrix V2H.
26: ldv2t Integer Input
On entry: the first dimension of the array v2t as declared in the (sub)program from which f08rnf is called.
Constraint: if jobv2t='Y', ldv2tmax(1,m-q).
27: work(max(1,lwork)) Complex (Kind=nag_wp) array Workspace
On exit: if info=0, work(1) returns the optimal lwork.
If info>0 on exit, work(2:r) contains the values PHI(1),PHI(r-1) that, together with xrefarg(1),xrefarg(r), define the matrix in intermediate bidiagonal-block form remaining after nonconvergence. info specifies the number of nonzero PHI's.
28: lwork Integer Input
On entry: the dimension of the array work as declared in the (sub)program from which f08rnf is called.
If lwork=−1, a workspace query is assumed; the routine only calculates the optimal size of the work array, returns this value as the first entry of the work array, and no error message related to lwork is issued.
The minimum workspace required is max(1,p) + max(1,m-p) + max(1,q) + max(1,m-q) + max(1,p,m-p,q,m-q) + 1 ; the optimal amount of workspace depends on internal block sizes and the relative problem dimensions.
Constraint:
lwork=−1 or lwork max(1,p) + max(1,m-p) + max(1,q) + max(1,m-q) + max(1,p,m-p,q,m-q) + 1 .
29: rwork(max(1,lrwork)) Real (Kind=nag_wp) array Workspace
30: lrwork Integer Input
On entry: the dimension of the array rwork as declared in the (sub)program from which f08rnf is called.
If lrwork=−1, a workspace query is assumed; the routine only calculates the optimal size of the work array, returns this value as the first entry of the work array, and no error message related to lrwork is issued. Otherwise the required workspace is 5 × max(1,q-1) + 4 × max(1,q) + max(1,8×q) +1 which equates to 11 for q=0, 18 for q=1 and 17×q-4 when q>1.
Constraint:
lrwork=−1 or lrwork5×max(1,q-1)+4×max(1,q)+max(1,8×q)+1.
31: iwork(m-min(p,m-p,q,m-q)) Integer array Workspace
32: info Integer Output
On exit: info=0 unless the routine detects an error (see Section 6).

6 Error Indicators and Warnings

info<0
If info=-i, argument i had an illegal value. An explanatory message is output, and execution of the program is terminated.
info>0
The Jacobi-type procedure failed to converge during an internal reduction to bidiagonal-block form. The process requires convergence to min(p,m-p,q,m-q) values, the value of info gives the number of converged values.

7 Accuracy

The computed CS decomposition is nearly the exact CS decomposition for the nearby matrix (X+E) , where
E2 = O(ε) ,  
and ε is the machine precision.

8 Parallelism and Performance

Background information to multithreading can be found in the Multithreading documentation.
f08rnf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
f08rnf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

The total number of floating-point operations required to perform the full CS decomposition is approximately 2m3.
The real analogue of this routine is f08raf.

10 Example

This example finds the full CS decomposition of a unitary 6×6 matrix X (see Section 10.2) partitioned so that the top left block is 2 by 4.
The decomposition is performed both on submatrices of the unitary matrix X and on separated partition matrices. Code is also provided to perform a recombining check if required.

10.1 Program Text

Program Text (f08rnfe.f90)

10.2 Program Data

Program Data (f08rnfe.d)

10.3 Program Results

Program Results (f08rnfe.r)