NAG Library for SMP & Multicore, Mark 24

FSSO624DC

Oracle SPARC64VII 64-bit, Solaris, Oracle Fortran, Double Precision

Users' Note



Contents


1. Introduction

This document is essential reading for every user of the NAG Library for SMP & Multicore implementation specified in the title. It provides implementation-specific detail that augments the information provided in the NAG Mark 24 Library Manual (which we will refer to as the Library Manual). Wherever that manual refers to the "Users' Note for your implementation", you should consult this note.

In addition, NAG recommends that before calling any Library routine you should read the following reference material (see Section 5):

(a) Essential Introduction
(b) Chapter Introduction
(c) Routine Document

The libraries supplied with this implementation have been compiled in a manner that facilitates their use within a multithreaded application. If you intend to use the NAG library within a multithreaded application please refer to the document on Thread Safety in the Library Manual (see Section 5).

2. Post Release Information

Please check the following URL:

http://www.nag.co.uk/doc/inun/fs24/so6dc/postrelease.html

for details of any new information related to the applicability or usage of this implementation.

3. General Information

3.1. Accessing the Library

In this section we assume that the library has been installed in the directory [INSTALL_DIR].

By default [INSTALL_DIR] (see Installer's Note (in.html)) is /opt/NAG/fsso624dc or /usr/local/NAG/fsso624dc depending on your system; however it could have been changed by the person who did the installation. To identify [INSTALL_DIR] for this installation:

To use the NAG Library for SMP & Multicore and the Sun Performance Library (Sunperf) libraries, you may link in the following manner:
  f90 -O3 -dalign -stackvar -openmp -m64 -I[INSTALL_DIR]/nag_interface_blocks driver.f90 \
      [INSTALL_DIR]/lib/libnagsmp.a \
-xlic_lib=sunperf
where driver.f90 is your application program ; or
  f90 -O3 -dalign -stackvar -openmp -m64 -I[INSTALL_DIR]/nag_interface_blocks driver.f90 \
      [INSTALL_DIR]/lib/libnagsmp.so \
-xlic_lib=sunperf
if the shareable library is required.

If your application has been linked with the shareable NAG library then the environment variable LD_LIBRARY_PATH_64 must be set or extended, as follows, to allow run-time linkage.

In the C shell, type:

  setenv LD_LIBRARY_PATH_64 [INSTALL_DIR]/lib
to set LD_LIBRARY_PATH_64, or
  setenv LD_LIBRARY_PATH_64 [INSTALL_DIR]/lib:${LD_LIBRARY_PATH_64}
to extend LD_LIBRARY_PATH_64 if you already have it set.

In the Bourne shell, type:

  LD_LIBRARY_PATH_64=[INSTALL_DIR]/lib
  export LD_LIBRARY_PATH_64
to set LD_LIBRARY_PATH_64, or
  LD_LIBRARY_PATH_64=[INSTALL_DIR]/lib:${LD_LIBRARY_PATH_64}
  export LD_LIBRARY_PATH_64
to extend LD_LIBRARY_PATH_64 if you already have it set.

Note that you may also need to set LD_LIBRARY_PATH_64 to point at other items such as compiler run-time libraries, for example if you are using a newer version of the compiler.

3.1.1. Setting the number of threads to use

Set the environment variable OMP_NUM_THREADS to the number of threads required, up to maximum available on your system.

In the C shell type:

  setenv OMP_NUM_THREADS N
In the Bourne shell, type:
  OMP_NUM_THREADS=N
  export OMP_NUM_THREADS
where N is the number of threads required. OMP_NUM_THREADS may be re-set between each execution of the program, as desired.

In general, the maximum number of threads you are recommended to use is the number of physical cores on your SMP system.

3.1.2. Calling the Library from C or C++

With care, the NAG Library for SMP & Multicore may be used from within a C or C++ environment. To assist the user make the mapping between Fortran and C types, a C/C++ header file ([INSTALL_DIR]/c_headers/nagmk24.h) is provided. It is recommended that users wishing to use a Library routine either copy and paste the relevant section of the file into their C or C++ application (making sure that the relevant #defines etc. are also copied from the top of the file) or simply include the header file with their application.

A document, techdoc.html, giving advice on calling the NAG Library for SMP & Multicore from C and C++ is also available in [INSTALL_DIR]/c_headers.

3.2. Interface Blocks

The NAG Library for SMP & Multicore interface blocks define the type and arguments of each user callable NAG Library for SMP & Multicore routine. These are not essential to calling the NAG Library for SMP & Multicore from Fortran programs. However, they are required if the supplied examples are used. Their purpose is to allow the Fortran compiler to check that NAG Library for SMP & Multicore routines are called correctly. The interface blocks enable the compiler to check that:

(a) subroutines are called as such;
(b) functions are declared with the right type;
(c) the correct number of arguments are passed; and
(d) all arguments match in type and structure.

The NAG Library for SMP & Multicore interface block files are organised by Library chapter. They are aggregated into one module named

  nag_library
The modules are supplied in pre-compiled form (.mod files) and they can be accessed by specifying the -Ipathname option on each compiler invocation, where pathname ([INSTALL_DIR]/nag_interface_blocks) is the path of the directory containing the compiled interface blocks.

The .mod module files were compiled with the compiler shown in Section 2.1 of the Installer's Note. Such module files are compiler-dependent, so if you wish to use the NAG example programs, or use the interface blocks in your own programs, when using a compiler that is incompatible with these modules, you will first need to create your own module files. See the Post Release Information page

http://www.nag.co.uk/doc/inun/fs24/so6dc/postrelease.html

where more information may be available, or contact NAG for further help.

3.3. Example Programs

The example results distributed were generated at Mark 24, using the software described in Section 2.2 of the Installer's Note. These example results may not be exactly reproducible if the example programs are run in a slightly different environment (for example, a different Fortran compiler, a different compiler library, or a different set of Basic Linear Algebra Subprograms (BLAS) or Linear Algebra PACKage (LAPACK) routines). The results which are most sensitive to such differences are: eigenvectors (which may differ by a scalar multiple, often -1, but sometimes complex); numbers of iterations and function evaluations; and residuals and other "small" quantities of the same order as the machine precision.

Note that the example material has been adapted, if necessary, from that published in the Library Manual, so that programs are suitable for execution with this implementation with no further changes. The distributed example programs should be used in preference to the versions in the Library Manual wherever possible. The directory [INSTALL_DIR]/scripts contains two scripts nagsmp_example and nagsmp_example_shar.

The example programs are most easily accessed by one of the commands

Each command will provide you with a copy of an example program (and its data and options file, if any), compile the program and link it with the appropriate libraries (showing you the compile command so that you can recompile your own version of the program). Finally, the executable program will be run with appropriate arguments specifying data, options and results files as needed.

The example program concerned, and the number of OpenMP threads to use, are specified by the arguments to the command, e.g.

nagsmp_example e04nrf 4
will copy the example program and its data and options files (e04nrfe.f90, e04nrfe.d and e04nrfe.opt) into the current directory, compile the program and run it using 4 OpenMP threads to produce the example program results in the file e04nrfe.r.

3.4. Fortran Types and Interpretation of Bold Italicised Terms

The NAG Library and documentation use parameterized types for floating-point variables. Thus, the type

      REAL(KIND=nag_wp)
appears in documentation of all NAG Library for SMP & Multicore routines, where nag_wp is a Fortran KIND parameter. The value of nag_wp will vary between implementations, and its value can be obtained by use of the nag_library module. We refer to the type nag_wp as the NAG Library "working precision" type, because most floating-point arguments and internal variables used in the library are of this type.

In addition, a small number of routines use the type

      REAL(KIND=nag_rp)
where nag_rp stands for "reduced precision type". Another type, not currently used in the library, is
      REAL(KIND=nag_hp)
for "higher precision type" or "additional precision type".

For correct use of these types, see almost any of the example programs distributed with the Library.

For this implementation, these types have the following meanings:

      REAL (kind=nag_rp)      means REAL (i.e. single precision)
      REAL (kind=nag_wp)      means DOUBLE PRECISION
      COMPLEX (kind=nag_rp)   means COMPLEX (i.e. single precision complex)
      COMPLEX (kind=nag_wp)   means double precision complex (e.g. COMPLEX*16)

In addition, the Manual has adopted a convention of using bold italics to distinguish some terms.

One important bold italicised term is machine precision, which denotes the relative precision to which DOUBLE PRECISION floating-point numbers are stored in the computer, e.g. in an implementation with approximately 16 decimal digits of precision, machine precision has a value of approximately 1.0D-16.

The precise value of machine precision is given by the routine X02AJF. Other routines in Chapter X02 return the values of other implementation-dependent constants, such as the overflow threshold, or the largest representable integer. Refer to the X02 Chapter Introduction for more details.

The bold italicised term block size is used only in Chapters F07 and F08. It denotes the block size used by block algorithms in these chapters. You only need to be aware of its value when it affects the amount of workspace to be supplied – see the parameters WORK and LWORK of the relevant routine documents and the Chapter Introduction.

3.5. Explicit Output from NAG Routines

Certain routines produce explicit error messages and advisory messages via output units which have default values that can be reset by using X04AAF for error messages and X04ABF for advisory messages. (The default values are given in Section 4.) These routines are potentially not thread safe and in general output is not recommended in a multithreaded environment.

4. Routine-specific Information

Any further information which applies to one or more routines in this implementation is listed below, chapter by chapter.
  1. C06

    In this implementation calls to the following FFT routines, from the Sunperf library, are made whenever possible:
     DFFTB  DFFTF  DFFTI  ZFFTB  ZFFTF  ZFFTI
     VDFFTB VDFFTF VDFFTI VZFFTB VZFFTF VZFFTI
     ZFFT2B ZFFT2F ZFFT2I ZFFT3B ZFFT3F ZFFT3I
    
    in the following NAG routines:
     C06PAF  C06PCF  C06PFF  C06PJF  C06PKF  C06PPF  C06PQF  C06PRF
     C06PSF  C06PUF  C06PXF
    
    The required size of the workspace array WORK for each routine will depend upon the parameters used and thus the choice of Sun or NAG FFT kernels used within the FFT routines. The values specified in the NAG routine documents should be sufficient in all cases.
  2. F06, F07, F08 and F16

    In Chapters F06, F07, F08 and F16, alternate routine names are available for BLAS and LAPACK derived routines. For details of the alternate routine names please refer to the relevant Chapter Introduction. Note that applications should reference routines by their BLAS/LAPACK names, rather than their NAG-style names, for optimum performance.

    Many LAPACK routines have a "workspace query" mechanism which allows a caller to interrogate the routine to determine how much workspace to supply. Note that LAPACK routines from the Sunperf library may require a different amount of workspace from the equivalent NAG versions of these routines. Care should be taken when using the workspace query mechanism.

    In this implementation calls to BLAS and LAPACK routines are implemented by calls to Sunperf, except for the following routines:

    BLAS_DMAX_VAL   BLAS_DMIN_VAL
    DBDSDC  DBDSQR  DGBRFS  DGBSV   DGBSVX  DGBTRF  DGBTRS  DGEBAL
    DGEBRD  DGEES   DGEESX  DGEEV   DGEEVX  DGEHRD  DGEJSV  DGELS
    DGELSD  DGELSS  DGELSY  DGEMV   DGEQP3  DGEQRF  DGER    DGERFS
    DGESDD  DGESV   DGESVD  DGESVJ  DGESVX  DGETRF  DGETRS  DGGES
    DGGESX  DGGEV   DGGEVX  DGGGLM  DGGLSE  DGGQRF  DGGRQF  DGTRFS
    DGTSVX  DHSEIN  DHSEQR  DLANSF  DNRM2   DOPGTR  DORGBR  DORGHR
    DORGQR  DORGTR  DORMBR  DORMHR  DORMQR  DORMTR  DPBRFS  DPBSV
    DPBSVX  DPBTRS  DPFTRF  DPFTRI  DPFTRS  DPORFS  DPOSV   DPOSVX
    DPOTRF  DPOTRS  DPPRFS  DPPSV   DPPSVX  DPPTRS  DPSTRF  DPTEQR
    DPTRFS  DPTSVX  DROT    DSBEV   DSBEVD  DSBEVX  DSBGV   DSBGVD
    DSBGVX  DSBTRD  DSFRK   DSGESV  DSPEV   DSPEVD  DSPEVX  DSPGV
    DSPGVD  DSPGVX  DSPOSV  DSPRFS  DSPSVX  DSTEBZ  DSTEDC  DSTEGR
    DSTEIN  DSTEQR  DSTEV   DSTEVD  DSTEVR  DSTEVX  DSYEV   DSYEVD
    DSYEVR  DSYEVX  DSYGV   DSYGVD  DSYGVX  DSYRFS  DSYSV   DSYSVX
    DSYTRD  DSYTRF  DTBRFS  DTBTRS  DTFSM   DTFTRI  DTFTTP  DTFTTR
    DTGSNA  DTGSYL  DTPRFS  DTPTRS  DTPTTF  DTPTTR  DTRRFS  DTRSEN
    DTRSV   DTRTTF  DTRTTP  ZBDSQR  ZCGESV  ZCPOSV  ZDSCAL  ZGBRFS
    ZGBSV   ZGBSVX  ZGBTRF  ZGBTRS  ZGEBAL  ZGEBRD  ZGEES   ZGEESX
    ZGEEV   ZGEEVX  ZGELS   ZGELSD  ZGELSS  ZGELSY  ZGEQP3  ZGEQRF
    ZGERFS  ZGESDD  ZGESV   ZGESVD  ZGESVX  ZGETRF  ZGETRS  ZGGBAK
    ZGGES   ZGGESX  ZGGEV   ZGGEVX  ZGGGLM  ZGGLSE  ZGGQRF  ZGGRQF
    ZGTRFS  ZGTSVX  ZHBEV   ZHBEVD  ZHBEVX  ZHBGV   ZHBGVD  ZHBGVX
    ZHBTRD  ZHEEV   ZHEEVD  ZHEEVR  ZHEEVX  ZHEGV   ZHEGVD  ZHEGVX
    ZHERFS  ZHESVX  ZHETRD  ZHFRK   ZHPEV   ZHPEVD  ZHPEVX  ZHPGV
    ZHPGVD  ZHPGVX  ZHPRFS  ZHPSVX  ZHSEIN  ZLANHF  ZPBRFS  ZPBSV
    ZPBSVX  ZPBTRF  ZPBTRS  ZPFTRF  ZPFTRI  ZPFTRS  ZPORFS  ZPOSV
    ZPOSVX  ZPOTRF  ZPOTRI  ZPOTRS  ZPPRFS  ZPPSV   ZPPSVX  ZPPTRS
    ZPSTRF  ZPTEQR  ZPTRFS  ZPTSVX  ZSPRFS  ZSPSVX  ZSTEDC  ZSTEGR
    ZSTEIN  ZSTEQR  ZSYRFS  ZSYSVX  ZTBRFS  ZTBTRS  ZTFSM   ZTFTRI
    ZTFTTP  ZTFTTR  ZTPRFS  ZTPTRS  ZTPTTF  ZTPTTR  ZTRRFS  ZTRTTF
    ZTRTTP  ZUNGBR  ZUNGHR  ZUNGQR  ZUNGTR  ZUNMBR  ZUNMHR  ZUNMQR
    ZUNMTR  ZUPGTR
    
  3. G02

    The value of ACC, the machine-dependent constant mentioned in several documents in the chapter, is 1.0D-13.
  4. P01

    On hard failure, P01ABF writes the error message to the error message unit specified by X04AAF and then stops.
  5. S07 - S21

    The behaviour of functions in these Chapters may depend on implementation-specific values.

    General details are given in the Library Manual, but the specific values used in this implementation are as follows:

    S07AAF  F_1 = 1.0E+13
            F_2 = 1.0E-14
    
    S10AAF  E_1 = 1.8715E+1
    S10ABF  E_1 = 7.080E+2
    S10ACF  E_1 = 7.080E+2
    
    S13AAF  x_hi = 7.083E+2
    S13ACF  x_hi = 1.0E+16
    S13ADF  x_hi = 1.0E+17
    
    S14AAF  IFAIL = 1 if X > 1.70E+2
            IFAIL = 2 if X < -1.70E+2
            IFAIL = 3 if abs(X) < 2.23E-308
    S14ABF  IFAIL = 2 if X > x_big = 2.55E+305
    
    S15ADF  x_hi = 2.65E+1
    S15AEF  x_hi = 2.65E+1
    S15AGF  IFAIL = 1 if X >= 2.53E+307
            IFAIL = 2 if 4.74E+7 <= X < 2.53E+307
            IFAIL = 3 if X < -2.66E+1
    
    S17ACF  IFAIL = 1 if X > 1.0E+16
    S17ADF  IFAIL = 1 if X > 1.0E+16
            IFAIL = 3 if 0 < X <= 2.23E-308
    S17AEF  IFAIL = 1 if abs(X) > 1.0E+16
    S17AFF  IFAIL = 1 if abs(X) > 1.0E+16
    S17AGF  IFAIL = 1 if X > 1.038E+2
            IFAIL = 2 if X < -5.7E+10
    S17AHF  IFAIL = 1 if X > 1.041E+2
            IFAIL = 2 if X < -5.7E+10
    S17AJF  IFAIL = 1 if X > 1.041E+2
            IFAIL = 2 if X < -1.9E+9
    S17AKF  IFAIL = 1 if X > 1.041E+2
            IFAIL = 2 if X < -1.9E+9
    S17DCF  IFAIL = 2 if abs(Z) < 3.92223E-305
            IFAIL = 4 if abs(Z) or FNU+N-1 > 3.27679E+4
            IFAIL = 5 if abs(Z) or FNU+N-1 > 1.07374E+9
    S17DEF  IFAIL = 2 if AIMAG(Z) > 7.00921E+2
            IFAIL = 3 if abs(Z) or FNU+N-1 > 3.27679E+4
            IFAIL = 4 if abs(Z) or FNU+N-1 > 1.07374E+9
    S17DGF  IFAIL = 3 if abs(Z) > 1.02399E+3
            IFAIL = 4 if abs(Z) > 1.04857E+6
    S17DHF  IFAIL = 3 if abs(Z) > 1.02399E+3
            IFAIL = 4 if abs(Z) > 1.04857E+6
    S17DLF  IFAIL = 2 if abs(Z) < 3.92223E-305
            IFAIL = 4 if abs(Z) or FNU+N-1 > 3.27679E+4
            IFAIL = 5 if abs(Z) or FNU+N-1 > 1.07374E+9
    
    S18ADF  IFAIL = 2 if 0 < X <= 2.23E-308
    S18AEF  IFAIL = 1 if abs(X) > 7.116E+2
    S18AFF  IFAIL = 1 if abs(X) > 7.116E+2
    S18DCF  IFAIL = 2 if abs(Z) < 3.92223E-305
            IFAIL = 4 if abs(Z) or FNU+N-1 > 3.27679E+4
            IFAIL = 5 if abs(Z) or FNU+N-1 > 1.07374E+9
    S18DEF  IFAIL = 2 if REAL(Z) > 7.00921E+2
            IFAIL = 3 if abs(Z) or FNU+N-1 > 3.27679E+4
            IFAIL = 4 if abs(Z) or FNU+N-1 > 1.07374E+9
    
    S19AAF  IFAIL = 1 if abs(X) >= 5.04818E+1
    S19ABF  IFAIL = 1 if abs(X) >= 5.04818E+1
    S19ACF  IFAIL = 1 if X > 9.9726E+2
    S19ADF  IFAIL = 1 if X > 9.9726E+2
    
    S21BCF  IFAIL = 3 if an argument < 1.583E-205
            IFAIL = 4 if an argument >= 3.765E+202
    S21BDF  IFAIL = 3 if an argument < 2.813E-103
            IFAIL = 4 if an argument >= 1.407E+102
    
  6. X01

    The values of the mathematical constants are:

    X01AAF (pi) = 3.1415926535897932
    X01ABF (gamma) = 0.5772156649015328
    
  7. X02

    The values of the machine constants are:

    The basic parameters of the model

    X02BHF   = 2
    X02BJF   = 53
    X02BKF   = -1021
    X02BLF   = 1024
    

    Derived parameters of the floating-point arithmetic

    X02AJF   = 1.11022302462516E-16
    X02AKF   = 2.22507385850721E-308
    X02ALF   = 1.79769313486231E+308
    X02AMF   = 2.22507385850721E-308
    X02ANF   = 2.22507385850721E-308
    

    Parameters of other aspects of the computing environment

    X02AHF   = 1.42724769270596E+45
    X02BBF   = 2147483647
    X02BEF   = 15
    
  8. X04

    The default output units for error and advisory messages for those routines which can produce explicit output are both Fortran Unit 6.

5. Documentation

The Library Manual is available as part of the installation or via download from the NAG website. The most up-to-date version of the documentation is accessible via the NAG website at http://www.nag.co.uk/numeric/FL/FSdocumentation.asp.

The Library Manual is supplied in the following formats:

The following main index files have been provided for these formats:

	nagdoc_fl24/html/FRONTMATTER/manconts.html
	nagdoc_fl24/pdf/FRONTMATTER/manconts.pdf
	nagdoc_fl24/pdf/FRONTMATTER/manconts.html
Use your web browser to navigate from here. For convenience, a master index file containing links to the above files has been provided at
	nagdoc_fl24/index.html

Advice on viewing and navigating the formats available can be found in the Online Documentation document.

In addition the following are provided:

Please see the Oracle web site for further information about Sunperf (http://docs.oracle.com/cd/E24457_01/pdf/E21997.pdf).

6. Support from NAG

(a) Contact with NAG

Queries concerning this document or the implementation generally should be directed to NAG at one of the addresses given in the Appendix. Users subscribing to the support service are encouraged to contact our support team (see below).

(b) NAG Technical Support Service

The NAG Technical Support Service is available for general enquiries from all users and also for technical queries from sites with an annually licensed product or support service.

The technical support desks are open during office hours, but contact is possible by email and phone (answering machine) at all times.

When contacting us, it helps us deal with your enquiry quickly if you can quote your NAG customer reference number and NAG product code (in this case FSSO624DC).

(c) NAG Websites

The NAG websites provide information about implementation availability, descriptions of products, downloadable software, product documentation and technical reports. The NAG websites can be accessed at the following URLs:

http://www.nag.co.uk/, http://www.nag.com/ or http://www.nag-j.co.jp/

(d) NAG Electronic Newsletter

If you would like to be kept up to date with news from NAG then please register to receive our free electronic newsletter, which will alert you to announcements about new products or product/service enhancements, technical tips, customer stories and NAG's event diary. You can register via one of our websites, or by contacting us at nagnews@nag.co.uk.

(e) Product Registration

To ensure that you receive information on updates and other relevant announcements, please register this product with us. For NAG Library products this may be accomplished by filling in the online registration form at http://www.nag.co.uk/numeric/Library_Registration.asp.

7. User Feedback

Many factors influence the way that NAG's products and services evolve, and your ideas are invaluable in helping us to ensure that we meet your needs. If you would like to contribute to this process, we would be delighted to receive your comments. Please contact any of the NAG offices (shown below).

Appendix - Contact Addresses

NAG Ltd
Wilkinson House
Jordan Hill Road
OXFORD  OX2 8DR                         Technical Support (Europe & ROW)
United Kingdom                          email: support@nag.co.uk

Tel: +44 (0)1865 511245                 Tel: +44 (0)1865 311744

NAG Inc
801 Warrenville Road
Suite 185
Lisle, IL  60532-4332                   Technical Support (North America)
USA                                     email: support@nag.com

Tel: +1 630 971 2337                    Tel: +1 630 971 2337

Nihon NAG KK
Hatchobori Frontier Building 2F
4-9-9
Hatchobori
Chuo-ku
Tokyo 104-0032                          Technical Support (Japan)
Japan                                   email: naghelp@nag-j.co.jp

Tel: +81 3 5542 6311                    Tel: +81 3 5542 6311