This chapter describes how you can use flexible file I/O (FFIO) with common file structures and how to enhance code performance without changing your source code.
The FFIO library on IRIX systems calls the aio_sgi_init library routine the first time the library issues an asynchronous I/O call. It passes the following parameters to aio_sgi_init:
aio_numusers=MAX(64,sysconf(_SC_NPROC_CONF)) aio_threads=5 aio_locks=3 |
If a program is using multiple threads and asynchronous I/O, it is important that the value in aio_numusers be at least as large as the number of sprocs or pthreads that the application contains. See the aio_sgi_init man page on your IRIX system for more details.
Users can change these values by setting the following environment variables to the desired value:
change FF_IO_AIO_THREADS to modify aio_threads
change FF_IO_AIO_LOCKS to modify aio_locks
change FF_IO_AIO_NUMUSERS to modify aio_numusers
In the following example, aio_threads is set to 8 when the FFIO routines call aio_sgi_init:
setenv FF_IO_AIO_THREADS 8 |
Users can also supersede the FFIO library's call to aio_sgi_init by calling it themselves, before the first I/O statement in their programs.
The following FFIO layers may issue asynchronous I/O calls on IRIX systems:
cos: see the description of cos on the INTRO_FFIO(3f) man page for a description of the circumstances when the cos layer uses asynchronous I/O.
cachea and bufa: users should assume that these layers may issue asynchronous I/O calls.
system or syscall: these layers may issue asynchronous I/O calls if called from a BUFFER IN or BUFFER OUT Fortran statement, or if called from one of the listed layers. The system and syscall layers may also issue asynchronous I/O calls if called via the ffreada(3c), ffwritea(3c), or fflistio(3c) routines (deferred implementation on IRIX systems).
This section describes the use of FFIO with common file structures and describes the correlation between the common and/or default file structures and the FFIO usage that handles them.
Most human-readable files are in text format; this format contains records comprised of ASCII characters with each record terminated by an ASCII line-feed character, which is the newline character in UNIX terminology. The FFIO specification that selects this file structure is assign -F text.
The FFIO package is seldom required to handle text files. In the following types of cases, however, using FFIO may be necessary:
Optimizing text file access to reduce I/O wait time
Handling multiple EOF records in text files
Converting data files to and from other formats
I/O speed is important when optimizing text file access. Using assign -F text is expensive in terms of CPU time, but it lets you use memory-resident and SDS files, which can reduce or eliminate I/O wait time.
The FFIO system also can process text files that have embedded EOF records. The ~e string alone in a text record is used as an EOF record. Editors such as sed(1) or other standard utilities can process these files, but it is sometimes easier with the FFIO system.
On UNICOS and UNICOS/mk systems, the text layer is also useful in conjunction with the fdcp(1) command. The text layer provides a standard output format. Many forms of data that are not considered foreign are sometimes encountered in a heterogeneous computing environment. If a record format can be described with an FFIO specification, it can usually be converted to text format by using the following script:
OTHERSPEC=$1 INFILE=$2 OUTFILE=$3 assign -F ${OTHERSPEC} ${INFILE} assign -F text ${OUTFILE} fdcp ${INFILE} ${OUTFILE} |
Use the fdcp command to copy files while converting record blocking.
The simplest form of data file format is the simple binary stream or unblocked data . It contains no record marks, file marks, or control words. This is usually the fastest way to move large amounts of data, because it involves a minimal amount of CPU and system overhead.
The FFIO package provides several layers designed specifically to handle this binary stream of data. These layers are syscall, sds, and mr. These layers behave the same from the user's perspective; they only use different system resources. The unblocked binary stream is usually used for unformatted data transfer. It is not usually useful for text files or when record boundaries or backspace operations are required. The complete burden is placed on the application to know the format of the file and the structure and type of the data contained in it.
This lack of structure also allows flexibility; for example, a file declared with one of these layers can be manipulated as a direct-access file with any desired record length.
In this context, fdcp can be called to do the equivalent of the cp(1) command only if the input file is a binary stream and to remove blocking information only if the output file is a binary stream.
The most common use for fixed-length record files is for Fortran direct access. Both unformatted and formatted direct-access files use a form of fixed-length records. The simplest way to handle these files with the FFIO system is with binary stream layers, such as system, syscall, cache, cachea, (all available on UNICOS and UNICOS/mk systems and IRIX systems) and sds , and mr (available only on UNICOS and UNICOS/mk systems). These layers allow any requested pattern of access and also work with direct-access files. The syscall and system layers, however, are unbuffered and do not give optimal performance for small records.
The FFIO system also directly supports some fixed-length record formats.
The COS blocking format is the default file structure for all Fortran sequential unformatted files on UNICOS and UNICOS/mk systems, except tape files. The cos layer is provided to handle these files. It provides for COS blocked files on disk and on magnetic tape and it supports multifile COS blocked datasets.
The cos layer must be specified for COS blocked files. If COS is not the default file structure, or if you specify another layer, such as sds, you may have to specify a cos layer to get COS blocking.
FFIO can be used to enhance performance in a program without changing the source code or recompiling the code. This section describes some basic techniques used to optimize I/O performance. Additional optimization options are discussed in Chapter 13, “I/O Optimization ”.
In the FFIO system, buffering is the responsibility of the individual layers; therefore, you must understand the individual layers in order to control the use and size of buffers.
The cos layer has high payoff potential to the user who wants to extract top performance by manipulating buffer sizes. As the following example shows, the cos layer accepts a buffer size as the first numeric parameter:
assign -F cos:42 u:1 |
The preceding example declares a working buffer size for the cos layer of forty-two 4096-byte blocks. This is an excellent size for a file that resides on a DD-49 disk drive because a track on a DD-49 disk drive is comprised of forty-two 4096-byte blocks (sectors).
If the buffer is sufficiently large, the cos layer also lets you keep an entire file in the buffer and avoid almost all I/O operations.
I/O optimization usually consists of reducing overhead. One part of the overhead in doing I/O is the CPU time spent in record blocking. For many files in many programs, this blocking is unnecessary. If this is the case, the FFIO system can be used to deselect record blocking and thus obtain appropriate performance advantages.
The following layers offer unblocked data transfer:
Layer | Definition |
syscall | System call I/O |
bufa | Buffering layer |
cachea | Asynchronous cache layer |
sds | SDS-resident I/O (not available on IRIX systems) |
cache | Memory-resident buffer cache |
mr | Memory-resident (MR) I/O (deferred implementation on IRIX systems) |
You can use any of these layers alone for any file that does not require the existence of record boundaries. This includes any applications that are written in C that require a byte stream file.
The syscall layer offers a simple direct system interface with a minimum of system and library overhead. If requests are larger than approximately 32 Kbytes, this method can be appropriate, especially if the requests are a uniform multiple of 4096 bytes.
The other layers are discussed in the following sections.
The bufa layer and cachea layer permits efficient file processing. Both layers provide library-managed asynchronous buffering, and the cachea layer allows recently accessed parts of a file to be cached either in main memory or in a secondary data segment.
The number of buffers and the size of each buffer is tunable. In the bufa:bs:nbufs or cachea :bs:nbufs FFIO specifications, the bs argument specifies the size in 4096-byte blocks of each buffer. The default on UNICOS systems and UNICOS/mk systems depends on the st_oblksize field returned from a stat(2) system call of the file; if this return value is 0, the default is 489 for ER90 files and 8 for all other files. The nbufs argument specifies the number of buffers to use.
The sds layer is not available on UNICOS/mk systems or on IRIX systems. It is only available on UNICOS systems.
The sds layer lets you use the secondary data segment (SDS) feature as an I/O device for almost any file. SDS is one use of the solid-state storage device (SSD). SDS as a device is described in the UNICOS File Formats and Special Files Reference Manual. If SDS is available, the sds layer can yield very high performance. The sds transfer rate can approach 2 Gbit/s.
Used in combination with the other layers, COS blocked files, text files, and direct-access files can reside in SDS without recoding. This can provide excellent performance for any file or part of a file that can reside in SDS.
The sds layer offers the capability to declare a file to be SDS resident. It features both scratch and save mode, and it performs overflow to the next lower layer (usually disk) automatically. You can declare that a file should reside in SDS to the extent possible. The simplest specification is assign -F sds fort.1.
This specification assumes default values for all options on the sds layer. By default, the sds layer is in save mode, which makes the SDS appear like an ordinary file. Because save is the assumed mode, any existing file is loaded into SDS when the file is opened. When the file is closed, the data is written back to the disk if the data was changed.
The sds layer overflows if necessary. Data that does not fit in the SDS space overflows to the next lower-level layer. This happens regardless of the reason for insufficient SDS space. For example, if you are not validated to use SDS, all of the files that are declared to be SDS-resident immediately overflow. In the previous assign(1) example, the overflow goes to disk file fort.1. The part of the file that fits in SDS remains there until the file is closed, but the overflowed portion resides on disk.
The assign -F command specifies the entire set of processing steps that are performed when I/O is requested. You can use other layers in conjunction with the sds layer to produce the desired file structures.
In the previous example, no specification exists for blocking on the file. Therefore, the resulting file structure is identical to the following:
assign -s u fort.1 |
This is also identical to the following:
assign -F syscall fort.1 |
If a file is COS blocked, a specification must be used that handles block and record control words. The following three examples produce identical files:
assign -s cos fort.1 assign -F cos fort.1 assign -F cos,sds fort.1 |
If the file is read or written more than once, adding sds to the assign command provides speed.
If SDS space is unlimited, almost any unformatted sequential file referenced from Fortran I/O can be declared by using the following command:
assign -F cos,sds unf_seq |
Any formatted sequential file could be declared by using the following command:
assign -F text,sds fmt_seq |
Record blocking is not required for unformatted direct-access files; therefore, any unformatted direct-access file can be declared by using the following command:
assign -F sds fort.1 |
In many cases, the cos specification is not necessary, but that decision must be made based on the specifics of the particular file and program.
All SDS space that the sds layer uses is obtained from the sdsalloc(3) library routines. Parameters, environment variables, and rules that pertain to these routines are fully applicable to this I/O technique.
For information about possible fragmentation with SDS, see the ldcache(8) man page.
“Sample Programs for UNICOS Systems”, contains several sds layer examples.
The mr layer lets you use main memory as an I/O device for many files. Used in combination with the other layers, COS blocked files, text files, and direct-access files can all reside in memory without recoding. This can result in excellent performance for any file or part of a file that can reside in memory.
If the file is small enough to fit in memory and is traversed many times, the wall-clock time can be reduced dramatically by using the mr layer to keep the file entirely in memory.
The mr layer lets you declare that a file is memory resident. It features both scratch and save mode, and it performs overflow to the next lower layer (usually disk) automatically.
Memory-resident files can run either in interactive or batch mode. The format for the mr layer on the assign(1) command is as follows:
assign -F mr. savscr.ovfopt:min:max:incr |
The assign -F command specifies the entire set of processing steps that are performed when I/O is requested. If the mr layer is specified alone, the resulting file structure is identical to the following:
assign -s unblocked fort.1 |
If a file is COS blocked, you must specify the handling of block and record control words as in the following example:
assign -s cos fort.1 |
The previous assign specification is identical to both of the following:
assign -F cos fort.1 assign -F cos,mr fort.1 |
“Sample Programs for UNICOS Systems”, contains several mr program examples.
The cache layer permits efficient file processing for repeated access to one or more regions of a file. It is a library-managed buffer cache that contains a tunable number of pages of tunable size.
To specify the cache layer, use the following option:
assign -F cache[:[bs][:[nbufs]]] |
The bs argument specifies the size in 4096-byte blocks of each cache page; the default is 8. The nbufs argument specifies the number of cache pages to use. The default is 4. You can achieve improved I/O performance by using one or more of the following strategies:
Use a cache page size (bs) that is a multiple of the disk sector or track size. This improves the performance when flushing and filling cache pages.
Use a cache page size that is a multiple of the user's record size. This ensures that no user record straddles two cache pages. If this is not possible or desirable, it is best to allocate a few additional cache pages (nbufs).
Use a number of cache pages that is greater than or equal to the number of file regions the code accesses at one time.
If the number of regions accessed within a file is known, the number of cache pages can be chosen first. To determine the cache page size, divide the amount of memory to be used by the number of cache pages. For example, suppose a program uses direct access to read 10 vectors from a file and then writes the sum to a different file:
integer VECTSIZE, NUMCHUNKS, CHUNKSIZE parameter(VECTSIZE=1000*512) parameter(NUMCHUNKS=100) parameter(CHUNKSIZE=VECTSIZE/HUMCHUNKS) read a(CHUNKSIZE), sum(CHUNKSIZE) open(11,access='direct',recl=CHUNKSIZE*8) call asnunit (2,'-s unblocked',ier) open (2,form='unformatted') do i = 1,NUMCHUNKS sum = 0.0 do j = 1,10 read(11,rec=(j-1)*NUMCHUNKS+i)a sum=sum+a enddo write(2) sum enddo end |
If 4 Mbytes of memory are allocated for buffers for unit 11, 10 cache pages should be used, each of the following size:
4MB/10 = 40000 bytes = 97 blocks |
Make the buffer size an even multiple of the record length of 40960 bytes by rounding it up to 100 blocks (= 40960 bytes), then use the following assign command:
assign -F cache:100:10 u:11 |
The following examples contain coding examples using the different layers that were discussed previously.
Example 11-1. sds using buffer I/O
The following is an example of a batch request shell script that uses an sds layer with buffer I/O. In the following example, a batch job named exam1 contains the following statements:
#QSUB -r exam1 -lT 10 -lQ 500000 #QSUB -eo -o exam1.out set -x cd $TMPDIR cat > ex1.f <<EOF program example1 double precision r(512),r1(512) open(1,form='unformatted') do 100 k=1,100 do 10 j=1,512 10 r(j)=j+k buffer out(1,1)(r(1),r(512)) if(unit(1).ne.-1.0)then print *,"error on 1 bufferout=",unit(1)," rec=",k end if 100 continue rewind 1 do 200 k=1,100 bufferin(1,1)(r1(1),r1(512)) if(unit(1).ne.-1.0)then print *,"error on bufferin 1=",unit(1)," rec=",k end if 200 continue close(1) end EOF f90 ex1.f -o ex1 # compile and load assign -R # reset assign parameters assign -F sds f:fort.1 # assign fort.1 to SDS ./ex1 # execute |
If fort.1 does not exist, the example could use a blocked or unblocked file structure for the fort.1 file. If COS record blocking is preferred, the last assign command should be changed to add the following cos layer:
assign -F cos,sds fort.1 |
The -lQ 500000 parameter on line 1 of the example indicates to NQS that the job needs 500,000 bytes of SDS for execution. It is rounded up to the nearest sector. The size was calculated from the formula (512 × 100 × 8 + numctl); numctl is the number of control words needed for the number of records and sectors in the file.
This example assumes that save and overflow are the default options for the sds layer. File fort.1 is saved in the current directory. If the data exceeds the maximum available space, the file overflows to disk. As shown, the example does not exceed the maximum available space.
Example 11-2. Unformatted sequential sds example
In the following batch job, ex3, the program performs unformatted sequential I/O to a file by using the cos and sds layers:
#QSUB -r ex3 -lT 10 -lQ 500000 #QSUB -eo -o exam3.out date set -x cd $TMPDIR cat > test.f <<EOF program example3 integer r(512),r1(512) data r/512*0.0/ open(1,form='unformatted') do 100 k=1,100 write(1)r 100 continue rewind(1) read(1)r1 endfile(1) backspace(1) backspace(1) read(1)r1 close(1) end EOF f90 -o test test.f # compile and load assign -R # reset assign parameters assign -F cos,sds.scr.novfl fort.1 # assign file fort.1 to be COS blocked, and # reside in SDS space with scr scratch # novfl overflow option enabled ./test # execute |
In this example, the program uses BACKSPACE statements; therefore, the specified file must be a COS blocked file structure. Record delimiters are needed for repositioning through a backspace. If the program did not use BACKSPACE statements, the file could be assigned an unblocked structure by using the following command:
assign -F sds.scr.novfl fort.1 |
The scr parameter directs the library not to save the file on disk when unit 1 is closed. The novfl parameter in this example specifies that the program aborts if the amount of data written exceeds the maximum available space.
Example 11-3. sds and mr with WAIO
The following batch job, exam5, uses the word-addressable (WA) package with an sds and mr layer. The arguments to the assign(1) command in this example are as follows:
-F mr.scr.ovfl::25:,sds.scr.ovfl::50: fort.1 |
The arguments specify that the first 25 blocks of the file reside in main memory. If the program writes more than 25 blocks, the overflow is written to the next lower layer (sds in this example). If the program writes more than 75 blocks, the remainder or overflow is written to the next lower layer (disk). No space is reserved on disk for the amount of blocks that are memory-resident or SDS-resident because nothing will be saved. The file is deleted when the file is closed. If the save option is specified as in the following example, the first 25 blocks of the written file will be memory resident:
-F mr.save.ovfl::25:,sds.save.ovfl::50: fort.1 |
If the program writes another 50 blocks to the file, 25 blocks are skipped in the next lower layer (sds) and the 50 blocks are written after skipping the first 25 blocks. If the program writes more than 75 blocks to the file, the first 75 blocks of the next lower layer (disk) are skipped and the write starts at the 76th block of the disk file. When the file is closed, the mr and sds layers are written to the file.
#QSUB -r exam5 -lT 10 -lQ 500000 #QSUB -eo -o ex5.out date set -x cd $TMPDIR cat > ex5.f <<EOF program example5 dimension r(512),r1(512) iblks=10 !use a 10 block buffer istats=1 !print out I/O stats call wopen(1,iblks,istats,ier) if(ier.ne.0)then print *,"error on wopen=",ier goto300 end if iaddr=1 do 100 k=1,100 do 10 j=1,512 10 r(j)=j+k call putwa(1,r,iaddr,512,ier) if(ier.ne.0)then print *,"error on putwa=",ier," rec=",k goto300 end if iaddr=iaddr+512 100 continue iaddr=1 do 200 k=1,100 call getwa(1,r1,iaddr,512,ier) if(ier.ne.0)then print *,"error on getwa=",ier," rec=",k goto300 end if iaddr=iaddr+512 200 continue 300 continue call wclose(1) end EOF f90 ex5.f -o ex5 # compile and load assign -R # reset assign parameters assign -F mr.scr.ovfl::25:,sds.scr.ovfl::50 f:fort.1 # assign mr layer with 25 block limit & # sds layer with 50 block limit to fort.1 ./ex5 # execute |
Example 11-4. Unformatted direct sds and mr example
In the following example, batch job ex8 contains a program that uses unformatted direct-access I/O with both an sds and an mr layer:
#QSUB -r ex8 -lT 10 -lQ 500000 #QSUB -eo -o ex8.out date set -x cd $TMPDIR cat > ex8.f <<EOF program example8 dimension r(512) data r/512*2.0/ open(1,form='unformatted',access='direct',recl=4096) do 100 i=1,100 write(1,rec=i,iostat=ier)r if(ier.ne.0)then if(ier.eq.5034)then print *,"overflow to disk at record=",i else print *,"error on write=",ier end if end if 100 continue do 200 i=100,1,-1 read(1,rec=i,iostat=ier)r if(ier.ne.0)then print *,"error on read=",ier end if 200 continue close(1) end EOF f90 ex8.f -o ex8 # compile and compile assign -R # reset assign parameters assign -F mr.scr.ovfl::50:,sds.scr.ovfl::100: fort.1 # assign file fort.1 to be mr with a # 50 block limit, then write the next # 100 blocks to sds. ./ex8 # execute |
The program writes the first 50 blocks of fort.1 to the memory-resident layer. The next 50 blocks overflow the mr buffer and will be written to the sds layer. No space is reserved in SDS for the first 50 blocks of the file. If the program writes more than 150 blocks to fort.1, the overflow is written to the next lower layer (disk). Because the scr option is specified, the file is not saved when fort.1 is closed.
Example 11-5. sds with MS package example
In this example, batch job ex9 contains a program that uses the MS record-addressable package and an sds layer:
#QSUB -r ex9 -lT 10 -lQ 500000 #QSUB -eo -o ex9.out set -x cd $TMPDIR cat > ex9.f <<EOF program example9 dimension r(512) dimension idx(512) data r/512*2.0/ irflag=0 call openms(1,idx,100,0,ier) do 100 i=1,100 call writms(1,r,512,i,irflag,0,ier) if(ier.ne.0)then print *,"error on writms=",ier goto300 end if 100 continue do 200 i=1,100 call readms(1,r,512,i,irflag,0,ier) if(ier.ne.0)then print *,"error on readms=",ier goto300 end if 200 continue 300 continue call closms(1,ier) end EOF f90 ex9.f -o ex9 # compile assign -R # reset assign parameters assign -F sds.scr.novfl:50:100: fort.1 # assign file fort.1 to sds minimum 50 # block allocation and maximum 100 blocks. # If more than 100 blocks are written,that # portion of data overflows to disk. ./ex9 # execute |
Because the scr option was specified, the file is not saved when fort.1 is closed.
Example 11-6. mr with buffer I/O example
The following program uses a memory-resident layer with buffer I/O:
cat> ex2.f<<EOF program example2 integer r(512),r1(512) integer ipos1(100) open(1,form='unformatted') do 100 k=1,100 do 10 j=1,512 10 r(j)=j+k call getpos(1,1,ipos1(k)) buffer out(1,1)(r(1),r(512)) if(unit(1).ne.-1.0)then print *,"error on bufferout=",unit(1)," rec=",k goto300 end if 100 continue do 200 k=100,1,-2 call setpos(1,1,ipos1(k)) buffer in(1,1)(r1(1),r1(512)) if(unit(1).ne.-1.0)then print *,"error on bufferin =",unit(1)," rec=",k goto300 end if 200 continue 300 continue close(1) end EOF f90 ex2.f -o ex2 # compile and load assign -R # reset assign parameters assign -F mr fort.1 # assign file fort.1 to memory ./ex2 |
You may specify a blocked or unblocked file structure for file fort.1 if the file does not exist.
Example 11-7. Unformatted sequential mr examples
The following program uses an mr layer with unformatted sequential I/O:
program example4a integer r(512) data r/512*1.0/ C Reset assign environment, then assign file without FFIO C to be read back in by subsequent program. call assign('assign -R',ier1) call assign('assign -a /tmp/file1 -s unblocked f:fort.1',ier2) if(ier1.ne.0.or.ier2.ne.0)then print *,"assign error" goto200 end if open(1,form='unformatted') C write out 100 records to disk file: /tmp/file1 do 100 k=1,100 write(1)r 100 continue close(1) 200 continue end |
In program unit example4b which follows, the assign arguments contain the following options to use blocked file structure:
assign -R assign -a /tmp/file1 -F cos,mr.save.ovfl u:3 |
example4b writes an unblocked file disk file, /tmp/file1. If you want to use a blocked file structure, the assign command arguments should contain the following in program unit example4a:
assign -R assign -a /tmp/file1 f:fort.1 program example4b integer r(512) C Reset assign environment, then assign file C with an mr layer. call assign('assign -R',ier1) call assign('assign -a /tmp/file1 & -F mr.save.ovfl u:3',ier2) if(ier1.ne.0.or.ier2.ne.0)then print *,"assign error" goto300 end if C open the previously written file '/tmp/file1', C load it into memory open(3,form='unformatted') C read 5 records do 200 k=1,5 read(3)r1 200 continue rewind(3) close(3) 300 continue end |
A sequential formatted file must always have a text specification before the residency layer specification so that the I/O library can determine the end of a record.
Example 11-8. mr and MS package example
The following program uses an mr layer with the MS record-addressable package:
program example10 dimension r(512) dimension idx(512) data r/512*2.0/ irflag=0 call assign('assign -R',ier1) C C Assign memory-resident file with overflow, an initial C size of 10 512-word blocks, a 200 block limit before C overflow and an increment size of 20 blocks. call asnunit(1,'-F mr.save.ovfl:10:200:20',ier2) if(ier1.ne.0.or.ier2.ne.0)then print *,"assign error" goto400 end if call opendr(1,idx,100,0,ier) do 100 i=1,100 call writdr(1,r,512,i,irflag,0,ier) if(ier.ne.0)then print *,"error on writdr=",ier goto300 end if 100 continue do 200 i=1,100 call readdr(1,r,512,i,irflag,0,ier) if(ier.ne.0)then print *,"error on readdr=",ier goto300 end if 200 continue 300 call closdr(1,ier) 400 continue end |