ANZFLD -- Analyze Field

ANZFLD ANALYZE FIELD TAAQRYD
The Analyze Field command sequences an externally described file on a named field and allows analysis by percentiles, ranges, or unique values. For each summary line printed, decimal fields may be summed, averaged, the maximum and minimum values determined, and the standard deviation calculated. Selection criteria may be specified to limit what is processed. A physical, logical, or joined logical file may be processed (only single format files are valid). A typical command to print the unique values for the field CODE and summarize the AMT field would be: ANZFLD FILE(xxx) ANZTYPE(UNIQUE) ANZFLD(CODE) SUMFLD((AMT SUM)) The TAA SORTDBF command is used to sequence the file on the CODE field and create a temporary file in QTEMP. SORTDBF uses OPNQRYF for selection and the Sort API to sort the data. The ANZFLD program reads the sorted file and outputs a spooled file. There would be one line for each unique CODE value including the sum of the AMT field and a count of the number of records. A final total line is also printed along with the median value found (the record at the 50th percentile), and the mode value found (the value with the most number of occurrences). The default is to display the information. The spooled file is retained. Up to 30 fields may be summarized. The width of the spooled file is 378 bytes (maximum supported by DSPSPLF). If you want a hard copy of the results, you will probably not be able to request 30 summary fields. You may also get the average, maximum, minimum, and standard deviation for AMT by modifying the SUMFLD parameter: SUMFLD((AMT AVG)(AMT MAX)(AMT MIN)(AMT STDDEV) You may use different fields such as a SUMFLD parameter which specifies the SALES and COST fields: SUMFLD((SALES AVG)(COST AVG) Percentiles ----------- The percentile function allows you to print a line for a group of records that make up one or more percentiles. The records are sequenced by the named field to determine the percentiles. For example, if there are 1000 records in the file, each 10 records represent one percentile. You can print all 100 percentiles or a group of percentiles. For example, to print a line for every 10 percentiles you would specify: ANZFLD FILE(xxx) ANZTYPE(PERCENTILE) ANZFLD(CODE) PERCENTILE(10) SUMFLE(...) The PERCENTILE(10) request provides for 10 percentiles (1-10, 11-20, ... 91-100). Each number represents how many percentile groups will be printed. You may request 2, 3, 4, 5, 10, 20, or 100 percentiles. Ranges ------ You may identify groups to be summarized by providing a range for each group. Up to 50 ranges may be entered. Either a decimal or character field may be used as the field name for the ANZFLD parameter. The ranges must be specified in ascending sequence and cannot overlap. A typical command would be: ANZFLD FILE(xxx) ANZTYPE(RANGES) ANZFLD(CODE) RANGES((AAA BBB)(CCC DDA)(DDB EEE)) SUMFLD(...) The 'From value' for each range must be less than the 'To value' for the same range and must be greater than the 'To value' of the previous range. One line would be printed for each range that contains records. If any records exist that do not fall into any of the specified ranges, an 'Other' total is summarized. When a decimal field is specified for a range of values, a decimal type of comparison is used. Consequently, you may specify a low value of 0 or 1 even though the field may have several digits. Negative values are valid such as a range of -10 to -2. Since ascending values are required, you must begin with the largest negative value. Range values should be entered with a decimal notation. For example, if you have a 5 digit field with 2 decimal positions and want a range of 1.00 to 2.00, enter 1.00 to 2.00. You must enter the same number of digits to the right of the decimal character as are defined for the field specified for the ANZFLD parameter. When a character field is specified for a range of values, a character comparison is used. You may use a shorter length than the full field, but you must be careful to specify the proper range. For example, assume you have a 2 byte character field and you specify ranges of A to B, and C to D. If the data in the field to be analyzed is BA, it will not fall in either range (it would appear as an 'Other' value). You could use a range such as A to B9 to allow all data such as BA to fall into the first range. Selection --------- Selection may be specified, to minimize the number of records that are to be processed such as: ANZFLD FILE(xxx) ANZTYPE(UNIQUE) ANZFLD(CODE) SUMFLD((AMT SUM)) SELFLD((STATE EQ NY)) You may select on the field named for ANZFLD or different fields. Multiple selection fields may be named and an AND/OR relationship described. The typical boolean operators EQ, NE, etc are supported along with: * The CT operator provides 'contains' which causes a scan of the data. For example, you may select based on a string of characters being anywhere in the field. * The WC operator provides for a 'wildcard' function. The simplest type is a generic value such as ABC meaning any value beginning with ABC. Other wildcard functions are supported. Saving the ANZFLD command ------------------------- The ANZFLD command is typically entered interactively and refined with multiple iterations until you achieve the results you are looking for. A subset of a large file may be used to provide better performance until you are satisfied with the specifications. Once you have a version of ANZFLD that meets your requirements, you should consider saving the command for future use. A good tool to consider is the FAVCMD TAA tool. This allows you to store the command with an ID that may be used in the future. To use FAVCMD, you must first create the required files in a library on your library list with the command: CRTFAVCMD FAVCMDPLIB(xxx) Then you follow the ANZFLD command with ADDFAVCMD and describe an ID such as: ANZFLD ... ADDFAVCMD FAVID(yyy) ADDFAVCMD extracts the previous command from the job's message queue and places it in the FAVCMD file. If you want to run the command at a later date, you would enter: EXCFAVCMD FAVID(yyy) There is also a shorthand version of the command: FV FAVID(yyy) The WRKFAVCMD command will let you work with your stored versions. You can copy, change, display, etc from a standard work type of display. Using FAVCMD allows you the advantage of having a solution that is like a CL program. You can easily display, rerun, modify, and copy to meet the same or similar requirements. Reusing the SORTDBF results --------------------------- When ANZFLD runs, a file named TAATMPQRY is output to QTEMP as the result of the internal SORTDBF TAA command. The TAATMPQRY file is processed by the ANZFLD program to produce the spooled output. TAATMPQRY is deleted by default at the end of the command. You can retain the file by specifying RTNOUTFILE(YES). The file may be used for your own purposes, but the typical use is in conjunction with another use of ANZFLD to avoid re-running SORTDBF. When FILE(QTEMP) is specified, the SORTDBF command is bypassed and processing begins immediately on the TAATMPQRY file. Thus the sequence of commands would be: ANZFLD FILE(xxx) ... RTNOUTFILE(YES) ANZFLD FILE(QTEMP) ... ANZFLD FILE(QTEMP) ... After the first use of ANZFLD, the TAATMPQRY file is used again for additional ANZFLD commands. This allows you to vary the ANZTYPE and SUMFLD parameters to see different results. When FILE(QTEMP) is specified, the TAATMPQRY file is automatically retained (the RTNOUTFILE parameter is ignored). Note that when FILE(QTEMP) is specified, the library qualifier for the FILE parameter (such as LIBL) is ignored. The following parameters are also ignored because they are only specified when SORTDBF is used: SELFLD MBR TRNTBL FLTWLDCRD FIXWLDCRD SORTTBL CCSID LANGID OPNQRYFCMD When FILE(QTEMP) is specified, the value of the ANZFLD parameter must be the same as that used to create the TAATMPQRY file. ANZFLD escape messages you can monitor for ------------------------------------------ TAA9891 The member to be processed has no records TAA9895 No records met the selection criteria Escape messages from based on functions will be re-sent. Command parameters CMD ------------------ FILE The qualified name of the file to be read. The library value defaults to LIBL. CURLIB may also be used. A physical or logical file (including a join logical file) may be named. Logical files may have only a single format. The special value QTEMP may be entered if a prior use of ANZFLD has specified RTNOUTFILE(YES). The allows reuse of the TAATMPQRY file in QTEMP and avoids the selection and sorting function. See the RTNOUTFILE parameter for a complete discussion. ANZTYPE The type of analysis to be performed. PERCNTILE. One line is listed for the number of percentiles specified for the PERCENTILE parameter. RANGES. One line is listed for each range identified in the RANGES parameter. UNIQUE. The values in the ANZFLD field are sequenced in ascending order and one line for each unique value is listed. FNLTOT. One line is listed for all selected records. Fields named in SUMFLD will be listed in the total as well as a count of the selected records. ANZFLD The field to be analyzed. Either a decimal field (packed, zoned, or binary) or a character field may be specified. The field will be sequenced in ascending order. The median value (the value of the record at the 50th percentile) is always printed at the end of the listing. The mode value (the value having the most number of frequencies) is always printed at the end of the listing. A character field may be up to 32 bytes in length. A decimal field may be up to 15 bytes in length. PERCENTILE The number of percentiles to be listed. An entry must be made when ANZTYPE(PERCNTILE) is specified. The values that may be entered are 2, 3, 4, 5, 10, 20, or 100. RANGES The ranges to be summarized. Up to 50 ranges may be entered when ANZTYPE(RANGES) is specified. Each range is specified as a 'From' and 'To' value. The 'From value' must be less than or equal to the 'To value' for each range. Ranges must be specified in ascending order and cannot overlap. The 'From value' for each range must be greater than the 'To value' of the preceding range. When a decimal field is specified for a range of values, a decimal type of comparison is used. Consequently, you may specify a low value of 0 or 1 even though the field may have several digits. Negative values are valid such as a range of -10 to -2. Since ascending values are required, you must begin with the largest negative value. Range values should be entered with a decimal notation. For example, if you have a 5 digit field with 2 decimal positions and want a range of 1.00 to 2.00, enter 1.00 to 2.00. You must enter the same number of digits to the right of the decimal character as are defined for the field specified for the ANZFLD parameter. When a character field is specified for a range of values, a character comparison is used. You may use a shorter length than the full field, but you must be careful to specify the proper range. For example, assume you have a 2 byte character field and you specify ranges of A to B, and C to D. If the data in the field to be analyzed is BA, it will not fall in either range (it would appear as an 'Other' value). You could use a range such as A to B9 to allow all data such as BA to fall into the first range. If a blank is required for a character field range, enter BLANK or BLANKS. If a zero is required for a decimal field range, enter 0, ZERO, or ZEROS. SUMFLD Up to 30 fields to be summarized. Only decimal fields of packed, zoned, or binary types of up to 15 digits in length may be specified. The number of decimal positions may not exceed 9. Each field will appear for each line that is printed along with a count of the number of records that exist for the summary line. For each field specified, the type of summary must be included: SUM. The sum of the values. AVG. The average of the values (the sum divided by the number of records making up the sum). The same number of decimal positions will be shown in the answer as exist in the definition of the field being averaged. MAX. The maximum value of the values. MIN. The minimum value of the values. STDDEV. The standard deviation of the values. Standard deviation allows you to express as a number the shape of the distribution curve. The value shown for standard deviation is a number which if added and subtracted from the average would provide a range for approximately 67% of the values in a normal distribution. For example, if numbers 1,2,3...100 exist as values, the median value is 50 and the standard deviation would be 28. This means that between 22 (50-28) and 78 (50+28) you should find approximately 67% of the values. Of the 100 values, 67 lie between 22 and 78. The lower the standard deviation value, the tighter the curve is. A value of 0 means that all values fall at the median. Two standard deviations should account for approximately 95% of the values. Three standard deviations should account for approximately 95+% of the values. Standard deviation is calculated by using the square root of the variance. The variance is calculated as ((b-(aa)/c))/c where a = the sum of the values b = the sum of the square of the values c = the count of the values SELFLD The field to perform selection on. This is a 'list parameter' that allows up to 30 fields to be described for selection. 1) Field name. The field name to perform selection on. The default is NONE. The field must exist in the file unless NONE is specified. Decimal fields (DDS Data types of P = Packed, S = Zoned decimal, B = Binary, F = Floating point) have certain restrictions as described in the following parts. You may use NONE if you have multiple selection fields that are primed by multiple conditions and not all conditions have the same number of selection values. For example if CODE = A you want to select on FLDB, but if CODE = B you want to select on FLDA and FLDB. You may prime the selection fields with IF logic and then have a single ANZFLD command. 2) Operator. The type of comparison to perform. EQ is the default. NE, GT, LT, GE, LE, CT (Contains) and WC (Wildcard) are supported. Decimal fields do not allow CT or WC operators. CT (contains) means the entire field is scanned for the value). Only character fields may be used with CT. WC (Wildcard) causes an EQ search against the data base data with one or more wildcards. Only character fields may be used with WC. Both 'fixed' and 'floating' wildcard functions are supported. The simplest use of wildcards is for either generic requests or where a position within a field should be ignored when selecting. When a generic request is needed such as selecting all values beginning with ABC, just enter ABC or ABC. The default floating wildcard character is ''. If the floating wildcard character does not exist at the end of the value, it is added automatically (there must be room at the end of the value to add a wildcard if none exists). The fixed form of wildcard support allows any value to exist in the search field at a single position. You would enter A_C to select ABC, ADC, AEC, etc. If the field is longer than 3 positions, the floating wildcard is automatically added at the end (if not specified) so you would select values such as ABC, ABCDEF, ADC, and ADCX. '_' is the default fixed wildcard character. When WC is used, if the length of the compare value is equal to the length of the field in the data base, at least one wild card character (either fixed or floating) must exist within the compare value. Multiple fixed wildcards can be specified such as A_D_F. Multiple floating wildcards can be specified such as AA meaning that any number of values may exist between the A's. You would select AA, ABA, ABBBA, etc. You may specify your own characters for the fixed and floating wildcards. See the FLTWLDCRD and FIXWLDCRD parameters. For additional examples and explanation, see the section on 'Wildcard processing' described for the SORTDBF command. 3) Compare value. The compare value (literal) to be used to select data. Up to 32 characters may be entered. The length of data may not exceed the length of the field length in the data base. See the Translate option and WC operator for special handling. For decimal data types, enter a value left adjusted. For example, to select on a 5 digit field with a value of 10, enter 10. If the field to select on has decimal positions, you can select using a whole number (such as 10) or a number with decimal positions (such as 10.5). The decimal notation (US = '.') should be entered in job decimal format. A compare value must be entered (a blank value is invalid). The special values BLANK or BLANKS must be used for character fields to specify a 'blank value'. The special values ZERO or ZEROS may be used (or enter a 0) for decimal fields to specify a 'zero value'. Another field name in the same data base file cannot be used as a compare value. 4) Translate. Whether to translate the values in the data base to upper case before making a comparison. The default is XLATE. Translate is ignored for Decimal fields. If XLATE is used, any character fields in the data base are translated using a translate table (see the TRNTBL parameter). For example, if you enter a compare value of 'ABC', the data will match in the data base for ABC, abc, Abc, AbC, etc. Note that the default is for the situation where you want any 'ABC' value regardless of the case (upper or lower) of the data in the data base. Just enter any case (upper or lower) into the command prompt without surrounding it with quotes (the command prompter will fold the value to upper case). Note that translation operates on the data in the data base and not on the compare value. If you enter a compare value with surrounding quotes such as 'Abc' and take the default for translation, you will not find any values in the data base. All the data base values have been translated to upper case by default before the comparison is made. If you only want the 'Abc' values and not values like 'ABC' or 'AbC', enter a value of 'Abc' (use quotes surrounding the compare value) and specify NOXLATE for translation. There is some performance degradation by using translation. In most applications this is of minor concern. If your data base data is all the same case, you can save some overhead by specifying NOXLATE. 5) And/Or. The relationship of this select field with the next select field. The entry allows you to have one or more fields that form a group. All fields in the group must match the selection criteria to select the record. AND is the default meaning the next selection is considered part of the same group. The value is ignored for the last select field. OR may be specified to start a new group of selections. For example, you may specify FLD1 and FLD2 as one group and FLD3 and FLD4 as a second group. All And/Or values would specify AND except for FLD2 which would specify OR. If you want either FLDA or FLDB equal to certain values as long as FLDX is greater than some value, you must specify two groups (FLDA and FLDX is one group, FLDB and FLDX is a second group). For more details, see the section on 'And/Or Relationships' described for the SORTDBF tool. OUTPUT How to output the results. is the default which displays the spooled file if the command is run interactively. The spooled file is retained. If the command is run in batch, the spooled file is output. PRINT may be specified to just output the spooled file. The spooled file is not automatically displayed. MBR The member to be processed. The default is FIRST for the first member of the file. RTNOUTFILE Whether to retain the TAATMPQRY outfile created internally into QTEMP. The default is DFT meaning the file is not retained unless FILE(QTEMP) was specified. If you are doing a single ANZFLD and do not need the outfile, take the default. YES may be specified to retain the outfile. The purpose of retaining the outfile is to allow the use of multiple ANZFLD commands that reuse the file output by the TAA SORTDBF command. This provides a faster solution in that no selection or sorting is done. To reuse the TAATMPQRY file in subsequent ANZFLD commands, specify FILE(QTEMP) and use the same name for the ANZFLD that was used to create the outfile. You may specify a different ANZTYPE value, different ranges, percentiles, summary fields, etc. When reusing TAATMPQRY (by specifying FILE(QTEMP), the SELFLD parameter and other parameters such as TRNTBL, and SORTTBL which relate to SORTDBF are ignored. The RTNOUTFILE parameter is also ignored. TRNTBL The translate table to be used for field translation. DFT is the default meaning to use the translate table defined in the TAATRNTBL data area in TAASECURE. The shipped default is QSYSTRNTBL (the system supplied translate table for translating lower case US English letters to upper case). The data area may be changed with the command EDTCONARR DTAARA(TAASECURE/TAATRNTBL). For more information about this function, see the TAA Tool RTVTRNTBL. FLTWLDCRD The Floating Wildcard character to be used. The default is ''. A blank is not valid. For more information, see the discussion with the SELFLD parameter and/or the SORTDBF command. FIXWLDCRD The Fixed Wildcard character to be used. The default is '_'. A blank is not valid. For more information, see the discussion with the SELFLD parameter and/or the SORTDBF command. SORTTBL The qualified name of the sort table to use. The default is JOB meaning to use the sort sequence of the job. LANGIDUNQ may be entered to mean the unique-weight sort sequence table that is associated with the LANGID parameter. LANGIDSHR may be entered to mean the shared-weight sort sequence table that is associated with the LANGID parameter. CCSID The sort sequence CCSID to be used along with the LANGID value for retrieving the national language sort sequence table for sorting character data. The default is JOB meaning to use the CCSID of the job. A specific CCSID in the range of 1 to 65535 may be entered. LANGID The language ID to be used to obtain a national language sort sequence table for sorting character data. The default is JOB meaning to use the language ID of the job. OPNQRYFCMD A YES/NO parameter for whether the SORTDBF command that is generated and the OPNQRYF command used by SORTDBF should be sent as messages. NO is the default. YES may be specified to assist in understanding how SORTDBF operates or for problem determination. Restrictions ------------ The maximum record length supported is 9999. The maximum printer width supported is 378 bytes. The maximum number of fields that may exist in a file to be analyzed is 999. Character fields to be analyzed or summarized cannot exceed 32 characters. Decimal fields to be analyzed or summarized cannot exceed 15 digits nor 9 decimal positions. When a field is summed, 3 extra digits are added to the value to a maximum of 15 digits. Prerequisites ------------- The following TAA Tools must be on your system: CHKOBJ2 Check object 2 CHKOBJ3 Check object 3 EDTVAR Edit variable EDTVAR2 Edit variable 2 EXTLST Extract list HLRMVMSG HLL Remove message RTVDBFA Retrieve data base file attributes RTVFLDA Retrieve field attributes RTVSYSVAL3 Retrieve system value 3 RTVTRNTBL Retrieve translate table RTVVALA Retrieve value attributes SCNVAR Scan variable SNDCOMPMSG Send completion message SNDESCMSG Send escape message SNDSTSMSG Send status message Implementation -------------- None, the tool is ready to use. Objects used by the tool ------------------------ Object Type Attribute Src member Src file ------ ---- --------- ---------- ---------- ANZFLD CMD TAAQRYD QATTCMD TAAQRYDC PGM CLP TAAQRYDC QATTCL TAAQRYDR *PGM RPG TAAQRYDR QATTRPG

ANZFLD          ANALYZE FIELD                          TAAQRYD


The Analyze Field command  sequences an externally described file  on a
named  field  and allows  analysis  by percentiles,  ranges,  or unique
values.   For each summary line  printed, decimal fields may be summed,
averaged, the maximum and  minimum values determined, and the  standard
deviation  calculated.  Selection  criteria may  be specified  to limit
what is processed.

A  physical, logical,  or joined  logical file  may be  processed (only
single format files are valid).

A typical command  to print the  unique values for  the field CODE  and
summarize the AMT field would be:

             ANZFLD        FILE(xxx) ANZTYPE(*UNIQUE) ANZFLD(CODE)
                             SUMFLD((AMT *SUM))

The  TAA SORTDBF  command is  used to  sequence  the file  on the  CODE
field and  create a temporary file in QTEMP.   SORTDBF uses OPNQRYF for
selection and the Sort API to sort the data.

The ANZFLD program reads  the sorted file and  outputs a spooled  file.
There would be  one line for each  unique CODE value including  the sum
of the AMT  field and a count of the number of  records.  A final total
line  is also printed along with the  median value found (the record at
the 50th  percentile), and  the mode value  found (the  value with  the
most number of occurrences).

The  default  is to  display  the information.    The  spooled file  is
retained.

Up  to 30 fields may be  summarized.  The width of  the spooled file is
378 bytes (maximum supported by DSPSPLF).   If you want a hard copy  of
the  results, you  will  probably not  be able  to  request 30  summary
fields.

You  may   also  get  the  average,   maximum,  minimum,  and  standard
deviation for AMT by modifying the SUMFLD parameter:

           SUMFLD((AMT *AVG)(AMT *MAX)(AMT *MIN)(AMT *STDDEV)

You  may  use  different  fields  such  as  a  SUMFLD  parameter  which
specifies the SALES and COST fields:

           SUMFLD((SALES *AVG)(COST *AVG)

Percentiles
-----------

The percentile  function allows  you to  print a  line for  a group  of
records  that  make  up one  or  more  percentiles.    The records  are
sequenced  by  the  named  field to  determine  the  percentiles.   For
example, if  there  are  1000 records  in  the  file, each  10  records
represent one percentile.

You  can print  all 100  percentiles or  a group  of percentiles.   For
example, to print a line for every 10 percentiles you would specify:

             ANZFLD        FILE(xxx) ANZTYPE(*PERCENTILE) ANZFLD(CODE)
                             PERCENTILE(10) SUMFLE(...)

The  PERCENTILE(10) request provides  for 10  percentiles (1-10, 11-20,
...  91-100).  Each  number represents how many percentile groups  will
be printed.   You may request 2, 3,  4, 5, 10, 20,  or 100 percentiles.

Ranges
------

You  may identify  groups to  be  summarized by  providing a  range for
each group.   Up to  50 ranges  may be entered.   Either  a decimal  or
character  field  may  be  used  as  the  field  name  for  the  ANZFLD
parameter.

The  ranges  must  be   specified  in  ascending  sequence  and  cannot
overlap.  A typical command would be:

             ANZFLD        FILE(xxx) ANZTYPE(*RANGES) ANZFLD(CODE)
                             RANGES((AAA BBB)(CCC DDA)(DDB EEE))
                             SUMFLD(...)

The  'From value' for each  range must be less than  the 'To value' for
the  same range  and  must  be  greater  than the  'To  value'  of  the
previous range.

One line would be printed for each range that contains records.

If  any records  exist  that do  not  fall into  any  of the  specified
ranges, an 'Other' total is summarized.

When  a decimal  field is specified  for a  range of  values, a decimal
type of  comparison  is used.   Consequently,  you  may specify  a  low
value of 0 or 1 even though the field may have several digits.

Negative  values are  valid  such as  a  range of  -10  to  -2.   Since
ascending  values  are  required,  you  must  begin  with  the  largest
negative value.

Range values should be entered with  a decimal notation.  For  example,
if you have a 5  digit field with 2 decimal positions and  want a range
of 1.00  to 2.00, enter 1.00 to  2.00.  You must enter  the same number
of  digits to  the right  of the decimal  character as  are defined for
the field specified for the ANZFLD parameter.

When  a  character  field  is  specified  for  a  range  of  values,  a
character comparison  is used.  You  may use a shorter  length than the
full field,  but you must be careful to  specify the proper range.  For
example, assume  you have  a 2  byte character  field  and you  specify
ranges  of A  to B,  and  C to  D.   If the  data  in the  field to  be
analyzed is  BA, it will not  fall in either range  (it would appear as
an 'Other' value).   You could  use a range such  as A to  B9 to  allow
all data such as BA to fall into the first range.

Selection
---------

Selection may  be specified,  to minimize  the number  of records  that
are to be processed such as:

             ANZFLD        FILE(xxx) ANZTYPE(*UNIQUE) ANZFLD(CODE)
                             SUMFLD((AMT *SUM))
                             SELFLD((STATE *EQ NY))

You  may select  on the  field named  for  ANZFLD or  different fields.
Multiple  selection fields  may be  named and  an *AND/*OR relationship
described.

The typical boolean operators *EQ, *NE, etc are supported along with:

  **   The *CT  operator provides  'contains'  which causes  a scan  of
       the data.   For  example, you  may select based  on a  string of
       characters being anywhere in the field.

  **   The  *WC  operator  provides for  a  'wildcard'  function.   The
       simplest type  is  a generic  value  such  as ABC*  meaning  any
       value  beginning  with  ABC.     Other  wildcard  functions  are
       supported.

Saving the ANZFLD command
-------------------------

The  ANZFLD  command is  typically  entered  interactively and  refined
with  multiple  iterations  until  you  achieve  the  results  you  are
looking for.   A subset of a large  file may be used to  provide better
performance until you are satisfied with the specifications.

Once  you have a  version of ANZFLD  that meets your  requirements, you
should consider saving the command for future use.

A  good tool to  consider is the FAVCMD  TAA tool.   This allows you to
store the command with an ID that may be used in the future.

To use FAVCMD, you  must first create the  required files in a  library
on your library list with the command:

             CRTFAVCMD     FAVCMDPLIB(xxx)

Then you  follow the ANZFLD command  with ADDFAVCMD and  describe an ID
such as:

             ANZFLD        ...
             ADDFAVCMD     FAVID(yyy)

ADDFAVCMD  extracts the previous  command from the  job's message queue
and places it in the FAVCMD file.

If you want to run the command at a later date, you would enter:

             EXCFAVCMD     FAVID(yyy)

There is also a shorthand version of the command:

             FV            FAVID(yyy)

The WRKFAVCMD  command will  let you  work with  your stored  versions.
You  can copy,  change,  display,  etc from  a  standard  work type  of
display.   Using FAVCMD allows  you the advantage of  having a solution
that is like  a CL  program.  You  can easily  display, rerun,  modify,
and copy to meet the same or similar requirements.

Reusing the SORTDBF results
---------------------------

When ANZFLD  runs, a  file named TAATMPQRY  is output  to QTEMP  as the
result  of the  internal SORTDBF TAA  command.   The TAATMPQRY  file is
processed  by  the  ANZFLD  program  to  produce  the  spooled  output.
TAATMPQRY is deleted  by default at  the end of  the command.   You can
retain the file by specifying RTNOUTFILE(*YES).

The file may be  used for your own purposes, but the  typical use is in
conjunction with another use of ANZFLD to avoid re-running SORTDBF.

When  FILE(*QTEMP) is  specified, the  SORTDBF command is  bypassed and
processing begins immediately on the TAATMPQRY file.

Thus the sequence of commands would be:

             ANZFLD        FILE(xxx) ... RTNOUTFILE(*YES)
             ANZFLD        FILE(*QTEMP) ...
             ANZFLD        FILE(*QTEMP) ...

After the first  use of ANZFLD,  the TAATMPQRY file  is used again  for
additional ANZFLD  commands.  This allows  you to vary the  ANZTYPE and
SUMFLD parameters to see different results.

When  FILE(*QTEMP) is  specified, the  TAATMPQRY file  is automatically
retained (the RTNOUTFILE parameter is ignored).

Note that when  FILE(*QTEMP) is  specified, the  library qualifier  for
the  FILE  parameter  (such  as  *LIBL)  is  ignored.    The  following
parameters  are  also  ignored because  they  are  only specified  when
SORTDBF is used:

          SELFLD
          MBR
          TRNTBL
          FLTWLDCRD
          FIXWLDCRD
          SORTTBL
          CCSID
          LANGID
          OPNQRYFCMD

When  FILE(*QTEMP)  is specified,  the  value of  the  ANZFLD parameter
must be the same as that used to create the TAATMPQRY file.

ANZFLD escape messages you can monitor for
------------------------------------------

      TAA9891    The member to be processed has no records
      TAA9895    No records met the selection criteria

Escape messages from based on functions will be re-sent.

Command parameters                                    *CMD
------------------

   FILE          The qualified  name  of  the file  to  be read.    The
                 library  value defaults  to *LIBL.   *CURLIB  may also
                 be used.

                 A  physical or logical file  (including a join logical
                 file) may be  named.   Logical files may  have only  a
                 single format.

                 The special  value *QTEMP  may be  entered if  a prior
                 use  of ANZFLD  has  specified RTNOUTFILE(*YES).   The
                 allows reuse  of  the  TAATMPQRY  file  in  QTEMP  and
                 avoids the  selection and sorting  function.   See the
                 RTNOUTFILE parameter for a complete discussion.

   ANZTYPE       The type of analysis to be performed.

                 *PERCNTILE.   One  line is  listed  for the  number of
                 percentiles  specified  for the  PERCENTILE parameter.

                 *RANGES.     One  line  is   listed  for  each   range
                 identified in the RANGES parameter.

                 *UNIQUE.     The  values  in  the   ANZFLD  field  are
                 sequenced  in ascending  order and  one line  for each
                 unique value is listed.

                 *FNLTOT.    One  line  is  listed  for   all  selected
                 records.   Fields named  in SUMFLD  will be listed  in
                 the  total  as   well  as  a  count  of  the  selected
                 records.

   ANZFLD        The field  to be  analyzed.   Either a  decimal  field
                 (packed, zoned,  or binary) or  a character  field may
                 be  specified.     The  field  will  be  sequenced  in
                 ascending order.

                 The  median  value (the  value  of the  record  at the
                 50th percentile) is always  printed at the end of  the
                 listing.

                 The mode  value (the value  having the most  number of
                 frequencies)  is  always  printed at  the  end  of the
                 listing.

                 A character field  may be  up to 32  bytes in  length.
                 A decimal field may be up to 15 bytes in length.

   PERCENTILE    The  number of  percentiles to  be listed.   An  entry
                 must  be made  when ANZTYPE(*PERCNTILE)  is specified.
                 The values that  may be entered  are 2, 3,  4, 5,  10,
                 20, or 100.

   RANGES        The ranges to be  summarized.  Up to 50  ranges may be
                 entered when ANZTYPE(*RANGES) is specified.

                 Each range  is specified as  a 'From' and  'To' value.
                 The  'From value'  must be less  than or  equal to the
                 'To value' for each range.

                 Ranges  must  be  specified  in  ascending  order  and
                 cannot  overlap.   The  'From  value'  for each  range
                 must  be greater than the 'To  value' of the preceding
                 range.

                 When a  decimal  field is  specified  for a  range  of
                 values,  a   decimal  type  of  comparison   is  used.
                 Consequently,  you may specify a  low value of  0 or 1
                 even though the field may have several digits.

                 Negative values are valid  such as a  range of -10  to
                 -2.   Since ascending  values are  required, you  must
                 begin with the largest negative value.

                 Range  values   should  be  entered   with  a  decimal
                 notation.   For example,  if you have  a 5 digit field
                 with 2 decimal positions  and want a range of  1.00 to
                 2.00, enter  1.00 to  2.00.  You  must enter  the same
                 number   of  digits  to  the   right  of  the  decimal
                 character as are defined  for the field specified  for
                 the ANZFLD parameter.

                 When a  character field  is specified  for a  range of
                 values, a  character comparison is used.   You may use
                 a shorter length  than the  full field,  but you  must
                 be careful to specify the proper range.

                 For  example,  assume  you have  a  2  byte  character
                 field and  you specify ranges of  A to B, and  C to D.
                 If  the data  in the  field to  be analyzed is  BA, it
                 will not fall in  either range (it would appear  as an
                 'Other' value).   You could use  a range such as  A to
                 B9  to allow  all  data such  as BA  to fall  into the
                 first range.

                 If a blank  is required for  a character field  range,
                 enter *BLANK or *BLANKS.

                 If  a zero  is  required for  a  decimal field  range,
                 enter 0, *ZERO, or *ZEROS.

   SUMFLD        Up  to  30  fields to  be  summarized.   Only  decimal
                 fields of packed, zoned, or  binary types of up to  15
                 digits in  length  may be  specified.   The number  of
                 decimal positions may not exceed 9.

                 Each field  will appear for each line  that is printed
                 along  with  a count  of  the number  of  records that
                 exist for the summary line.

                 For each  field specified,  the type  of summary  must
                 be included:

                 *SUM.  The sum of the values.

                 *AVG.  The  average of the values (the  sum divided by
                 the  number of records  making up the sum).   The same
                 number of  decimal  positions  will be  shown  in  the
                 answer as exist  in the definition of  the field being
                 averaged.

                 *MAX.  The maximum value of the values.

                 *MIN.  The minimum value of the values.

                 *STDDEV.    The  standard  deviation  of  the  values.
                 Standard deviation allows you  to express as a  number
                 the  shape  of  the distribution  curve.    The  value
                 shown  for standard  deviation  is a  number  which if
                 added  and subtracted  from the  average would provide
                 a range  for  approximately 67%  of  the values  in  a
                 normal distribution.

                 For example,  if numbers 1,2,3...100  exist as values,
                 the  median  value is  50 and  the  standard deviation
                 would be 28.  This  means that between 22 (50-28)  and
                 78 (50+28)  you should find  approximately 67%  of the
                 values.   Of  the 100  values, 67  lie between  22 and
                 78.

                 The lower the  standard deviation  value, the  tighter
                 the curve  is.   A value  of 0 means  that all  values
                 fall at the median.

                 Two    standard   deviations   should    account   for
                 approximately  95%  of  the  values.    Three standard
                 deviations should  account for  approximately 95+%  of
                 the values.

                 Standard deviation  is calculated by  using the square
                 root  of the variance.  The  variance is calculated as
                 ((b-(a*a)/c))/c where

                 a = the sum of the values
                 b = the sum of the square of the values
                 c = the count of the values

   SELFLD        The field to  perform selection on.   This is a  'list
                 parameter'  that  allows   up  to  30  fields   to  be
                 described for selection.

                 1)  Field name.   The field name  to perform selection
                 on.  The default  is *NONE.  The  field must exist  in
                 the file  unless *NONE is  specified.   Decimal fields
                 (DDS Data  types of P  = Packed, S =  Zoned decimal, B
                 =   Binary,   F   =  Floating   point)   have  certain
                 restrictions as described in the following parts.

                 You  may use  *NONE  if  you have  multiple  selection
                 fields  that are  primed  by  multiple conditions  and
                 not  all conditions have the  same number of selection
                 values.  For example  if CODE = A  you want to  select
                 on FLDB, but  if CODE = B  you want to select  on FLDA
                 and  FLDB.   You may prime  the selection  fields with
                 IF logic and then have a single ANZFLD command.

                 2) Operator.    The  type of  comparison  to  perform.
                 *EQ is  the default.   *NE,  *GT, *LT,  *GE, *LE,  *CT
                 (Contains)   and   *WC   (Wildcard)   are   supported.
                 Decimal fields do not allow *CT or *WC operators.

                 *CT  (contains) means the entire  field is scanned for
                 the value).   Only character fields  may be used  with
                 *CT.

                 *WC (Wildcard)  causes an *EQ search  against the data
                 base   data  with  one   or  more   wildcards.    Only
                 character fields may be used with *WC.

                 Both 'fixed'  and  'floating' wildcard  functions  are
                 supported.   The  simplest  use  of wildcards  is  for
                 either  generic requests or where  a position within a
                 field should be ignored when selecting.

                 When a  generic request  is needed  such as  selecting
                 all  values beginning  with  ABC,  just enter  ABC  or
                 ABC*.    The default  floating  wildcard character  is
                 '*'.    If the  floating  wildcard character  does not
                 exist  at  the   end  of  the   value,  it  is   added
                 automatically (there  must be room  at the end  of the
                 value to add a wildcard if none exists).

                 The  fixed form of  wildcard support  allows any value
                 to exist in  the search  field at  a single  position.
                 You  would enter  A_C to  select ABC,  ADC, AEC,  etc.
                 If  the   field  is  longer  than   3  positions,  the
                 floating wildcard is  automatically added  at the  end
                 (if not  specified) so  you would  select values  such
                 as  ABC, ABCDEF, ADC,  and ADCX.   '_' is  the default
                 fixed wildcard character.

                 When  *WC is used, if the  length of the compare value
                 is equal  to  the  length of  the  field in  the  data
                 base, at  least one wild card  character (either fixed
                 or floating) must exist within the compare value.

                 Multiple  fixed  wildcards can  be  specified  such as
                 A_D_F.  Multiple floating  wildcards can be  specified
                 such  as A*A  meaning that  any number  of values  may
                 exist  between the  A's.   You  would select  AA, ABA,
                 ABBBA, etc.

                 You may  specify your  own  characters for  the  fixed
                 and  floating  wildcards.    See   the  FLTWLDCRD  and
                 FIXWLDCRD parameters.

                 For  additional  examples  and  explanation,  see  the
                 section  on  'Wildcard processing'  described  for the
                 SORTDBF command.

                 3) Compare value.   The compare value (literal)  to be
                 used  to select  data.   Up  to 32  characters may  be
                 entered.    The  length of  data  may  not  exceed the
                 length of  the field  length in  the data  base.   See
                 the  Translate option  and  *WC  operator for  special
                 handling.

                 For decimal  data types, enter a  value left adjusted.
                 For example,  to select  on  a 5  digit field  with  a
                 value of  10, enter  10.   If the field  to select  on
                 has  decimal positions, you  can select using  a whole
                 number   (such  as  10)  or   a  number  with  decimal
                 positions (such as  10.5).   The decimal notation  (US
                 = '.') should be entered in job decimal format.

                 A  compare value  must be  entered (a  blank  value is
                 invalid).   The special values  *BLANK or *BLANKS must
                 be used  for  character  fields to  specify  a  'blank
                 value'.

                 The special  values *ZERO  or *ZEROS  may be  used (or
                 enter  a  0) for  decimal  fields to  specify  a 'zero
                 value'.

                 Another field name in the  same data base file  cannot
                 be used as a compare value.

                 4)  Translate.   Whether to  translate  the values  in
                 the   data  base  to   upper  case  before   making  a
                 comparison.    The default  is  *XLATE.   Translate is
                 ignored for Decimal  fields.  If  *XLATE is used,  any
                 character  fields  in  the  data base  are  translated
                 using  a translate  table (see the  TRNTBL parameter).

                 For example, if  you enter a  compare value of  'ABC',
                 the data  will match  in the data  base for  ABC, abc,
                 Abc,  AbC, etc.    Note that  the default  is  for the
                 situation where you  want any  'ABC' value  regardless
                 of the case (upper  or lower) of the data  in the data
                 base.   Just enter any case (upper  or lower) into the
                 command  prompt  without  surrounding  it  with quotes
                 (the command  prompter will  fold the  value to  upper
                 case).

                 Note  that translation  operates  on the  data in  the
                 data base and not on the compare value.

                 If  you enter a compare  value with surrounding quotes
                 such as 'Abc'  and take  the default for  translation,
                 you will  not find any values  in the data base.   All
                 the  data base  values have  been translated  to upper
                 case by default before the comparison is made.

                 If you  only  want the  'Abc'  values and  not  values
                 like  'ABC' or  'AbC',  enter a  value  of 'Abc'  (use
                 quotes  surrounding  the  compare value)  and  specify
                 *NOXLATE for translation.

                 There   is  some  performance   degradation  by  using
                 translation.   In most applications  this is of  minor
                 concern.   If  your data  base  data is  all the  same
                 case,  you   can  save  some  overhead  by  specifying
                 *NOXLATE.

                 5) And/Or.   The  relationship  of this  select  field
                 with the next  select field.  The entry  allows you to
                 have  one  or more  fields  that form  a  group.   All
                 fields  in  the   group  must   match  the   selection
                 criteria to select the record.

                 *AND  is the  default meaning  the  next selection  is
                 considered  part  of the  same  group.   The  value is
                 ignored for the last select field.

                 *OR   may  be  specified  to  start  a  new  group  of
                 selections.   For example,  you may  specify FLD1  and
                 FLD2  as one  group  and  FLD3 and  FLD4  as a  second
                 group.   All And/Or  values would  specify *AND except
                 for FLD2 which would specify *OR.

                 If you  want  either FLDA  or  FLDB equal  to  certain
                 values as  long as  FLDX is  greater than  some value,
                 you  must specify  two groups  (FLDA  and FLDX  is one
                 group, FLDB and FLDX is a second group).

                 For  more  details,   see  the   section  on   'And/Or
                 Relationships' described for the SORTDBF tool.

   OUTPUT        How to  output the results.   *  is the default  which
                 displays  the  spooled  file  if  the command  is  run
                 interactively.    The spooled  file  is retained.   If
                 the command  is  run in  batch,  the spooled  file  is
                 output.

                 *PRINT  may be  specified to  just output  the spooled
                 file.     The   spooled  file  is   not  automatically
                 displayed.

   MBR           The member to  be processed.   The  default is  *FIRST
                 for the first member of the file.

   RTNOUTFILE    Whether  to  retain  the   TAATMPQRY  outfile  created
                 internally  into QTEMP.   The default  is *DFT meaning
                 the file  is  not  retained  unless  FILE(*QTEMP)  was
                 specified.  If  you are doing  a single ANZFLD  and do
                 not need the outfile, take the default.

                 *YES  may be  specified to  retain  the outfile.   The
                 purpose  of retaining the outfile  is to allow the use
                 of  multiple  ANZFLD  commands  that  reuse  the  file
                 output by  the TAA SORTDBF  command.  This  provides a
                 faster  solution in  that no  selection or  sorting is
                 done.

                 To reuse  the  TAATMPQRY  file  in  subsequent  ANZFLD
                 commands, specify  FILE(*QTEMP) and use the  same name
                 for  the ANZFLD that  was used to  create the outfile.
                 You may specify a  different ANZTYPE value,  different
                 ranges, percentiles, summary fields, etc.

                 When reusing  TAATMPQRY  (by specifying  FILE(*QTEMP),
                 the  SELFLD  parameter and  other  parameters such  as
                 TRNTBL,  and  SORTTBL  which  relate  to  SORTDBF  are
                 ignored.   The RTNOUTFILE  parameter is also  ignored.

   TRNTBL        The   translate   table   to   be   used   for   field
                 translation.   *DFT is the default  meaning to use the
                 translate table  defined in  the TAATRNTBL  data  area
                 in  TAASECURE.    The shipped  default  is  QSYSTRNTBL
                 (the  system supplied translate  table for translating
                 lower case US English letters to upper case).

                 The  data  area  may  be  changed  with  the   command
                 EDTCONARR  DTAARA(TAASECURE/TAATRNTBL).     For   more
                 information  about  this function,  see  the TAA  Tool
                 RTVTRNTBL.

   FLTWLDCRD     The  Floating  Wildcard  character to  be  used.   The
                 default is  '*'.   A blank  is  not valid.   For  more
                 information,  see  the   discussion  with  the  SELFLD
                 parameter and/or the SORTDBF command.

   FIXWLDCRD     The  Fixed   Wildcard  character  to  be  used.    The
                 default is  '_'.   A blank  is not  valid.   For  more
                 information,  see  the  discussion   with  the  SELFLD
                 parameter and/or the SORTDBF command.

   SORTTBL       The  qualified name  of the  sort table  to use.   The
                 default is *JOB  meaning to use  the sort sequence  of
                 the job.

                 *LANGIDUNQ may  be entered  to mean the  unique-weight
                 sort  sequence  table  that  is  associated  with  the
                 LANGID parameter.

                 *LANGIDSHR  may be  entered to  mean the shared-weight
                 sort  sequence  table  that  is  associated  with  the
                 LANGID parameter.

   CCSID         The  sort sequence  CCSID to  be used  along  with the
                 LANGID  value  for  retrieving  the national  language
                 sort sequence table for  sorting character data.   The
                 default is *JOB  meaning to use the CCSID  of the job.
                 A  specific CCSID in  the range of  1 to 65535  may be
                 entered.

   LANGID        The language  ID  to  be used  to  obtain  a  national
                 language  sort sequence  table  for sorting  character
                 data.    The  default  is  *JOB  meaning  to  use  the
                 language ID of the job.

   OPNQRYFCMD    A  *YES/*NO parameter for  whether the SORTDBF command
                 that is  generated  and the  OPNQRYF command  used  by
                 SORTDBF should be sent as messages.

                 *NO is the default.

                 *YES may  be specified to assist  in understanding how
                 SORTDBF operates or for problem determination.

Restrictions
------------

The maximum record length supported is 9999.

The maximum printer width supported is 378 bytes.

The  maximum number of fields that  may exist in a  file to be analyzed
is 999.

Character  fields  to  be  analyzed  or  summarized  cannot  exceed  32
characters.

Decimal fields  to be  analyzed or summarized  cannot exceed  15 digits
nor 9 decimal positions.

When  a field is  summed, 3 extra  digits are added  to the  value to a
maximum of 15 digits.

Prerequisites
-------------

The following TAA Tools must be on your system:

     CHKOBJ2         Check object 2
     CHKOBJ3         Check object 3
     EDTVAR          Edit variable
     EDTVAR2         Edit variable 2
     EXTLST          Extract list
     HLRMVMSG        HLL Remove message
     RTVDBFA         Retrieve data base file attributes
     RTVFLDA         Retrieve field attributes
     RTVSYSVAL3      Retrieve system value 3
     RTVTRNTBL       Retrieve translate table
     RTVVALA         Retrieve value attributes
     SCNVAR          Scan variable
     SNDCOMPMSG      Send completion message
     SNDESCMSG       Send escape message
     SNDSTSMSG       Send status message

Implementation
--------------

None, the tool is ready to use.

Objects used by the tool
------------------------

   Object        Type    Attribute      Src member    Src file
   ------        ----    ---------      ----------    ----------

   ANZFLD        *CMD                   TAAQRYD       QATTCMD
   TAAQRYDC      *PGM       CLP         TAAQRYDC      QATTCL
   TAAQRYDR      *PGM       RPG         TAAQRYDR      QATTRPG

Added to TAA Productivity tools August 15, 2003

Documentation Home

Return to the Tool List

Up to Top