The Copy From Comma Delimited File tool provides two commands to copy
comma delimited files (such as from a spread sheet) to an externally
described file. CPYFRMCSV should be used for one time functions or
for the first time to help you create a customized externally
described file. CPYFRMCSV2 should be used when an externally
described file already exists. You may optionally specify a period
as a delimiter and a comma as a decimal point.
Either a stream file with the comma delimited data or a data base
file (normally created from CPYFRMSTMF) may be input.
One time usage of CPYFRMCSV
---------------------------
Assume you have copied spread sheet data to the PF200A file (probably
using CPYFRMSTMF). Comma delimiters exist in the file to separate
the fields. You want to convert the file to a new externally
described file named PF200B. Assume this is a one time use (you have
no plans to copy the similar information in the future). You would
specify:
CPYFRMCSV FROMFILE(PF200A) TOFILE(xxx/PF200B)
A library must be named for the TOFILE parameter. The PF200B file
must not exist and will be created by CPYFRMCSV.
The data in PF200A would be read twice. The first pass determines
the type and longest length of each field (and number of decimal
positions). Only two types of DDS field types are created (Character
or Packed Decimal). If only decimal data exists for a field (a
decimal point may be included), the DDS will be generated for a
packed decimal type of field. If any character data exists, the DDS
will be generated for a character field (see the exception for
negative values in the section on 'Numeric data').
After the DDS is generated to a temporary source file, the PF200B
file is created as an externally described file.
The second pass of the data writes the data to the newly created
file. The file could be placed in QTEMP. The file is created with
AUT(*USE).
The field names used would be FLD001, FLD002, etc (see an an
alternative in the next section). You can then process the file as a
normal externally described file.
If a stream file existed, a typical command would be:
CPYFRMCSV FROMSTMF('/home/usera/PF200A.csv')
TOFILE(xxx/PF200B)
A temporary file would be created in QTEMP and CPYFRMSTMF used to
copy from the stream file to the QTEMP file. The processing is then
the same as if the FROMFILE parameter had been used.
First time usage of CPYFRMCSV
-----------------------------
Assume the same PF200A file, but now you have plans to make a copy
periodically. Therefore, you want to set up a customized file
definition by modifying the generated DDS for your requirements.
Instead of the default field names (FLD001, FLD002 etc), you want to
supply your own field names as this will make the DDS easier to
change.
You would specify:
CPYFRMCSV FROMFILE(PF200A) TOFILE(xxx/PF200B)
DLTDDSSRC(*NO)
FLDNAM(CUST ADDRESS AMT ...)
The field names should be listed in order in which the data occurs.
When the command completes, the file would be created as in the
previous example. The source file TAATMPSRC in QTEMP with member
PF200B will exist with the generated DDS because DLTDDSSRC(*NO) was
specified.
You would then use a source editor and copy the generated DDS to a
permanent source file. You would probably want to edit the source to
add such things as column headings, edit codes, etc. You can
increase the length of the fields, but you must not decrease the
lengths or change the type of fields. Additional fields (character
or packed decimal) may exist at the end of the file that are not used
during CPYFRMCSV (they will default to blanks and zeros). Create the
file in a permanent library (assume you used the name PF200C in
library ABC).
You could then copy and map the data as follows:
CPYF FROMFILE(xxx/PF200B)
TOFILE(ABC/PF200C)
MBROPT(*ADD)
FMTOPT(*MAP)
The file is now ready to be processed.
Typical usage of CPYFRMCSV2
---------------------------
Assume you did the previous example, and now have file PF200C created
as an externally described file. The next time a copy is needed to
refresh the data from the comma delimited file, you would specify:
CPYFRMCSV2 FROMFILE(PF200A) TOFILE(ABC/PF200C)
REPLACE(*YES)
If the data does not agree with the file definition, an escape
message will be issued. For example, if there are 10 characters of
data that are read before the next comma, it cannot be placed into a
*CHAR LEN(8) field. The text of the escape message will describe the
record number and field that caused the mismatch.
Copying directly from spread sheet data
---------------------------------------
The CPYFRMSTMF command supports copying from an IFS named file. If
it is a spread sheet, the normal method of copying the file is to use
'.csv' at the end of the name. This causes the spread sheet data to
be formatted so there are comma separators.
A typical command to convert from a spread sheet directly to the BBB
file in library AAA would be:
CPYFRMCSV FROMSTMF(XXX.csv) TOFILE(AAA/BBB)
Blank records
-------------
Records with all blank values are implicitly bypassed.
Additional options on CPYFRMCSV
-------------------------------
** If you have column headings at the beginning of the spread
sheet data, you will not want to consider these as data.
Specify the BYPHDGLIN parameter for how many heading lines you
have. Note that if the column headings are also embedded in
the data, you must remove these or they will treated as data
(see the section on 'Deleting data').
** If you have totals at the end of the spread sheet data, you
will not want to consider these as data. Specify the
BYPFNLLIN parameter for how many total lines exist. Note that
if sub totals are embedded in the data you must remove these
or they will treated as data (see the section on 'Deleting
data').
** Debugging aids. One of the typical problems that can occur is
that a field that you want to be considered as numeric is
created as a character type in DDS. This is due to some
non-numeric data in the field. See the later discussion of
'Debugging Aids'.
Field ending considerations
----------------------------
The separator character (normally a comma) must exist to separate the
fields in the From file. Character field data may optionally be
enclosed in double or single quotes. If an embedded separator
character exists in a field such as 'Jones,J' the field must be
quoted with a single or double quote.
The following are valid conditions for character fields:
Input Output
----- ---------------
Fld1 Fld2
hello,abc, hello abc
"hello","abc", hello abc
"hello",abc, hello abc
'hello','abc' hello abc
'Jones,J','abc' Jones,J abc
"Jones,J",'abc' Jones,J abc
'he"llo','a'bc' he"llo a'bc
"he"llo",'a"bc' he"llo a"bc
he"llo,abc he"llo abc
""hello"",abc "hello" abc
"hello,abc hello,abc
"123","789" 123 789
If a single or double quote immediately follows a comma, it is
considered the beginning of a character field and the quote character
is not placed in the data. The field is not considered ended until a
comma immediately follows a quote character and the quote character
is not placed in the data.
If a single or double quote does not immediately follow or precede a
comma, it is placed in the data.
If you begin the field with a quote and do not have an ending quote
followed by a comma, the next set of data is considered to be part of
the same field.
Null data (no value) is also valid such as:
Input Output
----- ---------------
Fld1 Fld2
hello,abc hello abc
hello,, hello
"hello","" hello
Deleting records
----------------
If you have records that should be deleted before using CPYFRMCSV
(such as embedded heading or sub totals), two tools that could be
considered are:
** CNFDLTRCD. You specify the relative record number and a
confirmation prompt appears along with the data from the
record.
** DLTDBFRCD. Allows deletion of a record based on a contains
(*CT) value.
Numeric data
------------
If a field contains only digits (0-9) (and/or an optional decimal
point character) in all records, the data will be considered numeric
and a packed decimal field definition will be generated to hold the
data.
A leading or trailing minus sign is valid to describe negative values
such as:
500-
-500
A 'CR' trailing symbol may also be used to denote negative data such
as:
500CR
Blanks may not exist in a field to be recognized as numeric. For
example, '500 CR' would be considered a character field.
Separator characters may not exist in the field. For example,
'123,456,789' would be considered as 3 different fields.
If a decimal notation character (eg '.') exists, the data with the
largest number of decimal positions determines the maximum. The data
with the largest number of whole numbers determines the maximum
number. The sum of the two maximums determines the number of digits
defined for DDS.
You may optionally specify a comma as a decimal point instead of a
period.
Assume you have the following data for a field:
312.3
56.78
90
The maximum number of decimal positions is 2 and the maximum number
of whole numbers is 3. Therefore, the field would be defined in DDS
as 5 digits with 2 decimals.
The data is decimal aligned when placed in the data base and would be
formatted as:
312.30
56.78
90.00
Fields containing both character and numeric data
-------------------------------------------------
If a field contains both character and decimal data, the field will
be specified as character. The length of the field may be increased
to account for the decimal point and possible minus sign.
For example, assume you have the following:
123456.78-
ABCDEFGHI
If only the first record existed, the field would be defined as *DEC
LEN(8 2). Since the character field is then processed, it has a
length of 9. However, the field length will be assigned as 10 to
account for the decimal point and minus sign. The value 123456.78-
would be placed in the character field.
Debugging aids
--------------
Two listings are optional:
** Summary of all fields (PRTSUM). This provides one line for
each field with the name and DDS definition that was created.
A count exists for the number of processed records that have:
- No data (null)
- Character data
- Numeric data
if you have fields that should be defined as numeric, but are
being defined as character, you can scan the listing and
determine the fields that have a large number of records with
numeric data that are specified as character types because of
one or a few records that have character data. You may want
to review the specific records that have character data to see
if they should be deleted.
** Single field data (PRTFLD). This will list by default each
record processed that causes a change in the definition of a
specified field. This will allow you to easily determine
which record(s) are creating the definition of the field.
You may specify PRTOPT(*CHAR) to list every record that has
character data. This could be useful if you want the field to
have only numeric data, but some records are causing a
character definition.
Use PRTOPT(*ALL) to list every value for the field.
Null fields
-----------
If a field contains only null data, a character field of length 10 is
assumed.
CPYFRMCSV escape messages you can monitor for
---------------------------------------------
TAA9892 No records exist in the From file
Escape messages from based on functions will be re-sent.
CPYFRMCSV2 escape messages you can monitor for
----------------------------------------------
TAA9891 A mismatch exists between the data and the
file definition.
TAA9892 No records exist in the From file
Escape messages from based on functions will be re-sent.
CPYFRMCSV Command parameters *CMD
----------------------------
FROMSTMF The From stream file that contains the delimited
data. Either the FROMSTMF or FROMFILE parameter
must be entered, but not both.
If the data is being read directly from a spread
sheet, you would normally name the stream extension
as '.csv'. This will cause the system to convert
the data to a comma separated file.
If the spread sheet file is in your home directory,
you would typically specify FROMSTMF(stmf.csv).
FROMFILE This is the qualified name of the From file. Either
the FROMSTMF or FROMFILE parameter must be entered,
but not both.
The library qualifier defaults to *LIBL. A specific
library or *CURLIB may also be used.
TOFILE The Externally Described To file that will be
created by CPYFRMCSV. The file must not exist as it
will be created. A library qualifier must be
specified or the special value *CURLIB.
A member of the same name will be added to the file
and will contain the data.
FROMMBR The member of the From file to be used. *FIRST is
the default. *LAST or a specific member may be
named.
TOTXT The text description of the To file which will be
created. *DFT is the default which produces a text
description of 'Created by TAA CPYFRMCSV'. Up to 50
bytes of text may be entered.
BYPHDGLIN The number of heading records that should be
bypassed and not considered as data.
0 is the default.
A value of 1-9 may be entered to describe that
heading records exist in the From file that should
be bypassed.
Note that if heading data is embedded in the data
(not just in the first n records), the heading data
must be removed or it will be considered as data to
be converted.
The tools CNFDLTRCD and DLTDBFRCD may assist you in
deleting records.
BYPFNLLIN The number of final records such as totals that
should be bypassed and not considered as data.
0 is the default.
A value of 1-9 may be entered to describe that a
number of records at the end of the file should be
bypassed.
Note that if sub total records are embedded in the
data, they must be removed or they will be
considered as data.
The tools CNFDLTRCD and DLTDBFRCD may assist you in
deleting records.
DLTDDSSRC A *YES/*NO parameter that determines if the
internally created source file (TAATMPSRC in QTEMP)
will be deleted when the command completes.
*YES is the default, meaning the generated DDS
source in file TAATMPSRC is deleted. If you deleted
the source, but want a copy of it, you can use the
RTVPFSRC command.
*NO may be specified to retain the source in
TAATMPSRC in QTEMP. The member name used will be
the name of the To file. This allows you to copy
the source to a permanent source file and make
changes such as column headings, edit codes, etc.
FLDDLM The field delimiter to be used. A comma ',' is the
default.
A period '.' or a bar '|' may be specified.
DECPNT The decimal point character to be used. A period
'.' is the default.
A comma ',' may be specified.
FLDNAM A list of field names to be used instead of the
internally generated names (FLD001, FLD002 etc).
The default is *GEN meaning to generate the names as
FLD001, FLD002, etc.)
A list of names may be provided which will be used
instead of the internally generated names. The
names must appear in the same order as the data.
If the list of names does not match the count of
fields found in the data, the command will complete
normally and a diagnostic message will be issued.
If there are not enough field names provided, FLDnnn
will be used.
PRTSUM A *YES/*NO option for whether a one line summary by
field should be listed.
*NO is the default to prevent the listing.
*YES may be specified to output the summary. Each
field is shown with the DDS definition and the
number of records that contain 1) no data (null), 2)
character data and 3) numeric data.
PRTFLD The field to be printed. This parameter and the
PRTOPT parameter are intended as debugging aids to
assist in determining how the field definition
created by the program was arrived at.
The default is *NONE which means that no listing
will occur.
If a field name is entered, the PRTOPT determines
the type of printing that will occur.
PRTOPT The type of printing to be performed if the PRTFLD
parameter names a field.
*CHGS is the default to print a line only when a new
record is read that changes the definition of a
field.
For example, if the first record has a value of
'1234.56', the field would be defined as decimal
with 6 digits and 2 decimal positions. If a
subsequent record had a value of '12.567', the field
would be defined as 7 digits with 3 decimal
positions. If a subsequent record had a value of
'AB32', the field would be defined as a character
type with a length of 7. This allows room for the
decimal point. If a negative field was input, the
size would also be increased to contain the minus or
CR symbol. The last record shown will describe the
DDS field definition used.
*CHAR may be specified to list all records that have
character data. This would be useful if you want a
field to be defined as numeric, but some records
have character data which cause a character
definition.
*ALL may be specified to print all values for the
field for each processed record.
CPYFRMCSV2 Command parameters *CMD
-----------------------------
FROMSTMF The From stream file that contains the delimited
data. Either the FROMSTMF or FROMFILE parameter
must be entered, but not both.
If the data is being read directly from a spread
sheet, you would normally name the stream extension
as '.csv'. This will cause the system to convert
the data to a comma separated file.
If the spread sheet file is in your home directory,
you would typically specify FROMSTMF(stmf.csv).
FROMFILE The qualified name of the From file. Either the
FROMSTMF or FROMFILE parameter must be entered, but
not both.
The library qualifier and defaults to *LIBL. A
specific library or *CURLIB may also be used.
TOFILE The qualified name of the Externally Described file
to copy to. The file must exist.
*LIBL is the default. A specific library or the
special value *CURLIB may also be used.
FROMMBR The member of the From file to be used. *FIRST is
the default. *LAST or a specific member name may be
entered.
TOMBR The member of the To file to be used. *FIRST is the
default. *LAST or a specific member may be entered.
MBROPT The member option to be used.
*NONE is the default which will cause the command to
fail. You must specify *ADD or *REPLACE.
*ADD causes the copied records to be added to the
member.
*REPLACE causes the member to be cleared first and
then the records are copied.
BYPHDGLIN The number of heading records that should be
bypassed and not considered as data.
0 is the default.
A value of 1-9 may be entered to describe that
heading records exist in the From file that should
be bypassed.
Note that if heading data is embedded in the data
(not just in the first n records), the heading data
must be removed or it will be considered as data to
be converted.
The tools CNFDLTRCD and DLTDBFRCD may assist you in
deleting records.
BYPFNLLIN The number of final records that should be bypassed
and not considered as data.
0 is the default.
A value of 1-9 may be entered to describe that a
number of records at the end of the file (such as
containing totals) should be bypassed.
Note that if sub total records are embedded in the
data, they must be removed or they will be
considered as data.
The tools CNFDLTRCD and DLTDBFRCD may assist you in
deleting records.
FLDDLM The field delimiter to be used. A comma ',' is the
default.
A period '.' or a bar '|' may be specified.
DECPNT The decimal point character to be used. A period
'.' is the default.
A comma ',' may be specified.
Restrictions
------------
The maximum file length is 9998 using CPYFRMCSV.
The data must appear as described earlier.
When CPYFRMCSV2 is used, the data must match or be a subset of the
definition.
Prerequisites
-------------
The following TAA Tools must be on your system:
CHKOBJ3 Check object 3
EDTVAR Edit variable
EXTLST2 Extract list 2
HLRMVMSG HLL Remove message
RSNLSTMSG Resend last message
RTVDBFA Retrieve data base file attributes
RTVIFSED Retrieve IFS entry description
SNDCOMPMSG Send completion message
SNDDIAGMSG Send diagnostic message
SNDESCINF Send escape information
SNDESCMSG Send escape message
Implementation
--------------
None, the tool is ready to use.
Objects used by the tool
------------------------
Object Type Attribute Src member Src file
------ ---- --------- ---------- ----------
CPYFRMCSV *CMD TAADBLX QATTCMD
CPYFRMCSV2 *CMD TAADBLX2 QATTCMD
TAADBLXC *PGM CLP TAADBLXC QATTCL
TAADBLXC2 *PGM CLP TAADBLXC2 QATTCL
TAADBLXR *PGM RPG TAADBLXR QATTRPG
TAADBLXR2 *PGM RPGLE TAADBLXR2 QATTRPG
Structure
---------
CPYFRMCSV Cmd
TAADBLXC CL pgm
TAADBLXR RPG Pgm
TAADBLXC11 CL pgm Creates the Ext Dsc file
CPYFRMCSV2 Cmd
TAADBLXC2 CL pgm
TAADBLXR2 RPGLE Pgm
|