PCPYFRMCSV Command for IBM ISeries AS/400 System i

Introduction

PCPYFRMCSV builds regular ISeries (AS/400, AS400, System i) database file (PF, LF etc) records from their CSV (comma-separated value) equivalents. This is one of the commands in our ISeries utilities library (click this URL for full Prospero Software PSPUTIL contact and pricing details).

You can use any single character to separate field values e.g. a comma, the bar character '|' or the tab character. We use the term 'CSV' to mean any of these.

Conversion of ISeries data from CSV typically occurs where data must be imported from a non-DB/2 database. For example, via PC upload of records exported as CSV from Microsoft Access, Excel or SQL Server. Some file upload tools (like file transfer in Client Access or Rumba or Netsoft) will do conversion from CSV as part of the upload, but this tends to be very slow.

Using PCPYFRMCSV, you upload the data as CSV ‘plain text’ with just a simple ASCII->EBCDIC conversion, then import from CSV on the ISeries itself using PCPYFRMCSV. So the actual conversion is done on the ISeries, which compared to a PC has much better background (batch) job facilities and faster I/O with blocked database reads. The PC job itself will also run much more quickly if the conversion work is done by the ISeries.

See also the PCPYTOCSV command for the reverse conversion to CSV.


Examples

PCPYFRMCSV FROMFILE(GLMASTCSV) TOFILE(GLMAST)

The above command builds records in the file GLMAST in *LIBL from CSV records in the file GLMASTCSV also in *LIBL. Files such as GLMASTCSV typically have a single field with a suitable record length e.g. CRTPF FILE(QGPL/GLMASTCSV) RCDLEN(500). The PCPYFRMCSV command does not update any records in GLMASTCSV. PCPYFRMCSV command defaults mean that records are added into GLMAST rather than replaced, and that CR/LF (CarriageReturn/Linefeed) characters are presumed at the end of every record in GLMASTCSV.

PCPYFRMCSV FROMFILE(GLDATA/GLMASTCSV) TOFILE(GLDATA/GLMAST) FLDDLM('|') MBROPT(*REPLACE) EOR(*CHAR) EORCHAR(X'0B')

The above command converts all CSV records in the file GLMASTCSV in GLDATA to the file GLMAST also in GLDATA. Fields in the input file GLMASTCV are separated by the bar character '|'. Records are replaced in GLMAST rather than appended, and the hexadecimal character X'0B' is expected at the end of every record in GLMASTCSV.


Restrictions

The TOFILE must already exist. It is not practical for PCPYFRMCSV to 'guess' what file layout might be required and then create the target file automatically.

There is a hard-coded limitation of 4096 characters (4KB) on the size of any CSV record that PCPYFRMCSV processes. This could be increased easily if it represents a problem.


Processing

CSV fields are converted to database file fields using the following rules:

1) Decimal numeric fields: Full stop '.' for decimal place. Leading '-' for negative values (+ve values unsigned). No other characters except 0-9.
2) Dates: always CCYYMMDD (8 digits numeric - no separators)
3) Times: always HHMMSS (6 digits numeric - no separators)
4) Timestamp: always CCYYMMDDHHMMSSnnnnnn where nnnnnn is milli-seconds (20 digits numeric - no separators).
5) All data types 1) to 4) above must not be delimited by quotes.
6) The FLDDLM keyword on the command determines what character is used to separate fields. This defaults to ',' for comma-separation. You can use hexadecimal values for non-visible characters e.g. X'07'. The separator character should not appear before the first field or after the last field on each input record - only ever between fields.
7) The EOR keyword on the command determines what character(s) are used to indicate end of record. This defaults to *CRLF for the CarriageReturn/LineFeed (hexadecimal X'0D0A') character combination - typical on Windows-based systems. If you specify EOR(*CHAR), then you can use the EORCHAR keyword to specify any single character to use for end of record e.g. EORCHAR('$') or EORCHAR(X'FF').
8) Alphanumeric fields: May be delimited by quotes "". In this case, any quotes embedded in alphanumeric field values will cause problems during our conversion process. If not delimited by quotes, then any separator character (comma etc.) embedded in alphanumeric field values will also cause problems during our conversion process. We recommend you delimit alphanumeric fields by quotes if the field value may contain separator characters (e.g. commas in an address field). If the alphanumeric field will not contain separator characters (e.g. a customer code), then you need not delimit it by quotes.
9) If a source CSV field has more characters than the target database field, excess right-hand characters are silently ignored. For example if you copy "ABCD" to a 3-character database field, it will silently truncate to "ABC". It is probably not good practice to do this, because PCPYFRMCSV should be used for simple copying, not data manipulation.
10) If the database file has more fields than the CSV file, the additional fields in the database file are set to default values - blank, zero etc. If the CSV file has more fields than the database file, the additional CSV fields are ignored.


IBM rebranded the AS/400 - some users called it AS400 - to ISeries several years ago and then to System i. All 4 names - ISeries, AS/400, AS400, System i - refer to exactly the same system.