Z/OS Metal I/O – Making Developers' Lives Better

2 days ago 3

Photo of a historic hard disk drive by Bernd 📷 Dittrich on Unsplash

Updated June 5th, 2025

Motivation

z/OS I/O has always been truly impressive, and it’s a fundamental strength of the operating system. Most people access the I/O services through middleware such as CICS, IMS, or Db2. Since I like tinkering with things (and am somewhat of a glutton for punishment), I wanted to be able to read and update a number of PDSE members really efficiently using BPAM services. BPAM has a performance advantage over BSAM services and QSAM services when working with multiple members because you can open the PDSE, work with the members you need, then close the PDSE. BSAM and QSAM open up a particular PDSE member, process it, then close it. So if you want to work with several PDSE members using BSAM or QSAM, you end up opening and closing the PDSE multiple times.

If you use fopen, fread/fwrite, fclose in C, there is no way (that I know of) to use the BPAM services and instead, it uses BSAM or QSAM. C does let you control which of BSAM or QSAM you get, depending on the string you pass to fopen.

I also wanted to be able to read and write extended attributes on PDSE members including the CCSID, timestamp, and user for an individual member. The Directory Entry Services (DESERV) and the Update Partitioned Dataset Directory (STOW) macros are used to read and write extended attributes. Those require BPAM. They also require that you write the code in HLASM (High Level Assembler). I do wonder if anyone thought High Level Assembler was an oxymoron when they came up with the name…

As an aside, if you are asking ‘wtf is a CCSID?’ it’s a ‘tag’ that describes how you should interpret data. Wikipedia can tell you more. Much more. Most people on z/OS use one of 3 tags: ASCII (819), EBCDIC (often 1047, but could be another variant of EBCDIC like 37), and UTF-8 (1208).

Looking into the future a bit, having the ability to have architected time stamps, CCSIDs, and user information for PDSE members opens up many possibilities. Tools that compare timestamps, such as build tools, could be enhanced to support PDSE members. Tools that edit, search, or compare PDSE members could be enhanced to support CCSIDs. Imagine editing an ASCII or UTF8 PDSE member with your favourite editor and it just worked?

As compelling a story as it is, it also seemed overwhelming. I ended up pushing the idea into the back of my head for a few years. I found the energy to tackle the challenge in the spring of 2024, which is when I started creating C interfaces that front-ended the assembler services.

Overview

The code is organized in a few layers. The lowest layer is a set of assembler routines that call the low-level macros. These are all in dioa.s and mema.s and include:

S99A: dynamically allocate a DDName to a dataset
S99MSGA: print out a human-readable dynamic allocation error message.
OPENA: open a dataset
FINDA: find the start of a PDS, or PDSE member
READA: read a block of data
WRITEA: write a block of data
CHECKA: check if the input or output operation is complete
NOTEA: get the current location in the dataset
DESERVA: get directory information for one or more members
STOWA: update directory information for a member
CLOSEA: close a dataset
MALOC24A: allocate memory ‘below the line’ (24-bit addressable memory)
FREE24A: free 24-bit allocated memory

The next layer in the stack are the C services in dio.c that wrap the assembler macros. The wrapping lets the C code be built as either 31- or 64-bit even though the assembler routines are all 31-bit. If you’re interested in the gory details, check out the 64-31 thunking code in call31a.s. There you’ll find the assembler macro magic and C wizardry that makes it all tick.

One of the harder bits of work was understanding the control blocks (or data structures if you prefer) that are used to interface with these low level services. In C terms, they typically map to a union of structures that are self-describing based on the value of bits in fields. The most impressive control block is likely IHADCB, which I mapped to ‘struct ihadcb’ using the DSECT conversion utility provided with the IBM C compiler.

Many of the I/O control blocks have to be allocated in 24-bit storage. As a result, you can’t use the C stack or malloc(), as it is either in 31- or 64-bit storage. I ended up creating the MALOC24A and FREE24A services.

I created some simple test routines to try out the services – in particular:

basicalloc:
- allocate a DDNAME to SYS1.MACLIB (*)¹
- use standard C I/O to read the member DYNALLOC using the special syntax fopen(“//DD:DDNAME(DYNALLOC)”.
basiccreate: Create a PDSE member:
- allocate a system-generated DDNAME
- open the PDSE passed in
- write out a block of data in the PDSE (ASCII A characters)
- perform a CHECK and NOTE to ensure the data was written
- perform a STOW which will create the PDSE member with extended attributes, including setting the CCSID of the member to 819 (ASCII)
- close the PDSE
- free the DDNAME
basicread: Read a PDSE member:
- allocate a system-generated DDNAME
- open the PDSE passed in
- perform a DESERV to get the extended attributes of the PDSE member and print out the CCSID, userid, timestamp
- FIND the member to read
- READ the first block of the member
- CHECK the block was read
- READ another block
- CLOSE the PDSE

These simple routines let me understand how the BPAM services work in practice for the different dataset record formats, which I hope will be helpful to others. I wasn’t able to find examples of how to do a basic read or write of a member using BPAM and working with the extended attributes. I also created a reference where I found information, which seemed to be scattered across a number of different books.

The assembler, C, and testcases are all under the core directory and can be treated as a stand-alone set of code that are built into an archive.

C BPAM I/O Services

In the spring of 2025, I went back to the code and refactored it, creating a new layer called services. The header files describe the functions that this directory of code provides. Like core, the services are built into their own archive, and can be used by binding both this archive as well as the core archive with your code (because these services are built on the core archive code). The important functions are described in:

bpamio.h : open, find, read, write, flush, close, ENQ, DEQ
memdir.h : openmemdir, readmemdir, writememdir, closememdir, create_mstat, print_member

Copying Files to PDSE Members

With these core services under my belt, I decided to tackle a problem I occasionally trip over where I use git to manage all my source code, but the build process on z/OS is expecting the code to be in PDSE members. In the past I have used various forms of copy to get the files into datasets, but it’s a hassle, and when many of my files are tagged with different CCSIDs (code pages), it can become a bit of a mess. So I decided to use my newfound knowledge to create a utility to copy a directory of files to a set of datasets. I wanted to be able to:

specify a directory of files that might be in a variety of code pages.
specify a target dataset pattern to copy those files to.
only open each PDSE once to write all the members, making it fast.
use a common buffer for the input and output, making it fast.

As an example, I might have C files, COBOL files, HLASM files, listings, macros in a directory called src that I want to copy into datasets. But I have also used the low level qualifier of the dataset to indicate the ‘type’ of the dataset, e.g. FULTONM.DIO.C PDSE would have C code. FULTONM.DIO.LST would have listings.

In the preceding example, I can use the file-to-member copy program, which I call f2m, to do the copy as follows:

f2m src usr.dio '*.c' '*.lst' '*.cbl' '*.s' '*.mac' '*.lst'

or if I wanted to just copy everything from the src directory, I could use the shorter:

f2m src usr.dio '*.*'

Note the single quotes are important – you don’t want the shell to perform file expansion on these file patterns.

The syntax is a bit clunky. I should probably clean it up a bit. But it’s also open source with an Apache Version 2.0 License, so feel free to fork it and hack away.

The files will be copied to the correct datasets, and they will retain their CCSID that they are tagged with. So an ASCII file is copied as ASCII and an EBCDIC file is copied as EBCDIC. For tools like ISPF edit, you will need to specify if the file is in ASCII when you go to edit it, although perhaps ISPF edit will add support for the CCSID encoding in the PDSE member at some point.

The code to do the copy builds on the C functions outlined above. The ‘main’ program is f2m.c which uses the higher-level services. The I/O buffer management is messier than I would like – I’m hoping for some inspiration to refactor it a bit to make it simpler.

f2m is just one of the samples I have written to highlight these BPAM services. The list of samples so far includes:

readwriterec: create a PDSE member and then read it back again.
f2m: copy files to members
m2f: copy members to files
mlsx: list members – extended – which also shows extended attributes and ISPF attributes

If anyone else is interested in contributing and fleshing out some more services and/or samples, please reach out.

My hope is that these samples and tests will make it easier for others to exploit low-level z/OS I/O services. Please note that the code is lightly tested and so please beware. If you do find a bug, feel free to open an issue.

If you want to try out the services, I would recommend reading the code for: readwriterec which shows how to read and write to PDSE members with BPAM I/O.

As always, I asked A LOT of questions to figure out how this stuff works. I am indebted to many people in both IBM and outside of IBM who helped me out as I learned the basics of z/OS HLASM I/O.

In particular, Tom Brennan, David Crayford, @wizardofzos, Kirk Wolf, Lionel Dyck, Wayne Rhoten, Steve Smith, James Mulder, Peter Relson, Tabari Alexander, David Shackelford, Joseph Reynolds, Sri Hari Kolusu provided significant technical guidance in my barrage of questions. My apologies to those I missed – you are appreciated!

My special thanks to Anthony Giorgio and Igor Todorovski for providing guidance and feedback in the writing of this article.

Read Entire Article