Hardware Setup Verification
Verify Device Availability
Use lspci to check RoCE and ISM device availability:$ lspci
0000:00:00.0 Ethernet controller: Mellanox Technologies MT27500/MT27520
Family [ConnectX-3/ConnectX-3 Pro Virtual Function]
0001:00:00.0 Non-VGA unclassified device: IBM Internal Shared Memory (ISM)
virtual PCI device
z/VM Only: Verify Card Attachment
$ vmcp ‘QUERY PCIFUNCTION’PCIF 00000280 ATTACHED TO S8360018 00000280 ENABLED 10GbE RoCE
PCIF 000002E2 ATTACHED TO S8360018 000002E2 ENABLED ISM
IBM Z Only: PNET ID Verification
Verify PNET IDs are set and match. Use smc_rnics and/or smc_dbg if available. Otherwise:# RoCE device
$ cat /sys/class/net/ens1/device/util_string | iconv -f IBM-1047 -t ASCII
NetworkA
# ISM device with PCI ID 0001:00:00.0 according to lspci:
$ cat /sys/bus/pci/devices/0001:00:00.0/util_string | iconv -f IBM-1047 \
-t ASCII
NetworkA
# OSA or HiperSockets device
$ cat /sys/devices/css0/chp0.`cat /sys/class/net/enccw0.0.f500/device/chpid` \
/util_string | iconv -f IBM-1047 -t ASCII
NetworkA
OS Setup Verification
RoCE Express Cards: Verify Port and NIC States:
$ cat /sys/class/infiniband/mlx4_0/ports/1/phys_state5: LinkUp
$ cat /sys/class/infiniband/mlx4_0/ports/1/state
4: ACTIVE
$ ip link show ens2
3: ens2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 \
qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 82:03:14:32:f1:a0 brd ff:ff:ff:ff:ff:ff
RoCE Express Cards with VLANs: Verify Interfaces
Make sure the VLAM interface exists and has an IP address assigned.Note that the interface can be down.
$ ip addr show ens2.201
7: ens2.201: <BROADCAST,MULTICAST,DOWN,LOWER_UP> mtu 1500 qdisc \
fq_codel master virbr0 state UNKNOWN group default qlen 1000 \
link/ether fe:54:00:f9:cf:be brd ff:ff:ff:ff:ff:ff
inet 192.168.23.42/24 scope global vnet3
valid_lft forever preferred_lft forever
Check Available Free Memory via /proc/meminfo
$ grep "^Mem" /proc/meminfoMemTotal: 1710584 kB
MemFree: 83404 kB
MemAvailable: 1125752 kB
If things do not work: Check smcss Output for Reason Code
$ smcss -aState UID Inode Local Address [...] Intf Mode
ACTIVE 20000 115762 10.101.4.8:60594 [...] 0000 TCP 0x05000000/0x0000521e
See smcss man page (smc-tools v1.3 or later, or see here for the latest version) for an explanation of the reason codes.
Troubleshooting z/OS
“Autonomics” function might disable connections to peers that are unlikely to benefit from SMC (most typically: Short-lived connections exchanging few data) for a certain period of time. In the TCP/IP profile add NOAUTOSMC as follows:GLOBALCONFIG SMCGLOBAL NOAUTOSMC
See here for further details.
How to enable Unruly Applications
Some applications have involved startup procedures that will not easily work with smc_run.Example
DB2 requires registration of environment variables through the db2set command:$ db2set -i db2inst1 DB2ENVLIST="LD_LIBRARY_PATH LD_PRELOAD"
$ smc_run db2start
If smc_run does not work for an application with PID <p>:
Check /proc/<p>/environ whether LD_PRELOAD is set correctly, pointing to libsmc-preload.so!
Alternative Approaches
Set LD_PRELOAD in the user ID’s profile that starts the respective processes, e.g. the DB2 instance owner:$ echo “export LD_PRELOAD=libsmc-preload.so” >> ~/.profile
Or use /etc/ld.so.preload to enable the entire system:
$ cat /etc/ld.so.preload
libsmc-preload.so
Note: This will add the preload library to all processes on the entire system or all processes started by the respective user! This includes processes not performing any socket calls at all, e.g. the ls command.
Therefore, always prefer usage of smc_run.
No comments:
Post a Comment