AIX Replacing a failed physical volume in a mirrored volume group (P-Series 5.3)

** Credit IBM web
Replacing a failed physical volume in a mirrored volume group
The following procedures replace a failed physical volume (PV) within a mirrored volume group. The replacepv command provides a method for replacing a failed PV in most configurations. An alternative procedure is also provided for configurations where the replacepv command cannot be used.
The information in this how-to was tested using AIX® 5.3. If you are using a different version or level of AIX, the results you obtain might vary significantly.
Prerequisites
· All logical volumes using the failed PV have valid copies on other available PVs (with the possible exception of a dedicated dump logical volume).
Replacing a failed PV using the replacepv command
Prerequisites
If any of the prerequisites listed below cannot be met, see the alternate procedure.
· The volume group containing the failed PV is not rootvg.
· The replacement PV can be added to the volume group containing the failed PV (this might not be possible depending on the PV size and volume group characteristics, such as MAX PPs per PV).
· The replacement PV must be able to be configured into the system at the same time as the failing PV.
· The replacement PV's name can differ from the failed PV's name.
· The size of the replacement PV must be at least the size of the failed PV.
· The volume group containing the failed PV must not be a snapshot volume group or have a snapshot volume group.
Complete the following steps, assuming that the failed PV is hdisk2 and the replacement PV is hdisk10:
1. If the replacement PV is not yet installed on the system, perform the steps necessary to install it. To use the configuration manager to define a new PV, run the following command:
cfgmgr
Use the lspv command to determine the name assigned to the PV. For this example, assume that the new PV is named hdisk10.
2. To replace the failed PV with the one defined in Step 1, run the following command:
replacepv hdisk2 hdisk10
When the command runs, hdisk2 is replaced by hdisk10, and hdisk2 is no longer assigned to a volume group.
3. To undefine the failed PV, run the following command:
rmdev -dl hdisk2
4. Physically remove the failed disk from the system.
5. Verify that the procedure was successful by completing the following steps:
· To check that all logical volumes are mirrored to the new PV as desired, run the following
command:
lslv lvname
Check the COPIES attribute of each logical volume affected by the failed PV to ensure that the desired number of copies now exist. If the number of copies of the logical volume is below the desired number, use the mklvcopy command to create additional copies.
· To verify that all logical volume partitions are synchronized and there are no stale partitions, run the following command:
lspv hdisk10
Check the STALE PARTITIONS attribute of the replaced PV to ensure that the count is zero. If there are stale partitions use the syncvg command to synchronize the partitions.
Step 5 completes the replacement procedure for a failed PV.
Replacing a failed PV when the configuration does not allow the use of the replacepv command
Assume that the failed physical volume, hdisk0, and its mirror, hdisk1, are part of the yourvg volume group.
1. To remove mirror copies from the failed PV, run the following command:
unmirrorvg yourvg hdisk0
2. If the PV failure occurred on rootvg, remove hdisk0 from the boot list by running the following command:
Note: If your configuration uses boot devices other than hdisk0 and hdisk1, add them to the command syntax.
bootlist -om normal hdisk1
This step requires that hdisk1 remains a bootable device in rootvg. After completing this step, ensure that hdisk0 does not appear in output.
3. If the PV failure occurred on rootvg, recreate any dedicated dump devices from the failed PV.
If you have a dedicated dump device that was on the failed PV, you can use the mklv command to create a new logical volume on an existing PV. Use the sysdumpdv command to set the new logical volume as the primary dump device.
4. To undefine the failed PV, run the following command:
Note: Removing the disk device entry will also remove the /dev/ipldevice hard link if the failed PV is the PV used to boot the system.
reducevg yourvg hdisk0
rmdev -dl hdisk0
5. If the failed PV is the most recently used boot device, recreate the /dev/ipldevice hard link that was removed in Step 4 by running the following command:
ln /dev/rhdisk1 /dev/ipldevice
Note the r prefixed to the PV name.
To verify that your /dev/ipldevice hard link has been recreated, run the following command:
ls /dev/ipldevice
6. Replace the failed disk.
7. To define the new PV, run the following command:
cfgmgr
The cfgmgr command assigns a PV name to the replacement PV. The assigned PV name is likely to be the same as the PV name previously assigned to the failed PV. In this example, assume that the device hdisk0 is assigned to the replacement PV.
8. To add the new PV to the volume group, run the following command:
extendvg yourvg hdisk0
You might encounter the following error message:
0516-050 Not enough descriptor space left in this volume group.
Either try adding a smaller PV or use another volume group.
If you encounter this error and cannot add the PV to the volume group, you can try to mirror logical volumes to another PV that already exists in the volume group or add a smaller PV. If neither option is possible, you can try to bypass this limitation by upgrading the volume group to a Big-type or Scalable-type volume group using the chvg command.
9. Mirror the volume group.
Note: The mirrorvg command cannot be used if all of the following conditions exist:
· The target system is a logical partition (LPAR).
· A copy of the boot logical volume (by default, hd5) resides on the failed PV.
· The replacement PV's adapter was dynamically configured into the LPAR since the last cold boot.
If all of the above conditions exist, use the mklvcopy command to recreate mirror copies for each logical volume as follows:
d. Create copies of the boot logical volume to ensure that it is allocated to a contiguous series of physical partitions.
e. Create copies of the remaining logical volumes, and synchronize the copies using the syncvg command.
f. Make the disk bootable by shutting down the LPAR and activating it instead of rebooting using the shutdown or reboot commands. This shutdown does not have to be done immediately, but it is necessary for the system to boot from the new PV.
Otherwise, create new copies of logical volumes in the volume group using the new PV with the following command:
Note: The mirrorvg command disables quorum by default. For rootvg, you will want to use the -m option to ensure that the new logical volume copies are mapped to hdisk0 in the same way as the working disk.
mirrorvg yourvg hdisk0
10. If your configuration holds third copies of some logical volumes, you might need to recreate those copies with the following command:
mklvcopy -k
11. If the PV failure occurred on rootvg, initialize the boot record by running the following command:
bosboot -a
12. If the PV failure occurred on rootvg, update the boot list by running the following command:
Note: If your configuration uses boot devices other than hdisk0 and hdisk1, add them to the command.
bootlist -om normal hdisk0 hdisk1
13. Verify that the procedure was successful.
· To verify that all logical volumes are mirrored to the new PV, run the following command:
lslv lvname
Check the COPIES attribute of each logical volume affected by the failed PV to ensure that the desired number of copies now exist. If the number of copies of the logical volume is below the desired number, use the mklvcopy command to create additional copies.
· To verify that all the logical volume partitions are synchronized, check that there are no stale partitions by running the following command:
lspv hdisk0
Check the STALE PARTITIONS attribute of the replaced PV to ensure that the count is zero. If there are stale partitions use the syncvg command to synchronize the partitions.
14. If the PV failure occurred on rootvg, use the following steps to verify other aspects of this procedure:
· To verify the boot list, run the following command:
bootlist -om normal
· To verify the dump device, run the following command:
sysdumpdev -l
· To verify the list of bootable PVs, run the following command:
ipl_varyon -i
· To verify the /dev/ipl_device, run the following command:
ls -i /dev/rhdisk1 /dev/ipldevice
Ensure the output of the ls command has the same i-node number for both entries.
15. This step completes the procedure.

0 comments:

Loading