Daemons
clexecd | This is used by cluster kernel threads to execute userland commands (such as the run_reserve and dofsck commands). It is also used to run cluster commands remotely (like the cluster shutdown command). This daemon registers with failfastd so that a failfast device driver will panic the kernel if this daemon is killed and not restarted in 30 seconds. |
cl_ccrad | This daemon provides access from userland management applications to the CCR. It is automatically restarted if it is stopped. |
cl_eventd | The cluster event daemon registers and forwards cluster events (such as nodes entering and leaving the cluster). There is also a protocol whereby user applications can register themselves to receive cluster events. The daemon is automatically respawned if it is killed. |
cl_eventlogd | cluster event log daemon logs cluster events into a binary log file. At the time of writing for this course, there is no published interface to this log. It is automatically restarted if it is stopped. |
failfastd | This daemon is the failfast proxy server.The failfast daemon allows the kernel to panic if certain essential daemons have failed |
rgmd | The resource group management daemon which manages the state of all cluster-unaware applications.A failfast driver panics the kernel if this daemon is killed and not restarted in 30 seconds. |
rpc.fed | This is the fork-and-exec daemon, which handles requests from rgmd to spawn methods for specific data services. A failfast driver panics the kernel if this daemon is killed and not restarted in 30 seconds. |
rpc.pmfd | This is the process monitoring facility. It is used as a general mechanism to initiate restarts and failure action scripts for some cluster framework daemons (in Solaris 9 OS), and for most application daemons and application fault monitors (in Solaris 9 and10 OS). A failfast driver panics the kernel if this daemon is stopped and not restarted in 30 seconds. |
pnmd | Public managment network service daemon manages network status information received from the local IPMP daemon running on each node and facilitates application failovers caused by complete public network failures on nodes. It is automatically restarted if it is stopped. |
scdpmd | Disk path monitoring daemon monitors the status of disk paths, so that they can be reported in the output of the cldev status command. It is automatically restarted if it is stopped. |
man pages | /usr/cluster/man |
log files | /var/cluster/logs /var/adm/messages |
sccheck logs | /var/cluster/sccheck/report.<date> |
CCR files | /etc/cluster/ccr |
Cluster infrastructure file | /etc/cluster/ccr/infrastructure |
Display reservation keys | scsi2: /usr/cluster/lib/sc/pgre -c pgre_inkeys -d /dev/did/rdsk/d4s2 scsi3: /usr/cluster/lib/sc/scsi -c inkeys -d /dev/did/rdsk/d4s2 |
determine the device owner | scsi2: /usr/cluster/lib/sc/pgre -c pgre_inresv -d /dev/did/rdsk/d4s2 scsi3: /usr/cluster/lib/sc/scsi -c inresv -d /dev/did/rdsk/d4s2 |
Quorum info | scstat -q |
Cluster components | scstat -pv |
Resource/Resource group status | scstat -g |
IP Networking Multipathing | scstat -i |
Status of all nodes | scstat -n |
Disk device groups | scstat -D |
Transport info | scstat -W |
Detailed resource/resource group | scrgadm -pv |
Cluster configuration info | scconf -p |
Installation info (prints packages and version) | scinstall -pv |
Cluster Configuration
Integrity check | sccheck |
Configure the cluster (add nodes, add data services, etc) | scinstall |
Cluster configuration utility (quorum, data sevices, resource groups, etc) | scsetup |
Add a node | scconf -a -T node=<host><host> |
Remove a node | scconf -r -T node=<host><host> |
Prevent new nodes from entering | scconf -a -T node=. |
Put a node into maintenance state | scconf -c -q node=<node>,maintstate
Note: use the scstat -q command to verify that the node is in maintenance mode, the vote count should be zero for that node. |
Get a node out of maintenance state | scconf -c -q node=<node>,reset
Note: use the scstat -q command to verify that the node is in maintenance mode, the vote count should be one for that node. |
Quorum devices are nodes and disk devices, so the total quorum will be all nodes and devices added together.
You can use the scsetup GUI interface to add/remove quorum devices or use the below commands.
Adding a device to the quorum | scconf -a -q globaldev=d11
Note: if you get the error message "uable to scrub device" use scgdevs to add device to the global device namespace. |
Removing a device to the quorum | scconf -r -q globaldev=d11 |
Remove the last quorum device | Evacuate all nodes put cluster into maint mode #scconf -c -q installmode remove the quorum device #scconf -r -q globaldev=d11 check the quorum devices #scstat -q |
Resetting quorum info | scconf -c -q reset
Note: this will bring all offline quorum devices online |
Bring a quorum device into maintenance mode | obtain the device number #scdidadm -L #scconf -c -q globaldev=<device>,maintstate |
Bring a quorum device out of maintenance mode | scconf -c -q globaldev=<device><device>,reset |
Lists all the configured devices including paths across all nodes. | scdidadm -L |
List all the configured devices including paths on node only. | scdidadm -l |
Reconfigure the device database, creating new instances numbers if required. | scdidadm -r |
Perform the repair procedure for a particular path (use then when a disk gets replaced) | scdidadm -R <c0t0d0s0> - device scdidadm -R 2 - device id |
Configure the global device namespace | scgdevs |
Status of all disk paths | scdpm -p all:all Note: (<host>:<disk>) |
Monitor device path | scdpm -m <node:disk path> |
Unmonitor device path | scdpm -u <node:disk path> |
Disks group
Adding/Registering | scconf -a -D type=vxvm,name=appdg,nodelist=<host>:<host>,preferenced=true |
Removing | scconf -r -D name=<disk group> |
adding single node | scconf -a -D type=vxvm,name=appdg,nodelist=<host> |
Removing single node | scconf -r -D name=<disk group>,nodelist=<host> |
Switch | scswitch -z -D <disk group> -h <host> |
Put into maintenance mode | scswitch -m -D <disk group> |
take out of maintenance mode | scswitch -z -D <disk group> -h <host> |
onlining a disk group | scswitch -z -D <disk group> -h <host> |
offlining a disk group | scswitch -F -D <disk group> |
Resync a disk group | scconf -c -D name=appdg,sync |
Enable | scconf -c -m endpoint=<host>:qfe1,state=enabled |
Disable | scconf -c -m endpoint=<host>:qfe1,state=disabled Note: it gets deleted |
Resource Groups
Adding | scrgadm -a -g <res_group> -h <host>,<host> |
Removing | scrgadm -r -g <group> |
changing properties | scrgadm -c -g <resource group> -y <propety=value> |
Listing | scstat -g |
Detailed List | scrgadm -pv -g <res_group> |
Display mode type (failover or scalable) | scrgadm -pv -g <res_group> | grep 'Res Group mode' |
Offlining | scswitch -F -g <res_group> |
Onlining | scswitch -Z -g <res_group> |
Unmanaging | scswitch -u -g <res_group>
Note: (all resources in group must be disabled) |
Managing | scswitch -o -g <res_group> |
Switching | scswitch -z -g <res_group> -h <host> |
Adding failover network resource | scrgadm -a -L -g <res_group> -l <logicalhost> |
Adding shared network resource | scrgadm -a -S -g <res_group> -l <logicalhost> |
adding a failover apache application and attaching the network resource | scrgadm -a -j apache_res -g <res_group> \ -t SUNW.apache -y Network_resources_used = <logicalhost> -y Scalable=False -y Port_list = 80/tcp \ -x Bin_dir = /usr/apache/bin |
adding a shared apache application and attaching the network resource | scrgadm -a -j apache_res -g <res_group> \ -t SUNW.apache -y Network_resources_used = <logicalhost> -y Scalable=True -y Port_list = 80/tcp \ -x Bin_dir = /usr/apache/bin |
Create a HAStoragePlus failover resource | scrgadm -a -g rg_oracle -j hasp_data01 -t SUNW.HAStoragePlus \ > -x FileSystemMountPoints=/oracle/data01 \ > -x Affinityon=true |
Removing | scrgadm -r -j res-ip
Note: must disable the resource first |
changing properties | scrgadm -c -j <resource> -y <property=value> |
List | scstat -g |
Detailed List | scrgadm -pv -j res-ip scrgadm -pvv -j res-ip |
Disable resoure monitor | scrgadm -n -M -j res-ip |
Enable resource monitor | scrgadm -e -M -j res-ip |
Disabling | scswitch -n -j res-ip |
Enabling | scswitch -e -j res-ip |
Clearing a failed resource | scswitch -c -h<host>,<host> -j <resource> -f STOP_FAILED |
Find the network of a resource | # scrgadm -pvv -j <resource> | grep -I network |
Removing a resource and resource group | offline the group # scswitch -F -g rgroup-1 remove the resource # scrgadm -r -j res-ip remove the resource group # scrgadm -r -g rgroup-1 |
Adding | scrgadm -a -t <resource type> i.e SUNW.HAStoragePlus |
Deleting | scrgadm -r -t <resource type> |
Listing | scrgadm -pv | grep ‘Res Type name’ |
0 comments:
Post a Comment