Author Archives: TheTecharch

Refreshing Important Unix/Linux Commands

This is refresher on the Unix/Linux commands which i use quite often. Hence, wanted to share them with fellow IT professionals.

Disk Usage

du -ah   display disk usage of all files and directories
du -ahc display disk usage of all file and directories and display total usage
du -mh   display disk usage in MB
du -sh   display the total used size in form of a summary
du -ahc —exclude=“*.txt”   exclude files from disk usage output
du -ahc —time   display the modification time in disk usage output

Secure Copy

Copy the file “snowy.txt” from a remote host to the local host
$ scp your_username@remotehost:snowy.txt /some/local/directory

Copy the file “snowy.txt” from the local host to a remote host
$ scp snowy.txt your_username@remotehost:/some/remote/directory

Copy the directory “tardis” from the local host to a remote host’s directory “bar”
$ scp -r tardis your_username@remotehost:/some/remote/directory/bar

Copy the file “snowy.txt” from remote host “rh1” to remote host “rh2”
$ scp your_username@rh1:/some/remote/directory/snowy.txt \
your_username@rh2:/some/remote/directory/

Copying the files “tardis.txt” and “bar.txt” from the local host to your home directory on the remote host
$ scp tardis.txt bar.txt your_username@remotehost:~

Copy the file “snowy.txt” from the local host to a remote host using port 2264
$ scp -P 2264 snowy.txt your_username@remotehost:/some/remote/directory

Copy multiple files from the remote host to your current directory on the local host
$ scp your_username@remotehost:/some/remote/directory/\{a,b,c\} .
$ scp your_username@remotehost:~/\{tardis.txt,bar.txt\} .

File Transfer Protocol (FTP)

Connect to FTP server and change local/remote directories
ftp IP/Hostname
ftp> bin
ftp> hash
Change remote server current directory
ftp> pwd
257 “/myftpserver” is current directory.
ftp> cd dir1
250 CWD command successful. “/myftpserver/dir1” is current directory.
ftp> pwd
257 “/myftpserver/dir1” is current directory.
Change local machine current directory
ftp> pwd
/home/thetecharch/FTP
ftp> lcd /tmp
Local directory now /tmp
ftp> pwd
/tmp
Download Files from FTP Server
ftp> ls
ftp> get FILENAME
ftp> mget *.html (download multiple files using mget)
ftp> mls *.html (view the file names before downloading)
Upload files to FTP Server
ftp> mput *.html
Close FTP connection
ftp> close
OR
ftp> bye

 

AWK Concepts

Files with TAB seperated columns
print 1st column of a file

awk ‘{print $1}’ messages
display two columns with a separator
awk ‘{print $1″ “$2}’ messages
print all the columns from a file, this is same like “cat messages”
awk ‘{print $0}’ messages

For “,” (csv) separated columns use “-F” or “FS” seperators
awk -F “,” ‘{print $1}’ BillingFile_20170103-0919-54.csv
awk ‘{print $1}’ FS=”,” BillingFile_20170103-0919-54.csv

Do no display the first column (Headers in a file)
awk ‘NR!=1{print $1″ “$2}’ FS=”,” BillingFile_20170103-0919-54.csv
Display all rows after row 20
awk ‘NR>20{print $1″ “$2}’ FS=”,” BillingFile_20170103-0919-54.csv
NR>=20 || NR<20 || NR<=20

match column value then filer data (matc column 3 with value 13)
awk ‘$3==’13′{print $0}’ messages | grep “vol006_smb02”
using OR
awk ‘$1==’13′ || $3==’10000′{print $0}’ messages
using AND (&&)

search for a specific string
awk -F “,” ‘/NAS/{print $0}’ BillingFile_20170103-0919-54.csv
excluding specific string
awk -F “,” ‘!/NAS/{print $0}’ BillingFile_20170103-0919-54.csv
awk -F “,” ‘!/au2106/ && !/AU2106/ && NR!=1{print $1}’ BillingFile_20170103-0919-54.csv
display all records where a string matches in a particular column
awk ‘$1~/Fri/{print $0}’ messages
search a string at beginning or end 
awk -F=”,” ‘$1!~/^N/{print $0}’ BillingFile_20170103-0919-54.csv
end: N$
using “if” statement

awk -F “,” ‘{if ($3>=20000) print $0}’ test.csv
awk -F “,” ‘BEGIN{salary=2}{if($3==10000){ $3=$3*2; print $0;}}’ test.csv
awk -F “,” ‘BEGIN{count=0}{if($3==10000) {count++;print $0}}END{print “Total Records: ” count}’ test.csv
 

Analyze messages file from NetApp Storage Controller

Download multiple messages files from storage controller on unix/linux host’s /tmp
$ cd /tmp
Consolidate multiple file in to one
$ cat $(ls -t) > outputfile.txt

Read a file, use “awk” to select a column in the file, sort the column contents and get unique values
$ cat outputfile.txt | awk ‘{print $6}’| sort | uniq -c | sort -rn
Read a file, use “grep” to search for string then use “awk” to select a column from the searched rows and display unique string with their count of occurrence
$ cat outputfile.txt | grep SNOWY02:wafl.volume.snap.autoDelete:info| awk ‘{print $9}’| sort | uniq -c | sort -rn
Read a file, use “grep” to search for string then use “awk” to select a column from the searched rows and display unique strings with customised output
$ cat SNOWY02_messages.txt | grep SNOWY02:wafl.volume.snap.autoDelete:info| awk ‘{print $9″ was deleted on “$1,$2,$3}’| uniq -c | sort -rn
Display anything within square brackets [ ]
cat snowy_02.txt | sed ‘s/.*\[\([^]]*\)\].*/\1/g’ | sort| uniq -c | sort -rn
 

Connect your Mac to Storage System’s Serial Console

Search for Serial Port
ls /dev/tty*
If that doesn’t work try this:
ioreg -c IOSerialBSDClient  | grep usb

Use screen from Mac Terminal to connect to storage system’s serial console
screen /dev/cu.usbserial 9600
Capture Screen Terminal Output on mac terminal:
script -a -t 0 out.txt screen /dev/ttyUSB0 115200
details
•script built in app to “make a typescript of terminal session”
•-a append to output file
•-t 0 time between writing to output file is 0 seconds, so out.txt is updated for every new char
•out.txt is just the output file name
•screen /dev/ttyUSB0 115200 – command from question for connecting to external device
you can then use tail to see that the file is updating
tail -100 out.txt
 

Activate Web Server on Mac

Mac OS comes installed with Python language. Python as a built in Web server which can be used to upload firmware files to storage controllers.
python -m SimpleHTTPServer 8000

Rendezvous with NetApp AFF A700 All Flash Array

A few days back i got an opportunity to visit Data Centre to setup NetApp AFF A700 All Flash Array. I was very exited. It’s been a long wait to get hands on with new shiny Flash Arrays from NetApp.
Although this storage system looks small in size, however, when it comes to peformance, it’s a beast.
We got this system for a POC (proof of concept), A700 storage controller and a Disk shelf with 3.8TB SSD drives.

I clicked some pictures while setting up the storage array.

A700 Front
A700-Front

A700 Rear
A700-Rear

Controller
Controller

Racked
Final

Hickups after Powering on the Storage Controller
After powering on the disk shelf and the storage controller, the console on my mac showed ASCII characters. I immediately figured out, this is an incorrect Baud Rate issue. The usual baud rate of 9600 is not going to work properly. After changing multiple values, i used the baud rate to setup NetApp E Series systems and it worked fine. It took about 1 hour to figure out the correct value. The documentation provided by NetApp didn’t have the proper Baud Rate value for connecting to storage console.

If any of you reading this post are going to be working with NetApp AFF A series systems please use the Baud Rate: 115200 for connecting to storage array console.

Powershell Script to Setup new NetApp clustered Data ONTAP system

Setting up a new NetApp clustered Data ONTAP system involves a number of steps. I have tried to automate these steps using NetApp Powershell toolkit. This saves time and reduces human errors while configuring new systems.
This script assumes that hardware is installed, “cluster setup” is run and all nodes joined in. I use a suffix of “-mgmt” with the cluster name when i configure a new clustered Data ONTAP system. This script has been tested to work with ONTAP 8.3 (simulators).

At present i have automated the following tasks:
    1.  Rename Nodes
2.  Rename root aggregates
3.  Create failover groups for cluster-mgmt and node-mgmt interfaces
4.  Add feature licenses
5.  Configure Storage Failover
6.  Unlock diag User
7.  Setup diag user password
8.  Create admin user for access to logs through http
9.  Setup Timezone and NTP server
10. Remove 10 Gbe ports from Default broadcast domain
11. Create ifgroups and add ports to ifgroups
12. Enable Cisco Discovery Protocol (cdpd) on all of the nodes
13. Setup disk auto assignment
14. Setup flexscale options
15. Disable flowcontrol on all the ports
16. Create data aggregates

The script displays the results on Powershell console as it iterates through the setup tasks. A transcript is also saved as a text file.

Cluster_config_screenshot

Source Code

 

<# .SYNOPSIS Automate Setup of a new NetApp clustered Data ONTAP cluster install .DESCRIPTION The script assumes the basic cluster setup is completed and all nodes joined. The script automates the following tasks: 1. Rename Nodes 2. Rename root aggregates 3. Create failover groups for cluster-mgmt and node-mgmt interfaces 4. Add feature licenses 5. Configure Storage Failover 6. Unlock diag User 7. Setup diag user password 8. Create admin user for access to logs through http 9. Setup Timezone and NTP server 10. Remove 10 Gbe ports from Default broadcast domain 11. Create ifgroups and add ports to ifgrps 12. Enable Cisco Discovery Protocol (cdpd) on all of the nodes 13. Setup disk auto assignment 14. Setup flexscale options 15. Disable flowcontrol on all the ports 16. Create data aggregates Example: PS C:\Users\vadmin\Documents\pshell-scripts> .\cluster_config_v1.5.ps1
.PARAMETER settingsFilePath
    Location of the File with User defined Parameters.
.EXAMPLE
    PS C:\Users\vadmin\Documents\pshell-scripts> .\cluster_config_v1.5.ps1
#>
#####################
# Declare Variables
#####################
$ClusterName             = "ntapclu1-mgmt"
$mgmtIP                  = "aa.bb.cc.dd"
$mgmtSubnet              = "aaa.bbb.ccc.ddd"
$mgmtGateway             = "aa.bb.cc.xx"
$ntpServer               = "ntp-server1"
$ClusterNameMgmtPort     = "e0d"
$NodeMgmtPort            = "e0c"
$timezone                = "Australia/Sydney"
[int]$maxraidsize        = 17 #raid group size for creating an aggregate
[int]$diskCount          = 51 
$TranscriptPath          = "c:\temp\cluster_setup_transcript_$(get-date -format "yyyyMMdd_hhmmtt").txt"
$licensesPath            = "c:\temp\licenses.txt" 
$ifgrp_a0a_port1         = "e0e"
$ifgrp_a0a_port2         = "e0f"
$timezone                = 'Australia/Sydney'

###########################
# Declare the functions
###########################
function Write-ErrMsg ($msg) {
    $fg_color = "White"
    $bg_color = "Red"
    Write-host " "
    Write-host $msg -ForegroundColor $fg_color -BackgroundColor $bg_color
    Write-host " "
}
#'------------------------------------------------------------------------------
function Write-Msg ($msg) {
    $color = "yellow"
    Write-host " "
    Write-host $msg -foregroundcolor $color
    Write-host " "
}
#'------------------------------------------------------------------------------
function Invoke-SshCmd ($cmd){
    try {
        Invoke-NcSsh $cmd -ErrorAction stop | out-null
        "The command completed successfully"
    }
    catch {
       Write-ErrMsg "The command did not complete successfully"
    }
}
#'------------------------------------------------------------------------------
function Check-LoadedModule {
  Param( 
    [parameter(Mandatory = $true)]
    [string]$ModuleName
  )
  $LoadedModules = Get-Module | Select Name
  if ($LoadedModules -notlike "*$ModuleName*") {
    try {
        Import-Module -Name $ModuleName -ErrorAction Stop
        Write-Msg ("The module DataONTAP is imported")
    }
    catch {
        Write-ErrMsg ("Could not find the Module DataONTAP on this system. Please download from NetApp Support")
        stop-transcript
        exit 
    }
  }
}
#'------------------------------------------------------------------------------
##############################
# Begin Cluster Setup Process
##############################
#'------------------------------------------------------------------------------
## Load Data ONTAP Module
start-transcript -path $TranscriptPath
#'------------------------------------------------------------------------------
Write-Msg  "##### Beginning Cluster Setup #####"
Check-LoadedModule -ModuleName DataONTAP
try {
    Connect-nccontroller $ClusterName -ErrorAction Stop | Out-Null   
    "connected to " + $ClusterName
    }
catch {
    Write-ErrMsg ("Failed connecting to Cluster " + $ClusterName + " : $_.")
    stop-transcript
    exit
}
#'------------------------------------------------------------------------------
## Get the nodes in the cluster
$nodes = (get-ncnode).node
#'------------------------------------------------------------------------------
## Rename the nodes (remove "-mgmt" string)
Write-Msg  "+++ Renaming Node SVMs +++"
foreach ($node in $nodes) { 
    Rename-NcNode -node $node -newname ($node -replace "-mgmt") -Confirm:$false |Out-Null
} 
Get-NcNode |select Node,NodeModel,IsEpsilonNode | Format-Table -AutoSize
$nodes = (get-ncnode).node
#'------------------------------------------------------------------------------
## Rename root aggregates
Write-Msg  "+++ Renaming root aggregates +++"
# get each of the nodes
Get-NcNode | %{ 
    $nodeName = $_.Node
    # determine the current root aggregate name
    $currentAggrName = (
        Get-NcAggr | ?{ 
             $_.AggrOwnershipAttributes.HomeName -eq $nodeName `
               -and $_.AggrRaidAttributes.HasLocalRoot -eq $true 
        }).Name
    # no dashes
    $newAggrName = $nodeName -replace "-", "_"
    # can't start with numbers
    $newAggrName = $newAggrName -replace "^\d+", " "
    # append the root identifier
    $newAggrName = "$($newAggrName)_root"
    if ($currentAggrName -ne $newAggrName) {
        Rename-NcAggr -Name $currentAggrName -NewName $newAggrName | Out-Null 
    }
    sleep -s 5
    Write-Host "Renamed aggregates containing node root volumes"
    (Get-NcAggr | ?{ $_.AggrOwnershipAttributes.HomeName -eq $node -and $_.AggrRaidAttributes.HasLocalRoot -eq $true }).Name 
}
#'------------------------------------------------------------------------------
## Create failover groups for cluster-mgmt and node-mgmt interfaces
Write-Msg  "+++ Create failover groups for cluster-mgmt and node-mgmt interfaces +++"
# get admin vserver name
$adminSVMTemplate = Get-NcVserver -Template
Initialize-NcObjectProperty -Object $adminSVMTemplate -Name VserverType | Out-Null
$adminSVMTemplate.VserverType = "admin"
$adminSVM         = (Get-NcVserver -Query $adminSVMTemplate).Vserver
# create cluster-mgmt failover group 
$clusterPorts     = ((get-ncnode).Node | % { $_,$ClusterNameMgmtPort -join ":" })
$nodePorts        = ((get-ncnode).Node | % { $_,$NodeMgmtPort -join ":" })
$firstClusterPort = $clusterPorts[0]
$allClusterPorts  = $clusterPorts[1..($clusterPorts.Length-1)]
New-NcNetFailoverGroup -Name cluster_mgmt -Vserver $adminSVM -Target $firstClusterPort | Out-Null
foreach ($cPort in $allClusterPorts) {
    Add-NcNetFailoverGroupTarget -Name cluster_mgmt -Vserver $adminSVM -Target $cPort | Out-Null
}
Set-NcNetInterface -Name cluster_mgmt -Vserver $adminSVM -FailoverPolicy broadcast_domain_wide -FailoverGroup cluster_mgmt | Out-Null
Write-Host "Created cluster-mgmt failover group"
Get-NcNetInterface -Name cluster_mgmt  | select InterfaceName,FailoverGroup,FailoverPolicy
# create node-mgmt failover-group for each node
foreach ($node in $nodes) {
    $prt1 = ($node,$NodeMgmtPort -join ":")
    $prt2 = ($node,$ClusterNameMgmtPort -join ":")
    New-NcNetFailoverGroup -Name $node"_mgmt" -Vserver $adminSVM -Target $prt1 | Out-Null
    Add-NcNetFailoverGroupTarget -Name $node"_mgmt" -Vserver $adminSVM -Target $prt2 | Out-Null
    $nodeMgmtLif = (Get-NcNetInterface -Role node-mgmt | Where-Object {$_.HomeNode -match "$node"}).InterfaceName
    Set-NcNetInterface -Name $nodeMgmtLif -Vserver $adminSVM -FailoverPolicy local-only -FailoverGroup $node"_mgmt" | Out-Null
    sleep -s 5
    Write-Host "Created node-mgmt failover group for node "$node
    Get-NcNetInterface -Role node-mgmt | Where-Object {$_.HomeNode -match "$node"} | select InterfaceName,FailoverGroup,FailoverPolicy
}
sleep -s 15
#'------------------------------------------------------------------------------
## Add licenses to cluster
Write-Msg "+++ Adding licenses +++"
$test_lic_path = Test-Path -Path $licensesPath
if ($test_lic_path -eq "True") {
    $count_licenses = (get-content $licensesPath).count
    if ($count_licenses -ne 0) {
        Get-Content $licensesPath |  foreach { Add-NcLicense -license $_ }
        Write-Host "Licenses successfully added"
        Write-Host " "
    }
    else {
        Write-ErrMsg ("License file is empty. Please add the licenses manually")
    }
}
else {
    Write-ErrMsg ("License file does not exist. Please add the licenses manually")       
}
sleep -s 15
#'------------------------------------------------------------------------------
## Configure storage failover
Write-Msg  "+++ Configure SFO +++"
Write-Host "SFO Does not work with Simulators"
if ($nodes.count -gt 2) {
    foreach ($node in $nodes) {
        $sfo_enabled = Invoke-NcSsh "storage failover modify -node " $node " -enabled true"
        if (($sfo_enabled.Value.ToString().Contains("Error")) -or ($sfo_enabled.Value.ToString().Contains("error"))) {
            Write-ErrMsg ($sfo_enabled.Value)
        }
        else {
            Write-Host ("Storage Failover is enabled on node " + $node)
        }

	    $sfo_autogive = Invoke-NcSsh "storage failover modify -node " $node " -auto-giveback true"
        if (($sfo_autogive.Value.ToString().Contains("Error")) -or ($sfo_autogive.Value.ToString().Contains("error"))) {
                Write-ErrMsg ($sfo_autogive.Value)
        }
        else {
            Write-Host ("Storage Failover option auto giveback is enabled on node " + $node)
            Write-Host " "
        }
        sleep -s 2
    }
}
elseif ($nodes.count -eq 2) {
    foreach ($node in $nodes) {
        $sfo_enabled = Invoke-NcSsh "cluster ha modify -configured true"
        if (($sfo_enabled.Value.ToString().Contains("Error")) -or ($sfo_enabled.Value.ToString().Contains("error"))) {
            Write-ErrMsg ($sfo_enabled.Value)
        }
        else {
            Write-Host ("Cluster ha is enabled on node " + $node)
            Write-Host
        }  
    }
}
else {
    Write-Host "No HA required for single node cluster. Continuing with the setup"
    Write-Host " "
}
sleep -s 15
#'------------------------------------------------------------------------------
## Unlock the diag user
Write-Msg "+++ Unlock the diag user +++"
try {
    Unlock-NcUser -username diag -vserver $ClusterName -ErrorAction stop |Out-Null
    Write-Host "Diag user is unlocked"
}
catch {
    Write-ErrMsg "Diag user is either unlocked or script could not unlock the diag user"
}
#'------------------------------------------------------------------------------
## Setup diag user password
Set-Ncuserpassword -UserName diag -password netapp123! -vserver $ClusterName | Out-Null
Write-Host "created diag user password"
sleep -s 15
#'------------------------------------------------------------------------------
## Create admin user for access to logs through http
Write-Msg "+++ create web log user +++"
Set-NcUser -UserName admin -Vserver $ClusterName -Application http -role admin -AuthMethod password | Out-Null
Write-Host "created admin user access for http log collection"
sleep -s 15
#'------------------------------------------------------------------------------
## Set Date and NTP on each node
Write-Msg  "+++ setting Timezones/NTP/Datetime +++"
foreach ($node in $nodes) {
    Set-NcTime -Node $node -Timezone $timeZone | Out-Null
    Set-NcTime -Node $node -DateTime (Get-Date) | Out-Null
}
New-NcNtpServer -ServerName $ntpServer -IsPreferred | Out-Null
Write-Host "NTP Sever setup complete"
sleep -s 15
#'------------------------------------------------------------------------------
## Remove 10 Gbe ports from Default broadcast domain
Write-Msg  "+++ Rmoving 10Gbe Ports from Default broadcast domain +++"
# remove ports from Default broadcast domain
$broadCastTemplate = Get-NcNetPortBroadcastDomain -Template
Initialize-NcObjectProperty -Object $broadCastTemplate -Name Ports | Out-Null
$broadCastTemplate.BroadcastDomain = "Default"
$defaultBroadCastPorts = ((Get-NcNetPortBroadcastDomain -Query $broadCastTemplate).Ports).Port
foreach ($bPort in $defaultBroadCastPorts) {
	if (($bPort -notlike "*$ClusterNameMgmtPort") -and ($bPort -notlike "*$NodeMgmtPort")) {
		Write-Host "Removing Port: " $bPort
		Set-NcNetPortBroadcastDomain -Name Default -RemovePort $bPort | Out-Null
	}	
}
sleep -s 15
#'------------------------------------------------------------------------------
## Create ifgroups and add ports to ifgrps
Write-Msg  "+++ starting ifgroup creation +++"
foreach ($node in $nodes) {
    try {
        New-NcNetPortIfgrp -Name a0a -Node $node -DistributionFunction port -Mode multimode_lacp -ErrorAction Stop | Out-Null
        Add-NcNetPortIfgrpPort -name a0a -node $node -port $ifgrp_a0a_port1 -ErrorAction Continue | Out-Null
        Add-NcNetPortIfgrpPort -name a0a -node $node -port $ifgrp_a0a_port2 -ErrorAction Continue | Out-Null
        Write-Host ("Successfully created ifgrp a0a on node " + $node)
    }
    catch {
        Write-ErrMsg ("Error exception in ifgrp a0a " + $node + " : $_.")
    }
}
sleep -s 15
#'------------------------------------------------------------------------------
## Enable cdpd on all of the nodes
Write-Msg  "+++ enable cdpd on nodes +++"
foreach ($node in $nodes) {
    $cdpd_cmd = Invoke-NcSsh "node run -node " $node " -command options cdpd.enable on"
    if (($cdpd_cmd.Value.ToString().Contains("Error")) -or ($cdpd_cmd.Value.ToString().Contains("error"))) {
        Write-ErrMsg ($cdpd_cmd.Value)
    }
    else {
        Write-Host ("Successfully modified cdpd options for " + $node)
    }
}
sleep -s 15
#'------------------------------------------------------------------------------
## Set option disk.auto_assign on
Write-Msg  "+++ Setting disk autoassign +++"
foreach ($node in $nodes) {
    $set_disk_auto = Invoke-NcSsh "node run -node " $node " -command options disk.auto_assign on"
    if (($set_disk_auto.Value.ToString().Contains("Error")) -or ($set_disk_auto.Value.ToString().Contains("error"))) {
        Write-ErrMsg ($set_disk_auto.Value)
    }
    else {
        Write-Host ("Successfully modified disk autoassign option on node " + $node)
    }   
}
sleep -s 15
#'------------------------------------------------------------------------------
## Set flexscale options
Write-Msg  "+++ Setting flexscale options +++"
foreach ($node in $nodes) {
	$flexscale_enable = Invoke-NcSsh "node run -node " $node " -command options flexscale.enable on" 
    if (($flexscale_enable.Value.ToString().Contains("Error")) -or ($flexscale_enable.Value.ToString().Contains("error"))) {
        Write-ErrMsg ($flexscale_enable.Value)
    }
    else {
        Write-Host ("options flexscale.enable set to on for node " + $node)
    } 

	$flexscale_lopri = Invoke-NcSsh "node run -node " $node " -command options flexscale.lopri_blocks on"
    if (($flexscale_lopri.Value.ToString().Contains("Error")) -or ($flexscale_lopri.Value.ToString().Contains("error"))) {
        Write-ErrMsg ($flexscale_lopri.Value)
    }
    else {
        Write-Host ("options flexscale.lopri_blocks set to on for node " + $node)
    } 

	$flexscale_data = Invoke-NcSsh "node run -node " $node " -command options flexscale.normal_data_blocks on"
    if (($flexscale_data.Value.ToString().Contains("Error")) -or ($flexscale_data.Value.ToString().Contains("error"))) {
        Write-ErrMsg ($flexscale_data.Value)
    }
    else {
        Write-Host ("options flexscale.normal_data_blocks set to on for node " + $node)
        Write-Host " "
    } 

}
sleep -s 15
#'------------------------------------------------------------------------------
## Disable flowcontrol on all of the ports
Write-Msg  "+++ Setting flowcontrol +++"
foreach ($node in $nodes) {
    try {
        Write-Host "Setting flowcontrol for ports on node: " $node
        get-ncnetport -Node $node | Where-Object {$_.Port -notlike "a0*"} | select-object -Property name, node | set-ncnetport -flowcontrol none -ErrorAction Stop | Out-Null
        sleep -s 15
        Get-NcNetPort -Node $node | Select-Object -Property Name,AdministrativeFlowcontrol | Format-Table -AutoSize
    }
    catch {
        Write-ErrMsg ("Error setting flowcontrol on node " + $node + ": $_.")
    }
}
sleep -s 15
#'------------------------------------------------------------------------------
## Create data aggregates
Write-Msg  "+++ Creating Data Aggregates +++"
# get each of the nodes
Get-NcNode | %{ 
    $nodeName = $_.Node
    # no dashes
    $newAggrName = $nodeName -replace "-", "_"
    # can't start with numbers
    $newAggrName = $newAggrName -replace "^\d+", " "
    # append the root identifier
    $newAggrName = "$($newAggrName)_data_01"
    # create an aggreagate
    $aggrProps = @{
        'Name' = $newAggrName;
        'Node' = $nodeName;
        'DiskCount' = $diskCount;
        'RaidSize' = $maxraidsize;
        'RaidType' = "raid_dp";
    }
    New-NcAggr @aggrProps | Out-Null
#
    sleep -s 15
    # enable free space reallocation
    Get-NcAggr $newAggrName | Set-NcAggrOption -Key free_space_realloc -Value on
}
#'------------------------------------------------------------------------------
Write-Host " "
Write-Host " "
stop-transcript
#'------------------------------------------------------------------------------

VMware ESXi 6.5 White Box Build

In the past few months i have been researching on upgrading my home lab esxi white box (Mac Mini Server Late 2012) to a white box with more available resources and capable of running VMware ESXi 6.5.
My Mac Mini had the limitation of 16 GB RAM max. So a need to expand my home lab arose. I still wanted to keep a single system to run virtual machines in my home lab. During my research i came across various builds posted by home lab enthusiasts on their blogs and reddit. However, availability of parts in Australia restricted my options. And finally i settled on the Motherboard and CPU which were reported to work with ESXi 6.0 and above out of the box. Making sure a suitable RAM was another big task for me, as it was the costliest component of all (i ordered 128 GB kit).

I share my experiences with those who might be interested to build their own ESXi white box at home.

Components Used:
PC_Components_List
  

For PC Parts i shopped around local stores in Australia:
Mwave
Scorptec
MSY
ebay
Gumtree

I had two Solid State Drives (250GB each) in my Mac Mini which i imported to the new White box. I ran CPU and Memory Stress Tests on the new build and everthing reported normal. The softwares i used for RAM and CPU tests provided a bootable iso file. I booted from a USB drive and ran all tests without installing any Operating System. I used Rufus to create bootable USB drives.
RAM Test: Memory Test (RAM)
CPU Test: CPU Stress Test

I registered the virtual machines on the SSD’s in my new White Box and i was up and runnig with my new lab in seconds.

Picture of my Build

PC_Build

VMware ESXi 6.5 has a fully functional Web Client built in. There is no longer a need to connect a new ESXi host via vSphere client.

ESXi

 

Hope you find this post useful.

Setup VPN to Access Home Lab

I have been recently working on a personal project to access my home lab from outside home. My Home lab has a mac mini server 2012 where i have baremetal ESXi running instead of OSX.
I have an ADSL 2 conection at my home where modem/router functionality is both handeled by the same device. The modem/router does not have VPN functionality inbuild. My ISP is Telstra here in Australia.
Modem/Router Model : Technicolor TG799vac (Telstra Gateway MAX)

I bought an ASUS RT-68U router, the stock Firmware (ASUSWRT) on this router has openVPN server built in. After some research, i formulated a plan to configure my current Telstra modem/router in Bridge mode and use ASUS RT-68U as the primary wifi device for my home network. Here is the plan:

  • Telstra modem has a LAN IP address of 10.0.0.138
    • Login to http://10.0.0.138 and take screenshots of all tabs that might be useful
    • Record your ISP username and figure out the password. You’ll need to use this while setting us ASUS RT-68U router
  • Poweron ASUS RT-68U router and check if it boots up, i mean make sure it’s not faulty
  • Configure Telstra Modem/Router in Bridge Mode
    • Have a laptop directly connected with ethernet cable to Telstra router
    • Go to http://10.0.0.138
    • Go to Advanced
      • Turn off wifi for 2.4 and 5.0 GHZ (also guest wifi)
    • Go to Local Network
      • Scroll to Bridge Mode
      • Confirm
    • The router/model restarts in to bridge mode
    • The router will have RED LEDs, don’t worry
    • As a second measure, turn off wifi with the button on the front of the router/modem
    • After reboot, the router is still accessible via http://10.0.0.138 however, does not have all the tabs previously available
  • Now verify if internet is working by setting up a PPPoE connection
  • Connect ASUS RT-68U to Telstra modem
    • plug the LAN port from Telstra modem in to WAN port of ASUS RT-68U router
    • restart ASUS RT-68U
    • plug your laptop to one of the LAN ports on ASUS RT-68U
    • ASUS router web page automatically opens up else use the default IP address in the setup sheet which comes with ASUS RT-68U
    • Setup ASUS RT-68U with your Telstra username and password and configure Wifi
    • Connect your devices to new wifi and verify everything works
  • For VPN, Go to Advanced Settings -> VPN -> OpenVPN
    • Setup using Advanced Settings, use TUN instead of TAP
    • OpenVPN server on ASUS RT-68U assigns an IP address in the range 10.8.0.0/24 which is different to Local LAN IP range. With this IP address you cannot ping any of local devices. You will need to add a static route in OpenVPN settings page, to get this functionality working
      • push “route <local-LAN-IP-range> <subnet-mask>”
      • e.g push “route 192.168.2.1 255.255.255.0”
      • OpenVPN-Route
  • Connect to Home VPN from outside on a MAC

Get-NodePerfData – Powershell Script to query NetApp Oncommand Performance Manager (OPM)

<#
script : Get-NodePerfData.ps1
Example:
Get-NodePerfData.ps1

This script queries OPM(version 2.1/7.0) Server and extract following performance counters for each node in clusters
    Date, Time, avgProcessorBusy, cpuBusy, cifsOps, nfsOps, avgLatency

All data is saved in to "thismonth" directory. e.g. 1608 (YYMM)

#>
Function Get-TzDateTime{
   Return (Get-TzDate) + " " + (Get-TzTime)
}
Function Get-TzDate{
   Return Get-Date -uformat "%Y-%m-%d"
}
Function Get-TzTime{
   Return Get-Date -uformat "%H:%M:%S"
}
Function Log-Msg{
   <#
   .SYNOPSIS
   This function appends a message to log file based on the message type.
   .DESCRIPTION
   Appends a message to a log a file.
   .PARAMETER
   Accepts an integer representing the log file extension type
   .PARAMETER
   Accepts a string value containing the message to append to the log file.
   .EXAMPLE
   Log-Msg -logType 0 -message "Command completed succuessfully"
   .EXAMPLE
   Log-Msg -logType 2, -message "Application is not installed"
   #>
   [CmdletBinding()]
   Param(
      [Parameter(Position=0,
         Mandatory=$True,
         ValueFromPipeLine=$True,
         ValueFromPipeLineByPropertyName=$True)]
      [Int]$logType,
      [Parameter(Position=1,
         Mandatory=$True,
         ValueFromPipeLine=$True,
         ValueFromPipeLineByPropertyName=$True)]
      [String]$message
   )
   Switch($logType){
      0 {$extension = "log"; break}
      1 {$extension = "err"; break}
      2 {$extension = "err"; break}
      3 {$extension = "csv"; break}
      default {$extension = "log"}
   }
   If($logType -eq 1){
      $message = ("Error " + $error[0] + " " + $message)
   }
   $prefix = Get-TzDateTime
   ($prefix + "," + $message) | Out-File -filePath `
   ($scriptLogPath + "." + $extension) -encoding ASCII -append
}
function MySQLOPM {
    Param(
      [Parameter(
      Mandatory = $true,
      ParameterSetName = '',
      ValueFromPipeline = $true)]
      [string]$Switch,
      [string]$Query
      )

    if($switch -match 'performance') {
        $MySQLDatabase = 'netapp_performance'
    }
    elseif($switch -match 'model'){
        $MySQLDatabase = 'netapp_model_view'    
    }
    $MySQLAdminUserName = 'report'
    $MySQLAdminPassword = 'password123'
    $MySQLHost = 'opm-server'
    $ConnectionString = "server=" + $MySQLHost + ";port=3306;Integrated Security=False;uid=" + $MySQLAdminUserName + ";pwd=" + $MySQLAdminPassword + ";database="+$MySQLDatabase

    Try {
      [void][System.Reflection.Assembly]::LoadFrom("E:\ssh\L080898\MySql.Data.dll")
      $Connection = New-Object MySql.Data.MySqlClient.MySqlConnection
      $Connection.ConnectionString = $ConnectionString
      $Connection.Open()

      $Command = New-Object MySql.Data.MySqlClient.MySqlCommand($Query, $Connection)
      $DataAdapter = New-Object MySql.Data.MySqlClient.MySqlDataAdapter($Command)
      $DataSet = New-Object System.Data.DataSet
      $RecordCount = $dataAdapter.Fill($dataSet, "data")
      $DataSet.Tables[0]
      }

    Catch {
      Write-Host "ERROR : Unable to run query : $query `n$Error[0]"
     }

    Finally {
      $Connection.Close()
    }
}
#'------------------------------------------------------------------------------
#'Initialization Section. Define Global Variables.
#'------------------------------------------------------------------------------
##'Set Date and Time Variables
[String]$lastmonth      = (Get-Date).AddMonths(-1).ToString('yyMM')
[String]$thismonth      = (Get-Date).ToString('yyMM')
[String]$yesterday      = (Get-Date).AddDays(-1).ToString('yyMMdd')
[String]$today          = (Get-Date).ToString('yyMMdd')
[String]$fileTime       = (Get-Date).ToString('HHmm')
[String]$workDay        = (Get-Date).AddDays(-1).DayOfWeek
[String]$DOM            = (Get-Date).ToString('dd')
[String]$filedate       = (Get-Date).ToString('yyyyMMdd')
##'Set Path Variables
[String]$scriptPath     = Split-Path($MyInvocation.MyCommand.Path)
[String]$scriptSpec     = $MyInvocation.MyCommand.Definition
[String]$scriptBaseName = (Get-Item $scriptSpec).BaseName
[String]$scriptName     = (Get-Item $scriptSpec).Name
[String]$scriptLogPath  = $scriptPath + "\Logs\" + (Get-TzDate) + "-" + $scriptBaseName
[System.Object]$fso     = New-Object -ComObject "Scripting.FileSystemObject"
[String]$outputPath     = $scriptPath + "\Reports\" + $thismonth
[string]$logPath        = $scriptPath+ "\Logs"

# MySQL Query to get objectid, name of all nodes
$nodes = MySQLOPM -Switch model -Query "select objid,name from node"

# Create hash of nodename and objid
$hash =@{}

foreach ($line in $nodes) {
    $hash.add($line.name, $line.objid)
}
# Create Log Directory
if ( -not (Test-Path $logPath) ) { 
       Try{
          New-Item -Type directory -Path $logPath -ErrorAction Stop | Out-Null
          Log-Msg 0 "Created Folder ""logPath"""
       }
       Catch{
          Log-Msg 0 "Failed creating folder ""$logPath"" . Error " + $_.Exception.Message
          Exit -1;
       }
    }

# Check hash is not empty, then query OPM server to extract counters
if ($hash.count -gt 0) {

    # If Report directory does not exist then create
    if ( -not (Test-Path $outputPath) ) { 
       Try{
          New-Item -Type directory -Path $outputPath -ErrorAction Stop | Out-Null
          Log-Msg 0 "Created Folder ""$outputPath"""
       }
       Catch{
          Log-Msg 0 "Failed creating folder ""$outputPath"" . Error " + $_.Exception.Message
          Exit -1;
       }
    }
    # foreach node
    foreach ($h in $hash.GetEnumerator()) {
    
        $nodeperffilename  = "$($h.name)`_$filedate.csv"
        $nodePerfFile = Join-Path $outputPath $nodeperffilename

        # MySQL Query to query each object and save data to lastmonth directory
        MySQLOPM -Switch performance -Query "select objid,Date_Format(FROM_UNIXTIME(time/1000), '%Y:%m:%d') AS Date ,Date_Format(FROM_UNIXTIME(time/1000), '%H:%i') AS Time, round(avgProcessorBusy,1) AS cpuBusy,round(cifsOps,1) AS cifsOps,round(nfsOps,1) AS nfsOps,round((avgLatency/1000),1) As avgLatency from sample_node where objid=$($h.value)" | Export-Csv -Path $nodePerfFile -NoTypeInformation
        Log-Msg 0 "Exported Performance Logs for $($h.name)"
    }
} 

clustered Data ONTAP Upgrade Procedure from 8.2 to 8.3

Upgrade Prerequisites

Pre-upgrade Checklist
 • Send  autosupport from all the nodes
 system node autosupport invoke -type all -node * -message “Upgrading to 8.3.2P5"

• Verify Cluster Health
 cluster show

• Verify Cluster is in RDB
 set advanced
 cluster ring show -unitname vldb
 cluster ring show -unitname mgmt
 cluster ring show -unitname vifmgr
 cluster ring show -unitname bcomd

• Verify vserver health
 storage aggregate show -state !online
 volume show -state !online
 network interface show -status-oper down
 network interface show -is-home false
 storage disk show –state broken
 storage disk show –state maintenance|pending|reconstructing

• Revert not home lif’s
 network interface revert -vserver <vserver-name> -lif <lif-name>
 system health status show
 dashboard alarm show

• Verify lif failover configuration (data lif’s)
 network interface failover show

• Move all LS Mirror source Volumes to an aggregate on the node that will be upgraded last
vol move start -vserver svm1 -volume rootvol -destination-aggregate snowy008_aggr01_sas
vol move start -vserver svm2 -volume rootvol -destination-aggregate snowy008_aggr01_sas

• Modify existing failover-groups to failover lif’s on first two nodes to be upgraded
 snowy_CIFS_fg_a0a-101
failover-groups delete -failover-group snowy_CIFS_fg_a0a-101 -node snowy003 -port a0a-101
failover-groups delete -failover-group snowy_CIFS_fg_a0a-101 -node snowy004 -port a0a-101
failover-groups delete -failover-group snowy_CIFS_fg_a0a-101 -node snowy005 -port a0a-101
failover-groups delete -failover-group snowy_CIFS_fg_a0a-101 -node snowy006 -port a0a-101
failover-groups delete -failover-group snowy_CIFS_fg_a0a-101 -node snowy007 -port a0a-101
failover-groups delete -failover-group snowy_CIFS_fg_a0a-101 -node snowy008 -port a0a-101

snowy::> failover-groups show -failover-group snowy_CIFS_fg_a0a-101

snowy_NFS_fg_a0a-102
failover-groups delete -failover-group snowy_NFS_fg_a0a-102 -node snowy003 -port a0a-102
failover-groups delete -failover-group snowy_NFS_fg_a0a-102 -node snowy004 -port a0a-102
failover-groups delete -failover-group snowy_NFS_fg_a0a-102 -node snowy005 -port a0a-102
failover-groups delete -failover-group snowy_NFS_fg_a0a-102 -node snowy006 -port a0a-102
failover-groups delete -failover-group snowy_NFS_fg_a0a-102 -node snowy007 -port a0a-102
failover-groups delete -failover-group snowy_NFS_fg_a0a-102 -node snowy008 -port a0a-102

snowy::> failover-groups show -failover-group snowy_NFS_fg_a0a-102

• Modify one lif on each SVM to have home_node on either of first two nodes
network interface modify -vserver svm1 -lif svm1_CIFS_01 -home-node snowy001 -auto-revert true
network interface modify -vserver svm2 -lif svm2_CIFS_01 -home-node snowy001 -auto-revert true

• Revert back “not home” lif’s
network interface revert *
###############################################################################
UPGRADE

Determine the current image & Download the new image on each node
 • Current Image
 system node image show

• Verify no jobs are running
 job show

• Install Data ONTAP on all the nodes from each Service Processor SSH console
 system node image update -node snowy001 -package https://webserver/832P5_q_image.tgz -replace-package true
 system node image update -node snowy002 -package https://webserver/832P5_q_image.tgz -replace-package true
 system node image update -node snowy003 -package https://webserver/832P5_q_image.tgz -replace-package true
 system node image update -node snowy004 -package https://webserver/832P5_q_image.tgz -replace-package true
 system node image update -node snowy005 -package https://webserver/832P5_q_image.tgz -replace-package true
 system node image update -node snowy006 -package https://webserver/832P5_q_image.tgz -replace-package true
 system node image update -node snowy007 -package https://webserver/832P5_q_image.tgz -replace-package true
 system node image update -node snowy008 -package https://webserver/832P5_q_image.tgz -replace-package true

• Disable 32 bit aggregate support
 storage aggregate 64bit-upgrade 32bit-disable

• Verify software is installed
 system node image show

• Set 8.3.2P5 image as default image
 system image modify {-node snowy001 -iscurrent false} -isdefault true
 system image modify {-node snowy002 -iscurrent false} -isdefault true
 system image modify {-node snowy003 -iscurrent false} -isdefault true
 system image modify {-node snowy004 -iscurrent false} -isdefault true
 system image modify {-node snowy005 -iscurrent false} -isdefault true
 system image modify {-node snowy006 -iscurrent false} -isdefault true
 system image modify {-node snowy007 -iscurrent false} -isdefault true
 system image modify {-node snowy008 -iscurrent false} -isdefault true
#################
###  REBOOT NODES 1 and 2
#################

Reboot the First two Nodes in the cluster

• Delete any running or queued aggregate, volume, SnapMirror copy, or Snapshot job
snowy::> job delete –id <job-id>

STEP 1: Reboot the node snowy002 first and wait for it to come up; then move on to partner node

storage failover show
storage failover takeover -bynode snowy001
“snowy002” reboots, verify snowy002 is in Waiting for Giveback state
There is approximately 15 minutes’ gap before giveback is initiated and services are given back to original owner
storage failover giveback –fromnode snowy001 -override-vetoes true
storage failover show-giveback (keep verifying aggr show for aggrs to return back)

• Verify the node booted up with 8.3.2P5 image
 system node image show
 Once the aggregates are home, verify the lif’s, if not home
 system node upgrade-revert show –node snowy002
 network interface revert *

• Verify that node’s ports and LIFs are up and operational
 network port show –node snowy002
 network interface show –data-protocol nfs|cifs –role data –curr-node snowy002

• Verifying the networking configuration after a major upgrade
 After completing a major upgrade to Data ONTAP 8.3.2P5, you should verify that the LIFs required for external server connectivity, failover groups, and broadcast domains are configured correctly for your environment.

1. Verify the broadcast domains:
network port broadcast-domain show
During the upgrade to the Data ONTAP 8.3 release family, Data ONTAP automatically creates broadcast domains based on the failover groups in the cluster.
For each layer 2 network, you should verify that a broadcast domain exists, andthat it includes all of the ports that belong to the network. If you need to make any changes, you can use the network port broadcast-domain Commands.

2. If necessary, use network interface modify command to change the LIFs that you configured for external server connectivity.

###########
STEP 2: Reboot the partner node (snowy001)

Move all the Data lifs from snowy001 to snowy002

• Takeover node
storage failover takeover -bynode snowy002 -option allow-version-mismatch
“snowy001” reboots, verify snowy001 is in Waiting for Giveback state
There is approximately 15 minutes’ gap before giveback is initiated and services are given back to original owner
storage failover giveback –fromnode snowy002 -override-vetoes true
storage failover show (keep verifying aggr show for aggrs to return back)
system node upgrade-revert show –node snowy001

• Once the aggregates are home, verify the lif’s, if not home
 network interface revert *

• Verify that node’s ports and LIFs are up and operational
 network port show –node snowy001
 network interface show –data-protocol nfs|cifs –role data –curr-node snowy001

• Verifying the networking configuration after a major upgrade
 After completing a major upgrade to Data ONTAP 8.3.2P5, you should verify that the LIFs required for external server connectivity, failover groups, and broadcast domains are configured correctly for your environment.

1. Verify the broadcast domains:
 network port broadcast-domain show
 During the upgrade to the Data ONTAP 8.3 release family, Data ONTAP automatica lly creates broadcast domains based on the failover groups in the cluster.
 For each layer 2 network, you should verify that a broadcast domain exists, an d that it includes all of the ports that belong to the network. If you need to make any changes, you can use the network port broadcast-domain Commands.

2. If necessary, use network interface modify command to change the LIFs that you configured for external server connectivity.

• Ensure that the cluster is in quorum and that services are running before upgrading the next pair of nodes:
 cluster show
 cluster ring show
#############
# NODES 3 and 4
#############
Follow above STEPS and reboot following nodes in the order:
 • snowy004
 • snowy003

#############
# NODES 5 and 6
#############
Follow above STEPS and reboot following nodes in the order:
 • snowy006
 • snowy005

#############
# NODES 7 and 8
#############
Follow above STEPS and reboot following nodes in the order:
 • snowy007
 • snowy008 (node 8 is the last node to be upgraded and rebooted, it hosts all LS Mirror source volumes)
###############################################################################
snowy::*> vol show -volume rootvol -fields aggregate
 (volume show)
 vserver      volume  aggregate
 ------------ ------- --------------------------
 svm1 rootvol snowy008_aggr01_sas
 svm2 rootvol snowy008_aggr01_sas

• Move all LS Mirror source Volumes to their original nodes
 vol move start -vserver svm1 -volume rootvol -destination-aggregate snowy007_aggr01_sas
 vol move start -vserver svm2 -volume rootvol -destination-aggregate snowy005_aggr01_sas

• Move all the lif’s back to their original home nodes
network interface modify -vserver svm1 -lif svm1_CIFS_01 -home-node snowy007
network interface modify -vserver svm2 -lif svm2_CIFS_01 -home-node snowy008

• Revert lif’s back to their home nodes
 network interface revert *

• Modify failover-groups and add ports from nodes 03 to 08
 snowy_CIFS_fg_a0a-101
failover-groups create -failover-group snowy_CIFS_fg_a0a-101 -node snowy003 -port a0a-101
failover-groups create -failover-group snowy_CIFS_fg_a0a-101 -node snowy004 -port a0a-101
failover-groups create -failover-group snowy_CIFS_fg_a0a-101 -node snowy005 -port a0a-101
failover-groups create -failover-group snowy_CIFS_fg_a0a-101 -node snowy006 -port a0a-101
failover-groups create -failover-group snowy_CIFS_fg_a0a-101 -node snowy007 -port a0a-101
failover-groups create -failover-group snowy_CIFS_fg_a0a-101 -node snowy008 -port a0a-101

snowy_NFS_fg_a0a-102
failover-groups create -failover-group snowy_NFS_fg_a0a-102 -node snowy003 -port a0a-102
failover-groups create -failover-group snowy_NFS_fg_a0a-102 -node snowy004 -port a0a-102
failover-groups create -failover-group snowy_NFS_fg_a0a-102 -node snowy005 -port a0a-102
failover-groups create -failover-group snowy_NFS_fg_a0a-102 -node snowy006 -port a0a-102
failover-groups create -failover-group snowy_NFS_fg_a0a-102 -node snowy007 -port a0a-102
failover-groups create -failover-group snowy_NFS_fg_a0a-102 -node snowy008 -port a0a-102
##############################################################################
##########         ADD ABOVE PORTS TO BROADCAST DOMAINS     ##################
##############################################################################

Verify Post-upgrade cluster is healthy

• Check Data ONTAP version on the cluster and each node
 set advanced
 version (This should report as 8.3.2P5)
 system node image show -fields iscurrent,isdefault
 system node upgrade-revert show
 The status for each node should be listed as complete.

• Check Cluster Health
 cluster show

• Check Cluster is in RDB
 set advanced
 cluster ring show -unitname vldb
 cluster ring show -unitname mgmt
 cluster ring show -unitname vifmgr
 cluster ring show -unitname bcomd
 cluster ring show -unitname crs

• Check vserver health
 storage aggregate show -state !online
 volume show -state !online
 network interface show -status-oper down
 network interface show -is-home false
 failover-groups show

• Check lif failover configuration (data lif’s)
 network interface failover show

• Check for Network Broadcast Domains
 network port broadcast-domain show

• Check AV Servers are scanning the files

• Check DNS connectivity
 dns show -state disabled

• Check connectivity to Domain Controller
 cifs domain discovered-servers show -vserver svm1 -status ok
 (check for all Production vservers)

• Verify CIFS is working on all the SVMs
 Browse through shares on following SVMs from Windows Host
 \\svm1
 \\svm2

• Verify NFS is working
 Check connectivity to Unix/Linux Hosts where NFS exports are mounted

Volume Clone Split Extremely Slow in clustered Data ONTAP

Problem

My colleague had been dealing with growth on an extremely large volume (60 TB) for some time. After discussing with Business groups it was aggreed to split the volume in two seperate volumes.The largest directory identified was 20 TB that could be moved to it’s own volume. Discussions started on the best possible solution to get this job completed quickly.

Possible Solutions

  • robocopy / securecopy the directory to another volume. Past experience says this could be lot more time consuming.
  • ndmpcopy the large directory to a new volume. The ndmpcopy session needs to be kept open, if the job fails during transfer, we have to restart from begining. Also, there are no progress updates available.
  • clone the volume, delete data not required, split the clone. This seems to be a nice solution.
  • vol move. We don’t want to copy entire 60 TB volume and delete data. Therefore, we didn’t consider this solution.

So, we aggreed on the 3rd  Solution (clone, delete, split).

What actually happened

snowy-mgmt::> volume clone split start -vserver snowy -flexclone snowy_vol_001_clone
Warning: Are you sure you want to split clone volume snowy_vol_001_clone in Vserver snowy ?
{y|n}: y
[Job 3325] Job is queued: Split snowy_vol_001_clone.
 
Several hours later:
snowy-mgmt::> volume clone split show
                                Inodes              Blocks
                        ——————— ———————
Vserver   FlexClone      Processed      Total    Scanned    Updated % Complete
——— ————- ———- ———- ———- ———- ———-
snowy snowy_vol_001_clone          55      65562    1532838    1531276          0
 
Two Days later:
snowy-mgmt::> volume clone split show
                                Inodes              Blocks
                        ——————— ———————
Vserver   FlexClone      Processed      Total    Scanned    Updated % Complete
——— ————- ———- ———- ———- ———- ———-
snowy snowy_vol_001_clone         440      65562 1395338437 1217762917          0

This is a huge problem. The split operation will never complete in time.

What we found

We found the problem was with the way clone split works. Data ONTAP uses a background scanner to copy the shared data from the partent volume to the FlexClone volume. The scanner has one active message at any time that is processing only one inode, so the split tends to be faster on a volume with fewer inodes. Also, the background scanner runs at a low priority and can take considreable amount of time to complete. This means for a large volume with millions of inodes, it will take a huge amount of time to perform the split operation.

Workaround

“volume move a clone”

snowy-mgmt::*> vol move start -vserver snowy -volume snowy_vol_001_clone -destination-aggregate snowy01_aggr_01
  (volume move start)
 
Warning: Volume will no longer be a clone volume after the move and any associated space efficiency savings will be lost. Do you want to proceed? {y|n}: y

Benefits of vol move a FlexClone:

  • Faster than FlexClone split.
  • Data can be moved to different aggregate or node.

Reference

FAQ – FlexClone split

Error Handling in Powershell Scripts

Introduction

I have been writing powershell scripts to address various problems with utmost efficiency. I have been incorporating error handling in my scripts, however, i refreshed my knowledge and i am sharing this with fellow IT professionals. While running powershell cmdlets, you encounter two kinds of errors (Terminating and Non Terminating):

  • Terminating : These will halt the function or operation. e.g. syntax error, running out of memory. Can be caught and handled.

Terminating-Error

  • Non Terminating : These allow the function or operation to continue. e.g. file not found, permission issues, if the file is empty the operation continues to next peice of code. Difficult to capture.

So How do you capture non terminating errors in a funcion?

Powershell provides various Variables and Actions to handle errors and exceptions:

  • $ErrorActionPreference : environment variable which applies to all cmdlets in the shell or the script
  • -ErrorAction : applies to specific cmdlets where it is applied
  • $Error : whenever an exception occurs its added to $Error variable. By default the variable holds 256 errors. The $Error variable is an array where the first element is the most recent exception. As new exceptions occur, the new one pushes the others down the list.
  • -ErrorVariable: accepts the name of a variable and if the command generates and error, it’ll be placed in that variable.
  • Try .. Catch Constructs : Try part contains the command or commands that you think might cause an error. You have to set their -ErrorAction to Stop in order to catch the error. The catch part runs if an error occurs within the Try part.

-ErrorAction : Use ErrorAction parameter to treat non terminating errors as terminating. Every powershell cmdlet supports ErrorAction.Powershell halts execution on terminating errors. For non terminating errors we have the option to tell powershell how to handle these situations.

Available Choices

  • SilentlyContinue : error messages are supressed and execution continues
  • Stop : forces execution to stop, behaves like a terminating error
  • Continue : default option. Errors will display and execution will continue
  • Inquire : prompt the user for input to see if we should proceed
  • Ignore : error is ignored and not logged to the error stream

function Invoke-SshCmd ($cmd){
try {
Invoke-NcSsh $cmd -ErrorAction stop | out-null
"The command completed successfully"
}
catch {
Write-ErrMsg "The command did not complete successfully"
}
}

$ErrorActionPreference : It is also possible to treat all errors as terminating using the ErrorActionPreference variable.You can do this either for the script your are working with or for the whole PowerShell session.

-ErrorVariable : Below example captures error in variable “$x”

function Invoke-SshCmd ($cmd){
try {
Invoke-NcSsh $cmd -ErrorVariable x -ErrorAction SilentlyContinue | out-null
"The command completed successfully"
}
catch {
Write-ErrMsg "The command did not complete successfully : $x.exception"
}
}

$x.InvocationInfo : provides details about the context which the command was executed
$x.Exception : has the error message string
If there is a further underlying problem that is captured in $x.Exception.innerexception
The error message can be futher broken in:
$x.Exception.Message
and $x.Exception.ItemName
$($x.Exception.Message) another way of accessing the error message.

$Error : Below example captures error in default $error variable

function Invoke-SshCmd ($cmd){
try {
Invoke-NcSsh $cmd -ErrorAction stop | out-null
"The command completed successfully"
}
catch {
Write-ErrMsg "The command did not complete successfully : $error[0].exception"
}
}

Query Oncommand Performance Manager (OPM) Database using Powershell

Introduction

OnCommand Performance Manager (OPM) provides performance monitoring and event root-cause analysis for systems running clustered Data ONTAP software. It is the performance management part of OnCommand Unified Manager. OPM 2.1 is well integrated with Unified Manager 6.4. You can view and analyze events in the Performance Manager UI or view them in the Unified Manager Dashboard.

Performance Manager collects current performance data from all monitored clusters every five minutes (5, 10, 15). It analyzes this data to identify performance events and potential issues. It retains 30 days of five-minute historical performance data and 390 days of one-hour historical performance data. This enables you to view very granular performance details for the current month, and general performance trends for up to a year.

Accessing the Database

Using powershell you can query MySQL database and retrieve information to create performance charts in Microsoft Excel or other tools. In order to access OPM databse you’ll need a user created with “Database User” role.

OPM-User

The following databases are availbale in OPM 2.1

  • information_schema
  • netapp_model
  • netapp_model_view
  • netapp_performance
  • opm

Out of the above, the two databases that have more relevant information are “netapp_model_view” and “netapp_performance”Database “netapp_model_view” has tables that define the objects and relationships among the objects for which performance data is collected, such as aggregates, SVMs, clusters, volumes, etc.  Database netapp_performance has tables which contain the raw data collected as well as periodic rollups used to quickly generate the graphs OPM presents through its GUI.

Refer to MySQL function in my previous post on Querying OCUM Database using Powershell to connect to OPM database.

Understanding Database

OPM assigns each object (node, cluster, lif, port, aggregate, volumes etc.) a unique id. These id’s are independent of id’s in OCUM database. Theser id’s are stored in tables in “netapp_model_view” database. You can perform join on various tables through the object id’s.

Actual performance data is collected and stored in tables in “netapp_performance” database. All table have a suffix “sample_”. Each table row contains OPM object id for the object (node, cluster, lif, port, aggregate, volumes etc.), the timestamp of the collection and the raw data.

Few useful Database queries

Below example queries database to retrieve performance counter of a node.

Connect to “netapp_model_view” database and list the objid and name from table nodes

"MySQL -Query ""select objid,name from node"" | Format-Table -AutoSize"

Connect to “netapp_performance” database and export cpuBusy, cifsOps, avgLatency from table node

"MySQL -Query ""select objid,Date_Format(FROM_UNIXTIME(time/1000), '%Y:%m:%d %H:%i') AS Time,cpuBusy,cifsOps,avgLatency from sample_node where objid=2"" | Export-Csv -Path E:\snowy-01.csv -NoTypeInformation"