mikepalic

Certifications & Being a SME.

More Specifically, What It’s Like Being a “Virtual” SME

Item Development Workshops (IDW’s) take place when a certification test needs to be written or updated by a group of SMEs (Subject Matter Experts). My very first IDW was helping to write the NCDA in January of 2019, and it turned out to be one of the most interesting experiences of my career. I think the biggest surprise to me was how much work actually went into writing a certification test. I have always struggled with standardized testing, but this gave me insight that clarified the logic of testing.

So far this year I’ve participated in three IDWs (with one more coming up in June 2020):

NetApp NCDA – NetApp Certified Data Administrator, ONTAP
https://www.netapp.com/us/services-support/university/certification/ncda.aspx

NetApp NCIE-SAN – NetApp Certified Implementation Engineer—SAN Specialist
https://www.netapp.com/us/services-support/university/certification/ncie-san.aspx

NetApp NCIE-DP – NetApp Certified Implementation Engineer – Data Protection Specialist
https://www.netapp.com/us/services-support/university/certification/ncie-data-protection.aspx

All NetApp Certification tests follow a similar methodology when they are being created and are all legally defensible. All questions from the above certs (as well as others) are taken from the following list of reference docs.

Credit: https://dilbert.com/strip/2000-08-31

NetApp Certification tests aren’t written for specific classes.

So what goes into a (good) test question?

First, let’s look at the parts of the questions.

Stem – This is the actual question being asked! It can be either a “recall” or “reason” type question. Recall is a simple memory based question, and reason questions have you work through a scenario.
Answer – It has to be clear and from the documents on the reference lists.
Distractors – a.k.a. the wrong answers. They have to be real too, not made up, and have to be relevant to the question as well as part of the NetApp ecosystem. We typically have to come up with 2-3 distractors per question. Questions are either; choose 1 out of 4; 2 out of 4 or 3 out of 5. There will never be an “All of the Above” or “None of the Above” answer option.

So where do (good) questions come from?
Everywhere! (technically.) I’ve been in IT for twenty-five years now, in a variety of roles. So a lot of my reason based questions come from actual scenarios I’ve worked through both in real life or from helping people in the NetApp Communities. Some of these scenarios have occurred the week before an IDW!

Typically during IDWs, SMEs travel to a NetApp office. So far I’ve been to RTP (Research Triangle Park, NC) and Boulder, CO. I typically arrive Sunday, have dinner with my fellow SMEs, and turn in before the start of the week. The week gets broken down like so:

Monday: Review existing questions; questions get put into three groups, Keep, Toss, Practice.
Tuesday: Write new questions; typically each SME gets 10-12 questions.
Wednesday: Tech Review
Thursdays: Tech Review and fix / rework questions
Friday: Judge and score questions

After the day’s work, you go out, have dinner, do fun group activities, have a few beers and trade stories. Fun times are had by all.

Going Virtual.

Credit: https://www.imdb.com/title/tt0113481/

Back in March, the world took an odd turn. Lockdowns, shelter in place, social distancing, no traveling, working from home became the new norm for a lot of people. Because of that, two of the three IDWs I’ve done this year have been virtual (NCIE-SAN & NCIE-DP).

So what’s it like to be a virtual SME?

So what’s different in-person versus virtual? A lot honestly. Working on the questions in a room full of fellow SMEs is camaraderie at it’s finest. Bouncing questions off one another, suggesting distractors (which is always fun) or re-working the stems so a wider audience can understand them more clearly.

When you do this remotely, you still (kind of) have that, but you’re also competing with the distractions that everyone happens when you are working at home with the rest of your family present. My daughter likes to walk in and wave to folks while on video, showing off her Legos, drawings, etc. I think my favorite part was when she got a bit shy the first few times she saw SANta on the screen!

We would still keep to our day to day schedule. Everyone kept their video cameras on while we were working throughout the days. That helped keep up the camaraderie we all appreciated from the in-person IDWs.

So which do I prefer? Honestly, I do lean towards the on-site IDW mostly because I enjoy travel and spending time with my fellow SMEs. However, virtual works, given the current status of the world, and you get to sleep in your own bed every night.

NCIEDP

The NCIE-DP IDW Team!

How To Rename an ONTAP Cluster and Its Nodes

Renaming a NetApp ONTAP cluster is something that I’ve been asked about more than a few times; usually it’s related to a cluster changing locations, or when you’ve done a controller migration by adding a new pair of controllers into the cluster and removed the old pair. It’s pretty straightforward and non-disruptive. It can be manually done, or even automated (but that’s another blog!)

Let’s start by looking at the cluster name (identity).

Mother::> cluster identity show
          Cluster UUID: 99a0be28-9999-99ea-9a99-00a098c34804
          Cluster Name: Mother
 Cluster Serial Number: 1-80-000011
      Cluster Location: Erehwon 
       Cluster Contact: @SpindleNinja

Note that the cluster has nodes that match the naming schema.

Mother::> cluster show
Node                  Health Eligibility
--------------------- ------- ------------
Mother-01             true true
Mother-02             true true
2 entries were displayed.

Let’s start to rename!

Mother::> cluster identity modify -name WOPR

Now we can see that the identity has changed.

WOPR::> cluster identity show
          Cluster UUID: 99a0be28-9999-99ea-9a99-00a098c34804
          Cluster Name: WOPR
 Cluster Serial Number: 1-80-000011
      Cluster Location: Erehwon
       Cluster Contact: @SpindleNinja

If we look at the nodes, however, they are still called by the old cluster’s name.

WOPR::> cluster show
Node                  Health Eligibility
--------------------- ------- ------------
Mother-01             true true
Mother-02             true true
2 entries were displayed.

To rename them, run the “system node rename” command.

WOPR::> system node rename -node Mother-01 -newname WOPR-01
[Job 962] Job succeeded: Rename of the node "Mother-01" to "WOPR-01" is successful.
WOPR::> system node rename -node Mother-02 -newname WOPR-02
[Job 963] Job succeeded: Rename of the node "Mother-02" to "WOPR-02" is successful.

Now, we can see that the names have changed.

WOPR::> cluster show
Node                  Health Eligibility
--------------------- ------- ------------
WOPR-01               true true
WOPR-02               true true
2 entries were displayed.

Note: If you just want to rename to renumber the nodes, this is all you need to do: Run the “system node rename” command and just change the -XX value. The names are all, technically, arbitrary anyway.

WOPR::> system node rename -node WOPR-02 -newname WOPR-99
[Job 976] Job succeeded: Rename of the node "WOPR-02" to "WOPR-99" is successful.
WOPR::> cluster show
Node                  Health Eligibility
--------------------- ------- ------------
WOPR-01               true true
WOPR-99               true true
2 entries were displayed.

And we can rename it back:

WOPR::> system node rename -node WOPR-99 -newname WOPR-02
[Job 977] Job succeeded: Rename of the node "WOPR-99" to "WOPR-02" is successful.
WOPR::> cluster show
Node                  Health Eligibility
--------------------- ------- ------------
WOPR-01               true true
WOPR-02               true true

Let’s take a look at the rest of the parts of the cluster.

Note that the admin and node vservers (SVMs) were renamed automatically, so there is no need to do anything here.

WOPR::> vserver show
                           Admin     Operational   Root
Vserver   Type Subtype     State     State         Volume     Aggregate
----------- ------- ---------- ---------- ----------- ---------- ----------
iSCSI     data    default  running   running       iSCSI_root N1_aggr1
WOPR      admin -  - -     - -
WOPR-01   node -   - -     - -
WOPR-02   node -   - -     - -
4 entries were displayed.

The network LIFs are another story. They will need to be manually renamed.

WOPR::> net int show -lif Mother*
  (network interface show)
            Logical    Status Network            Current Current Is
Vserver     Interface Admin/Oper Address/Mask       Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
Cluster
            Mother-01_clus1 up/up 169.254.206.139/16 WOPR-01 e0e true
            Mother-01_clus2 up/up 169.254.46.169/16  WOPR-01 e0f true
            Mother-02_clus1 up/up 169.254.172.145/16 WOPR-02 e0e true
            Mother-02_clus2 up/up 169.254.214.1/16   WOPR-02 e0f true
WOPR
            Mother-01_mgmt1 up/up 192.168.1.221/24   WOPR-01 e0M true
            Mother-02_mgmt1 up/up 192.168.1.222/24   WOPR-02 e0M true

6 entries were displayed.

Renaming is straight forward, just time consuming.

WOPR::> net int rename -vserver Cluster -lif Mother-01_clus1 -newname WOPR-01_clus1
  (network interface rename)
WOPR::> net int rename -vserver Cluster -lif Mother-01_clus2 -newname WOPR-01_clus2
  (network interface rename)
WOPR::> net int rename -vserver Cluster -lif Mother-02_clus1 -newname WOPR-02_clus1
  (network interface rename)
WOPR::> net int rename -vserver Cluster -lif Mother-02_clus2 -newname WOPR-02_clus2
 (network interface rename)
WOPR::> net int rename -vserver WOPR -lif Mother-01_mgmt1 -newname WOPR-01_mgmt1
 (network interface rename)
WOPR::> net int rename -vserver WOPR -lif Mother-02_mgmt1 -newname WOPR-02_mgmt1
 (network interface rename)
WOPR::> net int show -lif Mother*
 (network interface show)
There are no entries matching your query.

(Now that looks better. )

WOPR::> net int show -lif WOPR*
  (network interface show)
            Logical    Status Network            Current Current Is
Vserver     Interface Admin/Oper Address/Mask       Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
Cluster
            WOPR-01_clus1 up/up   169.254.206.139/16 WOPR-01       e0e true
            WOPR-01_clus2 up/up   169.254.46.169/16 WOPR-01       e0f true
            WOPR-02_clus1 up/up   169.254.172.145/16 WOPR-02       e0e true
            WOPR-02_clus2 up/up   169.254.214.1/16 WOPR-02       e0f true
WOPR
            WOPR-01_mgmt1 up/up   192.168.1.221/24 WOPR-01       e0M true
            WOPR-02_mgmt1 up/up   192.168.1.222/24 WOPR-02       e0M true
6 entries were displayed.

You also might want to make sure that you don’t need to rename aggrs. I don’t rename them here in my lab, but renaming an aggr is also non-disruptive by using the “storage aggregate rename” command.

WOPR::> aggr show
Aggregate     Size Available Used% State   #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
N1_aggr1    3.05TB 1.98TB   35% online 23 WOPR-01          raid_dp,normal
N2_aggr1    3.05TB 2.19TB   28% online 10 WOPR-02          raid_dp,normal
root_aggr0_N1 368.4GB 17.85GB  95% online 1 WOPR-01     raid_dp,normal
root_aggr0_N2 368.4GB 17.85GB  95% online 1 WOPR-02     raid_dp,normal
4 entries were displayed.
WOPR::>

If we take a look at Active IQ Unified Manager, the name has been updated in there:

QIAUM

So there we have it. It’s pretty straightforward for a procedure that one might think renaming would really confuse things in a cluster.

NetApp ONTAP 9.6 New Features and Functions (plus some new hardware!)

Untitled 2
NetApp announced ONTAP 9.6 with new features and functions as well as a new mid-level NVMe hardware platform, the all NVMe A320.

(Going forward (9.6 and up)) All version of ONTAP will be “long term supported”
New (and renamed) version of ONTAP System Manager that is based on REST APIs
Simpler FlexGroup management
Adaptive QoS support for NVMe/FC (maximums)
Additional host support for NVMe
FabricPool tiers to Google and Alibaba clouds
FlexCache extended to use with Cloud Volumes ONTAP
FlexGroups on MetroCluster.
Larger ONTAP Select “Premium XL” license
Over-the-wire encryption with SnapMirror and FlexCache
Per tenant / SVM encryption key management
Aggregate-level encryption which enables aggregate level dedupe with NVE
Entry level AFF and FAS MetroCluster over IP
Easy deploy plugin for ONTAP Select in VMware
SnapMirror Synchronous support for NFSv4, SMB2 & 3
Self Encrypting Drives (SED) NVMe SSDs

Now let’s look at a few of my favorites:

New ONTAP Support Policy
Going forward all ONTAP releases will be “long term supported.” Previously, “even” versions were short term supported (one year) and “odd” versions were long term supported (three years). The new support policy will be fully support for three years, limited support for two years, and “self-service” support for three years after.

FlexGroups
FlexGroups operate as a “scale-out NAS container” utilizing a single NAS point to allow access and automatic load distribution across multiple constituents, which rests on multiple aggregates across the ONTAP cluster. In ONTAP 9.6, support for MetroCluster (FC and IP) is now supported and can be created on existing MetroCluster deployments after upgrading to ONTAP 9.6. Additional Out-Of-Space protection was also introduced; this is called in to action when one of the constituents gets a little fuller than the others. It will end up “borrowing” space (up to 1%) from the other constituents to allow the write to complete.

Over-the-Wire Encryption With Snapmirror and FlexCache
Snapmirror has been around since just about the dawn of NetApp. Tried and true, it’s the backbone of replication and migration in ONTAP systems. Now with ONTAP 9.6, SnapMirror and Snapmirror Synchronous is encrypted (TLS v1.2) end to end and is enabled by default on all new SnapMirror relationships. End-to-end encryption is also now available on the new version of FlexCache, which made its debut with ONTAP 9.5. FlexCache is a feature of ONTAP that allows you to extend and/or accelerate data access within a cluster or more, typically across the WAN to remote clusters.

Aggregate-Level Encryption:
NetApp Volume Encryption was introduced back in ONTAP 9.1 as a quick and easy way to get encryption at rest on ONTAP. Shortly after, aggregate level dedupe was introduced back in 9.2. The downside was that you could not do NVE and ALD at the same time due to each volume being encrypted with a different key for security. However, with ONTAP 9.6 you’re given the option to encrypt at a aggregate level, giving each volume the same key (technically), so ALD is able to read all the blocks across the volume.

Entry Level AFF and FAS MetroCluster of IP:
The A220 and FAS2750 will now support MetroCluster over IP (MCC-IP) which was initially only available on the larger A700 and FAS900 system, and then later on for the mid-level A300 and 8200 systems. This gives the entry level solutions the ability to utilize MetroCluster of IP functionality. To further lower the entry cost of MCC IP, new deployments of MCC IP are able to use existing switches within the customers datacenter for the ISL. (Note: Certain requirements are required for this option. Please contact your NetApp Partner or NetApp Rep for further details. )

New Platform AFF A320 All NVMe Controller
The A320 is the mid-level version of the all NVMe A800 that debuted last year and offers the same end-to-end NVMe connectivity. Onboard are 8x 100GbE (can also support 40GbE) ports for connectivity and shelf expansion. Ports “e0a” and “e0d” are reserved for shared cluster and HA interconnects. Each controller has two expansion slots that can be configured with either the 4 port 10GbE networking, or the 4 port 32Gb FC or 25/100GbE RoCE. Along with the release of the A320, there will be an all NVMe expansion shelf, the NS224 that will be available.

A320

Rear view of the A320 each controller has 8x 100GbE ports + 2 expansion slots.

A320_rear

What’s it like to write a tech cert test; contributing to the NetApp NCDA.

My wife dropped me off at O’hare for my flight to Raleigh-Durham. Upon arrival to the hotel, I met a couple of members of the NetApp United crew at the bar. (This will become the theme for the week, as well as keeping tidy.*)

*We would later learn, that tidy is Welsh slang for a few things depending on context.

Donny, me and Alun at the hotel bar.

It’s all about Psychometrics: “the science of measuring mental capacities and processes.”

On the first day, we learned the details of how to write a test and what makes a good question versus a bad question. So what actually goes in to writing a technology based certification test? Short answer; a lot. The long answer, a question is made up of the “stem” (aka the question), the answer(s), and the distractors. All parts need to be well thought out, including the distractors. For this we were not allowed to create faux distractors either. Everything has to be a valid answer in the realm of NetApp. And adding even more difficulty, the questions need to be written geared towards a Minimally Acceptable Candidate (MAC). The MAC for the NCDA is considered someone with 6-12 months of ONTAP administrator experience that requires some supervision.

The NCDA NS0-160 Team

Before we could write any new questions, we needed to review the test blueprint. The blueprint is an outline of various parts of the certification test. In this case, it was which parts of NetApp ONTAP did we want to include when testing the MAC. Some examples were things like general ONTAP and FAS design and functionality to basic SnapMirror functions and even some higher level functions like Metrocluster.

Once the blueprint and the number of questions was confirmed, it was time to start writing questions. I learned that writing questions specifically, writing good questions, is actually a lot harder than I originally thought. Oddly enough, the hardest part was coming up with the distractors, .e.g. the wrong answers. You don’t want to make it too obvious or easy, and generally speaking even the distractors should be valid. For example, if your answers are a series of commands, each command needs to be valid inside of ONTAP, or any technology that’s referenced needs to be valid tech that existed, or once existed inside of the NetApp universe.

Once all of the questions are written, then the real “fun” begins. It’s times like these that I think back to one of my very favorite quotes I learned back in my Rock Climbing and Alpine days.

“It doesn’t have to be fun for you to be having fun.”

Each question needed to be tech-reviewed by all us SME’s in the room, as well as noted with valid references to NetApp documentation. After each question passed the first round of tech review, there’s a second pass of all the questions that needed to be re-reviewed and edited. The second time through went by quicker than the first for sure, due to the reduced number of questions. Once all the questions were finalized, we reviewed and weighted the questions.

From the NetApp Cafe

Switching to the subject of food for a minute (because it is never far from my mind), I was super impressed with the NetApp RTC Cafe. Each day there was always something delicious (and healthy) to be had at the various stations.

Each evening required some good R&R. Good food, drink, and company were a welcome respite from the brain drain. I am happy to report, I finally found a BBQ joint (Backyard BBQ Pit) I truly enjoyed in the Raleigh-Durham area. More importantly, lots of locally brewed beer was ONTAP at most of the local establishments we visited.

BBQ Plate

On Friday, it was (sadly) time to head home. What a good week of making new friends from around the world, learning, and having fun while working!

Flying home.

Hello World

So, ever get a great idea in your head and then go to carry it out only to realize you might have embarked on a much bigger journey than you bargained for? Yeah… staring at a blank page is kind of like that. Here I sit trying to write my first blog entry and get this party started, and whoosh. All the air and ideas slipped right on out of my brain.

I envision this blog’s purpose to be a place to reflect on technology, primarily data center infrastructure, with some random bits of nerd thrown in. Professionally, I’ve been in the IT industry for twenty years, but the geek force has been with me since birth.

SpindleNinja's blog

A blog about tech and stuff. My words are my own.

Author: mikepalic