Troubleshooting vRA upgrade – Aria Automation Series

While upgrading vRealize to Aria Automation I ran into some blocking issues, I share in this article the KB and commands that helped me solve them and complete the upgrade.

Pre Checks fail due to low disk space on LCM root partition

In this case the EXTEND STORAGE function from the LCM settings does not help, the root is not extended

You can recover some space from the root using the commands in this KB, once the space is freed re-run the pre-checks and continue with the upgrade

If you try to expand the disk space on LCM the job stops with the following error

The problem is due to the presence of snapshots (due to previous upgrades)

You will simply need to delete all snapshots with DELETE ALL, rerun EXTEND STORAGE

Upgrade Pre Checks fail for no apparent reason

It may be a stuck job trying to connect to the Aria Automation environment. In my case I encountered 2 problems.

The first one solved by running the commands in this KB, in the second the Aria Automation services were not properly started.

To check the status of Aria Automation and its services connect via SSH to the appliance with the root user.

First check whether the node (or nodes in the case of a cluster) are in the ready state

If the node is in the NotReady state it may not have started properly or has not yet completed startup. If after a while the status does not change to Ready try restarting the appliance.

Check the status of the services, all services should be Started and Healthy

If the services are not in Started check the status of the Pods of the prelude namespace

You can check the Pods deployment logs with the following command

As you can see the status of the services, Pods and log do not report any problem, in this case the system works correctly.

This is the status of the previous commands in case of incomplete reboot or still starting

Another blocking issue, not being able to map an image to upgrade from LCM

This happens if you have not selected a Product support pack that supports the image you are trying to map (refer to this KB). Make sure you have selected and activated the latest PSPack available for the LCM release you are using.

Other useful commands to check your Aria Automation version and upgrade status

I will be sure to update the article with solutions to other upgrade problems πŸ™‚

Posted in aria automation, troubleshooting, vmug | Tagged , , | Comments Off on Troubleshooting vRA upgrade – Aria Automation Series

Upgrade vRA 8.11.2 to 8.18.0 – Aria Automation Series

And now comes the best part πŸ™‚ As you have seen, the installation with Easy Installer is not complicated, the upgrade requires more attention.

First you need to check the upgrade paths of the products involved and the relative compatibility matrix. During the entire upgrade process we must guarantee full interoperability between Lifecycle managers, Workspace ONE and Aria Automation.

For Workspace ONE (for simplicity IDM) it is simple, the current version is in matrix with Lifecycle manager and Automation.

Things get a little more complicated with LCM and Automation.

Let’s remember that the various versions must be compatible with each other!

Cross-referencing everything, and checking the various steps in the lab, I obtained the following upgrade path

Upgrade orderProductSource ReleaseDestination Release
1LCM8.108.12
2Automation8.11.28.13.1
3LCM8.128.14
4Automation8.13.18.16
5LCM8.148.16
6LCM8.168.18
7Automation8.168.18

The resulting upgrade path is not simple, in some steps I had to solve some blocking problems. As always it is worth keeping the infrastructure updated periodically, having to make so many jumps in a short time becomes much more risky.

Lifecycle manager provides the Upgrade Planner tool within the environment that hosts Aria Automation, unfortunately version 8.10 (installed with Easy Installer) gave me various errors and was not usable πŸ™

Only after the upgrade to 8.12 it started to work, but it started to not work correctly again after some releases. My advice is to use the upgrade paths and the site compatibility matrix.

However, let’s see an image of the working tool, it reflects the correct upgrade path πŸ™‚

From the environment select UPGRADE PLANNER, then you can specify the final release, select the product flags and run GENERATE UPGRADE PLAN

Here the result

once the upgrade path is defined, you can start the upgrade. Let’s start from the Lifecycle manager (LCM). The procedure is the same for all releases.

First we need to download the ISO for the upgrade from the broadcom site, the file name is VMware-Aria-Suite-Lifecycle-Appliance-8.12.0.7-21628952-updaterepo.iso

The ISO must be loaded on a datastore of the cluster where LCM resides and connected to the appliance. Then, from LCM, go to Settings and System Upgrade. Select CDROM as the repository and then CHECK FOR UPGRADE.

If the ISO is correctly connected, the release for the upgrade will be detected.

NOTE: before starting the upgrade it is necessary to take a snapshot of LCM, if something goes wrong we can always go back.

Let’s go with the upgrade, to continue enable the snapshot flag.

Run pre-checks and verify that everything is OK

Confirm with UPGRADE

LCM will update and restart, wait for services to come back up.

Upon completion you will be able to check the new release

Once the upgrade is complete, we install the latest Product Support Pack, go to Settings and then to Product Support Pack. The active Pack is the one available with the upgrade, check that there are more recent ones.

Select the latest one and click APPLY VERSION

We see that the latest Support Pack supports exactly the version of Aria Automation that we need for the upgrade step 2.

LCM must be restarted for the new configurations to take effect.

Let’s move on to the second step, download the ISO for the Automation upgrade to release 8.13.1

This time the ISO must be loaded directly on LCM, it is possible to transfer it via SCP to the path /data

NOTE: If there is not enough space available it will be necessary to remove vra.ova with which the installation was done by EASY INSTALLER. In addition to removing the image from Binary Mapping it will be necessary to physically remove it from /data via CLI

Mapping the ISO from Binary Mapping

The new image is available for upgrade!

Let’s select the environment that hosts Aria Automation, VIEW DETAILS

Before proceeding with the upgrade, perform a TRIGGER INVENTORY SYNC and verify that the job completes successfully

Back to the upgrade procedure, the new release is available

Enable snapshot and rollback flags if there are problems during the upgrade

Let’s check the hardware requirements for the new release by following the highlighted link

The RAM required is 48GB, my current implementation is MEDIUM and will need to increase the RAM as required. To do this use day 2 operations from the environment, first POWER OFF, change the RAM from the vSphere Client and then POWER ON. Wait for the services to come back up (takes a long time!)

Let’s resume the upgrade by running the pre-checks

Let’s go to the Upgrade Summary and click Submit to start the upgrade

Waiting for the upgrade process to complete successfully.

The whole process took about 1 hour, you can monitor the status of the services directly from the Aria Automation console. At the end of the upgrade our environment will be updated πŸ™‚

At this point we have performed the first 2 steps of the upgrade path, in the same way it will be possible to perform all the other steps until reaching the 8.18 release of LCM and Aria Suite

NOTE: the Aria Automatin upgrade from release 8.16 to 8.18 requires 54GB of RAM as a prerequisite.

These are the files to download to perform all the upgrades

Prelude_VA-8.13.1.32340-22360938-updaterepo.iso

Prelude_VA-8.16.0.33697-23103949-updaterepo.iso

Prelude_VA-8.18.0.35770-24024333-updaterepo.iso

VMware-Aria-Suite-Lifecycle-Appliance-8.12.0.7-21628952-updaterepo.iso

VMware-Aria-Suite-Lifecycle-Appliance-8.14.0.4-22630472-updaterepo.iso

VMware-Aria-Suite-Lifecycle-Appliance-8.16.0.4-23377566-updaterepo.iso

NOTE: The upgrade to LCM 8.18 can be done directly from the online repository πŸ™‚

In the next article I will tell you how to solve some problems encountered during the upgrade and how to monitor the status of Aria Automation services via CLI

Posted in aria automation, upgrade, vmug | Tagged , , | Comments Off on Upgrade vRA 8.11.2 to 8.18.0 – Aria Automation Series

Install vRealize Automation 8.11.2 – Aria Automation Series

This is the first article in a series on Aria Automation. Yes, the title talks about the installation of the 8.11.2 release, when the product was still called vRealize Automation. Why talk about such an old version? Because it often happens that you have to make updates on old infrastructures to bring them to the latest available release. For me it is essential to have old releases available to verify the upgrade path to be used by the customer and verify each phase to ensure that everything is updated correctly. Better to gain experience in a laboratory than directly on the customer’s production infrastructure πŸ˜‰

I take this opportunity to create an installation guide to share with anyone who needs it.

vRealize Automation needs other components for its operation: Workspace ONE Access and vRealize Suite Lifecyle Manager, the first is the identity manager that manages access for all vRealize/Aria products and the second instead takes care of lifecyle management (configuration/upgrade).

The easiest way to install vRealize Automation is through Easy Installer, which is an ISO containing the software needed to install all the components mentioned. The ISO can be downloaded from the Broadcom website (you need an account that is entitled for downloading).

Once the ISO has been downloaded, all you need to do is mount it on your workstation and start the installation process.

Then run the installer for your environment

The installation screen opens, next to continue.

Accept EULA

First of all, you need to enter the vcenter on which you will deploy Lifecycle Manager, the first appliance that will be installed.

Select Datacenter, cluster and datastore

Enter the information about the network the appliance will be connected to, as well as DNS and NTP

Enter the password that will be used for the root and configuration accounts (admin@local)

Specify the name to use for the VM, its IP address and its FQDN

It is also possible to install Identity Manager and vRealize Automation in one step, I prefer to do it later directly from Lifecycle Manager.

Let’s go to the summary of what has been selected and confirm the installation

Wait for the process to finish

Installation completed successfully! Now we can connect to Lifecycle manager

To log in use the admin@local user with the password specified in the wizard

The dashboard of Lifecycle appears, select Locker to upload the certificates that will be used for the deployment of Workspace ONE and vRealize Automation

Import the certificates to be used, one for Workspace ONE and one for vRealize Automation. The certificates must be in PEM format and contain, in addition to the certificate itself, also the root CA with which they were signed and the relative public key. The order of the certificates within the PEM must be as follows: certificate + root CA + public key

Repeat the same operations also for the vRealize Automation certificate

Let’s proceed with the installation of Workspace ONE, it will be necessary to simultaneously create the defaultEnvironment that will host it. Activate the flag for the installation, create the password to use for accessing Workspace ONE and select the destination Datacenter

Select Identity Manager, a new installation, the version and the deployment type, for my lab Standard is fine (for production environments obviously select cluster)

Accept EULA

Select the previously imported certificate

Enter the vcenter, select the target cluster and all other necessary data

Enter the data relating to the network to which the appliance will be connected

Select the appliance size (in this case Medium) and all other data

Run Pre-checks to verify that all data is correct

Go to summary and hit submit to start deploying the appliance

Under Requests you will be able to follow all the installation steps

If the installation was successful you will be able to see the identity manager present in the globalEnvironment

Now that the globalEnvironment has been created and Workspace ONE is installed, we can proceed with creating the environment for vRealize Automation and installing it. The procedure is very similar to the previous one.

Select the component to install: vRealize Aria Automation

NOTE: the selection also includes all the other products of the vRealize suite, however the Easy Installer includes only the ISO for Automation

Accept EULA

Select the License to use for vRealize Automation

Select the certificate to use

Enter the vcenter, its target cluster and all other information needed for deployment

Specify network information

Select the size of the appliance, in this case Medium. Complete with all other information.

Perform pre-check to verify the information entered is correct

Go to the summary and proceed with the installation by pressing submit

Wait for the installation request to finish successfully

New environment is now available with vRealize Automation

Connect to Automation for the first login

The logon is redirected to Workspace ONE for authentication; the username and password to be used are those specified during the creation wizard

This ends the first article in the series, we will see in the next article how to upgrade vRealize Automation to the latest version of Aria Automation

Posted in aria automation, vmug | Tagged , | Comments Off on Install vRealize Automation 8.11.2 – Aria Automation Series

VCF 5.2 Administrator – my experience

The good and the bad of my job? You never stop learning πŸ™‚

Keeping up with technology requires a lot of effort in terms of time and money, stay on track can prove stressful but challenging at the same time. That’s why certifications are part of my continuing education; they are important milestones that demonstrate achievements.

Why VMware Cloud Foundation Administrator ? Because it represents the whole set of technologies I work on and have been focusing on these last years. Also, a recent study (mentioned in this article and at the vmware explore keynote) reports that there will be a return to β€œon-premises” infrastructure in the coming years.

I I’m sure many are thinking about taking this exam, glad to share my personal experience with everyone!

Exam Details

ExamVMware Cloud Foundation 5.2 Administrator
Code2V0-11.24
Price250$
Items70
Time135 minutes
Minimum score300 (max 500)

NOTE: As announced last May, it is no longer necessary to take an official course in order to take the exam. Fees have also been changed, now all exam types (VCTA/VCP/VCAP) have the same price.

Preparing for the exam

Details of the exam are available at this link

The blueprint consists of 5 sections :

  1. IT Architectures, Technologies, Standards
  2. VMware by Broadcom Solution
  3. Plan and Design the VMware by Broadcom Solution
  4. Install, Configure, Administrate the VMware by Broadcom Solution
  5. Troubleshoot and Optimize the VMware by Broadcom Solution

For the exam, we can ignore sections 1 and 3 and focus on the remaining ones. A quick glance immediately reveals the weight of sections 4 (as many as 4 pages) and 5 on the exam. Study should be focused mainly on these sections.

The objectives are many and on all components of VCF. How to prepare for this exam?

Obviously by studying the official documentation; remember that VCF consists of many products, and for each you need to study the relevant guide:

  • Cloud Builder
  • SDDC Manager
  • VMware vCenter Server
  • VMware ESXi
  • VMware vSAN
  • VMware NSX
  • VMware Aria Suite Lifecycle

NOTE: Expect questions about each of the products listed πŸ˜‰

Taking an official course can certainly accelerate learning; details of the courses can be found at this link.

Other valuable tools are Hands on Labs that allow you to practice on virtual labs. Practice is essential to assimilate the theory.

My advice is to install VCF from scratch, for me it was a key experience. Getting hands-on in all phases of installation and configuration remains the best study : Learning by doing!

If you do not have a cluster with the requirements to install VCF you can try the nested version : Holodeck Tool Kit

NOTE : the previous link leads to a discussion on how to download the latest version of the Tool Kit, this is the form to request the download

There are many articles on how to implement Holodeck, I will leave you the search πŸ™‚

Exam

This is the classic vmware exam: single/multiple choice questions, state the right sequence of events, associate terms and definitions. Questions should be read carefully and slowly, the important thing is to remain clear-headed and focus on the answers.

Divide your time by the number of questions; do not spend more time than necessary on a question. If you find one challenging, don’t freeze: move on, you can mark it and review it before the final submission.

Try to answer all questions; be careful about the number of answers you select for multiple-choice questions!

Some answers are obviously wrong, a good strategy is to go by exclusion: the answers that remain, with a good chance, are the right ones.

The questions are really varied, ranging from simple ones with immediate answers to those that require not only reasoning but also experience in using the tools. Well yes, I found a question regarding a KB of NSX that allowed me to solve a problem not long ago πŸ™‚ This does not mean that you have to read all existing KBs but that the exam tests your actual experience of the various technologies.

Conclusion

Passing the VCF administrator exam requires commitment, discipline and a good study strategy. Sharing my experience has been a way to inspire and support anyone who wants to take this path. Good study and good luck!

Posted in certification, vcf, vmug | Tagged , , | Comments Off on VCF 5.2 Administrator – my experience

Deploy NSX ALB for TKG Standalone

In previous articles we have seen how to create a standalone TKG cluster using NSX ALB as a loadbalancer, let’s see how to install and configure it for TKG.

First we need to check which version is in compatibility matrix with the version of vSphere and TKG we are using.

In my case are vSphere 7.0u3 and TKG 2.4.

The check is necessary because not all ALB releases are compatible with vSphere and TKG, let’s look in detail at the matrix for the versions used.

The VMware Product Interoperability Matrix is available at thisΒ link.

The versions of ALB that can be used are 22.1.4 and 22.1.3; I chose the most recent.

The latest patches should also be applied to Release 22.1.4.

The files to download from the VMware site are as follows:

controller-22.1.4-9196.ova + avi_patch-22.1.4-2p6-9007.pkg

NOTE: Before starting the OVA deployment, verify that you have created the DNS records (A+PTR) for the controllers and the cluster VIP.

Start the deployment of the controller on our vSphere cluster.

Select the OVA file

Name the VM and select the target Datacenter and Cluster

On summary ignore certificate warnings, next

In the next steps, select the datastore and controller management network

Now enter the settings for the controller management interface

Select FINISH to start deployment

Turn on the VM and wait for the services to be active (takes a while)

The first time you log in, you will need to enter the admin user password

Enter the information necessary to complete the configuration and SAVE

First we apply the type of license to use, for the lab I used the Enterprise with 1 month of EVAL

Change the cluster and node settings by entering the virtual address and FQDN, then SAVE.

NOTE: after saving the settings you will lose connectivity with the node for a few seconds

Connect to the controller using the VIP address

Apply the latest available patches

Select the previously downloaded file and upload it

Waiting for the upload to complete successfully

Start the upgrade

Leave the default settings and confirm

Follow the upgrade process, during the upgrade the controller will reboot

Patch applied!

Create the Cloud zone associated with our vCenter for automatic deployment of Service Engines

Enter the Cloud name and the vcenter connection data

Select the Datacenter, the Content Library if any. SAVE & RELAUNCH

Select management network, subnet, defaut GW and specify the pool for assigning static ip address to Service Engines.

Create the IPAM profile that will be used for the VIPs required by the Tanzu clusters.

Save and return to the creation of Cloud zone. Confirm creation as well.

Cloud zone appears in the list, verify that it is online (green dot)

Generate a controller SSL certificate, it is the one used in the Tanzu standalone management cluster creation wizard

Enter the name and Common Name of the certificate, careful that it reflects the actual DNS record (FQDN) used by Tanzu to access the AVI controller. Enter all the necessary fields.

Complete the certificate with days of validity (365 or more if you want) and insert all the Subject Alternate Names with which the certificate can be invoked (IP addresses of the VIP and controllers, also include the FQDNs of the individual controllers)

The certificate will be used to access the controller VIP, set it for access.

Enable HTTP and HTTPS access as well as redirect to HTTPS.

Delete the current certificates and select the newly created one.

NOTE: applied the new certificate will need to reconnect to the controller

It remains to enter the default route for traffic leaving the VRF associated with our Cloud zone.

NOTE: verify that our Cloud zone is selected in the Select Cloud field.

Enter the default GW for all outgoing traffic

Check that the route is present

Complete our configuration by creating a Service Engine Group to be used for Tanzu. This will allow us to customize the configurations of the SEs used for Tanzu.

Enter the name and select the cloud zone.

Finally we have an ALB loadbalancer ready to serve our Tanzu clusters πŸ™‚

Posted in ALB, nsx, tanzu, vmug | Tagged , , , | Comments Off on Deploy NSX ALB for TKG Standalone

ESXi Network Tools

Sometimes it happens to troubleshoot an ESXi host for network problems.

Over time I created a small guide to help me remember the various commands, I share it hoping it will be useful to everyone πŸ™‚

esxcli network (here the complete list)

Check the status of firewall

esxcli network firewall get
Default Action: DROP
Enabled: true
Loaded: true

Enabling and disabling firewall

esxcli network firewall set --enabled falseΒ  (firewall disabled)

esxcli network firewall set --enabled true (firewall enabled)

TCP/UDP connection status

esxcli network ip connection list
Proto Recv Q Send Q Local Address                   Foreign Address       State       World ID CC Algo World Name
----- ------ ------ ------------------------------- --------------------- ----------- -------- ------- ----------
tcp        0      0 127.0.0.1:80                    127.0.0.1:28796       ESTABLISHED  2099101 newreno envoy
tcp        0      0 127.0.0.1:28796                 127.0.0.1:80          ESTABLISHED 28065523 newreno python
tcp        0      0 127.0.0.1:26078                 127.0.0.1:80          TIME_WAIT          0
tcp        0      0 127.0.0.1:8089                  127.0.0.1:60840       ESTABLISHED  2099373 newreno vpxa-IO
<line drop>

Configured DNS servers and search domain

esxcli network ip dns server list

DNSServers: 10.0.0.8, 10.0.0.4

esxcli network ip dns search list

DNSSearch Domains: scanda.local

List of vmkernel interfaces

esxcli network ip interface ipv4 get
Name IPv4 Address   IPv4 Netmask  IPv4 Broadcast Address Type Gateway      DHCP DNS
---- -------------- ------------- -------------- ------------ ------------ --------
vmk0 172.16.120.140 255.255.255.0 172.16.120.255 STATIC       172.16.120.1 false
vmk1 172.16.215.11  255.255.255.0 172.16.215.255 STATIC       172.16.215.1 false

Netstacks configured on host (used on vmkernel interfaces)

esxcli network ip netstack list
defaultTcpipStack
Key: defaultTcpipStack
Name: defaultTcpipStack
State: 4660

vmotion
Key: vmotion
Name: vmotion
State: 4660

List of physical network adapters

esxcli network nic list
Name   PCI Device   Driver  Admin Status Link Status Speed Duplex MAC Address       MTU  Description
------ ------------ ------- ------------ ----------- ----- ------ ----------------- ---- -----------
vmnic0 0000:04:00.0 ntg3    Up           Down        0     Half   ec:2a:72:a6:bf:34 1500 Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet
vmnic1 0000:04:00.1 ntg3    Up           Down        0     Half   ec:2a:72:a6:bf:35 1500 Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet
vmnic2 0000:51:00.0 bnxtnet Up           Up          25000 Full   00:62:0b:a0:b2:c0 1500 Broadcom NetXtreme E-Series Quad-port 25Gb OCP 3.0 Ethernet Adapter
vmnic3 0000:51:00.1 bnxtnet Up           Up          25000 Full   00:62:0b:a0:b2:c1 1500 Broadcom NetXtreme E-Series Quad-port 25Gb OCP 3.0 Ethernet Adapter
vmnic4 0000:51:00.2 bnxtnet Up           Up          25000 Full   00:62:0b:a0:b2:c2 1500 Broadcom NetXtreme E-Series Quad-port 25Gb OCP 3.0 Ethernet Adapter
vmnic5 0000:51:00.3 bnxtnet Up           Up          25000 Full   00:62:0b:a0:b2:c3 1500 Broadcom NetXtreme E-Series Quad-port 25Gb OCP 3.0 Ethernet Adapter

vmkpingΒ (KB reference)

command to send ICMP packets through vmkernel interfaces, very useful for checking MTU πŸ™‚

usage examples

ping an host
vmkping -I vmk0 192.168.0.1

check MTU and fragmentation
vmkping -I vmk0 -d -s 8972 172.16.100.1

ping an host using vmotion netstack
vmkping -I vmk2 -S vmotion 172.16.115.12

iperf ( good article here)

Very useful tool to check the actual usable bandwidth between 2 hosts, one host uses server mode and one uses client mode

the tool is located at this path

/usr/lib/vmware/vsan/bin/iperf3

NOTE: in vSphere 8 you may get ” Operation not permitted” error at runtime, you can enable the execution with the command

esxcli system secpolicy domain set -n appDom -l disabled

then enforcing with

esxcli system secpolicy domain set -n appDom -l enforcing

it is also necessary to disable the firewall to perform the tests

esxcli network firewall set --enabled false

usage example:

host server mode, the -B option allows a specific address and interface to be used for testing

 /usr/lib/vmware/vsan/bin/iperf3 -s -B 172.16.100.2

client mode host, the -n option specifies the amount of data to be transferred for testing

/usr/lib/vmware/vsan/bin/iperf3 -n 10G -c 172.16.100.2

25G interface test result

[ ID] Interval        Transfer    Bitrate        Retr
[  5]   0.00-4.04 sec 10.0 GBytes 21.3 Gbits/sec 0    sender
[  5]   0.00-4.04 sec 10.0 GBytes 21.3 Gbits/sec      receiver

NOTE : at the end of the test remember to re-enable the firewall and enforcing πŸ™‚

nslookup e cache DNSΒ (KB reference)

Sometimes it is necessary to verify that DNS name resolution is working properly on a host.

Use the nslookup command followed by the name to resolve

nslookup www.scanda.it

It may happen that changes to DNS records are not immediately received by esxi hosts, this is due to the DNS query caching mechanism.

To clear the DNS cache, use the following commandΒ (KB reference)

/etc/init.d/nscd restart

TCP/UDP connectivity test

On the esxi hosts, netcat (nc) tool is present to verify TCP/UDP connectivity to another host.

nc
usage: nc [-46DdhklnrStUuvzC] [-i interval] [-p source_port]
[-s source_ip_address] [-T ToS] [-w timeout] [-X proxy_version]
[-x proxy_address[:port]] [hostname] [port[s]]

If you need to verify access to an HTTPS service and the validity of its SSL certificate, you can use the command

openssl s_client -connect www.dominio.it:443

pktcap-uwΒ (KB reference)

another very useful tool is pktcap-uw, which allows you to capture network traffic in full tcpdump style. The tool differs from tcpdump-uw in that it can capture traffic not only from vmkernel interfaces, but also from physical interfaces, switchports, and virtual machines.

let’s look at a few examples

capturing traffic from the vmkernel vmk0

pktcap-uw --vmk vmk0

traffic capture from physical uplink vmnic3

pktcap-uw --uplink vmnic3

Capturing traffic from a virtual switch port

pktcap-uw --switchport <switchportnumber>

NOTE: To get the port number mapping and virtual nic of a VM use the command net-stats -l

It is also possible to retrieve information from the LLDP protocol from uplinks used by a VSS ( do not support LLDP) with the following command

pktcap-uw --uplink vmnic1 --ethtype 0x88cc -c 1 -o /tmp/lldp.pcap > /dev/null && hexdump -C /tmp/lldp.pcap

The output will be in hexadecimal format and may be useful for performing port mapping of a host even on a Virtual Standard Switch.

I will not fail to update the list with other useful commands.

 

Posted in esxi, networking, troubleshooting, vmug, vsphere | Tagged , , , , | Comments Off on ESXi Network Tools

Deploy TKG Standalone Cluster – part 2

Here is the second article, find the first one at this link.

Now that the bootstrap machine is ready we can proceed with the creation of the standalone cluster.

Let’s connect to the bootstrap machine and run the command that starts the wizard.

NOTE: by default the wizard is started on the loopback, if you want to reach it externally you only need to specify the –bind option with the ip of a local interface.

tanzu management-cluster create --ui 
or
tanzu management-cluster create --ui --bind 10.30.0.9:8080

By connecting with a browser to the specified address we will see the wizard page.

Select the cluster type to deploy, in this case vSphere.

NOTE: the SSH key is the one generated in the first part, bring it back correctly ( the image is missing ssh-rsa AAAAB3N… )

Deploy TKG Management Cluster

Select the controlplane type, node size and balancer

NOTE: In this case I chose to use NSX ALB which must have already been installed and configured

Enter the specifications for NSX ALB

Insert any data in the Metadata section

Select the VM folder, Datastore, and cluster to be used for deployment

Select the Kubernetes network

If needed, configure the identity provider

Select the image to be used for the creation of the cluster nodes.

NOTE: is the image previously uploaded and converted as a template

Select whether to enable CEIP

Deployment begins, follow the various steps and check for errors

Deployment takes time to create

It is now possible to connect to the cluster from the bootstrap machine and check its working

tanzu mc get

Download and install the Carvel tools on the bootstrap machine

Installation instructions can be found in the official documentation

Verify that the tools have been properly installed

Now we can create a new workload cluster.

After the management cluster is created, we find its definition file at the path ~/.config/tanzu/tkg/clusterconfigs
The file has a randomly generated name ( 9zjvc31zb7.yaml ), it is then converted to a file with the specifications for creating the cluster ( tkgvmug.yaml )

Make a copy of the file 9zjvc31zb7.yaml giving the name of the new cluster to be created ( myk8svmug.yaml )

Edit the new file by changing the CLUSTER_NAME variable by entering the name of the new cluster

Launch the command to create the new cluster

tanzu cluster create --file ~/.config/tanzu/tkg/clusterconfigs/myk8svmug.yaml

connect to the new cluter

tanzu cluster kubeconfig get --admin myk8svmug
kubectl config get-contexts
kubectl config use-context myk8svmug-admin@myk8svmug

We can now install our applications in the new workload cluster

 

 

Posted in kubernetes, tanzu | Tagged , | Comments Off on Deploy TKG Standalone Cluster – part 2

Deploy TKG Standalone Cluster – part 1

I had the pleasure of attending the recent Italian UserCon with a session on Tanzu Kubernetes Grid and the creation of a standalone management cluster. Out of this experience comes this series of posts on the topic.

As mentioned above this series of articles is on TKG Standalone version 2.4.0, it should be pointed out that the most common solution to use is TKG Supervisor (refer to theΒ  official documentation)

But then when does it make sense to use TKG Standalone?

  • When using AWS or Azure
  • When using vSphere 6.7 (vsphere with Tanzu has only been introduced since version 7)
  • When using vSphere 7 and 8 but need the following features : Windows Containers, IPv6 dual stack, and the creation of cluster workloads on remote sites managed by a centralized vcenter server

Let’s look at the requirements for creating TKG Standalone:

  • a bootstrap machine
  • vSphere 8, vSphere 7, VMware Cloud on AWS, or Azure VMware Solution

I have reported only the main requirements, for all details please refer to the official link

Management Cluster Sizing

Below is a table showing what resources to allocate for management cluster nodes based on the number of workload clusters to be managed.

In order to create the management cluster, it is necessary to import the images to be used for the nodes; the images are available from the vmware site downlaods.

I recommend using the latest available versions:

  • Ubuntu v20.04 Kubernetes v1.27.5 OVA
  • Photon v3 Kubernetes v1.27.5 OVA

Once the image has been imported, it is necessary to convert it to a template.

Creating bootstrap machine

Maybe that is the funniest part πŸ™‚ I chose a Linux operating system, specifically Ubuntu server 20.04.

Recommended requirements for the bootstrap machine are as follows : 16GB RAM, 4 cpu and at least 50GB disk space.

Here are the details of mine

Update to the latest available package

sudo apt update
sudo apt upgrade

Important! synchronize time via NTP.

If you are using the bootstrap machine in an isolated environment, it is useful to also install the graphical environment so that you can use a browser and other graphical tools.

apt install tasksel
tasksel install ubuntu-desktop
reboot

Installare Docker

Manage Docker as a non-root user

sudo groupadd docker
sudo usermod -aG docker $USER
docker run hello-world

Configure Docker to start automatically with systemd

sudo systemctl enable docker.service
sudo systemctl enable containerd.service

Activate kind

sudo modprobe nf_conntrack

Install Tanzu CLI 2.4

Check the Product Interoperability Matrix to find which version is compatible with TKG 2.4

Once you have identified the compatible version, you can download it from vmware

Proceed to install the CLI in the bootstrap machine (as a non-root user)

mkdir tkg
cd tkg
wget https://download3.vmware.com/software/TCLI-100/tanzu-cli-linux-amd64.tar.gz
tar -xvf tanzu-cli-linux-amd64.tar.gz
cd v1.0.0
sudo install tanzu-cli-linux_amd64 /usr/local/bin/tanzu
tanzu version

Installing TKG plugins

tanzu plugin group search -n vmware-tkg/default --show-details
tanzu plugin install --group vmware-tkg/default:v2.4.0
tanzu plugin list

Download and install on the bootstrap machine the kubernetes CLI for Linux

cd tkg
gunzip kubectl-linux-v1.27.5+vmware.1.gz
chmod ugo+x kubectl-linux-v1.27.5+vmware.1
sudo install kubectl-linux-v1.27.5+vmware.1 /usr/local/bin/kubectl
kubectl version --short --client=true

Enable autocomplete for kubectl and Tanzu CLI.

echo 'source <(kubectl completion bash)' >> ~/.bash_profile

echo 'source <(tanzu completion bash)' >> ~/.bash_profile

As the last thing we generate the SSH keys to be used in the management cluster creation wizard

ssh-keygen
cat ~/.ssh/id_rsa.pub

This last operation completes the first part of the article.

The second part is available here

Posted in kubernetes, tanzu | Tagged , | Comments Off on Deploy TKG Standalone Cluster – part 1

NSX-T Upgrade

The NSX-T installation series started with 3.1.x, it’s time to upgrade to 3.2 πŸ™‚

The upgrade is completely managed by NSX Manager, let’s see the process starting from the official documentation.

The upgrade version will be 3.2.2, this is because the Upgrade Evaluation Tool is now integrated with the pre-upgrade check phase. Therefore, you will not have to deploy theΒ  OVF πŸ˜‰

Download the NSX 3.2.2 Upgrade Bundle from your My VMware account.

NOTE: The bundle exceeds 8GB of disk space.

I forgot, let’s verify that the vSphere version is in matrix with the target NSX-T version. My cluster is in 7.0U3, fully supported by NSX-T 3.2.2 πŸ™‚

Connect to NSX Manager via SSH and verify that the upgrade service is active.

run the command as admin user:Β  get service install-upgrade

The service is active, connect to the NSX Manager UI and go to System -> Upgrade

Select UPGRADE NSX

Upload the upgrade bundle

wait for the bundle to load (the process may take some time)

After uploading, pre-checks on the bundle begin.

Once the preliminary checks have been completed, it is possible to continue with the upgrade. Select the UPGRADE button.

Accept the license and confirm the request to start the upgrade.

Select the RUN PRE-CHECK button and then ALL PRE-CHECK.

The pre-checks begin, in case of errors it will be necessary to solve each problem until all the checks are ok.

Proceed with the Edges update by selecting the NEXT button.

Select the Edge Cluster and start the update with the START button.

The updating of the Edges that form the cluster begins, the process stops in case of success or at the first detected error.

Once the upgrade has been successfully completed, run the POST CHECKS.

If all is well continue with the upgrade of the hosts esxi. Select NEXT.

Select the cluster and start the update with the START button.

At the end of the update run the POST CHECKS.

Now it remains to update NSX Manager, select NEXT to continue.

Select START to start upgrading the NSX Manager.

The upgrade process first performs pre-checks and then continues with the Manager upgrade.

The update continues with the restart of the Manager, a note reminds that until the update is complete it will not be possible to connect to the UI.

Once the Manager has been restarted and the upgrade has been completed, it will be possible to access the UI and check the result of the upgrade. Navigate to System -> Upgrade

You can find the details of each upgrade stage. As you can see, the upgrade process is simple and structured.

Upgrade to NSX-T 3.2.2 completed successfully πŸ™‚

Posted in nsx, upgrade, vmug | Tagged , , | Comments Off on NSX-T Upgrade

Create host transport nodes

Last article in the series is about preparing esxi hosts to turn them into Trasnport Nodes.

First you need to create some profiles to be used later for preparing hosts.

From the Manager console, move to System -> Fabric -> Profiles -> Uplink Profiles.

Select + ADD PROFILE

Enter the name of the uplink profile, if not using LAG (LACP) move to the next section.

Select the Teaming Policy (default Failover Order) and enter the name of the active uplink. Enter the VLAN ID, if any, to be used for the overlay network and the MTU value.

Move to Transport Node Profiles.

Select + ADD PROFILE

Enter the name of the profile, select the type of Distributed switch (leave the Standard mode), select the Compute Manager and the related Distributed switch.

In the Transport Zone section indicate the transport zones to be configured on the hosts.

Complete the profile by selecting the previously created uplink profile, the TEP address assignment methodology, and map the profile’s uplink to that of the Distribute switch.

Create the profile with the ADD button.

Move to System -> Fabric -> Host Transport Nodes.

Under Managed By select the Compute Manager with the vSphere cluster to be prepared.

Select the cluster and CONFIGURE NSX.

Select the Transport Node profile you have just created and give APPLY.

Start the installation and preparation of the cluster nodes.

Wait for the configuration process to finish successfully and for the nodes to be in the UP state.

Our basic installation of NSX-T can finally be considered completed πŸ™‚

From here we can start configuring the VM segments and the dynamic routing part with the outside world as well as all the other security aspects!

Posted in nsx, vmug | Tagged , | Comments Off on Create host transport nodes