X-Git-Url: https://gerrit.akraino.org/r/gitweb?a=blobdiff_plain;f=doc%2Ftroubleshooting.md;h=c440e8ca512516d668d4523b7519a8deca172ad5;hb=HEAD;hp=6dc0011d4c694cbc0a2f3f1eb59af587b873b12f;hpb=641f56a07791b0a3eabd23c0a0696b7aa0cb675c;p=icn.git

diff --git a/doc/troubleshooting.md b/doc/troubleshooting.md
index 6dc0011..c440e8c 100644
--- a/doc/troubleshooting.md
+++ b/doc/troubleshooting.md
@@ -27,6 +27,10 @@ Examining the BareMetalHost resource of the failing machine and the
 logs of Bare Metal Operator and Ironic Pods may also provide a
 description of why the provisioning is failing.
 
+A description of the BareMetalHost states can be found in the [Bare
+Metal Operator
+documentation](https://github.com/metal3-io/baremetal-operator/blob/main/docs/baremetalhost-states.md).
+
 ### openstack baremetal
 
 In rare cases, the Ironic and Bare Metal Operator information may get
@@ -115,3 +119,32 @@ this, Flux will complete reconcilation succesfully.
 Provisioning can take a fair amount of time, refer to [Monitoring
 progress](installation-guide.md#monitoring-progress) to see where the
 process is.
+
+A description of the BareMetalHost states can be found in the [Bare
+Metal Operator
+documentation](https://github.com/metal3-io/baremetal-operator/blob/main/docs/baremetalhost-states.md).
+
+## BareMetalHost never transitions from Available to Provisioned
+
+If the BareMetalHost has an owner but is not transitioning from
+Available to Provisioned, it is possible that the chart values are
+misconfigured. Examine the capm3-controller-manager logs for error
+messages:
+
+    # kubectl -n capm3-system logs capm3-controller-manager-7db896996c-7dls7 | grep ^E
+    ...
+    E0512 18:00:24.781426       1 controller.go:304] controller/metal3data "msg"="Reconciler error" "error"="Failed to create secrets: Nic name not found ens5" "name"="icn-nodepool-0" "namespace"="metal3" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="Metal3Data"
+
+In the above instance, the NIC name in the chart values (`ens5`) was
+incorrect and setting the correct name resolved the issue.
+
+## Vagrant destroy fails with `cannot undefine domain with nvram`
+
+The fix is to destroy each machine individually.  For the default ICN
+virtual machine deployment:
+
+    vagrant destroy -f jump
+    virsh -c qemu:///system destroy vm-machine-1
+    virsh -c qemu:///system undefine --nvram --remove-all-storage vm-machine-1
+    virsh -c qemu:///system destroy vm-machine-2
+    virsh -c qemu:///system undefine --nvram --remove-all-storage vm-machine-2