X-Git-Url: https://gerrit.akraino.org/r/gitweb?a=blobdiff_plain;f=README.md;h=3f76e11dccc3d385b6253e1faecdf7fe4a5402a4;hb=308b436e60c4f9477641a196fe5a53996fd9bc92;hp=523b49d54e7a53aa2b7dc7f7765ac3e946ae699c;hpb=faf6493f117e1745a6875ecc67f2002d25db8ef7;p=icn.git diff --git a/README.md b/README.md index 523b49d..3f76e11 100644 --- a/README.md +++ b/README.md @@ -1,31 +1,561 @@ -# Integrated Cloud Native +# Introduction +ICN strives to automate the process of installing the local cluster +controller to the greatest degree possible – "zero touch +installation". Most of the work is done simply by booting up the jump +server (Local Controller). Once booted, the controller is fully +provisioned and begins to inspect and provision the bare metal +servers, until the cluster is entirely configured. This document shows +step-by-step how to configure the network and deployment architecture +for the ICN blueprint. -Work in progress +# License +Apache license v2.0 -For more information refer - https://wiki.akraino.org/pages/viewpage.action?pageId=11995140 +# Deployment Architecture +The Local Controller is provisioned with the Metal3 Baremetal Operator +and Ironic, which enable provisioning of bare metal servers. The +controller has three network connections to the bare metal servers: +network A connects bare metal servers, network B is a private network +used for provisioning the bare metal servers and network C is the IPMI +network, used for control during provisioning. In addition, the +bare metal servers connect to the network D, the SRIOV network. -## Build ISO +![Figure 1](figure-1.png)*Figure 1: Deployment Architecture* -We should run the following commands on a fresh ubuntu 18.04 of kernel version 4.15.0-45-generic +- Net A -- Bare metal network, lab networking for ssh. It is used as + the control plane for k8s, used by OVN and Flannel for the overlay + networking. +- Net B (internal network) -- Provisioning network used by Ironic to + do inspection. +- Net C (internal network) -- IPMI LAN to do IPMI protocol for the OS + provisioning. The NICs support IPMI. The IP address should be + statically assigned via the IPMI tool or other means. +- Net D (internal network) -- Data plane network for the Akraino + application. Using the SR-IOV networking and fiber cables. Intel + 25GB and 40GB FLV NICs. -```bash -cd ~ -mkdir -p workspace && cd workspace -git clone "https://gerrit.akraino.org/r/icn" # may need to switch the branch based on your case -sudo icn/tools/setup_build_machine.sh -sudo icn/tools/collect.sh -sudo icn/tools/create_usb_bootable.sh +In some deployment models, you can combine Net C and Net A to be the +same networks, but the developer should take care of IP address +management between Net A and IPMI address of the server. + +Also note that the IPMI NIC may share the same RJ-45 jack with another +one of the NICs. + +# Pre-installation Requirements +There are two main components in ICN Infra Local Controller - Local +Controller and k8s compute cluster. + +### Local Controller +The Local Controller will reside in the jump server to run the Metal3 +operator, Binary Provisioning Agent (BPA) operator and BPA REST API +controller. + +### k8s Compute Cluster +The k8s compute cluster will actually run the workloads and is +installed on bare metal servers. + +## Hardware Requirements + +### Minimum Hardware Requirement +All-in-one VM based deployment requires servers with at least 32 GB +RAM and 32 CPUs. + +### Recommended Hardware Requirements +Recommended hardware requirements are servers with 64GB Memory, 32 +CPUs and SRIOV network cards. + +## Software Prerequisites +The jump server is required to be pre-installed with Ubuntu 18.04. + +## Database Prerequisites +No prerequisites for ICN blueprint. + +## Other Installation Requirements + +### Jump Server Requirements + +#### Jump Server Hardware Requirements +- Local Controller: at least three network interfaces. +- Bare metal servers: four network interfaces, including one IPMI interface. +- Four or more hubs, with cabling, to connect four networks. + +(Tested as below) +Hostname | CPU Model | Memory | Storage | 1GbE: NIC#, VLAN, (Connected extreme 480 switch) | 10GbE: NIC# VLAN, Network (Connected with IZ1 switch) +---------|-----------|--------|---------|--------------------------------------------------|------------------------------------------------------ +jump0 | Intel 2xE5-2699 | 64GB | 3TB (Sata)
180 (SSD) | eth0: VLAN 110
eno1: VLAN 110
eno2: VLAN 111 | + +#### Jump Server Software Requirements +ICN supports Ubuntu 18.04. The ICN blueprint installs all required +software during `make install`. + +### Network Requirements +Please refer to figure 1 for all the network requirements of the ICN +blueprint. + +Please make sure you have 3 distinguished networks - Net A, Net B and +Net C as mentioned in figure 1. Local Controller uses the Net B and +Net C to provision the bare metal servers to do the OS provisioning. + +### Bare Metal Server Requirements + +### k8s Compute Cluster + +#### Compute Server Hardware Requirements +(Tested as below) +Hostname | CPU Model | Memory | Storage | 1GbE: NIC#, VLAN, (Connected extreme 480 switch) | 10GbE: NIC# VLAN, Network (Connected with IZ1 switch) +---------|-----------|--------|---------|--------------------------------------------------|------------------------------------------------------ +node1 | Intel 2xE5-2699 | 64GB | 3TB (Sata)
180 (SSD) | eth0: VLAN 110
eno1: VLAN 110
eno2: VLAN 111 | eno3: VLAN 113 +node2 | Intel 2xE5-2699 | 64GB | 3TB (Sata)
180 (SSD) | eth0: VLAN 110
eno1: VLAN 110
eno2: VLAN 111 | eno3: VLAN 113 +node3 | Intel 2xE5-2699 | 64GB | 3TB (Sata)
180 (SSD) | eth0: VLAN 110
eno1: VLAN 110
eno2: VLAN 111 | eno3: VLAN 113 + +#### Compute Server Software Requirements +The Local Controller will install all the software in compute servers +from the OS to the software required to bring up the k8s cluster. + +### Execution Requirements (Bare Metal Only) +The ICN blueprint checks all the precondition and execution +requirements for bare metal. + +# Installation High-Level Overview +Installation is two-step process and everything starts with one +command `make install`. +- Installation of the Local Controller. +- Installation of compute cluster. + +## Bare Metal Deployment Guide + +### Install Bare Metal Jump Server + +#### Creating a Node Inventory File + +##### Preconfiguration for the Local Controller in Jump Server +The user is required to provide the IPMI information of the servers +they connect to the Local Controller by editing node JSON sample file +in the directory icn/deploy/metal3/scripts/nodes.json.sample as +below. This example only shows 2 servers, statically configured on the +baremetal network. If you want to increase servers, just add another +array. If the baremetal network provides a DHCP server with gateway +and DNS server information, just change the baremetal type to "ipv4". +ICN provides DHCP servers for the provisioning network. + +`node.json.sample` +``` json +{ + "nodes": [{ + "name": "node1", + "ipmi_driver_info": { + "username": "admin", + "password": "admin", + "address": "10.10.10.11" + }, + "os": { + "image_name": "focal-server-cloudimg-amd64.img", + "username": "ubuntu", + "password": "mypasswd" + }, + "net": { + "links": [ + { + "id": "baremetal_nic", + "ethernet_mac_address": "00:1e:67:fe:f4:19", + "type": "phy" + }, + { + "id": "provisioning_nic", + "ethernet_mac_address": "00:1e:67:fe:f4:1a", + "type": "phy" + }, + { + "id": "sriov_nic", + "ethernet_mac_address": "00:1e:67:f8:6a:41", + "type": "phy" + } + ], + "networks": [ + { + "id": "baremetal", + "link": "baremetal_nic", + "type": "ipv4", + "ip_address": "10.10.110.21/24", + "gateway": "10.10.110.1", + "dns_nameservers": ["8.8.8.8"] + }, + { + "id": "provisioning", + "link": "provisioning_nic", + "type": "ipv4_dhcp" + }, + { + "id": "sriov", + "link": "sriov_nic", + "type": "ipv4", + "ip_address": "10.10.113.2/24" + } + ], + "services": [] + } + }, + { + "name": "node2", + "ipmi_driver_info": { + "username": "admin", + "password": "admin", + "address": "10.10.10.12" + }, + "os": { + "image_name": "focal-server-cloudimg-amd64.img", + "username": "ubuntu", + "password": "mypasswd" + }, + "net": { + "links": [ + { + "id": "baremetal_nic", + "ethernet_mac_address": "00:1e:67:f1:5b:90", + "type": "phy" + }, + { + "id": "provisioning_nic", + "ethernet_mac_address": "00:1e:67:f1:5b:91", + "type": "phy" + }, + { + "id": "sriov_nic", + "ethernet_mac_address": "00:1e:67:f8:69:81", + "type": "phy" + } + ], + "networks": [ + { + "id": "baremetal", + "link": "baremetal_nic", + "type": "ipv4", + "ip_address": "10.10.110.22/24", + "gateway": "10.10.110.1", + "dns_nameservers": ["8.8.8.8"] + }, + { + "id": "provisioning", + "link": "provisioning_nic", + "type": "ipv4_dhcp" + }, + { + "id": "sriov", + "link": "sriov_nic", + "type": "ipv4", + "ip_address": "10.10.113.3/24" + } + ], + "services": [] + } + }] +} ``` -The script builds an ISO based on the official ubuntu-18.04-server.iso. The generated ISO -is located at `workspace/icn-ubuntu-18.04.iso`. -All files under `icn` directory are copied into the ISO. During the installation of the ISO, -these files are copied to infra-local-controller under `/opt/icn`. +##### Local Controller Metal3 Configuration Reference +- *node*: The array of nodes required to add to Local Controller. +- *name*: This will be the hostname for the machine, once it is + provisioned by Metal3. +- *ipmi_driver_info*: IPMI driver info is a json field. It currently + holds the IPMI information required for Ironic to send the IPMI tool + command. + - *username*: BMC username required to be provided for Ironic. + - *password*: BMC password required to be provided for Ironic. + - *address*: BMC server IPMI LAN IP address. +- *os*: Bare metal machine OS information is a json field. It + currently holds the image name to be provisioned, username and + password for the login. + - *image_name*: Images name should be in qcow2 format. + - *username*: Login username for the OS provisioned. + - *password*: Login password for the OS provisioned. +- *net*: Bare metal network information is a json field. It describes + the interfaces and networks used by ICN. For more information, + refer to the *networkData* field of the BareMetalHost resource + definition. + - *links*: An array of interfaces. + - *id*: The ID of the interface. This is used in the network + definitions to associate the interface with its network + configuration. + - *ethernet_mac_address*: The MAC address of the interface. + - *type*: The type of interface. Valid values are "phy". + - *networks*: An array of networks. + - *id*: The ID of the network. + - *link*: The ID of the link this network definition applies to. + - *type*: The type of network, either dynamic ("ipv4_dhcp") or + static ("ipv4"). + - *ip_address*: Only valid for type "ipv4"; the IP address of the + interface. + - *gateway*: Only valid for type "ipv4"; the gateway of this + network. + - *dns_nameservers*: Only valid for type "ipv4"; an array of DNS + servers. + +#### Creating the Settings Files + +##### Local Controller Network Configuration Reference +The user will find the network configuration file named as +"user_config.sh" in the ICN parent directory. + +`user_config.sh` +``` shell +#!/bin/bash + +#Edge Location Provider Network configuration +#Net A - Provider Network +#If provider having specific Gateway and DNS server details in the edge location, +#supply those values in nodes.json. + +#Ironic Metal3 settings for provisioning network +#Interface to which Ironic provision network to be connected +#Net B - Provisioning Network +export IRONIC_INTERFACE="eno2" + +#Ironic Metal3 setting for IPMI LAN Network +#Interface to which Ironic IPMI LAN should bind +#Net C - IPMI LAN Network +export IRONIC_IPMI_INTERFACE="eno1" +``` + +#### Running +After configuring the node inventory file and network configuration +files, please run `make install` from the ICN parent directory as +shown below: + +``` shell +root@pod11-jump:# git clone "https://gerrit.akraino.org/r/icn" +Cloning into 'icn'... +remote: Counting objects: 69, done +remote: Finding sources: 100% (69/69) +remote: Total 4248 (delta 13), reused 4221 (delta 13) +Receiving objects: 100% (4248/4248), 7.74 MiB | 21.84 MiB/s, done. +Resolving deltas: 100% (1078/1078), done. +root@pod11-jump:# cd icn/ +root@pod11-jump:# vim Makefile +root@pod11-jump:# make install +``` + +The following steps occurs once the `make install` command is given. +1. All the software required to run the bootstrap cluster is + downloaded and installed. +2. k8s cluster to maintain the bootstrap cluster and all the servers + in the edge location is installed. +3. Metal3 specific network configuration such as local DHCP server + networking for each edge location, Ironic networking for both + provisioning network and IPMI LAN network are identified and + created. +4. Metal3 is launched with IPMI configuration as configured in + "user_config.sh" and provisions the bare metal servers using IPMI + LAN network. For more information refer to the [Debugging + Failures](#debugging-failures) section. +5. Metal3 launch verification runs with a timeout of 60 mins by + checking the status of all the servers being provisioned or not. + 1. All servers are provisioned in parallel. For example, if your + deployment is having 10 servers in the edge location, all the 10 + servers are provisioned at the same time. + 2. Metal3 launch verification takes care of checking all the + servers are provisioned, the network interfaces are up and + provisioned with a provider network gateway and DNS server. + 3. Metal3 launch verification checks the status of all servers + given in user_config.sh to make sure all the servers are + provisioned. For example, if 8 servers are provisioned and 2 + servers are not provisioned, launch verification makes sure all + servers are provisioned before launch k8s clusters on those + servers. +6. BPA bare metal components are invoked with the MAC address of the + servers provisioned by Metal3, BPA bare metal components decide the + cluster size and also the number of clusters required in the edge + location. +7. BPA bare metal runs the containerized Kuberenetes Reference + Deployment (KUD) as a job for each cluster. KUD installs the k8s + cluster on the slice of servers and install ONAP4K8S and all other + default plugins such as Multus, OVN, OVN4NFV, NFD, Virtlet and + SRIOV. +8. BPA REST API agent installed in the bootstrap cluster or jump + server, and this install rest-api, rook/ceph, MinIO as the cloud + storage. This provides a way for user to upload their own software, + container images or OS image to jump server. + +## Virtual Deployment Guide + +### Standard Deployment Overview +![Figure 2](figure-2.png)*Figure 2: Virtual Deployment Architecture* + +Virtual deployment is used for the development environment using +Vagrant to create VMs with PXE boot. No setting is required from the +user to deploy the virtual deployment. + +### Snapshot Deployment Overview +No snapshot is implemented in ICN R2. + +### Special Requirements for Virtual Deployment + +#### Install Jump Server +Jump server is required to be installed with Ubuntu 18.04. This will +install all the VMs and install the k8s clusters. + +#### Verifying the Setup - VMs +To verify the virtual deployment, execute the following commands: +``` shell +$ vagrant up --no-parallel +$ vagrant ssh jump +vagrant@jump:~$ sudo su +root@jump:/home/vagrant# cd /icn +root@jump:/icn# make verifier +``` +`vagrant up --no-parallel` creates three VMs: vm-jump, vm-machine-1, +and vm-machine-2, each with 16GB RAM and 8 vCPUs. `make verifier` +installs the ICN BPA operator and the ICN BPA REST API verifier into +vm-jump, and then installs a k8s cluster on the vm-machine VMs using +the ICN BPA operator. The BPA operator installs the multi-cluster KUD +to bring up k8s with all addons and plugins. + +# Verifying the Setup +ICN blueprint checks all the setup in both bare metal and VM +deployment. Verify script will check that Metal3 provisioned the OS in +each bare metal servers by checking with a timeout period of 60 sec +and interval of 30. BPA operator verifier will check whether the KUD +installation is complete by doing plain curl command to the k8s +cluster installed in bare metal and VM setup. + +**Bare Metal Verifier**: Run the `make bm_verifer`, it will verify the +bare-metal deployment. + +**Verifier**: Run the `make vm_verifier`, it will verify the virtual +deployment. + +# Developer Guide and Troubleshooting +For development uses the virtual deployment, it take up to 10 mins to +bring up the virtual BMC VMs with PXE boot. Virtual deployment works +well for the BPA operator development for Metal3 installation scripts. + +## Utilization of Images +No images provided in this ICN release. + +## Post-deployment Configuration +No post-deployment configuration required in this ICN release. + +## Debugging Failures +* For first time installation enable KVM console in the trial or lab + servers using Raritan console or use Intel web BMC console. + + ![Figure 3](figure-3.png) +* Deprovision state will result in Ironic agent sleeping before next + heartbeat - it is not an error. It results in bare metal server + without OS and installed with ramdisk. +* Deprovision in Metal3 is not straight forward - Metal3 follows + various stages from provisioned, deprovisioning and ready. ICN + blueprint take care navigating the deprovisioning states and + removing the BareMetalHost (BMH) custom resouce in case of cleaning. +* Manual BMH cleaning of BMH or force cleaning of BMH resource result + in hang state - use `make bmh_clean` to remove the BMH state. +* Logs of Ironic, openstack baremetal command to see the state of the + server. +* Logs of baremetal operator gives failure related to images or images + md5sum errors. +* It is not possible to change the state from provision to deprovision + or deprovision to provision without completing that state. All the + issues are handled in ICN scripts. +* k8s cluster failure can be debugged by KUD Pod logs. + +## Reporting a Bug +Required Linux Foundation ID to launch bug in ICN: +https://jira.akraino.org/projects/ICN/issues + +# Uninstall Guide + +## Bare Metal deployment +The command `make clean_all` uninstalls all the components installed by +`make install` +* It de-provision all the servers provisioned and removes them from + Ironic database. +* Baremetal operator is deleted followed by Ironic database and + container. +* Network configuration such internal DHCP server, provisioning + interfaces and IPMI LAN interfaces are deleted. +* docker images built during the `make install` are deleted, such as + all Ironic, baremetal operator, BPA operator and KUD images. +* KUD will reset the bootstrap cluster - k8s cluster is torn down in + the jump server and all the associated docker images are removed. +* All software packages installed by `make install_all` are removed, + such as Ironic, openstack utility tool, docker packages and basic + prerequisite packages. + +## Virtual deployment +The command `make vm_clean_all` uninstalls all the components for the +virtual deployments. + +# Troubleshooting + +## Error Message Guide +The error message is explicit, all messages are captured in log +directory. + +# Maintenance + +## Blueprint Package Maintenance +No packages are maintained in ICN. + +## Software maintenance +Not applicable. + +## Hardware maintenance +Not applicable. + +## BluePrint Deployment Maintenance +Not applicable. + +# Frequently Asked Questions +**How to setup IPMI?** + +First, make sure the IPMI tool is installed in your servers, if not +install them using `apt install ipmitool`. Then, check for the +ipmitool information of each servers using the command `ipmitool lan +print 1`. If the above command doesn't show the IPMI information, then +setup the IPMI static IP address using the following instructions: +- Mostl easy way to set up IPMI topology in your lab setup is by + using IPMI tool. +- Using IPMI tool - + https://www.thomas-krenn.com/en/wiki/Configuring_IPMI_under_Linux_using_ipmitool +- IPMI information can be considered during the BIOS setting as well. + +**BMC web console URL is not working?** + +It is hard to find issues or reason. Check the ipmitool bmc info to +find the issues, if the URL is not available. + +**No change in BMH state - provisioning state is for more than 40min?** + +Generally, Metal3 provision for bare metal takes 20 - 30 mins. Look at +the Ironic logs and baremetal operator to look at the state of +servers. Openstack baremetal node shows all state of the server right +from power, storage. + +**Why provider network (baremetal network configuration) is required?** + +Generally, provider network DHCP servers in a lab provide the router +and DNS server details. In some labs, there is no DHCP server or the +DHCP server does not provide this information. + +# License + +``` +/* +* Copyright 2019 Intel Corporation, Inc +* +* Licensed under the Apache License, Version 2.0 (the "License"); +* you may not use this file except in compliance with the License. +* You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ +``` -## How to use the ISO to bootstrap a infra-local-controller +# References -1. We burn the ISO onto an USB strick. -2. We plug this USB into a server and press the power-on button. (choose boot from the USB strick) -3. The ubuntu 18.04 is supposed to be installed on the server, then it reboots automatically. -4. Now we can login the server with the default user/password of icn/icn -5. We can do anything we need here to install/configure/launch services. +# Definitions, acronyms and abbreviations