2018年4月30日 星期一

[ansible] ansible-vault操作指令






常用操作指令

ansible-vault是ansible所提供的加解密檔案的功能,但目前只能針對file做操作

加密檔案ansible-vault encrypt group_vars/all
解密檔案ansible-vault decrypt group_vars/all
查看檔案ansible-vault view group_vars/all類似less group_vars/all
編輯檔案ansible-vault edit group_vars/all等同vim group_vars/all
換密碼ansible-vault rekey group_vars/all

2018年4月20日 星期五

[OpenStack][Neutron] Nova instances的name resolution

nova instance藉由dhcp所拿到的ip配置資訊是由neutron-dhcp-agent所負責處理,其內部是由dnsmasq所實現。所以當我們需要做nova insances的domain name解析時,可以分成以下幾個cases來看


[Case1: 在特定子網使用指定的nameserver]

  • 在建立subnet的時候帶入
    neutron subnet-create --dns-nameserver DNS_RESOLVER
  • 更新subnet
    neutron subnet-update --dns-nameserver DNS_RESOLVER SUBNET_ID_OR_NAME


[Case2: 所有子網使用同一組nameserver]
  • 修改dhcp_agent.ini
[DEFAULT]
dnsmasq_dns_servers = DNS_RESOLVER

[Case3: 所有子網使用host上的DNS設定]

  • 修改dhcp_agent.ini

[DEFAULT]
dnsmasq_local_resolv = True


https://docs.openstack.org/newton/networking-guide/config-dns-res.html

2018年4月18日 星期三

[Docker][Ubuntu] 在ubuntu設定docker engine透過http proxy連線

[Ubuntu 16.04]
此版已用systemd來管理服務,所以必須經由以下設定來走http proxy
  1. sudo mkdir -p /etc/systemd/system/docker.service.d
  2. sudo vim /etc/systemd/system/docker.service.d/http-proxy.conf
  3. edit your http proxy config file
[Service]
Environment="HTTP_PROXY=http://proxy.example.com:80/" "NO_PROXY=localhost,127.0.0.1,docker-registry.somecorporation.com"
      4. sudo system daemon-reload
      5. sudo systemctl restart docker

[Ubuntu 14.04]
  1. sudo vim /etc/default/docker # edit your docker runtime context file
DOCKER_OPTS="--log-opt max-size=500m --insecure-registry {docker-registry} --dns 8.8.8.8" 
export http_proxy="http://{http-proxy}:8080"
export https_proxy="https://{http-proxy}:8080"
export no_proxy="localhost,127.0.0.1"
      2. sudo service docker restart



[Refer]

[OpenStack][GPU] OpenStack對於GPU的支援

目前OpenStack對於GPU的支援作法

  • PCI passthrough
    – Nova VM-based compute (e.g. Libvirt+KVM) with PCI passthrough
  • Ironic
    – 透過ironic設定含有GPU的compute node
  • CPU pinning、NUMA
  • vGPU
  • RDMA(Remote Direct Memory Access)
    – 透過一台主機上的網路卡API存取到另一台主機上的記憶體之技術

vGPU

目前若cpu有支援intel GVT-g,就可以支援全虛擬化的vGPU, 雖然效能略低於PCI passthrough,但可以多個VM間做sharing (up to 15)



[vGPU的啟動方法]
Nova
1. 於nova-compute啟動GPU type
[devices]
enabled_vgpu_types = nvidia-35
2. 於nova controller設定flavor
Configure a flavor to request one virtual GPU:
$ openstack flavor set vgpu_1 --property "resources:VGPU=1"

RDMA(Remote Direct Memory Access) 



GPU on K8S

  • 於K8S 1.10開始支援nv-docker
  • 一個pod會吃滿一張GPU卡,無法切分



[參考資源]

[OpenStack][kolla] Kolla於Queens/Rocky版的發展特色

Kolla於Queens/Rocky版的特色,簡單整理如下:

映像檔建置輕量化

  • 於Queens版的docker映像檔建置開始支援squash layer,可把多個docker image layer合併成一個
  • 而於Rocky版預計會實現docker的multi-stage build
  • 透過上述技術可讓映像檔的建置更加透過分層建置及reuse來達到輕量化的目的,以減少映像檔的建置跟傳輸時間。

Ceph的支援度更顯明顯

Rolling update

  • 部分服務開始支援最小downtime的upgrade
  • 目前於Queens版已經完成Keystone以及Cinder的部分,其它將於Rocky版實現中

開發者模式

  • Queens版正式支援"開發者模式"的部署方式(從pike版開始已經開始有這些features)
  • 透過*_dev_mode=true的設定,可以將各專案的原代碼主目錄直接與container內部路徑做綁定,
    讓開發者可以直接修改。

Healthcheck及監控服務支援

  • 之前kolla有嘗試過支持一些監控方案,但效果不是很理想
  • 隨著promethus的成熟,kolla社群預計於Rocky版提供promethus監控
    目前的設計會朝向promethus+alertmanager+ gnocchi的方案,
    promethus用來做數據收集、alertmanager來做告警、gnocchi用來做監控數據的儲存。

DB的備份任務支援

  • 預計將於queens/rocky版來做

Vitrage的支援

  • 目前已於queens版新增vitrage的部署。
  • Vitrage是OPENSTACK的RCA(Root-Cause Analysis)專案,可以處理OpenStack內部的告警、事件等,
    並經由統一分析後於dashboard呈現報表,方便OpenStack之維運。

Blazar的支援

  • queens版新增blazar的部分,該服務是做資源的預定,使用者可以在一段時間內申請資源的保留(reservation),
    以便後續使用。

其它細節可參閱:

[OVS] OVS debugging skills


[OVS]


[OVS Debugging Skills]
  1. 先確認一下基本連線狀態: sudo ovs-vsctl show
  2. 使用tcpdump等指令確認封包是否有傳送
  3. 使用ovs-ofctl dump-flows {bridge}或ovs-dpctl  dump-flows {flows}看看封包是否有hit flows


2018年4月17日 星期二

[Linux] 如何刪除占用port的session


刪除占用port的session指令

lsof

sudo lsof -i {protocol}:{port} | xargs sudo kill -9 #{protocol}可省略

例如: sudo lsof -i tcp:8080

2018年4月16日 星期一

[ManageIQ] ManageIQ簡介

[簡介]
ManageIQ為Redhat所維護,可以整合OpenStack、VMware、oVirt以及K8S的管控圖型介面工具

[安裝方法]

Docker

1. pull manageIQ container
     $ sudo docker pull manageiq/manageiq:gaprindashvili-2

2. start service
     $ sudo docker run --privileged -d -p 8443:443 manageiq/manageiq:gaprindashvili-2

3. Login service
     default login:  admin/smartvm


[DB的查詢方式]
manageIQ採用psql, 查看db有兩種方法

  1. 在gui上的ops/explorer上有Database的分頁可以查看
  2. 透過操作docker
    • sudo docker exec -it manageiq bash # 進入manageiq docker
    • su -l postgeres # 切換到postgres使用者
    • psql
    • \c vmdb_production #切換到vmdb (manageiq資料庫)
    • \d #表列table


安裝demo畫面



[java][gradle] gradle常用指令





常用指令集

Init Project

  • gradle init
  • gradle init --type java-library

Build Project

  • gradle clean build

Show all supported gradle tasks

  • gradle tasks

Static Code Analysis (YOU MUST DO THIS)

Unit Test

  • gradle test
  • gradle -x test // skip test
    測試完可於build/reports/資料夾觀察測試結果

Integration Test

  • gradle integrationTest
  • gradle -x integrationTest // skip integrationTest
    測試完可於build/reports/資料夾觀察測試結果

Liquibase

  • gradle update // 執行DB Migration到最新版本
  • gradle validate // Checks the changelog for errors
  • gradle dropAll // Drops all database objects owned by the user
  • gradle rollback // Rolls back the database to the state it was in when the tag was applied.
NOTE
  • SQL檔內容必須要用Linux的換行字元:\n,避免Liquibase讀不到
  • 若liquibase需要透過spring framework來整合啟動,則必須在spring beans的設定檔(ex: configs.xml)中加入以下XML程式碼片段: (可參考根專案底下src/main/resources/liquibase-springintegration.xml的設定)
  • <bean id="liquibase" class="liquibase.integration.spring.SpringLiquibase">
    <property name="dataSource" ref="dataSource" />
    <property name="changeLog" value="classpath:db/migration/changelog-master.yml" />
    <property name="dropFirst" value="true" />
    </bean>
  • SQL檔請放置於src/main/resources/db/migration資料夾中
  • Liquibase changelog檔案(src/main/resources/db/changelog-master.xml)必須進行以下宣告:
    databaseChangeLog:
    - preConditions:
    - runningAs:
    dbms: postgresql
    username: postgres
    - changeSet:
    id: 01_00_c_global_schema_20151005
    author: steed
    changes:
    - sqlFile:
    path: ../migrate_sql/01_00_c_global_schema_20151005.sql
    relativeToChangelogFile: true
    rollback:
    - sqlFile:
    path: ../migrate_sql/01_00_c_global_schema_20151005_rollback.sql
    relativeToChangelogFile: true

Packaging Jar/War

  • gradle jar
  • gradle war

Watch Project Layout

  • gradle projects

Show Project dependencies


  • gradle dependencies

[OpenStack][DevStack] Devstack pip套件版號注意事項


最近在安裝pike版的devstack過程中,發現devstack怎都安裝不起來。然後從log發現pip會安裝到最近pypi.org上所最新所釋出的pip 10,於是發現了以下有趣的程式碼片段:

pip_version=$(python -c "import pip; \
print(pip.__version__.strip('.')[0])")

https://github.com/openstack-dev/devstack/blob/stable/pike/inc/python#L336

這個pip.__version__會經由strip('.')去掉字串頭尾的dot(.)然後印出版號,像是'9.x.y'。 這個版號字串在<10的話不會怎樣,但是在新版10的時候就會變成1,這樣會悲劇阿,導致整個安裝過程會爆炸。以下是小弟的測試過程:


>>> import pip
>>> pip.__version__
'9.0.3'
>>> pip.__version__.strip('.')
'9.0.3'
>>> pip.__version__.strip('.')[0]
'9'
>>> x='10.0.0'
>>> x[0]
'1'
>>> pip.__version__.split('.')[0]
'9'
>>> x.split('.')[0]
'10'
>>>  
所以應該是要用split('.')來切割,取得大版號才對啊(是否strip()跟split()會傻傻分不清楚<_._>)。正在討論是否要進行bug回報的部分,但看到社區已於新版修復,可以參考以下的連結

  2. https://github.com/openstack-dev/devstack/commit/f99d1771ba1882dfbb69186212a197edae3ef02c

[OpenStack][Kolla] 如何用kolla建置及部署OpenStack

[Kolla簡介]
          Kolla一開始為TripleO專案的一部分,但它跟TripleO的不同是,它是基於Docker container來deploy OpenStack。而Kolla於ocata版已拆分為kolla和kolla-ansible/kolla-k8s兩個部分:
  • kolla: 用來建置production-ready images
  • kolla-ansible/kolla-k8s: 用來部署containernized OpenStack


[如何產生kolla-build.conf來建置kolla映像檔]
  1. 安裝tox: sudo pip install tox
  2. 切換至kolla專案主目錄: cd kolla/
  3. 產生kolla-build.conf:  tox -e genconfig  #預設會放在etc/
  4. kolla-build.conf配置檔設定
    • 從source建置
                    [heat-base]
                     type = git
                    location = https://github.com/openstack/heat.git
                    reference = stable/pike
    • 若要透過proxy建置dockerfile
    - include_header = ./.header # 自訂header檔案的位置
    - include_footer = ./.footer   # 自訂footer檔案的位置
                        [.header檔案內容]
                        ARG http_proxy=http://{proxy}:8080
                        ARG https_proxy=https://{https-proxy}:8080
                        ARG no_proxy=localhost

                        [.footer檔案內容]
                        ARG http_proxy=""
                        ARG https_proxy=""
                        ARG no_proxy=""
     
      [Kolla映像檔建置]
      1. [方法一] python tools/build.py -b ubuntu {{服務}}
        • 例如: sudo -E tools/build.py -b ubuntu glance neutron
      2. [方法二] kolla-build -b ubuntu {{服務}}
        • kolla-build指令可以透過sudo pip install kolla/來安裝


      [Kolla inventory]
      • * network_interface - While it is not used on its own, this provides the required default for other interfaces below.
      • * api_interface - This interface is used for the management network. The management network is the network OpenStack services uses to communicate to each other and the databases. There are known security risks here, so it’s recommended to make this network internal, not accessible from outside. Defaults to network_interface.
      • * kolla_external_vip_interface - This interface is public-facing one. It’s used when you want HAProxy public endpoints to be exposed in different network than internal ones. It is mandatory to set this option when kolla_enable_tls_external is set to yes. Defaults to network_interface.
      • * storage_interface - This is the interface that is used by virtual machines to communicate to Ceph. This can be heavily utilized so it’s recommended to put this network on 10Gig networking. Defaults to network_interface.
      • * cluster_interface - This is another interface used by Ceph. It’s used for data replication. It can be heavily utilized also and if it becomes a bottleneck it can affect data consistency and performance of whole cluster. Defaults to network_interface.
      • * tunnel_interface - This interface is used by Neutron for vm-to-vm traffic over tunneled networks (like VxLan). Defaults to network_interface.
      • * neutron_external_interface - This interface is required by Neutron. Neutron will put br-ex on it. It will be used for flat networking as well as tagged vlan networks. Has to be set separately.
      • * dns_interface - This interface is required by Designate and Bind9. Is used by public facing DNS requests and queries to bind9 and designate mDNS services. Defaults to network_interface.
      • * bifrost_network_interface - This interface is required by Bifrost. Is used to provision bare metal cloud hosts, require L2 connectivity with the bare metal cloud hosts in order to provide DHCP leases with PXE boot options. Defaults to network_interface.
      • 注意如果interface是OVS bridge的話,介面卡代號記得從br-ethX置換成br_ethX (注意是底線),詳細可以透過ansible -m setup -i {inventory} all去抓來看



        [Kolla部署]
              由於目前社群較多人使用的是kolla-ansible專案,所以以下將介紹kolla-ansible的部署指令
        1. 確認inventory
        2. 確認服務的ports是否被占用:  tools/kolla-ansible prechecks
        3. 執行部署:  sudo -E tools/kolla-ansible -i {{環境部署方式之inventory}} --configdir {{環境設定檔資料夾}} --password {{該環境使用密碼檔}} deploy
        4.  Neutron網路配置在kolla-ansible專案中ansible/group_vars/all中修改   
           
        5. kolla-ansible的role service一般在部署會經過register => config => bootstrap => restart service
          • register: 註冊endpoint
          • config: 配置
          • bootstrap: migrate db
          • restart service    
          [kolla-ansible各role的目錄結構]
          • bootstrap: 建立服務用的db,也就是執行*-manage db sync
          • bootstrap-service: 啟動服務用container,但會以前景模式啟動
          • handlers/main.yml : 規範要以背景常駐啟動的服務,主要設定檔及啟動參數都規範於此
          • kolla已自訂撰寫模組kolla_toolbox、kolla_image 於ansible/library來動態載入

          [kolla-ansible支援的操作actions]
          • deploy
          • precheck
          • destroy
          • reconfigure
          • pull
          [kolla-ansible重部署的方法]
          • 手動
            1. sudo docker ps -qa | sudo xargs docker rm -f #清除所有containers
            2. sudo docker volume ls -q | sudo xargs docker volume rm  #清除docker volumes的所有資料
            3. sudo service docker restart # 重啟docker服務
          [kolla-ansible資料庫修復]

          • 當mariadb的galera抱怨連線有錯的時候可以下達tools/kolla-ansible -i {inventory} --configdir {configdir} --passwords {password } mariadb_revory

          [kolla-ansible的腳本演進]
          • 於ocata版本開始,將role的start任務轉交給handler處理,會於deploy.yml中透過meta: flush_handlers來執行


          [kolla的HA]
          [kolla的OVS]

          • OVS BR的設定: https://goo.gl/MHGSWG
          • - name: Ensuring OVS bridge is properly setup
              command: docker exec openvswitch_db /usr/local/bin/kolla_ensure_openvswitch_configured {{ item.0 }} {{ item.1 }}
              register: status
              changed_when: status.stdout.find('changed') != -1
              when:
                - inventory_hostname in groups["network"]
                  or (inventory_hostname in groups["compute"] and computes_need_external_bridge | bool )
              with_together:
                - "{{ neutron_bridge_name.split(',') }}"
                - "{{ neutron_external_interface.split(',') }}"
            



          [kolla社群發展]


          [kolla常見問題]
          1. Hostname has to resolve to IP address of api_interface 
          TASK [prechecks : fail] ********************************************************************************************************************************************************
          task path: /home/ubuntu/kolla/ansible/roles/prechecks/tasks/port_checks.yml:448
          [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: '{{ hostvars[item['item']]['ansible_' +
          hostvars[item['item']]['api_interface']]['ipv4']['address'] }}' not in '{{ item.stdout }}'

          fatal: [10.144.192.36]: FAILED! => {
              "msg": "The conditional check ''{{ hostvars[item['item']]['ansible_' + hostvars[item['item']]['api_interface']]['ipv4']['address'] }}' not in '{{ item.stdout }}'' failed. T                              he error was: Invalid conditional detected: EOL while scanning string literal (<unknown>, line 1)\n\nThe error appears to have been in '/home/ubuntu/kolla/ansible/roles/prechec                              ks/tasks/port_checks.yml': line 448, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- fail: msg=\"                              Hostname has to resolve to IP address of api_interface\"\n  ^ here\n"
          [ANS]
                 此問題常見於Kolla AIO部署上,請嘗試修改remote host上的/etc/hosts,確認被部署機器上api_interface所綁定的網路介面卡如eth0的ip跟hostname是可以互相解析的。

          1. 確認有127.0.0.1 localhost和<mgmt ip> <hostname>這兩行,若不行的話移除127.0.1.1 localhost看看
          2. 若有看到ansible的error,可確認ansible套件版本是否符合所需,N版以前建議ansible <= 2.2 、>=2.1 (目前實驗環境為2.1.6)。另外問了Kolla Reviewer,目前p和q版kolla的版本建議用<2.4的,目前coverage rate最高
          2.  inventory缺少的ansible供裝錯誤
          TASK [nova : Ensuring config directories exist]
          fatal: [10.144.192.36]: FAILED! => 
          {"msg": "The conditional check 'inventory_hostname in groups[item.value.group]' failed. 
          The error was: error while evaluating conditional 
          (inventory_hostname in groups[item.value.group]): Unable to look up a name or 
          access an attribute in template string 
          ({% if inventory_hostname in groups[item.value.group] %} True {% else %} False {% endif %}).\n
          Make sure your variable name does not contain invalid characters like '-': 
          argument of type 'StrictUndefined' is not iterable\n
          \nThe error appears to have been in '/home/ubuntu/kolla-ansible/ansible/roles/nova/tasks/config.yml': 
          line 13, column 3, but may\n
          be elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:
          \n\n\n- name: Ensuring config directories exist\n  ^ here\n"}
              to retry, use: --limit @/home/ubuntu/kolla-ansible/ansible/site.retry
          [ANS] 檢查一下inventory的group是否有缺失,再來考慮是否是ansible版本問題


          [參考資源]

          [OpenStack][Ceilometer] Pike版ceilometer的event.sample



          1.   ceilometer collector是已經從ceilometer中被移除了,從pike版後就不再支援 。因為當collector push 資料給backend時會有lag情況,所以為了提高效能,現在改用dispatchers來做

          2.   ceilometer db(mongo)ocata版後就被拿掉了,改用gnocchibackend配置檔放在pipeline.yml

          [Linux] 安全清除/boot空間滿了的作法

          分享清除舊有kernels的簡單safe作法 (不過還是得看情況執行)
                 
          sudo dpkg --list 'linux-image*'| awk '{ if ($1=="ii") print $2}'| grep -v $(uname -r) | while read -r line; do sudo apt-get -y purge $line;done;sudo apt-get autoremove; sudo update-grub

          [說明] 先反向選擇出目前所用kernel版號以外的所有版本清單來進行purge刪除,然後再移除不必要的套件再執行update-grub更機開機清單,所以在執行此指令前先確保已經使用最新kernel版本號

                  https://gist.github.com/ipbastola/2760cfc28be62a5ee10036851c654600#case-ii-cant-use-apt-ie-boot-is-100-full

          [OpenStack][Qinling] Function-as-a-Service(Serverless)服務「秦嶺Qinling」簡介

          去年(2017)於澳洲雪梨參加OpenStack Summit時有看到社群已經在發展最近火紅的Serverless部分,雖然還在初步階段,但值得玩看看


          [Qinling專案簡介]

                  Qinling(秦嶺)專案的發起目標是要成為OpenStackFunction-as-a-Service,提供一個平台架構來支援serverless functions(: AWS Lambda, Google Cloud functions…)
                 
                  Catalyst IT所發起的專案,有於這次的雪梨峰會來介紹,詳細可以參考底下影片連結[1][2]





          [Qinling專案特性]
          l   支援多種COE(ex: K8S / Swarm …)
          l   支援不同的storage backends: local / swift / s3

          [Qinling安裝]

          DevStack

          [[local|localrc]]
          RECLONE=True
          enable_plugin qinling https://github.com/openstack/qinling
          
          LIBS_FROM_GIT=python-qinlingclient
          DATABASE_PASSWORD=password
          ADMIN_PASSWORD=password
          SERVICE_PASSWORD=password
          SERVICE_TOKEN=password
          RABBIT_PASSWORD=password
          LOGFILE=$DEST/logs/stack.sh.log
          LOG_COLOR=False
          LOGDAYS=1
          
          ENABLED_SERVICES=rabbit,mysql,key,tempest



          [Qinling參考資源]
          [1] Make your application Serverless: https://www.youtube.com/watch?v=NmCmOfRBlIU

          [2] (demo) Qinling - Function as a Service in OpenStack: https://www.youtube.com/watch?v=K2SiMZllN_A
          [3] Qinling source: https://github.com/openstack/qinling
          [4] http://qinling.readthedocs.io/en/latest/




          [OpenStack][Ironic] Ironic於Queens版的支援特色

          1.      Ironic rescue mode
          Q版開始,ironic也開始支援救援模式(rescue),在救援模式下,用戶可以用rescue password修復instance進行trouble-shooting(PS: 目前nova instancerescue mode,所以ironicQ版也開始支援)

          2.      Traits API支援
          Nova已經開始有Placement功能(一種計算資源註冊調度的追蹤機制),於Q版開始,Ironic也新增traits api,讓traits訊息可以註冊到novaplacement api中。
          Placement API是用來收集不同的物理資源,提供較為詳細的物理資源訊息,並管理查看物理資源的使用分配、以及AZ關聯等等,所以新增traits api後,可以讓Nova那邊有統計調度ironic資源的能力

          3.      Neutron networking-baremetal
          neutorn這邊也開始支援baremetal的部分,初步研究發現有修正port狀態不正常的部分,也提供一些routed network的功能