OpenStack is a cloud computing infrastructure that is used for managing cloud computing resources. Actually I am new to concept and trying to learn by following excellent resource from edx. As I proceed I want to perform practical experiments as much as possible and document resulting takeaways in order to help my future self and anybody else if interested. Being a cloud infrastructure, Openstack depends on virtualization so it will be better to start with these concepts;
Virtualization, Containers and Cloud Computing
Virtualization manages and abstracts hardware resources between operating systems much like operating systems performs similar task for processes. Cloud computing uses shared resources on an on demand basis. It operates on top of virtualization and container computing to eliminating on premise hardware and therefore provide scalability and elasticity. Ultimate aim of cloud computing will be to offer on demand, pay as you go computing service, much like todays electrical infrastructure. As compute power becomes similar to electrical power, you just need to connect which will be similar to plugging cable to use electrical power. Implementing, sustaining and distributing details will be handled by professionals which is usually not the concern of end user. Having said this, today cloud computing, being far away from this idealization, is offered with three broad alternatives;
- Software as a service (SaaS) where provider offers access to a specific application much like Ofice365. Usually end users interact with SaaS cloud.
- Platform as a service (PaaS) where provider offers some suite of applications that will be bundle of hardware, storage, operating system and middle ware. Build platforms may be thought as an example.
- Infrastructure as a service (IaaS) where provider offers infrastructure to host virtual machines. OpenStack, Microsoft Azure, VMware vCloud Air, Amazon Web Services are examples of IaaS.
Besides, cloud computing enables easy access to IT basics by enabling self deployment with eliminating need of an IT administrator to deploy a machine for you.
Virtualization may be of;
- Hardware virtualization, software abstraction of hardware.
- Storage virtualization, Software defined storage (SDS), abstraction of actual discs and computers accessing these discs.
- Network virtualization, Software defined networks (SDN), abstraction of psychical network infrastructure to provide logical network infrastructures.
Virtualization provides efficient use of psychical resources and power. In hypervisor based virtualization, virtual machines are running on a small optimized kernel. KVM, XEN, VMware ESXI are known alternatives of this kind. In host based virtualization, virtualization software is performing on an host operating system. VMware player, VirtualBox are examples.
Containers are lightweight compared to virtual machines by eliminating each virtual machine having its own kernel. They depend on the idea of using same kernel and sharing it between users to form containers. Multiple instance of operating system will be using the same kernel therefore it may be taken as virtualization at operating system level. Container image contains applications, user libraries and dependencies, whereas kernel space components are provided by host operating system. Every container has namespace (global system resources), cgroups (used to reserve and allocate isolated resources), and a union file system. Containers are small in size compared to virtual machines, and are very lightweight, and many of them can be used on top of a single kernel. Besides, a user on one kernel is not able to access resources on another kernel, so it is fairly secure. Running multiple copies of a single application is a perfect use case of containers.However isolation of containers is weaker than virtual machines, and if kernel goes down, all containers will be down.
Now we have resources, how can we manage them?
Now we have hypervizors, and we have somewhat quantized compute resources. We will definitely want to control and interconnect these to provide on demand scalable compute power, and we will also want to clever ways of storing input / output data. One solution for these is OpenStack platform.
OpenStack is a bunch of infrastructure services, with core ones being; Nova for compute which is an interface for hypervisors, Swift for object storage which performs distributed and replicated binary storage, Neutron for networking which enables software defined networking to the cloud, Cinder for block storage that enables persistent storage for virtual machines, Keystone for identity, that administers users, roles, tenants, services and Glance for image that eliminates installing but enables deploying images.
Nova is an interface to hypervizor, which spawns, schedules and decommissions machines on demand. It is responsible from managing the compute instance life cycle. Nova has a distributed architecture as there are Nova agents running on hypervizors, and Nova service process running on Cloud Controller.
Neutron allows software defined networking that enables own inter instance networking between deployed images. It should provide logical networks on top of physical architecture.
Swift proposes distributed and replicated, scalable solution for binary object storage.
Swift provides REST api for applications and distributes request to multiple physical devices for replication, reliability, scalability and performance.
As image storages are ephemeral (like live cd boot image), changes are not persistent. Cinder provides persistent storage to instances. It may use Swift or Ceph as backend object storage.
Keystone provides central repository for authentication and authorization. Services and endpoints are introduced to Keystone. Besides, users and roles are created and assigned to projects, known as tenants and by default kept in MariaDB.
Glance is used to store virtual machine disc images, which are then instantiated on demand. They may either be downloaded from repositories, or custom created to represent requirements of organizations. Glance may use Swift or Ceph as object storage for scalability or just use local storage for simple / small environments.
Horizon is a user friendly web interface dashboard for easy management of instances.
Ceilometer is used for metering and billing.
Heat is used for deploying stacks of instances.
Magnum is used as a container manager for OpenStack.
Congress is used as a policy enforcer in OpenStack.
OpenStack shared file system service is performed by Manila
Other important service are about time synchronization, message queue and database for storing cloud related information. Manual deployment of OpenStack will require these services to be setup manually.
OpenStack components are accessed through RESTful to enable uniform access.
To sum up basic Openstack nodes will be;
Controller node will typically be performing centralized controller functionality. It may be a single node or a cluster with redundancy and high availability. Network controller node will perform network services to the cloud. There will be compute nodes that have Nova agents, and there will be storage nodes containing Swift or Ceph. As a bundle DevStack contains all for development and testing environment (not intended to be used in production)
Openstack can be deployed by
- Manual deployment
- Scripted deployment with PackStack and DevStack
- Large scale automatic deployment with TripleO and Director
As a staring point I will take the easy path and deploy a DevStack instance. I will have a DevStack guest controlled by Vagrant. It seems like matrix in matrix, but as Vagrant provides a controlled reproducible development environment, it will make my life easier in the long term and worth this a priori effort. This will be in OpenStack | DevStack setup.