Cinder Client: absolute-limits

This is the first of what will hopefully be a series of posts covering the CLI commands available in the python-cinderclient package.

It is also a long term goal to deprecate the Cinder CLI in favor of the newer OpenStack Client CLI (osc). The goal of the OpenStack Client is to unify the various project CLIs into one, consistent, CLI interface. Whenever possible, it’s highly recommended you learn – and get used to – using the osc CLI instead.

Command:

cinder absolute-limits

Description:

The absolute-limit command gets limits for a user.

Details:

This is kind of an odd command. The best I can tell, this command, and the API in the service it calls, were carried over from when the Cinder project was split out of Nova. It has since changed in Nova, but the Cinder command remains the same.

One of the odd things about the command is it gets a mix of different settings that apply for the current user. Half of the results are the quota limits, half are API rate limits. These are really to wildly different concepts. They also each have their own separate API and CLI for getting the same information.

Another interesting thing with the rate limited – it doesn’t appear to work right now. Or at least I can’t figure out how to correctly configure it. I’ve filed a bug to make sure that gets figured out or documented.

$ cinder absolute-limits
 +--------------------------+-------+
 | Name                     | Value |
 +--------------------------+-------+
 | maxTotalBackupGigabytes  | 1000  |
 | maxTotalBackups          | 10    |
 | maxTotalSnapshots        | 10    |
 | maxTotalVolumeGigabytes  | 1000  |
 | maxTotalVolumes          | 10    |
 | totalBackupGigabytesUsed | 0     |
 | totalBackupsUsed         | 0     |
 | totalGigabytesUsed       | 1     |
 | totalSnapshotsUsed       | 0     |
 | totalVolumesUsed         | 1     |
 +--------------------------+-------+

Since I don’t have any rate limits working, the only results are the quota limits for my current user account. This would be the same as:

$ cinder quota-show $PROJECT_ID
+----------------------------------------+-------+
 | Property                               | Value |
 +----------------------------------------+-------+
 | backup_gigabytes                       | 1000  |
 | backups                                | 10    |
 | gigabytes                              | 1000  |
 | gigabytes_DellStorageCenterISCSIDriver | -1    |
 | per_volume_gigabytes                   | -1    |
 | snapshots                              | 10    |
 | snapshots_DellStorageCenterISCSIDriver | -1    |
 | volumes                                | 10    |
 | volumes_DellStorageCenterISCSIDriver   | -1    |
 +----------------------------------------+-------+

Basically the same information (plus per volume type) formatted a little differently. And the extra inconvenience that you need to pass in $PROJECT_ID.

OpenStack Client Equivalent

The OpenStack Client makes these calls consistent for multiple projects and separates out the quota limit from the rate limit information, which makes a lot of sense to me.

First, for quota related limit information, it combines relevant data from both Cinder, Neutron, and Nova into one common list of limits:

$ openstack limits show --absolute
+--------------------------+-------+
 | Name                     | Value |
 +--------------------------+-------+
 | maxServerMeta            |   128 |
 | maxTotalInstances        |    10 |
 | maxPersonality           |     5 |
 | totalServerGroupsUsed    |     0 |
 | maxImageMeta             |   128 |
 | maxPersonalitySize       | 10240 |
 | maxTotalRAMSize          | 51200 |
 | maxServerGroups          |    10 |
 | maxSecurityGroupRules    |    20 |
 | maxTotalKeypairs         |   100 |
 | totalCoresUsed           |     1 |
 | totalRAMUsed             |  2048 |
 | maxSecurityGroups        |    10 |
 | totalFloatingIpsUsed     |     0 |
 | totalInstancesUsed       |     1 |
 | maxServerGroupMembers    |    10 |
 | maxTotalFloatingIps      |    10 |
 | totalSecurityGroupsUsed  |     1 |
 | maxTotalCores            |    20 |
 | totalSnapshotsUsed       |     0 |
 | maxTotalBackups          |    10 |
 | maxTotalVolumeGigabytes  |  1000 |
 | maxTotalSnapshots        |    10 |
 | maxTotalBackupGigabytes  |  1000 |
 | totalBackupGigabytesUsed |     0 |
 | maxTotalVolumes          |    10 |
 | totalVolumesUsed         |     1 |
 | totalBackupsUsed         |     0 |
 | totalGigabytesUsed       |     1 |
 +--------------------------+-------+

Then any set rate limits are shown separately with the command:

$ openstack limits show --rate

Which again is nothing in my case. No output is shown if no rate limit settings are found.

 

OpenStack: Using provider networks with isolated storage traffic

OpenStack Neutron has a configuration option called provider networks that allows simpler connectivity to existing network infrastructure. This works great for my lab environment where I just want my VMs to connect directly to my shared subnets.

There are a lot of tools for deploying OpenStack now (too many?), but I just used the official install guides for Ubuntu 16.04 found on the docs.openstack.org site.

It’s generally considered a best practice to isolate all storage traffic from your production traffic. My lab environment looks something like this:

isolatednetworks

For the basic provider network I had set up, this was fine. Each compute node was set up with a NIC and IP on each of the “production” and “storage” networks. Cinder volumes could be mapped to the nova-compute hosts to provide persistent storage for the VMs. Life was good.

Then I had to complicate it. I wanted to spin up some devstack instances on VMs in this test cloud to try some things out. But with the default provider network settings, I only had access to the production subnet, so the VMs could not reach the iSCSI ports of my array.

This may be fine for some. If your production subnet has routing configured to be able to get to the iSCSI network, while not ideal, the VM’s storage traffic can get routed through and you probably won’t be able to tell any difference. But at least in my case, my iSCSI subnet is completely isolated so this would not work.

So the trick is to actually configure two provider networks, the default “provider” network from the installation guide, and a second “iscsi” (or whatever you want to name it) network to be able to provide your VMs with two NICs, just like the host nova-network node has.

Again, following the provider setup instructions from the docs, there are just a couple of tweaks needed to add an additional provider network.

First, in /etc/neutron/plugins/ml2/ml2_conf.ini you need to include the name of the second provider network you would like to include. Again, this can be whatever you would like to name it. In my case I just called it “iscsi”.

[ml2_type_flat]

#
# From neutron.ml2
#

# List of physical_network names with which flat networks can be created. Use
# default ‘*’ to allow flat networks with arbitrary physical_network names. Use
# an empty list to disable flat networks. (list value)
flat_networks = provider,iscsi

Then in /etc/neutron/plugins/ml2/linuxbridge_agent.ini I had to add the mapping for that provider network to the local host’s interface.

[linux_bridge]

#
# From neutron.ml2.linuxbridge.agent
#

# Comma-separated list of <physical_network>:<physical_interface> tuples
# mapping physical network names to the agent’s node-specific physical network
# interfaces to be used for flat and VLAN networks. All physical networks
# listed in network_vlan_ranges on the server should have mappings to
# appropriate interfaces on each agent. (list value)
physical_interface_mappings = provider:eno1,iscsi:eno2

Check your host’s ifconfig information to find the name of the interface that corresponds to each of your networks.

After following through the rest of the setup instruction, you eventually get to the point of configuring your networks in OpenStack, with provider network setup instructions included in the guide.

So after sourcing your admin credentials, you first need to define the network for your iSCSI traffic:

neutron net-create –shared –provider:physical_network iscsi –provider:network_type flat iscsi

Then create the subnet:

neutron subnet-create –name iscsi –allocation-pool start=10.23.139.200,end=10.23.139.254 iscsi 10.23.139.0/24

Of course, you will need to update the values in the allocation-pool to match your given environment.

Once this was defined, I was then able to launch an instance from this cloud and allocate both my normal “provider” production network, and my new “iscsi” storage network.

iscsiprovidernetwork

And that’s about it for configuration on the parent cloud end. Your new VM will now have two NICs. As long as you add them in the correct order (as above) it will boot up with access on the provider network.

The one additional step is that by default, the second NIC in the guest VM (at least in the Ubuntu cloud image I was using) does not get any configuration set and does not get initialized. I’m sure there are scripts you can run using cloud-init to automate this, but I was too lazy to do that for now and just logged into the booted VM and did something like the following:

sudo su –
cd /etc/network/interfaces.d
cp eth0.cfg eth1.cfg
vi eth1.cfg   # change reference from eth0 to eth1
ifup eth1

The rest is just setting up whatever you would like to run within the nested VM. Since I set up the iSCSI network to not have a gateway, any local traffic for my iSCSI subnet gets routed out the eth1 interface, all other traffic gets routed out eth0. Pretty clean mirror of the physical host configuration it’s running on.

OpenStack, Dell SC, and Alerting

I recently had a good discussion about the “errors” reported from Dell SC storage when using OpenStack. I hadn’t really thought much about it before, but this highlighted a difference in functional requirements and expectations between running a “traditional” IT infrastructure vs running a “cloud” environment.

First, a little background on the Dell SC alerting I’m referring to.

The SC probably works a little differently than other arrays in this aspect. Access to volumes are controlled by specifying which initiators have access to which volumes. That part is probably not too different – most have some sort of access control list (ACL) support. But the SC does this by creating what is called a Server Definition and then creating Mapping Profiles to control which initiators get told about which volumes on SCSI REPORT_LUN calls.

These server definitions define one or more initiators as being part of a single manageable entity. It models the physical or virtual hosts used to connect to the storage. Not only does it give the initiators a name and grouping, but it also defines thing like the operating system running on the host. This gives the storage some awareness of the capabilities and limitations of the host for things like LUN restrictions, multipathing support, max volume sizes, etc.

One setting on this server definition is “Alert On Lost Connectivity” that tells the array whether to generate email notifications or SNMP traps if any connectivity between any of the initiators and the storage array go down. This defaults to enabled.

Traditional Infrastructure

For traditional environments, it makes a lot of sense for this alerting to be enabled. When running an Exchange server or VMware vSphere hosts an admin would typically create iSCSI sessions from one or more of the hosts initiator ports to the array, then go the Dell Storage Manager (DSM) and create a server definition for those new initiators.

This is a one time operation for all servers, and a lot of it is automated through things like our vSphere Client plugin and other integrations. But once the sessions are established, it’s just a matter of creating and deleting mapping profiles, and rescanning on the hosts, to add or remove volumes. Need a new datastore for your VMs? Just create the volume, map it to your ESX hosts, and rescan. No data path management needed.

So in this fairly static environment it makes sense to want an email alert at 2 AM if your Exchange database has lost connectivity.

Cloud Infrastructure

When running something like OpenStack, however, you definitely don’t want your storage array waking you up at 2 AM because a host went away. In automated infrastructures such as this, this kind of thing is expected and quite normal.

At a very high level, when a new Cinder volume is created, Cinder first talks to the backend driver to create the volume, then asks the driver for attachment information to give the Nova compute host. Nova then logs in to the provided targets and looks for the volume.

There’s a couple interesting things here. First, if the Nova host logs in to one of the provided targets and doesn’t find the device there, it will log back out. Then later, once a volume is detached through OpenStack, the Nova host will log out of any targets that are no longer needed.

This dynamic connection handling is perfectly normal for infrastructure that is being automated by a cloud platform like OpenStack. But it’s a lot different animal than your somewhat static Exchange and SQL Server environment. In this case you don’t want to “Alert On Partial Connectivity” since partial connectivity (and lost connectivity) are to be expected and is the norm.

Dell SC Cinder Driver

In order to automate things, if a request comes in to the Cinder driver to attach to a host and there is no Server Definition found, the Dell SC Cinder driver will automatically create the server and associate any initiator ports with it. This makes it simple for OpenStack cloud deployments to just spin up new Nova compute instances and start using them.

Prior to the Newton release, we hadn’t realized the implication that alerting is enabled by default. This was unfortunately overlooked in our environments.

Starting with the Newton release we have changed this. One extra parameter is passed in to the call to create the Server Definition to turn off this flag. So hopefully users of Newton and later will never even notice this.

If you are using Cinder versions prior to Newton, or have upgraded and had Server Definitions created by these older versions, it’s a simple manual task to change this. In the DSM UI, just find the server, right-click and select Edit Settings, then in the server settings dialog uncheck the Alert On Partial Connectivity checkbox.

editsettings

If you are unable to upgrade your Cinder version, but would like to have this fixed, luckily it is a very minor change. Just add this one line to your local installation to change the driver behavior:

Dell SC Driver Patch

Monitoring

So the question comes up of what to do for monitoring and alerting. It’s certainly useful to get an email alert if something goes wrong in your environment. Many, many tools have been created expressly for this purpose. You don’t want to wander in Monday morning only to realize that you’ve had some kind of outage since Friday night.

With something like OpenStack though, alerting at the storage array level is not the right place. You can and will have connections going down, but that’s completely normal and expected. From the array level, there is no way to know whether that server was supposed to go away or not. There’s just not the visibility into the environment needed in order to decide whether something was intentional or not.

You can (and should!) monitor your network devices. While the array does not know if a connection going away was on purpose or not, seeing a switch go down would be a sure fire way of knowing there is an infrastructure issue that needs your attention. There are a lot of options here from vendor specific tools to open source monitoring systems. I won’t even attempt to make any recommendations here as it really depends on your environment and what other tools you have in place. Most likely you are already using something to do this, so it may just be a matter of checking with your vendor documentation on how best to monitor you network fabric.

To effectively monitor a cloud environment, you probably need a monitoring tools that actually understands the cloud. In order for a storage array to really be able to alert on a problem, it would need to have more awareness of what other operations are being performed in the environment (destroying an instance, migrating a volume) to know whether something going down was intentional or not. There needs to be something that knows about these operations and can evaluate whether something was intentional or the result of a failure somewhere in the stack.

Luckily there is a service designed to do this specifically for OpenStack with the Monasca project.

I’m not a Monasca expert, and I’m not involved in that project, so I won’t try to dig in too much there. But if you are looking at monitoring options for your cloud, that is where I would start looking. To effectively monitor your cloud, you need something that understands the cloud. I’m very interested in seeing where this project goes, and if other existing monitoring platforms add support to address this knowledge gap.

OpenStack Day India 2016

Last week I was able to attend the OpenStack Day India event in Bangalore, July 8-9. A big thank you to my employer, Dell, and NEC Technologies for making it possible for me to attend. And a big thank you to Sheel Rana from NEC for helping make my attendance possible and being an incredible guide and host while I was there.

This was a long way to go for a couple days conference, but I was actually kind of excited to go. Developers in India make up the second largest group of OpenStack contributors after the US. It as great to get out and support this strong community and do whatever I could to help contribute to its success.

Day 1: Workshops
The first day kicked off with Jonathan Bryce, executive director of the OpenStack Foundation, and Mark Collier, OpenStack Foundation Chief Operating Officer, providing some insight on the state of OpenStack while attendees got ready for some hands on workshops.

Over the next few hours folks worked on spinning up local devstack instances and walking through various OpenStack functionality to explore Nova, Neutron, and other core project functionality.

To close out the day, we switched over to more of a developer focus. I have a walkthrough of how contributing to OpenStack works. From initially getting set up to the basics of code reviews, to how to submit code changes and various tips and tricks along the way.

There is a lot of context missing without the audio, but here are the slides from my first day presentation:

Day 2: Workshops and Keynotes
The second day was a little larger, with a split between two tracks with more of a keynote and presentation track happening in the main auditorium, and workshops/tech track on various topics in another room.

Unfortunately I was only able to attend the presentation track, but there were a lot of interesting topics presented by Aptira, Red Hat, and other conference sponsors.

As part of these presentations, I gave an overview of the Cinder project:

There were some good questions by the audience, but I think the biggest value for me was all of the hallway conversations I was able to have with folks interested in OpenStack and Cinder, and being able to meet some existing Cinder contributors face to face for the first time. There were many names I knew from IRC and patches, so it was awesome being able to meet them in person and be able to shake hands. That was one of the more rewarding parts of the event.

OpenStack Cinder Newton Design Summit Recap

Cinder Newton Design Summit Summary

At the Design Summit in Austin, the Cinder team met over three days
to go over a variety of topics. This is a general summary of the
notes captured from each session.

We were also able to record most sessions. Please see the
openstack-cinder YouTube channel for all its minute and tedious
glory:

https://www.youtube.com/channel/UCJ8Koy4gsISMy0qW3CWZmaQ

Replication Next Steps
Replication v2.1 was added in Mitaka. This was a first step in supporting
a simplified use case. A few drivers were able to implement support for
this in Mitaka, with a few already in the queue for support in Newton.

There is a desire to add the ability to replicate smaller groups of volumes
and control them individually for failover, failback, etc. Eventually we
would also want to expose this functionality to non-admin users. This will
allow tenants to group their volumes by application workload or other user
specific constraint and give them control over managing that workload.

It was agreed that it is too soon to expose this at this point. We would
first like to get broader vendor support for the current replication
capabilities before we add anything more. We also want to improve the admin
experience with handling full site failover. As it is today, there is a lot
of manual work that the admin would need to do to be able to fully recover
from a failover. There are ways we can make this experience better. So before
we add additional things on top of replication, we want to make sure what
we have is solid and at least slightly polished.

Personally, I would like to see some work done with Nova or some third party
entity like Smaug or other projects to be able to coordinate activities on
the compute and storage sides in order to fail over an environment completely
from a primary to secondary location.

Related to the group replication (tiramisu) work was the idea of generic
volume groups. Some sort of grouping mechanism would be required to tie in
to that. We have a grouping today with consistency groups, but that has its
own set of semantics and expectations that doesn’t always fully mesh with
what users would want for group replication.

There have also been others looking at using consistency groups to enable
vendor specific functionality not quite inline with the intent of what
CGs are meant for.

We plan on creating a new concept of a group that has a set of possible types.
One of these types will be consistency, with the goal that internally we can
shift things around to convert our current CG concept to be a group of type
consistency while still keeping the API interface that users are used to for
working with them.

But beyond that we will be able to add things like a “replication” type that
will allow users to group volumes, that may or not be able to be snapped in
a IO order consistent manner, but that can be acted on as a group to be
replicated. We can also expand this group type to other concepts moving
forward to meet other use cases without needing to introduce a wholly new
concept. The mechanisms for managing groups will already be in place and a new
type will be able to be added using existing plumbing.

Active/Active High Availability
Work continues on HA. Gorka gave an overview of the work completed so far and
the work left to do. We are still on the plan proposed at the Tokyo Summit,
just a lot of work to get it all implemented. The biggest variations are around
the host name used for the “clustered” service nodes and the idea that we will
not attempt to do any sort of automatic cleanup for in-progress work that gets
orphaned due to a node failure.

Mitaka Recap
Two sessions were devoted to going over what had changed in Mitaka. There were
a lot of things introduced that developers and code reviewers now need to be
aware of, so we wanted to spend some time educating everyone on these things.

Conditional DB Updates
To try to eliminate races (partly related to the HA work) we will now use
conditional updates. This will eliminate the gap between checking a value in
setting it, making it one atomic DB update. Better performance than locking
around operations.

Microversions
API microversions was implemented in Mitaka. The new /v3 endpoint should be
used. Any change in the API should now be implemented as a micrversion bump.
Devref in Cinder with details of how to use this and more detail as to when
a microversion is needed and when it is not.

Rolling Upgrades
Devref added for rolling upgrades and versioned objects. Discussed need to make
incremental DB changes rather than all in one release. First release add new
colume – write to both, read from original. Second release – write to both,
read from new column. Third release – original column can now be deleted.

Recommended service upgrade order: cinder-api, cinder-scheduler, cinder-volume,
cinder-backup. After all services upgraded, send SIGHUP to each to release
version pins.

Multinode grenade tests need to be set up to get regular testing on rolling
upgrades to ensure we don’t let anything through that will cause problems.

Some additional patches are in progress to fix a few internal things like
unversioned dicts that are passed around. This will help us make sure we don’t
change one of those structures in an incompatible way.

Scalable Backup
The backup service was decoupled from the c-vol service in Mitaka. This allows
us to move backup to other nodes to offload that work. We can also scale out
to allow more backup operations to work in parallel.

Also some discussion of whether KVMs change block tracking could be used to
further optimize this process.

Some other issues were identified with backup. It is currently a sequential
process, so backups can take a long time. Some work is being done to break out
the backup chunks into separate processes to allow concurrent processing.

The idea of having different backup types was brought up. This will require
more discussion. The idea is to have a way to configure different volumes to
be able to be backed up to different backup target types (i.e., one to Swift
backend, one to Ceph backend, one to GCS, etc.).

There are some implications for the HA work being done and how to do clean up.
Those issues will need to be fully identified and worked through.

Testing Process
We discussed our various available testing and how we can get better test
coverage. We also want to optimize the testing process so everyone isn’t stuck
running long running tests just to validate small changes.

For unit tests, tests that take more than 3-5 seconds to run will be moved out
to our integration tests. Most of these are doing more than just “unit” testing
so makes more sense for them to be separated out.

There’s some desire to gate on coverage, but it was discussed how that can be
a difficult thing to enforce. There was a lot of concern that a hard policy
around coverage would lead to a lot of useless unit tests that don’t really add
valuable testing, just adds coverage over a given path to get the numbers up.

It may be nice to have something like our non-voting pylint jobs to help
detect when our test coverage decreases, but not sure if the additional infra
load to run this would be worth it.

We have a functional test job that is currently non-voting. It has been broken
recently and due to it being non-voting it was missed for some time. This needs
to be fixed and the job changed to be voting. Since the majority of these tests
were moved from the unit tests, they should be fine to make voting once passing
again.

Having in-tree tempest tests were also discussed. We’ve used tempest for
testing things not in line with the mission of tempest. We also have tests in
there that may not really be relevant or needed for other projects getting
them by running all tempest tests. For the ones that are really Cinder specific
we should migrate them from being in tempest to being in tree with Cinder. That
also gives us full control over what is included.

We also discussed plans for how to test API microversions and adding pylint
testing to os-brick.

CinderClient and OpenStackClient
A group of folks are working through a list of commands to make sure osc has
functional parity with cinderclient. We need to have the same level of support
in osc before we can even consider deprecating the cinder client CLI.

The idea was raised to create a wiki page giving a “lookup” of cinder client
commands to openstackclient commands to help users migrate over to the new
client. Will need to decided how to support this given long term plans for the
wiki, but having the lookup posted somewhere should help.

We do have an extra challenge in that we added a cinderclient extension for
working with os-brick as a stand alone storage management tool. We will work
with the osc team to see what the options are for supporting something like
this and decide what the best end user experience would be for this. It may be
that we don’t deprecate python-cinderclient CLI just for this use case.

Unconference
Made midcycle plans based on survey results. (Has since changed since we ran in
to complications with hotal availability)

Talked breifly about non-Cinder volume replication and how to recover during
DR. Discussed whether all backends support managing existing volumes. How to
discover available volumes. What to do with snapshots on manage.

Nova Cross Project
All about multiattach. Matt Riedman wrote up a nice recap of this discussion
already:

http://lists.openstack.org/pipermail/openstack-dev/2016-May/094018.html

Contributors Meetup
API capability and feature discovery.
What to do for backends that only support allocation in units greater than 1G.
Next steps for user message reporting.
Release priorities:

  • User messaging
  • Active/active high availability
  • Improved functional test coverage
  • Better support for cheesecake (repl v2.1)

Extending attached volume.
Discussed qemu-img convert error a couple drivers were seeing.
FICON protocol support.
Generic group discussion.
Force detach.
Replication questions.
QoS support for IOPs per GB.
Per tenant QoS.
Cinder callbacks to Nova for task completion.
Config options inheritance (defined in Default, apply to subsections)
Deadlines (http://releases.openstack.org/newton/schedule.html)
Status of image caching.
Driver interface compliance checks and interface documentation.

OpenStack Cinder Mitaka Design Summit Recap

Cinder Mitaka Design Summit Summary

Will the Real Block Storage Service Please Stand Up
Should Cinder be usable outside of a full OpenStack environment.
There are several solutions out there for providing a Software
Defined Storage service with plugins for various backends. Most
of the functionality used for these is already done by Cinder.
So the question is, should Cinder try to be that ubiquitous SDS
interface?

The concern is that Cinder should either try to address this
broader use case or be left behind. Especially since there is
already a lot of overlap in functionality, and end users already
asking about it.

Some concern about doing this is whether it will be a distraction
from our core purpose – to be a solid and useful service for
providing block storage in an OpenStack cloud.

On the other hand, some folks have played around with doing this
already and found there really are only a few key issues with
being able to use Cinder without something like Keystone. Based on
this, it was decided we will spend some time looking into doing
this, but at a lower priority than our core work.

Availability Zones in Cinder
Recently it was highlighted that there are issues between AZs
used in Cinder versus AZs used in Nova. When Cinder was originally
branched out of the Nova code base we picked up the concept of
Availability Zones, but the ideas was never fully implemented and
isn’t exactly what some expect it to be in its current state.

Speaking with some of the operators in the room, there were two
main desires for AZ interaction with Nova – either the AZ specified
in Nova needs to match one to one with the AZ in Cinder, or there
is no connection between the two and the Nova AZ doesn’t matter on
the Cinder side.

There is currently a workaround in Cinder. If the config file
value for allow_availability_zone_fallback is set to True, if a
request for a new volume comes in with a Nova AZ not present, the
default Cinder AZ will be used instead.

A few options for improving AZ support were suggested. At least for
those present, the current “dirty fix” workaround is sufficient. If
further input makes it clear that this is not enough, we can look
in to one of the proposed alternatives to address those needs.

API Microversions
Some projects, particularly Nova and Manila, have already started
work on supporting API microversions. We plan on leveraging their
work to add support in Cinder. Scott D’Angelo has done some work
porting that framework from Manila into a spec and proof of concept
in Cinder.

API microversions would allow us to make breaking API changes while
still providing backward compatibility to clients that expect the
existing behavior. It may also allow us to remove functionality
more easily.

We still want to be restrictive about modifying the API. Just
because this will make it slightly easier to do, it still has
an ongoing maintenance cost, and slightly a higher one at that,
that we will want to limit as much as possible.

A great explanation of the microversions concept was written up by
Sean Dague here:

https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2/

Experimental APIs
Building on the work with microversions, we would use that to expose
experimental APIs and make it explicit that they are experimental
only and could be removed at any time, without the normal window
provided with deprecating other features.

Although there were certainly some very valid concerns raised about
doing this, and whether it would be useful or not, general consensus
was that it would be good to support it.

After further discussion, it was pointed out that there really isn’t
anything in the works that needs this right now, so it may be delayed.
The issue there being that if we wait to do it, when we actually do
need to use it for something it won’t be ready to go.

Cinder Nova Interaction
Great joint session with some of the Nova folks. Talked through some
of the issues we’ve had with the interaction between Nova and Cinder
and areas where we need to improve it.

Some of the decisions were:

  • Working on support for multiattach. Will delay encryption support
    until non-encrypted issues get worked out.
  • Rootwrap issues with the use of os-brick. Priv-sep sounds like it
    is the better answer. Will need to wait for that to mature before
    we can switch away from rootwrap though.
  • API handling changes. A lot of cases where an API call is made and
    it is assumed to succeed. Will use event notifications to report
    results back to Nova. Requires changes on both sides.
  • Specs will be submitted for coordinated handling for extending
    volumes.
  • Volume related Nova bugs were highlighted. Cinder team will try to
    help triage and resolve some of those.
    https://bugs.launchpad.net/nova/+bugs?field.tag=volumes

Volume Manager Locks
Covered in cross-project discussion on DLM. Will use tooz as an
abstraction layer. Default will use local locks, so no change for
those that don’t need it, but ability to plug in DLMs like
zookeeper, etc., for those that need the DLM functionlity.

C-Vol Active/Active HA
Discussed the desire to be able to run the c-vol service in an a/a
configuration. It can kind of be done now with things like pacemaker,
but there are definite issues and is considered risky. The desire
is to build in support to Cinder to be able to run A/A, but we don’t
want to impose heavy requirements on operators running smaller
deployments or deployments that do not require A/A.

Based on the DLM discussion, we should be able to support this based
on end user configuration. If appropriate locking is in place within
the Cinder code and using the tooz locking abstraction, on single
node installations it will just work without extra overhead. If A/A
is desired, configuring tooz to use a distributed locking mechanism
will provide locking across nodes without changes to the code.

ABC Work
General agreement that our inheritance model is currently a mess
and needs to be cleaned up. Need to work through it. Either collapse
all into a common, simpler, driver base or make our inhertance model
usable. Report capbilities rather than discover via inheritance?

Cinder Driver Interface
This is a common area of confusion for new driver developers and we
have even found that folks involved with the project for some time
weren’t always clear on what was expected for each call. This is an
attempt to capture those details more clearly without needing to
trace through all of the code to understand the basics.

Eric has started documenting our driver requirements and is making
great progress. This should also help to have a better reference as
we work through the ABC and inheritance cleanup.

Driver Deadlines
General consensus that past restrictions were a good attempt to bring
some order and focus to work during the cycle, but there were too
many different deadlines to keep track of and it didn’t really solve
all our problems as well as we had hoped.

Based on the discussion, deadlines for the Mitaka cycle have been
adjusted. See the mailing list post for full details:

http://lists.openstack.org/pipermail/openstack-dev/2015-November/078215.html

Working Session Sprints
We spent the day Friday working through a long list of various topics to
discuss. Some of the highlights from the discussion are:

  • Consistency group replication. We are looking in to what this would look
    like now, though there was a lot of concern about adding on top of basic
    replication before we at least have a few drivers implementing it. This
    spec should be used for awareness for the current drivers looking at
    replication support to make sure they are ready for the next step.
  • We have consistency groups, but some arrays support concepts such as
    replication groups or even arbitrary grouping of volumes. No changes
    planned at this point, but some ideas being floated around for having a
    more generic volume grouping to support other scenarios.
  • Snapshot backup. We support backing up volumes, but we don’t support
    backing up snapshots currently. A spec is in the works to make this a
    supported option.
  • Config file setting inheritance. To make it clear what settings are
    inherited by later sections and what are not, considering adding a new
    section to clearly break out what should apply to everything and what
    should be specific to each section.
  • We discussed if there was any functionality to pull in to the set of
    minimum required features. Nothing was selected at this time. We may
    add some after all current drivers have more functionality that can
    be considered de facto minimum features.
  • We need functional tests. Some work has started, but help is needed to
    expand this. Could be a big win for ensuring better test coverage.
  • Ongoing work for conversion to objects and RPC versioning. It was
    agreed this is something we want and it should be completed. This will
    allow rolling upgrades. We plan to support only N-1 upgrades.
  • Call out for attention to the LVM driver. Anyone who would like to
    help maintain it – help would be welcome.

Priorities for Mitaka

  • Active/Active High Availability
  • Rolling upgrade support
  • API Microversions
  • Functional testing
  • Completion of replication

For the first time, all Design Summit sessions were live streamed and
recorded for later playback. All sessions are available on the
openstack-cinder YouTube channel:

https://www.youtube.com/channel/UCJ8Koy4gsISMy0qW3CWZmaQ

A big thank you to Walter Boring for making this possible.

Python Service/Daemon

Lately I’ve been working on setting up an OpenStack CI system for work. It’s a set of Python scripts that perform tests against every patch submitted to the Cinder project and is available here:

https://github.com/stmcginnis/openstack-ci

To start I’ve just been kicking this off manually and watching output while I make sure things are working as I expect them to. I think I’ve gotten things to a point now where I’d like this to always run, even after unexpected events. We occasionally get power blips or other issues that have resulted in the CI not running until I got back and saw it was down.

Of course there are monitoring systems, such as Nagios, that can alert on these types of things, but I really don’t want to be on call if it’s a simple matter of just getting it running again. A perfect case for making this a daemon that automatically kicks off on bootup or restarts on error.

There were several steps needed to make this into a service. This particular system is running Ubuntu, so for the service handling I used Upstart.

That was fine for kicking off the script, but I also needed to refactor the Python code slightly to make it work as a daemonized service. For that I used the python-daemon package that implements PEP3143, making it a “well behaved service.”

There’s some good documentation out there on bits and pieces of these packages, but nothing I could find that tied them all together. There were some “non-obvious” things I had to do, especially around logging. Here are the basic steps to make this all work.

Upstart Service
To start with, in the /etc/init/ directory create a [servicename].conf file. Here’s mine (switching to test.py for illustration purposes):


# openstack-ci - OpenStack CI Testing Service

description “OpenStack CI Testing Service”
author “Sean McGinnis <my_name@farmer.com”

# When to start the service
start on runlevel [2345]

# When to stop the service
stop on runlevel [016]

# Automatically restart process if crashed
respawn

# Essentially lets upstart know the process will detach itself to the background
expect fork

# Start the process (assumed “chmod 775 /opt/ci/test.py”)
exec /opt/ci/test.py

After creating the file run “initctl reload-configuration” to pick up the change.

Daemon Script
The python-daemon package takes care of daemonizing the script, but I also had to fork the process. Otherwise “service openstack-ci start” would hang. I also had to do a little work to get logging to work after daemonizing. That piece was a bit confusing until I figured out what was going on. Here’s an example of what I ended up with:


#!/usr/bin/python

import daemon
import lockfile
import logging
import os
import signal
import time

run = True
logging.basicConfig(format=’%(asctime)s %(thread)d %(levelname)s: %(message)s’,
datefmt=’%Y-%m-%d %H:%M:%S’,
level=logging.DEBUG,
filename=’/var/log/ci.log’)

class CI(object):
def initial_program_setup(self):
global run
run = True
logging.debug(‘Program setup called.’)

def do_main_program(self):
global run
logging.info(‘Running…’)
while run:
time.sleep(15)
logging.debug(‘CI test running…’)

def program_cleanup(self):
global run
run = False
logging.debug(‘Stopping…’)

def run():
# Fork process for service start to return
pid = os.fork()
if pid > 0:
os._exit(0)

ci_process = CI()

# Make sure the logging carries through
handles = []
logger = logging.getLogger()
for handler in logger.handlers:
handles.append(handler.stream.fileno())

context = daemon.DaemonContext(
working_directory=’/opt/ci’,
pidfile=lockfile.FileLock(‘/var/run/ci.pid’),
files_preserve=handles)

context.signal_map = {
signal.SIGTERM: ci_process.program_cleanup,
signal.SIGHUP: ‘terminate’}

logging.info(‘Initializing process…’)
ci_process.initial_program_setup()

logging.info(‘Calling do_main_program’)
with context:
ci_process.do_main_program()

if __name__ == ‘__main__’:
run()

It doesn’t appear the SIGTERM signalling is being handled correctly. It immediately kills the process. That will definitely be an area I will be looking in to.

Despite that piece this is still an improvement. Running “service openstack-ci start” kicks it off and “service openstack-ci stop” terminates it. It also handles execution errors. While it’s running I can “pkill test” and upstart notices the termination and automatically kicks it off again. Definitely much more robust than me watching log files!

Ignore Changing SSH Keys

In my lab I have a set of virtual machines that are automatically spun up for various tests. This setup is mostly hands off, but occasionally I do need to log in to one of these instances to check on something.

Since the instances are constantly changing, I was getting annoyed always getting the following prompt:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
af:92:a4:9c:ff:bf:0b:88:2a:62:90:c3:3e:2a:3f:74.
Please contact your system administrator.
Add correct host key in /home/smcginnis/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /home/smcginnis/.ssh/known_hosts:11
remove with: ssh-keygen -f "/home/smcginnis/.ssh/known_hosts" -R 10.125.202.186
ECDSA host key for 10.125.202.186 has changed and you have requested strict checking.
Host key verification failed.

I would then have to run the ssh-keygen -f command to remove the entry, then reconnect and accept the new host key.

It turns out there is a way to make it much simpler in this scenario. If you have a range of addresses that you know are going to be changing, you can configure SSH to just ignore these host key changes and connect. There are options on the command line to do this per instance, but less typing is better in my book.

The solution is to add the following to your ~/.ssh/config file:

Host 10.125.202.*
UserKnownHostsFile /dev/null
StrictHostKeyChecking no

You can do an entire subnet as above, or you can specify individual hosts. More options can be found in the ssh_config man page.

SSH Key for Specific Host

This information is available elsewhere, but placing here for easy access.

I recently had the need to use a specific key file for an SSH connection. In this case it was for a git server that needed different credentials than I use for other hosts.

It turns out it is very easy to configure SSH to use a different key when connecting to a specific host. Just add the following to your ~/.ssh/config file.


Host git.host.name
Hostname git.host.name
IdentityFile ~/.ssh/special_key
IdentitiedOnly yes

That’s it. Now when attempting to establish an SSH connection to git.host.name, SSH will automatically know to use the key file specified for that host.

Linux sysprep

In the Windows world there is a utility called sysprep that can be used for, among other things, preparing a gold image of a virtual machine. Sysprep will make sure there aren’t any conflicting system information that could cause problems by having two copies in the same domain.

More information on Windows sysprep can be found here: http://technet.microsoft.com/en-us/library/cc721940(v=ws.10).aspx

In my lab I have been deploying a lot of Linux hosts for various things lately. Looking for something similar I found a lot of individual steps to perform to get the same results, but no easy to use utilities like sysprep.

Once site in particular had a great collection of various things to do in order to prepare a Linux machine for cloning. Thanks to Bob Plankers and contributors for collecting these in one spot:

http://lonesysadmin.net/2013/03/26/preparing-linux-template-vms/

Since this may be something I would be doing often I decided to take all of these ideas and put them in one easy script.

sysprep.sh

#!/bin/bash

# Make sure we are running as root
if [[ $EUID -ne 0 ]]; then
echo "This script must be run as root" 2>&1
exit 1
fi

# Clean out yum update gunk
yum clean all

# Get fresh logs and clear out old ones
logrotate -f /etc/logrotate.conf
rm -f /var/log/*-???????? /var/log/*.gz

# Clear audit and login logs
cat /dev/null > /var/log/audit/audit.log
cat /dev/null > /var/log/wtmp

# Clean up persistent device rules
rm -f /etc/udev/rules.d/70*

# Reset MAC address and UUID to prevent duplicates
shopt -s nullglob
for f in /etc/sysconfig/network-scripts/ifcfg-eth*;
do
/bin/sed -i '/^\(HWADDR\|UUID\)=/d' $f
done

# Clean out tmp
/bin/rm -fr /tmp/*
/bin/rm -fr /var/tmp/*

# Clean out SSH host keys
# Uncomment to use this, skipping for use in lab
#/bin/rm -f /etc/ssh/*key*

#
# Create script to run on first login to remind user to
# change the host name and any other changes.
# Added to .bash_profile:
#
# if [ -f ~/runonce.sh]; then
# ~/runonce.sh
# rm -f ~/runonce.sh
# fi
#
cat > ~/runonce.sh << EOF #!/bin/bash echo "!!!" echo "!!! Remember to change the hostname and anything else needed for this machine." echo "!!!" EOF chmod +x ~/runonce.sh # Clean up shell history /bin/rm -f ~root/.bash_history unset HISTFILE

Enjoy!