Posted By Kepler Lam

To complete the discussion of the whole demostration. This last part disccus another supporting Python script nxapi_utils.py come from Cisco, it provides the ExecuteiAPICommand function call to connect to the Nexus box and execute the command. It just return the XML as "text", then the text can be passed into minidom.parseString which parses the fields and arrange the information into a python XML class (with hierarchy).

To get back the value of a particular field, there are 2 methods:

1. Use the GetNodeDataDom by passing the XML and the field name as the parameters.

2. or use the XML class method getElementsByTagName

Here is the script:

#!/usr/bin/env python
#
# tested with build n9000-dk9.6.1.2.I1.1.510.bin
#
# Copyright (C) 2013 Cisco Systems Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

 

import requests

def GetiAPICookie(url, username, password):
    xml_string="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?> \
      <ins_api>                         \
      <version>0.1</version>            \
      <type>cli_show</type>             \
      <chunk>0</chunk>                  \
      <sid>session1</sid>               \
      <input>show clock</input>         \
      <output_format>xml</output_format>\
      </ins_api>"
    try:
        r = requests.post(url, data=xml_string, auth=(username, password))
    except requests.exceptions.ConnectionError as e:
        print "Connection Error"
    else:
        return r.headers['Set-Cookie']

def ExecuteiAPICommand(url, cookie, username, password, cmd_type, cmd):
    headers = {'Cookie': cookie}

    xml_string="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?> \
     <ins_api>                          \
     <version>0.1</version>             \
     <type>" + cmd_type + "</type>      \
     <chunk>0</chunk>                   \
     <sid>session1</sid>                \
     <input>" + cmd + "</input>         \
     <output_format>xml</output_format> \
     </ins_api>"

    try:
        r = requests.post(url, headers=headers, data=xml_string, auth=(username, password))
    except requests.exceptions.ConnectionError as e:
        print "Connection Error"
    else:
        return r.text

def GetNodeDataDom(dom,nodename):
    # given a XML document, find an element by name and return it as a string
    try:
     node=dom.getElementsByTagName(nodename)
     return (NodeAsText(node))
    except IndexError:
     return "__notFound__"

def NodeAsText(node):
    # convert a XML element to a string
    try:
     nodetext=node[0].firstChild.data.strip()
     return nodetext
    except IndexError:
     return "__na__"   

The challenge of the script is the requirement to know the structure of the returned XML and the corresponding field names. This can be easily solved by using the NX-API Developer Sandbox.

 


 
Posted By Kepler Lam

Second part of intf.py:

def show_interfaces(IP):

   form_str="""
   <form action="/cgi-bin/WebMgr.py" method="post">
      Switch IP Address List: <input type="text" name="IP" value=%s>
      <input type="submit" value="Show">
   </form>
   """%IP_STRING

   if intf:
      print "<h2>Nexus Web Manager - Show Interface</h2>"
      print form_str
      print """
      <table>
        <tr>
          <th>Interface</th>
          <th>Description</th>
          <th>HW Addr</th>
          <th>Speed</th>
          <th>In Bytes</th>
          <th>Out Bytes</th>
          <th>Duplex</th>
        </tr>
        <tr>
      """

   else:
      print "<h2>Nexus Web Manager - Interface Management</h2>"
      print form_str
      print """
      <table>
        <tr>
          <th>Interface</th>
          <th>State</th>
          <th>Vlan</th>
          <th>Port Mode</th>
          <th>Show Interfaces</th>
        </tr>
        <tr>
      """

   url = 'http://'+IP+'/ins/'
   cookie=GetiAPICookie(url, user, password)

   if intf:
      dom = minidom.parseString(ExecuteiAPICommand(url, cookie, user, password, "cli_show", "show interface %s"%intf))
      intfdict=getIntf(dom)
      dom = minidom.parseString(ExecuteiAPICommand(url, cookie, user, password, "cli_show_ascii", "show run interface %s"%intf))
      run_cfg = NodeAsText(dom.getElementsByTagName("body"))
   else:
      dom = minidom.parseString(ExecuteiAPICommand(url, cookie, user, password, "cli_show", "show interface"))
      intfdict=getIntf(dom)
      dom = minidom.parseString(ExecuteiAPICommand(url, cookie, user, password, "cli_show", "show interface switchport"))
      intfdict=getSwitchport(dom,intfdict)

   for interface in intfdict.keys():
      print "<tr>"
      if intf:
         print("<td>%s </td>" % (interface))
         print("<td>%s </td>" % (intfdict[interface]['desc']))
         print("<td>%s </td>" % (intfdict[interface]['HWaddr']))
         print("<td>%s </td>" % (intfdict[interface]['speed']))
         print("<td>%s </td>" % (intfdict[interface]['inbytes']))
         print("<td>%s </td>" % (intfdict[interface]['outbytes']))
         print("<td>%s </td>" % (intfdict[interface]['duplex']))
      else:
         print("<td>%s </td>" % (interface))
         print("<td>%s </td>" % (intfdict[interface]['state']))
         print("<td>%s </td>" % (intfdict[interface]['access_vlan']))
         print("<td>%s </td>" % (intfdict[interface]['mode']))

         form_str="""<td>
           <form action="/cgi-bin/Intf.py" method="post">
           <input type="hidden" name="Intf" value=%s>
           <input type="hidden" name="IP_LIST" value=%s>
           <input type="hidden" name="IP" value=%s>
           <input type="submit" value="Show">
           </form>
         </td>
         """%(interface,IP_STRING,IP)
         print form_str

      print("</tr>")

   return run_cfg

#################
#  MAIN MODULE  #
#################

# First things first: credentials. They should be parsed through sys.argv[] ideally ..
form = cgi.FieldStorage()
# Get data from fields
IP = form.getvalue('IP')
IP_STRING = form.getvalue('IP_LIST')
intf = form.getvalue('Intf')
user="admin"
password="dummy"

print("Content-type:text/html")
print """
<head>
<style>
table {
  font-family: arial, sans-serif;
  border-collapse: collapse;
  width: 100%;
}

td, th {
  border: 1px solid #dddddd;
  text-align: left;
  padding: 8px;
}

</style>
</head>
<body>
"""

run_cfg=show_interfaces(IP)

if intf:
   print "<table><tr><td>"
   print run_cfg.replace("\n","<br />\n")
   print "</table>"

print "</table>"

Thus actually, depending on whether the interface name is passed to Intf.py script or not, the flow is a bit different. When it gets the interface name, it calls "show interface interface_name" and "show run interface interface_name " to get the detail and display the information.

Click here to continue to the last part.


 
Posted By Kepler Lam

Thus actually, depending on whether the interface name is passed to Intf.py script or not, the flow is a bit different. When it gets the interface name, it calls "show interface interface_name" and "show run interface interface_name " to get the detail and display the information.

intf2
Here is the intf.py (as its too long, I will break it down into another blog):

#!/usr/bin/env python
#
import cgi, cgitb

from xml.dom import minidom
from nxapi_utils import *

def getIntf(xml):
    interfaces = xml.getElementsByTagName("ROW_interface")

    # build a dictionary of interface with key = interface
    # the format of the dictionary is as follows:
    intfdict = {}
    for interface in interfaces:
        intfname  =  NodeAsText(interface.getElementsByTagName("interface"))
        intfstate =  NodeAsText(interface.getElementsByTagName("state"))
        intfdesc  =  NodeAsText(interface.getElementsByTagName("desc"))
        intfhwaddr =  NodeAsText(interface.getElementsByTagName("eth_hw_addr"))
        intfspeed =  NodeAsText(interface.getElementsByTagName("eth_speed"))
        intfduplex =  NodeAsText(interface.getElementsByTagName("eth_duplex"))
        intfinbytes =  NodeAsText(interface.getElementsByTagName("eth_inbytes"))
        intfoutbytes =  NodeAsText(interface.getElementsByTagName("eth_outbytes"))
        intfdict[intfname]={'state': intfstate, \
                          'desc': intfdesc, \
                          'access_vlan': "N/A", \
                          'mode': "L3", \
                          'speed': intfspeed, \
                          'duplex': intfduplex, \
                          'inbytes': intfinbytes, \
                          'outbytes': intfoutbytes, \
                          'HWaddr': intfhwaddr}
    return intfdict

def getSwitchport(xml,intfdict):
    interfaces = xml.getElementsByTagName("ROW_interface")

    # build a dictionary of interface with key = interface
    # the format of the dictionary is as follows:
    # neighbors = {'intf': {neighbor: 'foo', remoteport: 'x/y', model: 'bar'}}    
    for interface in interfaces:
        intfname  =  NodeAsText(interface.getElementsByTagName("interface"))
        intfswitchport =  NodeAsText(interface.getElementsByTagName("switchport"))
        intfvlan  =  NodeAsText(interface.getElementsByTagName("access_vlan"))
        intfmode =  NodeAsText(interface.getElementsByTagName("oper_mode"))
        intfdict[intfname]['switchport']= intfswitchport
        intfdict[intfname]['access_vlan']= intfvlan
        intfdict[intfname]['mode']= intfmode
    return intfdict

Click here to continue.


 
Posted By Kepler Lam

In the NX-OS 9K training, I created a sample Webportal (by modifying some code from Github) to demonstrate the usage of the NX-API, here I want to share it.

The purpose of the Webportal is just allow an user enter a list of Nexus 9K Management IP, then it display some version information of them. The user can then select one of the 9K to view the interfaces' information. Finally, can select the interface to view more detail.

The portal is quite straight forward. The first screen just a pure form to prompt user to enter the list of IP addresses. The form action will then call the python script WebMgr.py to process the form data (which is just the list of IP addresses).

indexhtml

Here's the HTML:

<h2>Nexus Web Manager</h2>

<form action="/cgi-bin/WebMgr.py" method="post">
     Switch IP Address List: <input type="text" name="IP">
     <input type="submit" value="Show">
</form>

The WebMgr.py python script get back the form data by  form.getvalue('IP'), then connect to each of the IP address (username and password is HARD CODED inside the script, as an exercise, reader can modify the form to prompt user to enter them), use the Nexus API to do a "show version" and parse some of the information, then display them one by one in a table. Moreover, on the last table column, create a form button with action to call Intf.py by passing the corresponding Nexus 9K management IP to it.

Webmgr
Here is the WebMgr.py:

#!/usr/bin/env python
#

import cgi, cgitb

from xml.dom import minidom
from nxapi_utils import *

#################
#  MAIN MODULE  #
#################

# First things first: credentials. They should be parsed through sys.argv[] ideally ..
form = cgi.FieldStorage()
# Get data from fields
IP_STRING = form.getvalue('IP')
IP_LIST=IP_STRING.split(",")
user="admin"
password="dummy"

print("Content-type:text/html")

print """
<head>
<style>
table {
  font-family: arial, sans-serif;
  border-collapse: collapse;
  width: 100%%;
}

td, th {
  border: 1px solid #dddddd;
  text-align: left;
  padding: 8px;
}

</style>
</head>
<body>

<h2>Nexus Web Manager</h2>

<form action="/cgi-bin/WebMgr.py" method="post">
     Switch IP Address List: <input type="text" name="IP" value=%s>
     <input type="submit" value="Show">
</form>
<table>
  <tr>
    <th>IP Address</th>
    <th>Hostname</th>
    <th>Version</th>
    <th>Show Interfaces</th>
  </tr>
  <tr>
"""%format(IP_STRING)

for IP in IP_LIST:
   url = 'http://'+IP+'/ins/'
   cookie=GetiAPICookie(url, user, password)
   dom = minidom.parseString(ExecuteiAPICommand(url, cookie, user, password, "cli_show", "show version"))
   host_name=GetNodeDataDom(dom,"host_name")
   kickstart_ver_str=GetNodeDataDom(dom,"kickstart_ver_str")

   print "<tr>"
   print("<td>%s </td>" % (IP))
   print("<td>%s </td>" % (host_name))
   print("<td>%s </td>" % (kickstart_ver_str))

   form_str="""<td>
     <form action="/cgi-bin/Intf.py" method="post">
     <input type="hidden" name="IP_LIST" value=%s>
     <input type="hidden" name="IP" value=%s>
     <input type="submit" value="Manager">
     </form>
   </td>
   """%(IP_STRING,IP)
   print form_str

   print("</tr>")

print "</table>"

Inside the Intf.py script, just like the WebMgr.py, after getting back the form data (IP address), it uses the Nexus API to do a "show interface" and "show interface switchport" commands, to get information such as the status, VLAN about the interfaces. Again, just display as a table, and in the last column, also create a form button to display the detail of that interface. This time the form action calls back the Intf.py script with an additional information which is the interface name.

intf1
 

Please follow this link to continue with next part.


 
Posted By Kepler Lam

As in the previous blog, I want to compare and relate some Cisco network features with the AWS VPC. Here let's see the NAT feature in the AWS. Here I want to focus the concept and mechanism, please refer to the AWS document for the detail configuration.

If you are familiar with the NAT function in Cisco routers, there are basically 3 different types:

  1. One-to-one (static NAT)
  2. Many-to-many (dynamic NAT)
  3. Many-to-one (PAT)

Actually you can also have these 3 different kinds of NAT configuration in the VPC of AWS. To understand this, you need to understand the logical layer 3 architecture of the VPC and the address assignment in AWS.

In fact, the routing (logical) structure of the AWS is quite straightforward, after you create the VPC, you have a VPC Router that routes between the internal subnets (with private IP addresses) within the VPC. To go out to Internet, there is another Internet Router that logically connected with the VPC router which has a default route pointing to the Internet router. Like the following diagram:

 

AWSnat
 

The Internet Router is also responsible for the NAT.

Obvious to access Internet your instance (VM) requires a global IP address. In AWS, there are 2 kinds of global IP address (the name is a bit confusing):

  1. Elastic IP – AWS allocate five global Internet IP addresses for every AWS account by default, these IP addresses are owned by your account, so will not be shared with others. You can freely map it to any private IP address in your VPCs. But AWS DO charge the usage of this so called Elastic IP, if you allocate it but not associate with any instance, or even associated but the instance is stopped (please refer to the AWS pricing, as I am not focusing on the price of AWS in this technical blog, but just be aware of it).
  2. Public IP - allocated from a pool of global IP addresses maintained by Amazon. This address pool is globally shared by all users, and dynamically assigned, that means you may not be able to permanently use it. As AWS may release this IP address from your instance and assign another one to it. If you require a persistence global IP address, you should use the Elastic IP.

Now let's discuss how to implement the 3 different types if NAT in AWS. 

One-to-one

The usage of one-to-one NAT is usually because your instance is acting as a public server that requires a fix global IP address. According to the above discussion of global IP address types, most likely you already figure out which kind of global IP to be used? Yes, Elastic IP. You need to assign one of the Elastic IP addresses from your account and assign it to interface of the instance.

Note that from the configuration point of view, it seems that the interface right now have 2 IP addresses - one private address of the internal subnet, other is the Elastic IP (like multihome), but actually it is not! The private address to public address translation occurs on the Internet router, just like the normal NAT case of a standard network.

Many-to-many

You may want to use this kind of NAT if your instance requires to use any application that is not PAT friendly such as application that requires fixed port number. If so, you can either enable the allocation of a public IP (not Elastic) for instance on a subnet, or you can directly enable the allocation on the instance itself. Just like the one-to-one case, this public IP is not actually configured on the interface of the instance (which still only have the private IP address), instead the mapping is implemented in the Internet router.

Many-to-one

Actually this is the most common case for those instances that just need to access the Internet as clients. Then they can share a common global IP address using different port numbers when going out to the Internet.

To use this kind of NAT, you need to allocate another NAT Gateway or NAT instance. From functional point of view, NAT Gateway and NAT instance are more or less the same, the only difference is that NAT instance is implemented by a Linux instance which is actually cheaper in the cost (as AWS do charge for NAT Gateway usage). The NAT Gateway (instance) requires to be assigned an Elastic IP address.

The NAT Gateway is just liked a single armed router, its interface is also on a private subnet just like other instance of your VPC, the VPC router will need to change the default route to point to the NAT gateway. For traffic bounded to Internet, the VPC router send the packet to the NAT gateway which will change the source address to itself interface address (which is actually still private IP) using PAT i.e. the port number maybe change. As the NAT gateway has a default route to the Internet Gateway (NAT occurs in there). Therefore, the packet is send to the Internet Gateway which will change the source IP address to the Elastic IP address of the NAT gateway.

 

AWSpat
 

 


 
Posted By Kepler Lam

In this Blog entry, I want to compare some basic concept of the Cisco ACI with the AWS VPC. As ACI is the SDN solution by Cisco to build the private cloud, while AWS VPC service is a public cloud solution for the Data Center network.
Before the discussion, let's see the term SDN first. As there are different interpretations of SDN, yet what's the most fundamental meaning? After I discuss the traditional hardware base network, then you should be able to define SDN. Think about in the old time, if you have 2 sets of servers, for security reason, you want to put them in 2 different "domain" i.e. subnets. Obviously, you also need to allow them to be able to communicate. Then what network devices you need to implement? Actually, this is the most basic form of network, you may deploy 2 switches (or one switch with 2 VLANs) and connect the switches with a router. As the figure below:

ACIvsVPC_Phys

What's the corresponding logical network? In nowadays data center, how will you setup the corresponding infrastructure? First of all, we won't use physical servers anymore, instead VMs are being deployed. If I refer it as Software Defined Servers, then you should understand what I mean Software Defined Network. We want to use software to create (define) a logical network, then use this logical network for connecting those logical servers which are the VMs. As below:

ACIvsVPC_Logic

That's the motivation of ACI. Of course we still need a physical network (just like physical servers) which consists of a set of Nexus 9K, but on top, we use overlay to create logical (or virtual) networks (similar to the concept of creating VMs inside physical server). Think about for every physical network topology, logically you can just view it as a core layer 3 network connecting different layer 2 segments. Just like the figure below:

ACIvsVPC_Gen

Or no matter how many routers inside the layer 3 core, it can be degenerated to one single router as Figure 2.
So Figure 2 is the most basic form of a network, its the fundamental building  block. In ACI, it is referred as context (technically speaking, it is a VRF), while in AWS, it is called VPC. Of course, you can create many contexts within one tenant, similarly you can have multiple VPCs in your AWS account.
In ACI, layer 2 domain is called Bridge Domain (BD), while VPC just use the term subnet. When you create bridge domain, you need to assign the subnet by actually assigning the default gateway IP address. Hosts that attach to the bridge domain can use that IP as the gateway. While the subnet in VPC actually define the subnet address only, its a bit magic for the default gateway IP. Because when you start the VM (AWS refer it as instance), it will automatically get an IP address from the subnet and set the default gateway IP (AWS has tailored the operating system's LAN card driver to perform this).

For security point of view, in ACI, there is one more level inside the Bridge Domain, which is called EPG (end point group) . You can have multiple EPGs within a BD. No traffic control within the EPG, to allow traffic between 2 different EPGs, you need to define contracts (somewhat like ACL without IP address) between them.
Hosts are assigned to the EPG, for bare metal, need to assigned the connected physical port. For VMs, Cisco integrate with the Hypervisor system (VMware, HyperV), the EPG will be mapped to Port Group in vCenter and assign to the VNIC of the VMs. The advantage of using EPGs is the isolation of IP address in the contract, and the EPG membership will not change when VMs are moved across different ESXi servers using Vmotion.
In the case of VPC, because you don't need to manage a separate VM system. AWS EC2 service already provide the VM service. Actually you can only launch VMs (instance), you don't need bare mental server anymore. Thus VPC don't require something like EPG, but you can assign security group (somewhat like the port ACL) to control inbound and/or outbound traffic of individual instance. Or use network ACL (like the router ACL) and assign to the corresponding subnet to control traffic to and from the subnet.

Following figure shows the above concepts:

 

Ctx_VPC
 


 
Posted By Kepler Lam

Want to change to a non-technical topic, though I am not a travel blogger, this time I really want to share the aurora (northern light) viewing experience in YellowKnife (Canada) just on last month (late September).

day1

The weather there is really very nice and not cold. Around 5-8°C during the night time. We stayed 3 nights in YK, and thank God that I can have a very spectacular view of aurora on the first and most of time in 2nd night (Photo above). Though the 3rd night is quite cloudy, still able to view a bit of aurora (below). When I saw this amazing show on the sky, how can't I sing the hymn How Great Thou Art!

day3
 

Some people told me that they went to Euro to see Aurora, but can’t see much and some even unable to see.  As there are 2 factors affecting the visibility of aurora:

  1. The weather – should be clean sky, not much cloud
  2. Strength of the Aurora

Located in the northern interior part of Canada, Yellowknife has higher chance to have clean sky and it is also under the Aurora Oval i.e. it satisfies the above 2 conditions. Mentioned in the YellowKnife Village Website that they have up to 240 days per year that can see aurora.

To be able to see aurora, obvious it needs to be at dark. Since at summer, sunset will be very late. So the recommended viewing season is from late Aug to the early April (excluding Oct). Personally, for those (like me) who live in warmer place better go before end of September, as the temperature is still acceptable. One of my friends who visit at late Aug can also see very beautiful aurora and not so cold.

Besides the high chance of able to see aurora, the spending is not that much (compare to Euro). Excluding the round trip air-ticket between Hong Kong and Vancouver, if just counting the trip starting on Vancouver:

  1. Round trip air-ticket between Vancouver and YellowKnife – around CAN 750/person
  2. 3 night hotel plus tour (aurora village at night) package - around CAN 700/person

To fit your budget, I suggest 3 different options:

  1. Book Flight+Hotel package, then book tour in each separate night
  2. Book Flight only, then book Hotel+Tour package
  3. Book Flight+hotel+tour packagefrom tour agent

Option 1 is the most flexible and should be the cheapest, as say if you go 3 nights, then maybe you already able to see aurora at first 2 nights, and weather forecast on 3rd night to be cloudy, then you can save money and won’t go on 3rd night. But you may need to consider the availablility of the night tour (book at the same day) especially on peek season, as it maybe full.

So to compromise (like what I did), can consider option 2. Though you may waste money and time if weather forecast for a particular night is not so favor, you guarantee have a seat.

While option 3 maybe the most expense, is obvious the easiest way. But we found that most of the travel agent only provide winter tour (not before Nov).

For photographing, I am not expert at all. But as general recommendation, before you go, you should get familiar of the manual operation of your camera (at least know how to set the ISO level, shutter speed and aperture) and how to use your tripod to point to different area of the sky. You know what, I spend less than half hour to practice, just few hours before going to view the aurora!
 


 
Posted By Kepler Lam

In a customized OpenStack training, the client asked for an example of using the Python code to create an instance, here I want to share how to do it.

Basically the script consists of the following steps:

  1. Import the python client library
  2. Get the authentication information
  3. Create a client object by passing the authentication parameters
  4. Use the client object to perform different operations

Let me show you the code for each step.

Import the python client library

from novaclient import client

Get the authentication information

Create a dictionary variable and get the credential information from corresponding environment variables.

def get_nova_creds():
    d = {}
    d['username'] = os.environ['OS_USERNAME']
    d['api_key'] = os.environ['OS_PASSWORD']
    d['auth_url'] = os.environ['OS_AUTH_URL']
    d['project_id'] = os.environ['OS_TENANT_NAME']
    return d

Create a client object by passing the authentication parameters

creds = get_nova_creds()
nova = client.Client('2.0',**creds)

Use the client object to perform different operations

Once you create the Nova client object, you can use it to find out information (such as image, flavor etc.) that are required to launch a new instance. Following shows an example:

image = nova.images.find(name="cirros-0.3.4-x86_64")
flavor = nova.flavors.find(name="m1.tiny")
network = nova.networks.find(label="mynet2")

Now you can use the client object to launch an new instance:

server = nova.servers.create(name = "myinstance4",
                                 image = image.id,
                                 flavor = flavor.id,
                                 nics = [{'net-id':network.id}],
                                 )

 

 


 
Posted By Kepler Lam

When a tenant network is created in any project, Neutron will allocate a unique VLAN number (which OpenStack refer it as segment ID) for that tenant network. Note that this VLAN number is ONLY used in the physical network but NOT inside the OVS of the compute node. This is the most confusing thing, as OpenStack beginners will always have the misconception that the segment ID is used internally in the compute nodes.

Now let’s see what happen in the compute node. When you create a new VM and attach it to your tenant network, Nova component of the OpenStack will find a compute node to launch your VM. Then Neutron will try to allocate an “internal” VLAN of the OVS inside the compute node and put your VM to that internal VLAN. Neutron will instruct the OVS to map the internal VLAN traffic to the physical VLAN when going out to the physical NIC of the compute node and vice versa. Obvious, if 2 VMs are on the same tenant network, they will be put in the same internal VLAN.

The term “internal” here has the meaning of local, it means that the VLAN value of the OVS maybe different in different compute node, even if they belong to the same tenant network. But they will be translated to the same VLAN of the provider network when traffic goes onto the physical network. In short, the provider VLAN bridges the internal VLANs of all compute nodes for the same tenant network.

tnet2pnet

Why is it so complicated? Why not directly use the VLAN number inside the OVS so that translation is not required. From my point of view, if you use VLAN for the provider network, it seems that it’s really unnecessary. But what about VXLAN and GRE? As the address space of VLAN is not large enough to have a one-to-one mapping to the VXLAN or GRE segment ID, so it make sense to use a local VLAN number within the compute node, as you will never put all tenant networks in one single node. The number of VLANs should be good enough within one single node. In fact, Cisco DFA uses similar trick in the leaf switch.

Once you understand the above implementation, there should be no magic for the case of using GRE and VXLAN as the provider network.

OK, let’s see what happen if we use VXLAN (or GRE) for the physical network. Actually the only different is that when you create a tenant network, instead of allocate a VLAN for that network segment, OpenStack will allocate a VXLAN ID (VNID) (or in GRE it will allocate a key) for that network, once again this ID is referred as segment ID.

Then when you create an instance (VM) on that tenant network, once again OpenStack will find a compute node to launch your VM. However, the VM will be put onto one of the internal VLAN of the OVS in that compute node. As internally, OVS uses VLAN (not VXLAN nor GRE). Nonetheless, OpenStack will instruct the OVS to remember the mapping between the internal VLAN and the segment ID. So when the Ethernet frame goes out of the compute node to the physical network, the OVS will encapsulate it onto the corresponding VXLAN using the mapped VNID (segment ID). Similarly, in the case of GRE, the frame is encapsulated in the corresponding GRE tunnel with the mapped key.

In simple word, it just making use of VXLAN (or GRE) to bridge the internal VLANs of the compute nodes, so that all those VLANs are now in the same layer 2 domain.


 
Posted By Kepler Lam

Continue with my previous blog entry which I have mentioned that OpenStack can make use of VLANs in physical network for tenant network segregation.

Yet, what is the limitation of VLAN? What is the maximum number of VLANs you can use? Yes, only 4K, that needs to be shared with all the tenants in your cloud. Also, all your compute nodes’ physical NIC need to be on the same layer 2 network.

Then what’s the solution? If you have followed my previous blogs, you will figure out that VXLAN is one of the promising solutions. As the VNID of VXLAN supports 24 bits addressing space i.e. 16 million LAN segments. Moreover, by using VXLAN, the compute nodes’ physical NIC need not to be on the same layer 2, they can be in different subnets of the physical network, so that they can be anywhere in your data center.

Besides using VXLAN, there is another option that Neutron provides, which is the traditional GRE tunnel. GRE is just like VXLAN, both are tunneling technology that making use of IP network to encapsulate the Ethernet frames. However, GRE is point-to-point in nature, while VXLAN can make use of IP multicast to carry multi-destination Ethernet frames. In GRE header, there is 32 bit key field that can be used to identify different tenant network number.

To summary, you have 3 choices:

  1. Use VLAN,
  2. Use GRE
  3. Use VXLAN.

Let me discuss the detail one by one.

If you want to use VLAN, your compute nodes should be reside on the same layer 2 domain of your physical network, the physical NIC of your compute nodes need to connected to a trunk port of the uplink switch. And all those trunk ports need to be the same layer 2, i.e. cannot be routed. Just like the figure below:

OSVlan

In the traditional Cisco 3-tier data center design, layer 2 domains are resided within the same aggregation block. As the layer 2 boundary is between the aggregation and the core, unless you extend your layer 2 over the core, otherwise, your compute nodes cannot be attached to access switches in different aggregation blocks.

3tier

That’s the reason for Cisco Nexus to provide the Fabric Path technology so that you can extend the layer 2 anywhere in your data center. Similar solution is the Cisco DFA and ACI.

Talking back to the OpenStack, let me discuss the relationship between the tenant network and the VLAN of your physical network.

When a tenant network is created in any project, Neutron will allocate a unique VLAN number (which OpenStack refer it as segment ID) for that tenant network. Note that this VLAN number is ONLY used in the physical network but NOT inside the OVS of the compute node. This is the most confusing thing, as OpenStack beginners will always have the misconception that the segment ID is used internally in the compute nodes.

Let me discuss the relationship between the tenant network and the VLAN of your physical network in next blog entry. Please follow here to the part 3 of this blog.


 

 

 
Google

User Profile
Kepler Lam
Hong Kong

 
Links
 
Category
 
Archives
 
Visitors

You have 248836 hits.

 
Latest Comments