Posted By Kepler Lam

As in the previous blog, I want to compare and relate some Cisco network features with the AWS VPC. Here let's see the NAT feature in the AWS. Here I want to focus the concept and mechanism, please refer to the AWS document for the detail configuration.

If you are familiar with the NAT function in Cisco routers, there are basically 3 different types:

  1. One-to-one (static NAT)
  2. Many-to-many (dynamic NAT)
  3. Many-to-one (PAT)

Actually you can also have these 3 different kinds of NAT configuration in the VPC of AWS. To understand this, you need to understand the logical layer 3 architecture of the VPC and the address assignment in AWS.

In fact, the routing (logical) structure of the AWS is quite straightforward, after you create the VPC, you have a VPC Router that routes between the internal subnets (with private IP addresses) within the VPC. To go out to Internet, there is another Internet Router that logically connected with the VPC router which has a default route pointing to the Internet router. Like the following diagram:



The Internet Router is also responsible for the NAT.

Obvious to access Internet your instance (VM) requires a global IP address. In AWS, there are 2 kinds of global IP address (the name is a bit confusing):

  1. Elastic IP – AWS allocate five global Internet IP addresses for every AWS account by default, these IP addresses are owned by your account, so will not be shared with others. You can freely map it to any private IP address in your VPCs. But AWS DO charge the usage of this so called Elastic IP, if you allocate it but not associate with any instance, or even associated but the instance is stopped (please refer to the AWS pricing, as I am not focusing on the price of AWS in this technical blog, but just be aware of it).
  2. Public IP - allocated from a pool of global IP addresses maintained by Amazon. This address pool is globally shared by all users, and dynamically assigned, that means you may not be able to permanently use it. As AWS may release this IP address from your instance and assign another one to it. If you require a persistence global IP address, you should use the Elastic IP.

Now let's discuss how to implement the 3 different types if NAT in AWS. 


The usage of one-to-one NAT is usually because your instance is acting as a public server that requires a fix global IP address. According to the above discussion of global IP address types, most likely you already figure out which kind of global IP to be used? Yes, Elastic IP. You need to assign one of the Elastic IP addresses from your account and assign it to interface of the instance.

Note that from the configuration point of view, it seems that the interface right now have 2 IP addresses - one private address of the internal subnet, other is the Elastic IP (like multihome), but actually it is not! The private address to public address translation occurs on the Internet router, just like the normal NAT case of a standard network.


You may want to use this kind of NAT if your instance requires to use any application that is not PAT friendly such as application that requires fixed port number. If so, you can either enable the allocation of a public IP (not Elastic) for instance on a subnet, or you can directly enable the allocation on the instance itself. Just like the one-to-one case, this public IP is not actually configured on the interface of the instance (which still only have the private IP address), instead the mapping is implemented in the Internet router.


Actually this is the most common case for those instances that just need to access the Internet as clients. Then they can share a common global IP address using different port numbers when going out to the Internet.

To use this kind of NAT, you need to allocate another NAT Gateway or NAT instance. From functional point of view, NAT Gateway and NAT instance are more or less the same, the only difference is that NAT instance is implemented by a Linux instance which is actually cheaper in the cost (as AWS do charge for NAT Gateway usage). The NAT Gateway (instance) requires to be assigned an Elastic IP address.

The NAT Gateway is just liked a single armed router, its interface is also on a private subnet just like other instance of your VPC, the VPC router will need to change the default route to point to the NAT gateway. For traffic bounded to Internet, the VPC router send the packet to the NAT gateway which will change the source address to itself interface address (which is actually still private IP) using PAT i.e. the port number maybe change. As the NAT gateway has a default route to the Internet Gateway (NAT occurs in there). Therefore, the packet is send to the Internet Gateway which will change the source IP address to the Elastic IP address of the NAT gateway.




Posted By Kepler Lam

In this Blog entry, I want to compare some basic concept of the Cisco ACI with the AWS VPC. As ACI is the SDN solution by Cisco to build the private cloud, while AWS VPC service is a public cloud solution for the Data Center network.
Before the discussion, let's see the term SDN first. As there are different interpretations of SDN, yet what's the most fundamental meaning? After I discuss the traditional hardware base network, then you should be able to define SDN. Think about in the old time, if you have 2 sets of servers, for security reason, you want to put them in 2 different "domain" i.e. subnets. Obviously, you also need to allow them to be able to communicate. Then what network devices you need to implement? Actually, this is the most basic form of network, you may deploy 2 switches (or one switch with 2 VLANs) and connect the switches with a router. As the figure below:


What's the corresponding logical network? In nowadays data center, how will you setup the corresponding infrastructure? First of all, we won't use physical servers anymore, instead VMs are being deployed. If I refer it as Software Defined Servers, then you should understand what I mean Software Defined Network. We want to use software to create (define) a logical network, then use this logical network for connecting those logical servers which are the VMs. As below:


That's the motivation of ACI. Of course we still need a physical network (just like physical servers) which consists of a set of Nexus 9K, but on top, we use overlay to create logical (or virtual) networks (similar to the concept of creating VMs inside physical server). Think about for every physical network topology, logically you can just view it as a core layer 3 network connecting different layer 2 segments. Just like the figure below:


Or no matter how many routers inside the layer 3 core, it can be degenerated to one single router as Figure 2.
So Figure 2 is the most basic form of a network, its the fundamental building  block. In ACI, it is referred as context (technically speaking, it is a VRF), while in AWS, it is called VPC. Of course, you can create many contexts within one tenant, similarly you can have multiple VPCs in your AWS account.
In ACI, layer 2 domain is called Bridge Domain (BD), while VPC just use the term subnet. When you create bridge domain, you need to assign the subnet by actually assigning the default gateway IP address. Hosts that attach to the bridge domain can use that IP as the gateway. While the subnet in VPC actually define the subnet address only, its a bit magic for the default gateway IP. Because when you start the VM (AWS refer it as instance), it will automatically get an IP address from the subnet and set the default gateway IP (AWS has tailored the operating system's LAN card driver to perform this).

For security point of view, in ACI, there is one more level inside the Bridge Domain, which is called EPG (end point group) . You can have multiple EPGs within a BD. No traffic control within the EPG, to allow traffic between 2 different EPGs, you need to define contracts (somewhat like ACL without IP address) between them.
Hosts are assigned to the EPG, for bare metal, need to assigned the connected physical port. For VMs, Cisco integrate with the Hypervisor system (VMware, HyperV), the EPG will be mapped to Port Group in vCenter and assign to the VNIC of the VMs. The advantage of using EPGs is the isolation of IP address in the contract, and the EPG membership will not change when VMs are moved across different ESXi servers using Vmotion.
In the case of VPC, because you don't need to manage a separate VM system. AWS EC2 service already provide the VM service. Actually you can only launch VMs (instance), you don't need bare mental server anymore. Thus VPC don't require something like EPG, but you can assign security group (somewhat like the port ACL) to control inbound and/or outbound traffic of individual instance. Or use network ACL (like the router ACL) and assign to the corresponding subnet to control traffic to and from the subnet.

Following figure shows the above concepts:



Posted By Kepler Lam

Want to change to a non-technical topic, though I am not a travel blogger, this time I really want to share the aurora (northern light) viewing experience in YellowKnife (Canada) just on last month (late September).


The weather there is really very nice and not cold. Around 5-8°C during the night time. We stayed 3 nights in YK, and thank God that I can have a very spectacular view of aurora on the first and most of time in 2nd night (Photo above). Though the 3rd night is quite cloudy, still able to view a bit of aurora (below). When I saw this amazing show on the sky, how can't I sing the hymn How Great Thou Art!


Some people told me that they went to Euro to see Aurora, but can’t see much and some even unable to see.  As there are 2 factors affecting the visibility of aurora:

  1. The weather – should be clean sky, not much cloud
  2. Strength of the Aurora

Located in the northern interior part of Canada, Yellowknife has higher chance to have clean sky and it is also under the Aurora Oval i.e. it satisfies the above 2 conditions. Mentioned in the YellowKnife Village Website that they have up to 240 days per year that can see aurora.

To be able to see aurora, obvious it needs to be at dark. Since at summer, sunset will be very late. So the recommended viewing season is from late Aug to the early April (excluding Oct). Personally, for those (like me) who live in warmer place better go before end of September, as the temperature is still acceptable. One of my friends who visit at late Aug can also see very beautiful aurora and not so cold.

Besides the high chance of able to see aurora, the spending is not that much (compare to Euro). Excluding the round trip air-ticket between Hong Kong and Vancouver, if just counting the trip starting on Vancouver:

  1. Round trip air-ticket between Vancouver and YellowKnife – around CAN 750/person
  2. 3 night hotel plus tour (aurora village at night) package - around CAN 700/person

To fit your budget, I suggest 3 different options:

  1. Book Flight+Hotel package, then book tour in each separate night
  2. Book Flight only, then book Hotel+Tour package
  3. Book Flight+hotel+tour packagefrom tour agent

Option 1 is the most flexible and should be the cheapest, as say if you go 3 nights, then maybe you already able to see aurora at first 2 nights, and weather forecast on 3rd night to be cloudy, then you can save money and won’t go on 3rd night. But you may need to consider the availablility of the night tour (book at the same day) especially on peek season, as it maybe full.

So to compromise (like what I did), can consider option 2. Though you may waste money and time if weather forecast for a particular night is not so favor, you guarantee have a seat.

While option 3 maybe the most expense, is obvious the easiest way. But we found that most of the travel agent only provide winter tour (not before Nov).

For photographing, I am not expert at all. But as general recommendation, before you go, you should get familiar of the manual operation of your camera (at least know how to set the ISO level, shutter speed and aperture) and how to use your tripod to point to different area of the sky. You know what, I spend less than half hour to practice, just few hours before going to view the aurora!

Posted By Kepler Lam

In a customized OpenStack training, the client asked for an example of using the Python code to create an instance, here I want to share how to do it.

Basically the script consists of the following steps:

  1. Import the python client library
  2. Get the authentication information
  3. Create a client object by passing the authentication parameters
  4. Use the client object to perform different operations

Let me show you the code for each step.

Import the python client library

from novaclient import client

Get the authentication information

Create a dictionary variable and get the credential information from corresponding environment variables.

def get_nova_creds():
    d = {}
    d['username'] = os.environ['OS_USERNAME']
    d['api_key'] = os.environ['OS_PASSWORD']
    d['auth_url'] = os.environ['OS_AUTH_URL']
    d['project_id'] = os.environ['OS_TENANT_NAME']
    return d

Create a client object by passing the authentication parameters

creds = get_nova_creds()
nova = client.Client('2.0',**creds)

Use the client object to perform different operations

Once you create the Nova client object, you can use it to find out information (such as image, flavor etc.) that are required to launch a new instance. Following shows an example:

image = nova.images.find(name="cirros-0.3.4-x86_64")
flavor = nova.flavors.find(name="m1.tiny")
network = nova.networks.find(label="mynet2")

Now you can use the client object to launch an new instance:

server = nova.servers.create(name = "myinstance4",
                                 image =,
                                 flavor =,
                                 nics = [{'net-id'}],



Posted By Kepler Lam

When a tenant network is created in any project, Neutron will allocate a unique VLAN number (which OpenStack refer it as segment ID) for that tenant network. Note that this VLAN number is ONLY used in the physical network but NOT inside the OVS of the compute node. This is the most confusing thing, as OpenStack beginners will always have the misconception that the segment ID is used internally in the compute nodes.

Now let’s see what happen in the compute node. When you create a new VM and attach it to your tenant network, Nova component of the OpenStack will find a compute node to launch your VM. Then Neutron will try to allocate an “internal” VLAN of the OVS inside the compute node and put your VM to that internal VLAN. Neutron will instruct the OVS to map the internal VLAN traffic to the physical VLAN when going out to the physical NIC of the compute node and vice versa. Obvious, if 2 VMs are on the same tenant network, they will be put in the same internal VLAN.

The term “internal” here has the meaning of local, it means that the VLAN value of the OVS maybe different in different compute node, even if they belong to the same tenant network. But they will be translated to the same VLAN of the provider network when traffic goes onto the physical network. In short, the provider VLAN bridges the internal VLANs of all compute nodes for the same tenant network.


Why is it so complicated? Why not directly use the VLAN number inside the OVS so that translation is not required. From my point of view, if you use VLAN for the provider network, it seems that it’s really unnecessary. But what about VXLAN and GRE? As the address space of VLAN is not large enough to have a one-to-one mapping to the VXLAN or GRE segment ID, so it make sense to use a local VLAN number within the compute node, as you will never put all tenant networks in one single node. The number of VLANs should be good enough within one single node. In fact, Cisco DFA uses similar trick in the leaf switch.

Once you understand the above implementation, there should be no magic for the case of using GRE and VXLAN as the provider network.

OK, let’s see what happen if we use VXLAN (or GRE) for the physical network. Actually the only different is that when you create a tenant network, instead of allocate a VLAN for that network segment, OpenStack will allocate a VXLAN ID (VNID) (or in GRE it will allocate a key) for that network, once again this ID is referred as segment ID.

Then when you create an instance (VM) on that tenant network, once again OpenStack will find a compute node to launch your VM. However, the VM will be put onto one of the internal VLAN of the OVS in that compute node. As internally, OVS uses VLAN (not VXLAN nor GRE). Nonetheless, OpenStack will instruct the OVS to remember the mapping between the internal VLAN and the segment ID. So when the Ethernet frame goes out of the compute node to the physical network, the OVS will encapsulate it onto the corresponding VXLAN using the mapped VNID (segment ID). Similarly, in the case of GRE, the frame is encapsulated in the corresponding GRE tunnel with the mapped key.

In simple word, it just making use of VXLAN (or GRE) to bridge the internal VLANs of the compute nodes, so that all those VLANs are now in the same layer 2 domain.

Posted By Kepler Lam

Continue with my previous blog entry which I have mentioned that OpenStack can make use of VLANs in physical network for tenant network segregation.

Yet, what is the limitation of VLAN? What is the maximum number of VLANs you can use? Yes, only 4K, that needs to be shared with all the tenants in your cloud. Also, all your compute nodes’ physical NIC need to be on the same layer 2 network.

Then what’s the solution? If you have followed my previous blogs, you will figure out that VXLAN is one of the promising solutions. As the VNID of VXLAN supports 24 bits addressing space i.e. 16 million LAN segments. Moreover, by using VXLAN, the compute nodes’ physical NIC need not to be on the same layer 2, they can be in different subnets of the physical network, so that they can be anywhere in your data center.

Besides using VXLAN, there is another option that Neutron provides, which is the traditional GRE tunnel. GRE is just like VXLAN, both are tunneling technology that making use of IP network to encapsulate the Ethernet frames. However, GRE is point-to-point in nature, while VXLAN can make use of IP multicast to carry multi-destination Ethernet frames. In GRE header, there is 32 bit key field that can be used to identify different tenant network number.

To summary, you have 3 choices:

  1. Use VLAN,
  2. Use GRE
  3. Use VXLAN.

Let me discuss the detail one by one.

If you want to use VLAN, your compute nodes should be reside on the same layer 2 domain of your physical network, the physical NIC of your compute nodes need to connected to a trunk port of the uplink switch. And all those trunk ports need to be the same layer 2, i.e. cannot be routed. Just like the figure below:


In the traditional Cisco 3-tier data center design, layer 2 domains are resided within the same aggregation block. As the layer 2 boundary is between the aggregation and the core, unless you extend your layer 2 over the core, otherwise, your compute nodes cannot be attached to access switches in different aggregation blocks.


That’s the reason for Cisco Nexus to provide the Fabric Path technology so that you can extend the layer 2 anywhere in your data center. Similar solution is the Cisco DFA and ACI.

Talking back to the OpenStack, let me discuss the relationship between the tenant network and the VLAN of your physical network.

When a tenant network is created in any project, Neutron will allocate a unique VLAN number (which OpenStack refer it as segment ID) for that tenant network. Note that this VLAN number is ONLY used in the physical network but NOT inside the OVS of the compute node. This is the most confusing thing, as OpenStack beginners will always have the misconception that the segment ID is used internally in the compute nodes.

Let me discuss the relationship between the tenant network and the VLAN of your physical network in next blog entry. Please follow here to the part 3 of this blog.

Posted By Kepler Lam

Recently, I have a consultation request for the OpenStack and Cisco DFA integration which is a relativity new solution, and I find that although there are documentations and blogs discussing the network architecture about OpenStack, most of them are focusing on the internal network inside the compute node (hypervisor). It seems that there is not much information about the relationship of the physical network (OpenStack refer it as provider network) and the tenant network (network that defined in OpenStack). Here in this blog I try to bridge the gap, so that I wish it will be easier for networking guys to understand how to design the physical network for the OpenStack.

At first, just want to have a brief introduction about OpenStack (in case you don’t know what it is). OpenStack is not a single vendor product, instead it’s an open source software developed by the community. They refer it as a cloud operating system. From my point of view, if you are familiar with VMware, OpenStack is somewhat like the vCenter. You can use OpenStack to manage different compute nodes, create and launch VMs (OpenStack refer it as instance) in those compute nodes. Of course, can also create different LAN segments (virtual in some sense) for the VMs.

In cloud environment, one of the important features is to support multi-tenants. OpenStack allows the administrator to create different tenants (which is referred as projects). As a tenant administrator, you can create different tenant networks in your project, and attach instances on different tenant network. Basically a tenant network is a layer 2 domain (like the traditional VLAN, or right now Cisco refer it as bridge domain in terms of the Cisco ACI). So VMs on different tenant networks should be in different subnets and need a layer 3 device to route traffic between them.

Following figure illustrates how OpenStack Web UI (dashboard) looks like after creating 2 tenant networks of each 2 VMs on it:


What I am going to discuss is the relationship of the tenant network and the provider (physical) network. Let’s first have a look on the whole network picture. Your compute nodes are actually connected to your physical network. Yet the VMs are not directly connected to the physical network, instead they are logically connected to an internal switch OVS within the compute node. So logically it’s like the following diagram (in fact it’s more complicate then that, internally it’s not only one single bridge, but I just want to simplify the picture so that networking people can understand it much quickly).


Now the question is, how does each tenant network mapped to the physical network? Recall that each tenant network (no matter within the same tenant or different tenants) is a layer 2 domain. So the most natural way is to use VLAN in the physical network for the layer 2 segregation – it’s true that one of the methods is to configure the OpenStack network component (they call it Neutron) to use different VLAN for different tenant network traffic.

However, besides VLAN, OpenStack provides 2 more solutions, please follow here to the part 2 of this blog.


Posted By Kepler Lam

In the DESIGN course, there is a question about how to bridge different network segments in different sites on to the same layer 2 network. One of the obvious solutions is using GRE tunnel between routers.

But here, I am not going to discuss how to configure GRE tunnel, as there are many documents on it. What I am going to discuss is, if you cannot change the router configuration (say if the router is not managed by you), then can you directly bridge 2 or even more Windows PC to the same layer 2 network? The solution is using the UBridge tool.

Long time ago, I have discussed how to use the Ethernet over UDP feature of UBridge. By using the new release, right now you can use the VXLAN encapsulation. Moreover, you can use the head end replication to bridge more than 2 PCs.

For example, if you have 3 different PCs on 3 sites and have IP address, and respectively, and want to bridge them all over subnet

Then you can click here to download the UBridge tool, install the Winpcap. Create a loopback interface (or use other existing interface other than the one for the VTEP IP) in each PC. Configure the loopback interface of the 3 PCs to have IP address, and respectively.

Now, in PC1, execute:

C:\> ubdg 5000#V:E@ 5000#W:E

Select the loopback interface as prompted.

Similarly, under PC2:

C:\> ubdg 5000#V:E@ 5000#W:E

Similarly, under PC3:

C:\> ubdg 5000#V:E@ 5000#W:E

You should be able to ping among the loopback interfaces of all the PCs.

Some other design choose, such as it you don’t want to use the full mesh configuration, you can actually use the hub-and-spoke design by using the relay mode, you can further encrypt the traffic. Please refer to my other VXLAN blog entries.

Posted By Kepler Lam

In this last part of the blog entries, I will discuss the encryption feature of UBridge (open source tool under Windows platform) for VXLAN. As mentioned before, VXLAN just carry Ethernet frames over IP network without any encryption. As the original design of VXLAN is to be just used within Data Center, it is not a big deal without encryption.

But if you extend your VXLAN over a public network, encryption may become an important feature. As such a requirement, UBridge now supports encryption of the original Ethernet frame.

Currently, only pre share key (PSK) configuration using BlowFish encryption algorithm is being supported. Other stronger encryption algorithm maybe supported in future release. As the original goal is to make UBridge as an Ad Hoc tool that to be used example in testing environment, and not to be prolonged use.

Like other PSK technology, it’s susceptible for brute force attack. So use it at your own discretion.


To configure encryption, you just need to define a PSK and configure it after the remote VTEP parameter separated by an asterisk. Then when UBridge sends Ethernet frame over that leg, the frame will be encrypted. Obviously, all other VTEP(s) also need to be configured with the same key.

Just follow my previous example of the provider and access mode VTEPs in Part 3, if you want to encrypt traffic between them. Only the corresponding leg need to be configure with a key. Say if you want to use ‘secret’ as the key.

Then for the provider mode legs::

  • Provider leg: 5000#V:E@*secret:8472

Now execute ubdg:

C:> ubdg 5000#V:E@ 5000#V:E@*secret:8472

For the access mode VTEP,:

  • Access mode VXLAN leg: 5000#V:E@*:*secret:8000

Now execute ubdg:

C:> ubdg 5000#V:E@*:*secret:8000 5000#W:E

Posted By Kepler Lam

As discussed in part 1 blog entry, all VTEPs must use the same UDP port number within the same VXLAN. This is a hindrance to extending the VXLAN to some VTEPs behind a PAT device. Unless static NAT or port mapping is being used.

Furthermore, even with the unicast mode, which can be used among VTEPs over the Internet. But it still requires the VTEP addresses to be known from each other. That further prevents VTEP device to move around behind NAT devices, as the IP address is no longer static in such case.

It’s not surprising that the VXLAN has such limitations, since most of VTEP devices are static devices that not being move around. But with the UBridge which can be executed in Windows platforms, i.e. you can run it in your laptop PC and move around anywhere.

Now with a new innovation of dynamic VTEP, you can still able to bridge your Windows laptop even behind any PAT device over the Internet, just like a remote access VPN. Dynamic VTEP consist of two different VTEP modes:

  1. Provider mode VTEP (like a VPN server) – which allows a VTEP device to be configured without specifying the remote VTEPs. It dynamically learns the remote VTEP address and UDP port number.
  2. Access mode VTEP (like a VPN client) – which allows a VTEP device without a fixed global IP and be placed behind any PAT device.

Note that the provider mode VTEP still requires a global IP and a fixed port number, despite that it can be statically mapped behind NAT device.  

To configure a provider mode VTEP and relay to other VXLAN, you just need to create a leg and specific ‘0’ as the remote VTEP parameter. And create other legs for other VXLAN domain.

Example, to configure your Windows machine as a provider mode VTEP, suppose your PC has an IP address which will be the VTEP address connected to the IP network. You need to relay remote access mode VTEP to an existing data center VXLAN that uses a standard multicast mode with multicast group address using VNID 5000 and UDP port number 8472. (Of course, HER mode VTEPs can also be relayed).


Then create 2 legs::

  • Bridge to data center VXLAN: 5000#V:E@
  • Provider leg: 5000#V:E@

Now execute ubdg:

C:> ubdg 5000#V:E@ 5000#V:E@

For the access mode VTEP, just put an asterisk on the local VTEP address parameter. For example, suppose your local PC is now behind a PAT router, and has an internal IP address Suppose the above provider mode VTEP is mapped with a global IP and UDP port 8000. You want to bridge a local loopback interface of your PC to the data center VXLAN. Then create 2 legs:

  • Access mode VXLAN leg: 5000#V:E@*:
  • Winpcap leg for your loopback (UBridge will let you to select NIC to be bridged): 5000#W:E

Now execute ubdg:

C:> ubdg 5000#V:E@*: 5000#W:E

Select the loopback interface when prompted.




User Profile
Kepler Lam
Hong Kong


You have 237976 hits.

Latest Comments