DAT: Distributed ARP Table for B.A.T.M.A.N.

Antonio Quartulli

Abstract

In wireless mesh networks handling broadcast packets in an efficient way is still an open problem. The ARP-REQ is a packet sent in broadcast and it is stricly needed to establish a IP stream between two hosts. In case of packet loss it will obviously increase the initial connection delay. In a layer 2 mesh DAT wants to optimise the ARP-REQ/REP procedure to avoid the use of broadcast communications as much as possible and rely on the much efficient unicast ones mixing mesh and DHT

Additional Information

 D.A.T. Distributed ARP Table

===================================

To make ARP mechanism more reliable it would be better to avoid the use of broadcast packets, which introduce a lot of delay due to their poor reliability in particular in large mesh networks.  DAT uses the DHT (Distributed Hash Table) concept to store and retrieve information over the network. In particular, instead of broadcasting an ARP_REQUEST packet, a node should calculate the node(s) address on which the information has been stored and directly contact it using unicast. 

Moreover, each node should keep a ARP cache so that, if a new client comes and ask for an already known IP address, the node should directly reply instead of requesting a translate.

The ARP cache should have a timeout such that, if packets directed or coming from a particular IP are not seen for a certain period, the entry has to be deleted (Inspect every IP packet?! A better solution could be to use a timeout bigger than the classical ARP-table timeout and to refresh the entry time using successive ARP-reply).

1. Advertising addresses

-------------------------

An inspection mechanism should be provided such that, for any ARP_REPLAY packet coming into the soft interface, the SRC IP has to be used as _key_ for a local hash table. If an entry is already present than the advertising procedure can terminate. Otherwise:
- an entry containing [IP,MAC] should be added to the local hashtable
- the value of K1 = DHT_HASH1(IP) should be calculated 3 orig_node, such that   their O1, O2 and O3 are the nearest to but not greater than K1, have to be retrived (where Oi = DHT_HASH2(orig_node->orig))
- a DHT_STORE(IP,MAC) message should be sent to such orig_nodes so that they can store the couple [IP,MAC] in their hashtable.
At this point the advertising procedure can terminate.

2. Retriving addresses

-----------------------

When an ARP_REQUEST message goes into the soft interface, a snooping mechanism should detect it and instead of forwarding it as usual, the requested IP is used as follow:
the IP should be used as key in the local hash table and, if an entry of the form [IP,MAC] is found, an ARP_REPLY with MAC as result is delivered.
otherwise:
- the value of K1 = DHT_HASH1(IP) should be calculated
- 3 orig_node, such that their O1, O2 and O3 are the nearest to but not greater than K1, have to be retrived (where Oi = DHT_HASH2(orig_node->orig))
- a DHT_GET(IP) message should be sent to such orig_nodes so that, if one of them had the couple [IP,MAC] in its hash table, it could reply with a DHT_REPLY(MAC) message
- the requesting node should at this point deliver the ARP_REPLY containing the retrived MAC address and store the couple [IP,MAC] for further requests.
The retriving phase is finished.  If no one of the three contacted nodes will reply with a MAC address, the ARP_REQUEST has to be broadcasted over all the network. See the next section for this.

3. Handling Unreachability

---------------------------

Due to the dinamic and expansive nature of mesh networks, it is possible to build such topologies where some nodes have a different knowledge of the network with respect to others. For example, since the starting TTL value of the OGM is 50, in a network where the diameter is greater than this value, there could be nodes that will never heard nodes on the other side of network. This problem will clearly lead to a DHT inconsistency, since, for example, a node A could store a couple [IP,MAC] on a node K that A can reach but B cannot. At this point B will try to send the DHT_GET message but it will choose an uncorrect destination as it don't know anything about node K.  
To avoid this problem several countermeasure have to be taken:
- storing the information on 3 nodes will, at least, reduce the probability of getting into this issue
- a fallback mechanism should be taken into account so that, if no one of the 3 orig_nodes reply with a DHT_REPLY (we could think to a DHT_NACK to avoid waiting for timeouts)
- a unicast flooding mechanism can be used.
It is possible to see how this situation is going to happen each time an host issues an ARP_REQUEST packet requesting the MAC addr for a never seen IP: no nodes in the network will know the MAC address and then none of three queried node will reply with the requested data.

The broadcast procedure has room for improvement.

4. Managing hash tables

------------------------

The local hash table should provide a purging mechanism so that obsolete entries can be purged out as soon as they are not used anymore.