Let's Traceroute!

Background:

I am always amazed by the beauty of Linux commands. A few days ago, I was tinkering with different Linux commands and one of them was traceroute. Exploring it left me wondering how it works. Traceroute is a network troubleshooting utility that comes with Linux/Unix systems to trace the path your network packets take from source to destination.

Let’s say you are connecting to google.com from your local machine, traceroute can help you with tracing the exact path taken to reach the google server. Try this out in your terminal,

traceroute www.google.com

your output may vary but the idea of visualisation will be similar,

Concept:

The important concept here is to understand ‘TTLs’ in packets. Time To Live(TTL) in a packet signifies the amount of router hops the packet is valid till. When a packet is sent from a source host to a destination host, it contains a TTL field in its header. The TTL field is initialised with a certain value, usually 255, and is decremented by one each time the packet passes through a router. So,

why do we even need TTL?

TLDR; to save the packet from clogging the network. For example, router A connects to router B and router B to router C but router C also connects to router A creating a circular path and causing it to run around indefinitely.

The TTL value acts as a "hop limit" to prevent packets from circulating indefinitely. The idea is that if a packet is not able to reach its destination within a certain number of hops, it is likely that there is a problem with the network that is preventing it from reaching its destination. And thus, the TTL value serves as a safeguard to prevent packets from travelling indefinitely in the network.

Simple implementation (yes, simple, promise 😬):

It is typically implemented by sending a series of ICMP(Internet Control Message Protocol: network layer protocol used for error-reporting and diagnosis, typically generated by the hardware/ routers) echo request packets, each with a gradually increasing "time to live" (TTL) value. As each packet passes through a router on its way to the destination, the router decrements the TTL value by one, and if the TTL value reaches zero, the router sends an ICMP time exceeded message back to the source host. By analysing the sequence of routers that the packets pass through and the amount of time it takes for each packet to reach its destination, network administrators can identify and diagnose problems with network connectivity.

By now we've got the basic idea of traceroute's working. So what we want as output is this,

Below given algorithm implemented in Python will print out the journey our packet will take from hypothetical local & destination IPs.

# =================== CONFIG START =============================
network = {
    "127.0.0.1": "10.0.0.2", 
    "10.0.0.2": "20.0.0.2", 
    "20.0.0.2": "50.0.0.3"
}

local_ip = "127.0.0.1"
destination_ip = "50.0.0.3"

MAX_ALLOWED_HOPS = 255
# ==================== CONFIG END ==============================

# =================== HELPER START =============================
def get_next_hop(current_ip: str) -> str:
    """
    This function returns the next hop IP address
    """
    try:
        return network[current_ip]
    except KeyError:
        return None
# =================== HELPER END =============================

# =================== MAIN START =============================
def traceroute(local_ip: str, destination_ip: str) -> None:
    """
    The traceroute function is used to trace the path of a packet from the local IP to the destination IP. 
        It uses the Time to Live (TTL) value of the packet to perform network hops till the destination is found or the maximum allowed number of hops is reached. 
        It also prints out the ICMP responses encountered in each iteration.
    """
    is_found = False
    allowed_hops = 1  # to start with only one 1 hop is allowed, i.e. ttl = 1; gradually increased after each iteration

    while is_found == False:
        if allowed_hops > MAX_ALLOWED_HOPS:  # check if the packet is travelling indefinitely
            raise Exception("Hops exceeded the maximum limit. Path not found!")

        current_ip = local_ip
        print(f"Allowed hops(TTL) -> {allowed_hops}")
        print(f"Source: {current_ip}")

        for i in range(0, allowed_hops):  # perform hopping till ttl
            current_ip = get_next_hop(current_ip)
            print(f"Hop #{i+1}: {current_ip}")
            if current_ip == destination_ip:
                print("ICMP RESPONSE: Echo Reply!")
                print(f"Found destination {destination_ip}... with total hops {allowed_hops}")
                is_found = True
                break

        if is_found == False:
            allowed_hops += 1  # if destination is not found in this iteration, increase the ttl and restart the journey
            print("ICMP RESPONSE: Time Exceeded!")

        print("=======================================================")
# =================== MAIN END =============================

# Call main function
traceroute(local_ip=local_ip, destination_ip=destination_ip)

Please note that traceroute may not work every time since the network admins can also decide to block ICMP requests flowing through the router due to a variety of reasons(security/protection from DoS attacks, bandwidth conservation, compliance).

This is a sample implementation, I haven’t checked out the real implementation. But if you’re a curious soul, you can check out OpenBSD’s implementation here.

Such an interesting command and clever implementation by engineers! what do you think?