Traceroute and MTR not working in Virtual machines (Waiting for reply / no reply) and NAT explanation

So today I hit a bit of a weird situation while playing with an embedded version of Ubuntu running on windows…

Usually, when I get started with a VM I trigger a quick ping just to double-check its got connectivity but this time I tried MTR (I was curious if the embedded version would nicely download packages as a normal version of ubuntu would… not sure what I was expecting) and it resulted in a “waiting for reply” response for the intermediate hops having never seen this I was stumped and didn’t know what was wrong at first…

I thought to myself this is weird and tested it out on one of my other ubuntu VM’s I’ve had running via Workstation and it had a similar issue…. very weird!

I then try a tracepath on both and see a no reply message on the VM workstation instance (embedded VM didn’t respond)

So I think maybe something is wrong with my firewall, try turning it off no luck. Next, I fire up Wireshark and see interestingly we have the replies on my host OS instance of Wireshark but on the VM nothing but the actual destination reply!

So I start thinking, what could stop this traffic?

I don’t have any firewall rules setup anywhere my host OS is receiving the packets.

I start thinking maybe there’s some stupid logic that says if we have an errored packet or a bad state packet just drop it, don’t bother forwarding it…. But that makes no sense why would that be implemented.

And then I realized… NAT (Network Address translation) is probably the thing stopping it. It completely went over me to consider it as the local router managed that why would NAT block it? But no there is another NAT layer between the host OS and the VM!

Reviewing the VM Network settings I found we had actually left the interface settings left as a NAT interface rather than bridged and once flipped and DHCP assigned it all worked…

For some of you out there this is probably pretty obvious but as someone who’s never hit that weird response of “Waiting for reply” from MTR it got me stumped and I couldn’t find anything online to suggest otherwise…

If you’re still struggling to understand why NAT would have caused it, take a look back at NAT fundamentals but to try and explain it basically:

When the initial ICMP messages are sent the NAT application (running in VMware I assume) it reviews the IP header. In my case it sees the DEST IP as 1.1.1.1 and knows its sent from VM1 and hence says to itself VM1 is initiating a connection to 1.1.1.1, I’ll update the IP header to send out to the wider network the message so any reply to this address comes back to me and I know where to send it!

Now as we see on the Wireshark capture on the windows host we see many ICMP Time to live exceeded messages in response (This is as expected with MTR/Traceroute and is how they work) for example from 4.69.133.238, as a result, the application then reviews its current connections and doesn’t see any existing connection to 4.69.133.238 and hence drops it as it doesn’t know what to do with it!

When we turn the VMware interface to bridged we get a specific IP address for the VM (Rather than being behind the Host OS IP using NAT) and hence my home router knows to send traffic for that VM address to the host OS/My computers NIC and then it gets forwarded onto the VM directly without dealing with nat and hence we get the direct replies!

I hope that all makes sense, if not feel free to leave a comment and I can try to explain further! But yet again like with everything on this site Its all for my personal notes so if/when I hit this issue again in a few years I can know straight away, ahhh its because of NAT!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.