The Internet: Burning Questions
(cross-posted from my personal blog)
I’m about to start a learning project and I’m paying extra close attention, “What feels most interesting?” rather than merely, “What am I supposed to know?” When I stopped to think about it, there’s plenty of very specific (and vague) things that I’m curious about regarding how the Internet and the network stack operate. Below is my lightly edited brainstorm of “What’s confusing and interesting about computer networking?”
Another way of framing this is that I find myself more easily bored when being told “Here’s how a system works” compared to when I go, “How the hell would I build this? None of my pieces seem to fit.… maybe if I....”
If you know anything about networking and would enjoy giving me answers/hints/nudges on any of these questions, go ahead!
How does anything find where it’s going?
What local knowledge does any given router have, and what search algorithim does it use to eventually end up in the right place?
What (if any) are the guarantees of this (or other) algos?
How often do you not find the node you’re looking for?
Is it deterministic and/or predictable? i.e will a request always follow the same path?
Is there an average number of “hops” till success?
Is there anything where after making an initial request, and a path to the end node needs to be found, that path is then cached for use in ongoing exchanges? Is the original search process efficient enough that you don’t really get any gains from that? Does IP even allow for specifying a specific path?
How the hell are IP addresses and DNS records regulated?
My current understanding is there’s some central committee that dolls out IP addresses based on geographic considerations.
So IP only needs to be unique. It doesn’t matter what IP you have, so it seems like the main job here is just to avoid collisions.
I’m guessing there storing of “where does a given IP address live?” is stored in a distributed way across many routers across the whole internet.
What do you have to do to be a DNS service provider? It seems like they all have to coordinate on not letting multiple people get domain names, and they also are in change of maintaining the mapping between URLs and IPs.
How does my computer access/interact with the DNS records?
Is it a bootstrap thing, where a few DNS servers have a “forever” unchanging IP address, and I just ask them for a translation?
I think I’ve read stuff that my computer also caches dns records, but how does it know when those records have been updated? If the old IP address is still available it seems like I could go to the wrong site without noticing, unless my computer is constantly checking DNS servers.
That “sounds” expensive, but is it?
How do people who sell domain names handle collisions? If two people try to buy the same domain name at once, from different domain name vendors? (if it was from the same vendor, I assume it would be a straightforward “we queue all requests and process them one at a time” thing, doesn’t seem doable with multiple companies (unless they all forwarded to a third party, but wow that seems like a lot))
What does the topology of the internet look like?
Is there a N-degrees of separation thing?
Are there any bottle necks?
Like, what’s the min amount of routers I’d have to turn off to disconnect a city/state/country from the rest of the Internet?
What governs Internet speeds?
Is speed different from bandwidth?
K, so one can pay for faster or slower Internet. What are the resources that are shuffled around by ISP that make that speed difference.
What metrics should I use to think about Internet usage? (total GB/time?)
What metrics should I use to thing about Internet… providance? (number of parallel requests that can be handled? Total amount of GB? GB/time served?)
^ The above two questions are in the context of doing Fermi estimates on what sort of network infrastructure is needed to support X amount of consumption.
General resistence to attacks/natural-disasters/acts of god?
Those fiber optic cables that line the oceans, should I be worried about anything happening to them?
What the largest Internet outage in history?
How resilient is Internet infrastructure compared to, say, the power grid?
Wooooooooooooaaaah how the hell does satellite Internet work?
I can imagine “big space computer” being strong enough to send signals to earth, but can a phone send back?
Oh wait, I remember satellite Internet coming with a big box.
Is the main cost of a satellite Internet receiver mostly “big transmitter and power system to operate it”?
What is an Internet satellites service capabilities? What’s stopping them from taking over from cable driven ISP setups?
What are big player application layer protocols besides HTTP and SMTP?
What’s really the difference between them? How weird would it be to send email using HTTP? How weird would it be to send webpages using SMTP?
What (if any exist) does a suuuuper specific application protocol look like?
Where on your computer does networking stuff live?
Is the TCP/HTTP code in the kernel? .txt file on your desktop?
What does the networking card do?
What does network monitoring look like?
Related to topology, does the NSA have like 6 hubs they can look at and see most traffic?
How much “extra” does a router need to also store/monitor the traffic going through it?
The great firewall of Chine?
I heard something about DNS poisoning, what’s that?
Expect an update in the future with what I learn!
- 18 Aug 2019 1:20 UTC; 2 points) 's comment on Hazard’s Shortform Feed by (
Fun! And wide-ranging to the point that this is perhaps weeks of reading to get rudimentary understanding and months to really get it by setting up a lab network and trying stuff out. I do recommend putting in the time, and depending on your career goals, an internship or job at an ISP or cloud provider could be a great choice. If you can answer most of these questions in a little bit of detail, you’re a shoe-in to get such a job. A few pointers (I can give more detail, but it might be best to separate out topics into multiple shortform topics that get replies, rather than a 4-deep outline of topics in one post.
> How does anything find where it’s going?
https://en.wikipedia.org/wiki/Router_(computing)
https://en.wikipedia.org/wiki/Border_Gateway_Protocol
For local packets from your computer to the first IP router, see https://en.wikipedia.org/wiki/Address_Resolution_Protocol and https://en.wikipedia.org/wiki/Bridging_(networking) .
there is no end-to-end route specification, it’s decided at each hop what the next hop should be. https://en.wikipedia.org/wiki/Source_routing used to be a thing, but it’s pretty much never seen in the wild.
> How the hell are IP addresses and DNS records regulated?
https://www.iana.org/numbers
https://www.iana.org/domains (but this goes MUCH deeper with the number of TLDs and delegation nowadays).
> What do you have to do to be a DNS service provider?
https://en.wikipedia.org/wiki/Domain_name_registrar
(not needed for local use—anyone can run a local https://en.wikipedia.org/wiki/Name_server that is authoritative for any domain they like, including masking or overriding public domain names).
> How does my computer access/interact with the DNS records?
Varies widely by OS and configuration, but typically has a local resolver which delegates to a local resolver (which delegates further and/or resolves from root domains). It generally gets the correct local upstream resolver via https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol at the same time as it gets it’s IP address and local router address.
> I think I’ve read stuff that my computer also caches dns records, but how does it know when those records have been updated?
https://en.wikipedia.org/wiki/Domain_Name_System#DNS_resource_records contain a TTL (time to live) which is simply the time that it’s reasonable to cache for. Stupid simple, so very reliable. Yes, it does cause problems when things change more rapidly than expected.
> What does the topology of the internet look like?
https://internet-map.net/
> Are there any bottle necks?
ROFL. This is many (many) peoples’ life’s work.
> Like, what’s the min amount of routers I’d have to turn off to disconnect a city/state/country from the rest of the Internet?
To disconnect my house, just one (indeed, any one of three—my ISP’s immediate upstream router, my home cable modem, or my home router). To fully disconnect a distributed organization (city/state/country), probably thousands or tens of thousands, and not all in one place nor owned by one company.
> What governs Internet speeds?
Money. Money to pay for more/better wires/fibers and the equipment to get more bandwidth on those things. And money to convince them to allocate more of their bandwidth you.
> Is speed different from bandwidth?
Search on latency vs bandwidth vs throughput. Also consider burst vs average measurements. Need more specific questions for further discussion.
> General resistence to attacks/natural-disasters/acts of god?
There is a _LOT_ of redundancy in the systems, so it’s pretty well-protected against random outages. It’s still pretty common to have a widespread impact when a backhoe hits a fiber line, but it only affects a small portion of traffic (it sucks for those who don’t have redundant routes, though). There is more encryption and cryptographic authentication than there used to be, but it’s pure speculation about what directed large-scale logical (not physical) attacks are still possible.
> Those fiber optic cables that line the oceans, should I be worried about anything happening to them?
I don’t know, do you like worrying? Do you think you’d change any behaviors for the worry?
> What are big player application layer protocols besides HTTP and SMTP
https://en.wikipedia.org/wiki/Category:Application_layer_protocolsand it’s worth special attention to https://en.wikipedia.org/wiki/Category:Transport_Layer_Security as a common security mechanism underneath a lot of protocols.
> Where on your computer does networking stuff live?
Everywhere! Change my estimate to “years” if you intend to know how to design a chipset that accelerates wifi and TLS session establishment. The https://en.wikipedia.org/wiki/OSI_model is a complete lie for any actual implementation, but it’s a good way to think about the layered abstractions required to make networking tractable.
> I heard something about DNS poisoning, what’s that?
This one you can google for yourself, right?
Also, you didn’t ask, but a related rabbit-hole to fall into is how TCP/IP actually works, including https://en.wikipedia.org/wiki/Transmission_Control_Protocol to turn a bunch of independent unreliable packets into a stream, and https://en.wikipedia.org/wiki/Network_address_translation for how the border between private and public networks (including in your home) typically operate.
Thanks for the detailed reply! I like your idea of splitting this up as multiple shortform bits, I mostly went with one massive list because that’s how I brainstormed it before I thought of posting it. I plan to make some more of these in the future any time I’m about to investigate a new topic.
Is your day job doing network stuff?
Not really—I’ve been closer and further from the networking details at various points in my life, currently more about data and modeling than about actually moving the bits. Knowing this stuff to some degree is necessary to be able to work with the real networking gurus when something goes wrong, though.