Depicted sound of the dialup modem
If you ever connected to the Internet before the 2000s, you probably remember that it made a peculiar sound. But despite becoming so familiar, it remained a mystery for most of us. What you’re hearing is often called a handshake. The start of a telephone conversation between two modems. They are trying to find a common language and determine the weaknesses of the telephone channel originally meant for human speech.
The first thing we hear is a landline dial tone. The modem now knows it’s connected to a phone line and can dial a number. The number is signaled to the network using DTMF (Dual-Tone Multi-Frequency) signaling.
The remote modem answers with a distinct tone that our calling modem can recognize. They then exchange short bursts of binary data to assess what kind of protocol is appropriate.
Now the modems must address the problem of echo suppression. When people talk, only one is usually talking while the other one listens. The telephone network uses this fact and temporarily silences the return channel to suppress any confusing echoes of the talker’s own voice. Modems don’t like this at all, as they can very well talk at the same time (full-duplex). The answering modem now puts on a special answer tone that will disable any echo suppression circuits on the line. The tone also has periodic “snaps” (180° phase transitions) that aim to disable yet another type of circuit called echo canceller.
Now the modems will list their supported modulation modes and try to find one that both know. They also probe the line with test tones to see how it responds to tones of different frequencies, and how much it attenuates the signal. They exchange their test results and decide a speed that is suitable for the line.
After this, the modems will go to scrambled data. They put their data through a special scrambling formula before transmission to make its power distribution more even and to make sure there are no patterns that are suboptimal for transfer. They listen to each other sending a series of binary 1’s and adjust their equalizers to optimally shape the incoming signal.
Soon after this, the modem speaker will go silent and data can be put through the connection.
Below is a spectrogram of the handshake audio (by Oona Räisänen). Some signals are labeled according to which party transmitted them, and also have an explanation below.
Why can we hear this? Back in the days, telephone lines were used for audio. The first modems even used the telephone receiver like people do, by talking into the mouthpiece, until newer modems were developed that could directly connect into the phone line. Even then, the idea of not hearing what’s happening on a phone line you’re calling on was quite new, and modems would default to exposing the user to the handshake audio. In order to silence the handshake, you could sent the ATM0 command to the modem before dialing.