Cryptography 5: Stream Cipher Vulnerabilities and the Hybrid Argument

Posted on: August 14, 2025 | Reading time: 12 minutes | Word count: 2507 words | Author: Dang Duong Minh Nhat

Hello everyone, today we’ll continue with episode 5 in this series. The topic will be exploring the limitations of stream ciphers, and at the same time, introducing the hybrid argument. I recommend reading the previous posts in this series first. Alright, let’s get started.

Limitations of Stream Ciphers: Attacks on the One-Time Pad

Although stream ciphers achieve semantic security, they are quite fragile and become insecure if misused.

The Two-Time Pad is Insecure

A well-designed stream cipher can securely encrypt a single message from Alice to Bob. However, Alice may want to send multiple messages to Bob. Suppose, for simplicity, Alice wants to encrypt two messages, $m_1$ and $m_2$.

The naive approach is to encrypt both messages with the same stream cipher key $s$:

$$ c_1 \leftarrow m_1 \oplus G(s) \quad \text{and} \quad c_2 \leftarrow m_2 \oplus G(s) \tag{1} $$

A moment’s thought reveals that this construction is very insecure. An attacker who intercepts $c_1$ and $c_2$ can compute:

$$ \Delta := c_1 \oplus c_2 = (m_1 \oplus G(s)) \oplus (m_2 \oplus G(s)) = m_1 \oplus m_2 $$

From this, they obtain the XOR of $m_1$ and $m_2$. Unsurprisingly, English text contains enough redundancy that, given $\Delta = m_1 \oplus m_2$, an attacker can recover both $m_1$ and $m_2$ in the clear (plaintext). Therefore, construction $(1)$ leaks the plaintext after observing just two sufficiently long ciphertexts.

Construction $(1)$ is often jokingly called the two-time pad. As we have just analyzed, the two-time pad is completely insecure. In particular, a stream cipher key must never be used to encrypt more than one message. In situations where a key must be used multiple times, stream ciphers should never be used directly.

The incorrect reuse of a stream cipher key is a common mistake in real-world systems. For example, a protocol called PPTP allows two parties, $A$ and $B$, to send encrypted messages to each other. In Windows NT, Microsoft implemented PPTP using a stream cipher called RC4. The initial version encrypted messages from $A$ to $B$ with the same RC4 key as messages from $B$ to $A$. Consequently, by simply eavesdropping on two encrypted messages traveling in opposite directions, an attacker could recover the plaintext of both messages.

Another interesting story about the two-time pad is told by Klehr, who details how Russian spies in the US during World War II sent messages to Moscow using one-time pads. The system had a critical flaw, as Klehr explains:

During World War II, the Soviet Union could not produce a sufficient quantity of one-time pads to meet the huge demand. So, they used some one-time pads twice, thinking this would not endanger the system. US counter-intelligence during WWII collected all international telegrams sent and received. Starting in 1946, in cooperation with the British, they launched an effort to decrypt Soviet messages and, thanks to the error of reusing one-time pads as two-time pads, they were able, over the next 25 years, to decrypt about 2,900 messages (equivalent to 5,000 pages) out of the hundreds of thousands of messages sent between 1941 and 1946 (when the Soviets switched to a different system).

This decryption effort was codenamed Project Venona. The Venona documents are most famous for exposing Julius and Ethel Rosenberg, providing evidence of their involvement with the Soviet spy network. Starting in 1995, all 3,000 decrypted Venona messages were declassified.

The One-Time Pad is Malleable

Although semantic security ensures that an attacker cannot read the plaintext, it provides no guarantee of integrity. When using a stream cipher, an attacker can modify a ciphertext, and this modification will never be detected by the decryption algorithm. Even worse, we will see that by modifying the ciphertext, an attacker can control how the decrypted plaintext is changed.

Suppose an attacker intercepts the ciphertext:

$$ c := E(s, m) = m \oplus G(s) $$

The attacker modifies $c$ to:

$$ c’ := c \oplus \Delta $$

where $\Delta$ is chosen arbitrarily by the attacker. When the decryption algorithm receives the modified message, it will output:

$$ D(s, c’) = c’ \oplus G(s) = (c \oplus \Delta) \oplus G(s) = m \oplus \Delta $$

Thus, even without knowing $m$ or $s$, the attacker can cause the decrypted plaintext to become $m \oplus \Delta$ for a $\Delta$ of their choosing. We say that stream ciphers are malleable, because an attacker can create predictable changes to the plaintext.

Illustrative Example

A simple example where malleability can help an attacker is in an encrypted file system.

Suppose specifically that Bob is a professor and Alice and Molly are students. Bob’s students submit their homework assignments via email, and Bob saves these emails to a drive encrypted with a stream cipher. An email always starts with a standard header. To simplify, let’s assume an email from Alice always starts with the string "From:Alice".

Now, suppose Molly can access Bob’s drive and find the ciphertext of Alice’s homework email. Molly can claim credit for Alice’s work as follows:

She simply XORs the appropriate 5-character string into the ciphertext at positions 6 through 10 to change the header "From:Alice" to "From:Molly".
Molly performs this change only on the ciphertext and does not need to know Bob’s secret key.
Bob will never know that the header has been changed. When grading, Bob will think the assignment is from Molly, and Molly will get the grade instead of Alice.

Of course, for this attack to be effective, Molly must have a way to find Alice’s email on Bob’s encrypted drive. However, in some encrypted file systems, the file’s metadata—such as filename, modification time, etc.—is not encrypted. With this metadata, Molly could easily find the ciphertext of Alice’s email and carry out the attack.

Combining PRGs

Next, we will discuss two constructions that allow creating new PRGs from old ones. These methods allow for increasing the output space size of the original PRG while preserving its security. Perhaps more important than the constructions themselves is the proof technique, known as the hybrid argument. This technique is widely used in modern cryptography.

Parallel Construction

Suppose $G$ is a PRG defined on $(\mathcal{S}, \mathcal{R})$. Suppose in some application, we want to use $G$ multiple times. We want all outputs of $G$ to be computationally indistinguishable from random elements of $\mathcal{R}$. If $G$ is a secure PRG and the seeds are generated independently, this will hold.

We can model the multiple uses of $G$ as a new PRG $G’$. That is, we define a new PRG $G’$ that applies $G$ to $n$ different seeds and concatenates the outputs. Then $G’$ is defined on $(\mathcal{S}^n, \mathcal{R}^n)$, and for $s_1, \dots, s_n \in \mathcal{S}$:

$$ G’(s_1, \dots, s_n) := (G(s_1), \dots, G(s_n)). $$

We call $G’$ the $n$-wise parallel composition of $G$. The value $n$ is called the repetition parameter and is required to be poly-bounded.

Theorem 1. If $G$ is a secure PRG, then the $n$-wise parallel composition $G’$ of $G$ is also a secure PRG.

That is, for any PRG adversary $\mathcal{A}$ attacking $G’$ in the PRG attack game, there exists a PRG adversary $\mathcal{B}$ attacking $G$ in the PRG attack game, where $\mathcal{B}$ is a simple wrapper around $A$, such that:

$$ \mathtt{PRG}\texttt{adv}[\mathcal{A}, G’] = n \cdot \mathtt{PRG}\texttt{adv}[\mathcal{B}, G]. $$

Proof for the case n = 2

Suppose $\mathcal{A}$ is an efficient PRG adversary with advantage $\epsilon$ in attacking $G’$ in the PRG attack game. We want to show that $\epsilon$ is negligible if $G$ is a secure PRG.

Game 0 (experiment 0 of the PRG attack game with $A$ and $G’$):

$s_1 \xleftarrow{R} \mathcal{S}$, $r_1 \leftarrow G(s_1)$
$s_2 \xleftarrow{R} \mathcal{S}$, $r_2 \leftarrow G(s_2)$
Send $(r_1, r_2)$ to $\mathcal{A}$.

Let $p_0$ be the probability that $\mathcal{A}$ outputs $1$ in this game.

Game 1 (hybrid experiment):

$r_1 \xleftarrow{R} \mathcal{R}$
$s_2 \xleftarrow{R} \mathcal{S}$, $r_2 \leftarrow G(s_2)$
Send $(r_1, r_2)$ to $\mathcal{A}$.

Game 1 does not correspond fully to either Experiment 0 or 1, but is a hybrid experiment: $r_1$ is changed from a pseudorandom value to a truly random one.

Let $p_1$ be the probability that $\mathcal{A}$ outputs $1$ in Game 1 and set $\delta_1 := |p_1 - p_0|$. If $G$ is secure, $\delta_1$ is negligible.

Since $\mathcal{A}$ is just a distinguisher for “pseudorandom” vs. “random,” it will output 1 if, by its criteria, the pair $(r_1, r_2)$ looks like a pair where both are from the PRG (like Game 0).
If $\mathcal{A}$ cannot distinguish, its rate of outputting 1 will be close to that in Game 0.
If $\mathcal{A}$ recognizes that $r_1$ is truly random or that the pair $(r_1, r_2)$ was not generated from a pair of PRGs (like Game 0), it will tend to output 1 less often.

We define an adversary $\mathcal{B}_1$ against $G$:

Receive $r \in \mathcal{R}$ from its challenger.
Set $r_1 \leftarrow r$.
Generate $s_2 \xleftarrow{R} \mathcal{S}$, $r_2 \leftarrow G(s_2)$.
Send $(r_1, r_2)$ to $\mathcal{A}$.
Output the result of $\mathcal{A}$.

When $\mathcal{B}_1$ is in Experiment 0, it simulates Game 0; when in Experiment 1, it simulates Game 1. Therefore, the advantage of $\mathcal{B}_1$ is equal to $\delta_1$.

Game 2:

$r_1 \xleftarrow{R} \mathcal{R}$
$r_2 \xleftarrow{R} \mathcal{R}$
Send $(r_1, r_2)$ to $\mathcal{A}$.

Let $p_2$ be the probability that $\mathcal{A}$ outputs $1$ in Game 2. Game 2 is exactly Experiment 1 of the PRG attack game against $G’$.

Set $\delta_2 := |p_2 - p_1|$. Similarly, we can define a $\mathcal{B}_2$ such that its advantage is $\delta_2$.

Since $\epsilon = |p_2 - p_0| = |p_2 - p_1 + p_1 - p_0| \le |p_2 - p_1| + |p_1 - p_0| = \delta_1 + \delta_2$ and both are negligible, $\epsilon$ is also negligible.

A Second Proof for n = 2

Combine $\mathcal{B}_1$ and $\mathcal{B}_2$ into a single adversary $\mathcal{B}$:

Receive $r \in \mathcal{R}$ from the challenger.
Choose $\omega \in {1, 2}$ uniformly at random.
Run $\mathcal{B}_{\omega}$ with input $r$ (pass $r$ to $\mathcal{B}_{\omega}$).
Output the result of $\mathcal{B}_\omega$.

Letting $W_0$ and $W_1$ be the events that $\mathcal{B}$ outputs $1$ in Experiment 0 and 1, respectively, we have:

$$ \begin{aligned} \Pr[W_0] &= \Pr[W_0 \mid \omega = 1] \Pr[\omega = 1] + \Pr[W_0 \mid \omega = 2] \Pr[\omega = 2] \\ &= \frac{1}{2} \left( \Pr[W_0 \mid \omega = 1] + \Pr[W_0 \mid \omega = 2] \right) \\ &= \frac{1}{2} (p_0 + p_1) \end{aligned} $$

and

$$ \begin{aligned} \Pr[W_1] &= \Pr[W_1 \mid \omega = 1] \Pr[\omega = 1] + \Pr[W_1 \mid \omega = 2] \Pr[\omega = 2] \\ &= \frac{1}{2} \left( \Pr[W_1 \mid \omega = 1] + \Pr[W_1 \mid \omega = 2] \right) \\ &= \frac{1}{2} (p_1 + p_2). \end{aligned} $$

Thus, the advantage of $\mathcal{B}$ is:

$$ \delta = \left|\Pr[W_1] - \Pr[W_0]\right| = \frac{|p_2 - p_0|}{2} = \frac{\epsilon}{2}. $$

Therefore, $\epsilon = 2\delta$, and since $\delta$ is negligible, $\epsilon$ must also be negligible.

Now, we finally present the proof of Theorem 1 for the general case where $n$ is bounded by a polynomial.

Proof Idea. We can try to extend the first strategy mentioned above from $n = 2$ to an arbitrary $n$. In other words, we can construct a sequence of $n + 1$ games, starting with a challenger generating a sequence $(G(s_1), \dots, G(s_n))$ of pseudorandom elements, then replacing each element one by one with a truly random element from $\mathcal{R}$, and ending with the sequence $(r_1, \dots, r_n)$ of entirely truly random elements from $\mathcal{R}$.

The hope is that the adversary will not detect any single replacement, since $G$ is a secure PRG; however, to prove this formally, we would need to construct $n$ different adversaries, each attacking a slightly different variant of $G$. This is challenging when $n$ is not an absolute constant but only poly-bounded.

Proof. Let $\mathcal{A}$ be an efficient PRG adversary playing the PRG attack game with $G’$.

First, we introduce a sequence of $n + 1$ hybrid games: $\text{Hybrid } 0, \text{Hybrid } 1, \dots, \text{Hybrid } n.$

For $j = 0, 1, \dots, n$, Hybrid $j$ is a game between $\mathcal{A}$ and a challenger, who prepares an $n$-tuple of values where the first $j$ values are truly random, and the remaining $n - j$ values are pseudorandom outputs of $G$. Specifically:

$r_1 \xleftarrow{R} \mathcal{R}$ $\vdots$
$r_j \xleftarrow{R} \mathcal{R}$
$s_{j+1} \xleftarrow{R} \mathcal{S}$, $r_{j+1} \leftarrow G(s_{j+1})$ $\vdots$
$s_n \xleftarrow{R} \mathcal{S}$, $r_n \leftarrow G(s_n)$
Send $(r_1, \dots, r_n)$ to $\mathcal{A}$.

As usual, $\mathcal{A}$ outputs $0$ or $1$ at the end of the game.

Let $p_j$ be the probability that $\mathcal{A}$ outputs $1$ in Hybrid $j$. Observe that $p_0$ is precisely the probability that $\mathcal{A}$ outputs $1$ in Experiment 0 of the PRG attack game, while $p_n$ is the probability that $\mathcal{A}$ outputs $1$ in Experiment 1.

Therefore:

$$ \mathtt{PRG}\text{adv}[\mathcal{A}, G’] = |p_n - p_0|. \tag{2} $$

Now, we define a PRG adversary $\mathcal{B}$ that plays the PRG attack game against $G$, operating as follows:

Receive $r \in \mathcal{R}$ from its challenger.
$\mathcal{B}$ acts as a challenger to $\mathcal{A}$:
- Choose $\omega \xleftarrow{R} {1, \dots, n}$.
- Generate $r_1, \dots, r_{\omega-1} \xleftarrow{R} \mathcal{R}$.
- Set $r_\omega \leftarrow r$.
- Generate $s_{\omega+1} \xleftarrow{R} \mathcal{S}$, $r_{\omega+1} \leftarrow G(s_{\omega+1})$, and so on up to $r_n$.
- Send $(r_1, \dots, r_n)$ to $\mathcal{A}$.
Finally, $\mathcal{B}$ outputs whatever $\mathcal{A}$ outputs.

Let $W_0$ be the event that $\mathcal{B}$ outputs $1$ in Experiment 0 and $W_1$ be the event that $\mathcal{B}$ outputs $1$ in Experiment 1.

The key observation is:

When fixing $\omega = j$ for some $j \in {1, \dots, n}$, Experiment 0 of the attack game for $\mathcal{B}$ is equivalent to Hybrid $j-1$.
And Experiment 1 is equivalent to Hybrid $j$.

Therefore:

$$ \Pr[W_0 \mid \omega = j] = p_{j-1}, \quad \Pr[W_1 \mid \omega = j] = p_j. $$

It follows that:

$$ \Pr[W_0] = \sum_{j=1}^n \Pr[W_0 \mid \omega = j] \Pr[\omega = j] = \frac{1}{n} \sum_{j=1}^n p_{j-1}, $$

and similarly:

$$ \Pr[W_1] = \sum_{j=1}^n \Pr[W_1 \mid \omega = j] \Pr[\omega = j] = \frac{1}{n} \sum_{j=1}^n p_j. $$

Finally:

$$ \mathtt{PRG}\text{adv}[\mathcal{B}, G] = \left| \Pr[W_1] - \Pr[W_0] \right| = \frac{1}{n} \left| \sum_{j=1}^n p_j - \sum_{j=1}^n p_{j-1} \right| = \frac{1}{n} \left| p_n - p_0 \right|. $$

Combining this with $(2)$ gives:

$$ \mathtt{PRG}\text{adv}[\mathcal{A}, G’] = n \cdot \mathtt{PRG}\text{adv}[\mathcal{B}, G]. $$

Since we assume $G$ is a secure PRG, $\mathtt{PRG}\text{adv}[\mathcal{B}, G]$ is negligible. As $n$ is poly-bounded, it follows that $\mathtt{PRG}\text{adv}[\mathcal{A}, G’]$ is also negligible. This proves the theorem. $\square$

Throughout this post, we have journeyed from the practical pitfalls of stream ciphers to the solid theoretical foundations for proving security. The core lesson is that cryptographic security depends not only on the algorithm itself but also on its correct implementation and usage. The “two-time pad” serves as a stark warning about the importance of key management, while malleability underscores that integrity is just as crucial as confidentiality. Finally, the “hybrid argument” technique is not merely an academic exercise but an essential tool demonstrating how cryptographers build confidence in complex systems through rigorous proof. Understanding both sides of this coin—attacks and proofs—is a critical step toward mastering the art of cryptography.

References

Dan Boneh and Victor Shoup. A Graduate Course in Applied Cryptography. Retrieved from https://toc.cryptobook.us/

Connect with Me

Connect with me on Facebook, LinkedIn, via email at dangduongminhnhat2003@gmail.com, GitHub, or by phone at +84 829 258 815.