Cryptography 1: Perfect Security and the Limits of Perfect Security

Posted on: March 28, 2025 | Reading time: 21 minutes | Word count: 4337 words | Author: Dang Duong Minh Nhat

Welcome to the Cryptography Knowledge Sharing Series!

Hello everyone! I’m excited to share with you the knowledge I’ve gained about Cryptography. This is the first article in a series on cryptography, where I will introduce fundamental and core concepts of this fascinating field. Before we begin, I recommend having some foundational mathematical knowledge, particularly in discrete probability, which you can explore at Discrete Probability - High School Mathematics Extensions, and basic number theory available in A Computational Introduction to Number Theory and Algebra – Victor Shoup.

In this series, I primarily reference A Graduate Course in Applied Cryptography by Dan Boneh and Victor Shoup, an excellent and accessible introductory book. You can check out the table of contents here.

I hope you follow and support this series! If you have any questions or suggestions, feel free to connect with me. Now, let’s dive into the first chapter: Shannon Cipher and Perfect Security! 🔐

Definition of Shannon Cipher

We begin with the concept of encryption, assuming that Alice and Bob share a secret key $k$ . Alice wants to send a message $m$ to Bob over a network while keeping $m$ confidential from any eavesdropper. We will develop fundamental techniques to address this problem.

Beyond transmitting messages over a network, these techniques also allow Alice to store a file securely on a disk, ensuring that no one else can access it while she retains the ability to retrieve it later.

A fundamental mechanism for encrypting a message using a secret key is called a cipher or an encryption scheme. A simpler form of cipher, known as Shannon cipher, consists of a pair of functions, denoted as $\mathcal{E} = (E, D)$ :

Encryption function $E$ takes as input a key $k$ and a message $m$ (also known as plaintext) and produces a ciphertext $c = E(k, m)$ . We say that $c$ is the ciphertext of $m$ under key $k$ .
Decryption function $D$ takes as input a key $k$ and a ciphertext $c$ and recovers the message $m = D(k, c)$ . We say that $m$ is the plaintext decrypted from $c$ using key $k$ .
Correctness: The decryption process must accurately recover the original message. That is, for all keys $k$ and messages $m$ , we must have: $D(k, E(k, m)) = m$

Formally, let:

$\mathcal{K}$ be the set of all possible keys (key space)
$\mathcal{M}$ be the set of all possible messages (message space)
$\mathcal{C}$ be the set of all possible ciphertexts (ciphertext space)

Then, the encryption and decryption functions can be described as $E: \mathcal{K} \times \mathcal{M} \to \mathcal{C}$ and $D: \mathcal{K} \times \mathcal{C} \to \mathcal{M}$ . We define an encryption scheme $\mathcal{E}$ over the tuple $(\mathcal{K}, \mathcal{M}, \mathcal{C})$ .

To better understand this process, consider the following encryption and decryption workflow. Suppose Alice and Bob want to use an encryption scheme for secure communication. They must first agree on a secret key $k \in \mathcal{K}$ . Once they have the key:

When Alice wants to send a message $m \in \mathcal{M}$ to Bob, she encrypts it using key $k$ to obtain the ciphertext $c = E(k, m) \in \mathcal{C}$ . She then sends $c$ to Bob over a communication channel.
When Bob receives the ciphertext $c$ , he decrypts it using key $k$ : $m = D(k, c)$ . Due to the correctness property, Bob is guaranteed to recover the exact original message $m$ .
We assume that the ciphertext $c$ remains unchanged during transmission from Alice to Bob. The goal of encryption is to ensure that even if an eavesdropper intercepts $c$ , they cannot deduce significant information about the message $m$ .

In practice, keys, messages, and ciphertexts are typically represented as sequences of bytes.

Keys usually have a fixed length, e.g., 128-bit (16 bytes).
Messages and ciphertexts can have either fixed or variable lengths, ranging from a single bit to large files like 1GB videos, 10MB songs, or 1KB emails.
Besides bit strings, they can also be mathematical objects such as integers, polynomials, matrices, or elements of algebraic groups.

In theory, we often assume that $\mathcal{K}, \mathcal{M}, \mathcal{C}$ are finite sets. This simplification helps in modeling encryption schemes, though real-world systems may support messages of unbounded length. I will introduce various applications of these concepts in upcoming sections.

One-Time Pad (OTP) Cipher

The OTP cipher is a Shannon cipher $\mathcal{E} = (E, D)$ , where the key, message, and ciphertext are all bit strings of the same length $L$ :

$\mathcal{K} := \mathcal{M} := \mathcal{C} := \left\lbrace 0,1 \right\rbrace^L.$

For a given key $k \in \left\lbrace 0,1 \right\rbrace^L$ and a message $m \in \left\lbrace 0,1 \right\rbrace^L$ , encryption and decryption are defined as follows:

Encryption:
$E(k, m) := k \oplus m$
Decryption:
$D(k, c) := k \oplus c$

where $\oplus$ denotes the bitwise XOR (exclusive-OR) operation. The XOR operation has the following properties:

Commutativity: $x \oplus y = y \oplus x$
Associativity: $x \oplus (y \oplus z) = (x \oplus y) \oplus z$
Identity element: $x \oplus 0^L = x$
Inverse property: $x \oplus x = 0^L$

To verify correctness:

$D(k, E(k, m)) = D(k, k \oplus m) = k \oplus (k \oplus m) = (k \oplus k) \oplus m = 0^L \oplus m = m.$

Variable-Length One-Time Pad

This scheme is similar to OTP but allows messages of variable length up to $L$ .

Key space: $\mathcal{K} := \left\lbrace 0,1 \right\rbrace^L$
Message & ciphertext spaces: $\mathcal{M} := \mathcal{C} := \left\lbrace 0,1 \right\rbrace^{\leq L}$

For a message $m$ of length $\ell$ , encryption and decryption are defined as:

Encryption:
$E(k, m) := k[0 \dots \ell -1] \oplus m$
Decryption:
$D(k, c) := k[0 \dots \ell -1] \oplus c$

where $k[0 \dots \ell -1]$ represents the first $\ell$ bits of the key $k$ .

Substitution Cipher

A substitution cipher is an encryption method in which each character of the plaintext is replaced with another character according to a fixed rule.

Alphabet: A finite set of characters, e.g., the 26 English letters (A–Z) and space.
Message & ciphertext spaces: The set of all fixed-length strings of length $L$ , denoted as $\mathcal{M} = \mathcal{C} = \Sigma^L$ .
Key space: Each key is a permutation (rearrangement) of the alphabet $\Sigma$ , meaning the key space has a size of $\left| \mathcal{K} \right| = \left| \Sigma \right|!$ (which is extremely large).

Given a permutation key $k$ , which defines a unique mapping of characters:

Encryption: Each character in the plaintext is replaced by its corresponding character in the permutation $k$ :

Here, $m[i]$ represents the $i$ -th character of the message $m$ .
Decryption: The inverse permutation $k^{-1}$ is applied to recover the original message:

The function $k^{-1}$ is the inverse mapping of $k$ , allowing decryption back to the original plaintext.

Additive One-time Pad

This cipher is defined as follows:

Key, message, and ciphertext spaces:
$\mathcal{K} = \mathcal{M} = \mathcal{C} = \left\lbrace 0, \dots, n-1 \right\rbrace.$
Encryption:
$E(k, m) = (m + k) \mod n.$
Decryption:
$D(k, c) = (c - k) \mod n.$

Correctness verification:

$D(k, E(k, m)) = (m + k - k) \mod n = m.$

Perfect Security

So far, we have only defined the basic syntax and correctness requirements of a Shannon cipher. The next crucial question is: What does it mean for a cipher to be secure? Intuitively, a secure cipher is one in which an encrypted message remains well hidden, even if an adversary intercepts the ciphertext. However, translating this intuition into a mathematically rigorous and practically meaningful definition is a major challenge. While cryptographic systems have been used for centuries, it is only in the past few decades that mathematically sound security definitions have been developed.

Next, we will formalize the notion of perfect security—the gold standard of security (at least when considering encryption of a single message, without integrity concerns). We will see that this level of security is achievable; in fact, we will prove that the one-time pad satisfies this definition. However, the one-time pad is impractical because the key must be as long as the message: if Alice wants to send a 1GB file to Bob, they must share a 1GB key! Unfortunately, this is unavoidable: we will also prove that any cipher achieving perfect security must have a key space at least as large as the message space. This result motivates the development of a weaker yet still practical definition of security, allowing long messages to be encrypted with shorter keys.

If Alice encrypts a message $m$ using a key $k$ , and an eavesdropper intercepts the ciphertext $c$ , then $m$ remains secret only if $k$ is unpredictable. At a minimum, $k$ must be randomly chosen from a sufficiently large key space. When we say that $m$ is well hidden, we mean that determining $m$ from $c$ without knowing $k$ should be infeasible. However, this is not enough. Although the adversary does not know $k$ , we assume they know the encryption algorithm and the key distribution. In fact, we assume that each time a message is encrypted, the key $k$ is chosen uniformly at random from the key space.

Additionally, the adversary may have some prior knowledge about the encrypted message. Based on context, they might know that the set of possible messages is small and even have some probability distribution over these messages. For example, suppose the adversary knows that Alice’s message $m$ is either

$m_0 = \text{“ATTACK AT DAWN”}$

$m_1 = \text{“ATTACK AT DUSK”}$

and that Alice is equally likely to choose either message. Before seeing the ciphertext $c$ , the adversary has a 50% probability of guessing correctly. However, after observing $c$ , both messages might still be possible—there could exist keys $k_0$ and $k_1$ such that:

$E(k_0, m_0) = c, \quad E(k_1, m_1) = c.$

Thus, the adversary cannot be certain whether $m = m_0$ or $m = m_1$ . However, they can still make an educated guess. Suppose there are 800 keys $k_0$ satisfying $E(k_0, m_0) = c$ and 600 keys $k_1$ satisfying $E(k_1, m_1) = c$ . Then, the adversary’s optimal guessing probability is

$\frac{800}{800 + 600} \approx 57%.$

This is slightly better than the 50% probability without knowing the ciphertext. The formal definition of perfect security eliminates this advantage entirely—meaning that learning the ciphertext does not increase the probability of correctly guessing the message, nor does it reveal any partial information about the message.

Without further delay, we formally define perfect security. In this definition, we consider a randomized experiment where the key $\mathbf{k}$ is chosen uniformly from the key space. We denote $\mathbf{k}$ as a random variable representing this key. For each message $m$ , we define $E(\mathbf{k}, m)$ as another random variable, representing the result of encrypting $m$ using the randomly chosen key $\mathbf{k}$ . Thus, every message $m$ induces a distinct random variable $E(\mathbf{k}, m)$ .

Definition (perfect security)

Let $\mathcal{E} = (E, D)$ be a Shannon cipher defined over the set $(\mathcal{K}, \mathcal{M}, \mathcal{C})$ . Consider a randomized experiment in which the random variable $k$ is uniformly distributed over the key space $\mathcal{K}$ . If for all $m_0, m_1 \in \mathcal{M}$ and for all $c \in \mathcal{C}$ , we have:

$\Pr[E(k, m_0) = c] = \Pr[E(k, m_1) = c],$

then $\mathcal{E}$ is said to be a perfectly secret Shannon cipher.

There are multiple equivalent formulations of perfect security that we shall explore. Below are some equivalent statements:

Theorem 1 Let $\mathcal{E} = (E, D)$ be a Shannon cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$ . Then, the following conditions are equivalent:

$\mathcal{E}$ is perfectly secret.
For every $c \in \mathcal{C}$ , there exists an integer $N_c$ (which may depend on $c$ ) such that for every $m \in \mathcal{M}$ , we have: $\left| \left \lbrace k \in \mathcal{K} \mid E(k, m) = c \right \rbrace \right| = N_c.$
If the random variable $k$ is uniformly distributed over $\mathcal{K}$ , then each random variable $E(k, m)$ , for every $m \in \mathcal{M}$ , is also uniformly distributed.

We shall prove this theorem. First, we can restate condition (2) as follows: for every $c \in \mathcal{C}$ , there exists a probability $P_c$ (depending on $c$ ) such that for every $m \in \mathcal{M}$ , we have:

$\Pr[E(\mathbf{k}, m) = c] = P_c.$

Here, $\mathbf{k}$ is a random variable uniformly distributed over $\mathcal{K}$ , which implies that $P_c = \frac{N_c}{|\mathcal{K}|}$ , where $N_c$ is the number of keys $k$ satisfying $E(k, m) = c$ . From this, it is evident that $E(k, m)$ is also uniformly distributed, making statements (2) and (3) equivalent.

Next, we prove that $(1) \Rightarrow (2)$ . Suppose $\mathcal{E}$ is perfectly secret. We aim to establish condition (2). Fix an arbitrary ciphertext $c \in \mathcal{C}$ . Choose any message $m_0 \in \mathcal{M}$ . Define $P_c := \Pr[E(k, m_0) = c]$ . By the definition of perfect security, for every $m \in \mathcal{M}$ , we have:

$\Pr[E(k, m) = c] = \Pr[E(k, m_0) = c] = P_c.$

This proves condition (2).

Finally, we prove that $(2) \Rightarrow (1)$ . Assume (2) holds, and we show that $\mathcal{E}$ is perfectly secret. For any two messages $m_0, m_1 \in \mathcal{M}$ and any ciphertext $c \in \mathcal{C}$ , condition (2) implies that:

$\Pr[E(k, m_0) = c] = P_c = \Pr[E(k, m_1) = c].$

Since this holds for all $m_0, m_1$ , $\mathcal{E}$ satisfies the definition of perfect security.

Returning to Shannon’s examples, we first prove that the one-time pad achieves perfect security. Suppose $\mathcal{E} = (E, D)$ is a one-time pad cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$ , where $\mathcal{K} := \mathcal{M} := \mathcal{C} := \left \lbrace0,1\right \rbrace^L$ .

For any fixed message $m \in \left \lbrace0,1\right \rbrace^L$ and ciphertext $c \in \left \lbrace0,1\right \rbrace^L$ , there exists a unique key $k \in \left \lbrace0,1\right \rbrace^L$ satisfying:

$k \oplus m = c,$

specifically, $k := m \oplus c$ . Thus, $\mathcal{E}$ satisfies condition (ii) in Theorem 1 (with $N_c = 1$ for each $c$ ).

Now consider the variable-length one-time pad. This scheme does not satisfy our definition of perfect security because the ciphertext length directly reveals the plaintext length. Suppose we select a one-character message $m_0$ and a two-character message $m_1$ . Assume $c$ is a one-character ciphertext, and $k$ is a random key uniformly distributed over the key space. Then,

$\Pr[E(k, m_0) = c] = \frac{1}{2}, \quad \Pr[E(k, m_1) = c] = 0.$

This provides a direct counterexample to the definition of perfect security.

We note that the variable-length one-time pad cannot satisfy perfect security simply because any ciphertext reveals the length of its corresponding plaintext. However, in a certain sense (not yet formally defined), this is the only information leaked. Whether this is a flaw in the scheme or in the definition of perfect security is debatable.

On one hand, if messages vary significantly in length, one can pad them to a uniform size. However, this may be impractical due to bandwidth inefficiencies.
On the other hand, in some applications, leaking only the message length can be dangerous. For example, encrypting the response “yes” or “no” would leak the answer unless “no” is padded to match the length of “yes.”

Finally, let us examine the substitution cipher. There are multiple ways to see that this scheme is not perfectly secret. For example, consider two messages $m_0, m_1 \in \Sigma^L$ where the first two characters of $m_0$ are identical, but those of $m_1$ differ: $m_0[0] = m_0[1]$ but $m_1[0] \neq m_1[1]$ . If $k$ is a permutation of $\Sigma$ , then for any ciphertext $c$ , we have:

$c[0] = c[1] \text{ if } c = E(k, m_0), \quad c[0] \neq c[1] \text{ if } c = E(k, m_1).$

Thus, $E(k, m_0)$ and $E(k, m_1)$ have distinguishable distributions, violating perfect security.

A more practical attack exploits structured headers in encrypted emails, such as “FROM” appearing in predictable locations. Knowing this, an attacker can infer portions of the plaintext, demonstrating that substitution ciphers fail to achieve perfect security.

In contrast, the additive one-time pad can be easily verified to be perfectly secret, as it satisfies condition (ii) in Theorem 1 (with $N_c = 1$ for each $c$ ).

The next two theorems introduce alternative characterizations of perfect security. In the first theorem, we assume that an eavesdropper applies a logical predicate $\phi$ to the ciphertext they intercept. The predicate $\phi$ is a Boolean function over the ciphertext space, which could be as simple as a parity check function (i.e., checking whether the number of 1-bits in the ciphertext is even or odd), or it could be a more sophisticated statistical test. Regardless of how intricate $\phi$ may be, perfect security ensures that its value on the ciphertext reveals no information whatsoever about the original message.

Theorem 2. Let $\mathcal{E} = (E, D)$ be a Shannon cipher defined over the tuple $(\mathcal{K}, \mathcal{M}, \mathcal{C})$ . Consider a random experiment where the key $\mathbf{k}$ is a uniformly distributed random variable over the key space $\mathcal{K}$ . Then, $\mathcal{E}$ is perfectly secret if and only if, for every predicate $\phi$ over the ciphertext space $\mathcal{C}$ and for all $m_0, m_1 \in \mathcal{M}$ , we have:

$\Pr[\phi(E(\mathbf{k}, m_0))] = \Pr[\phi(E(\mathbf{k}, m_1))].$

The forward direction of this theorem follows from a straightforward computation.

Assume that $\mathcal{E}$ is perfectly secret, and let $\phi$ be an arbitrary predicate, with two messages $m_0$ and $m_1$ .
Define the set $S := \left \lbrace c \in \mathcal{C} \mid \phi(c) \right \rbrace$ , which consists of all ciphertexts for which $\phi$ evaluates to true. This allows us to express:

$\Pr[\phi(E(\mathbf{k}, m_0))] = \sum_{c \in S} \Pr[E(\mathbf{k}, m_0) = c].$

By the definition of perfect security, we have:

$\sum_{c \in S} \Pr[E(\mathbf{k}, m_0) = c] = \sum_{c \in S} \Pr[E(\mathbf{k}, m_1) = c] = \Pr[\phi(E(\mathbf{k}, m_1))].$

Now, for the reverse direction, suppose that $\mathcal{E}$ is not perfectly secret. This means there exist messages $m_0, m_1$ and a ciphertext $c$ such that $\Pr[E(\mathbf{k}, m_0) = c] \neq \Pr[E(\mathbf{k}, m_1) = c]$ .

Consider the predicate $\phi$ that evaluates true for ciphertext $c$ and false for all other ciphertexts. Then, we obtain:

$\Pr[\phi(E(\mathbf{k}, m_0))] = \Pr[E(\mathbf{k}, m_0) = c] \neq \Pr[E(\mathbf{k}, m_1) = c] = \Pr[\phi(E(\mathbf{k}, m_1))].$

This contradiction establishes the theorem.

The next theorem provides an alternative formulation of perfect security, stating that a ciphertext reveals no information about the original message.

Suppose that $\mathbf{m}$ is a random variable distributed over the message space $\mathcal{M}$ (not necessarily uniformly).
Suppose that $\mathbf{k}$ is a uniformly distributed random variable over the key space $\mathcal{K}$ and is independent of $\mathbf{m}$ .
Define $\mathbf{c} := E(\mathbf{k}, \mathbf{m})$ as the random variable distributed over the ciphertext space $\mathcal{C}$ .

Corollary: perfect security ensures that $\mathbf{c}$ and $\mathbf{m}$ are statistically independent. One way to express this independence is:

$\Pr[\mathbf{m} = m \mid \mathbf{c} = c] = \Pr[\mathbf{m} = m], \quad \forall m \in \mathcal{M}, c \in \mathcal{C}, \text{ with positive probability.}$

Intuitive Interpretation: After observing a ciphertext $c$ , we gain no additional information about the original message $m$ compared to before seeing $c$ . Another way to describe this independence is:

$\Pr[\mathbf{c} = c \mid \mathbf{m} = m] = \Pr[\mathbf{c} = c], \quad \forall m \in \mathcal{M}, c \in \mathcal{C}.$

This means that the choice of message $m$ does not affect the distribution of the ciphertext.

Remark: The assumption that $\mathbf{m}$ and $\mathbf{k}$ are independent random variables is reasonable. When using any encryption scheme, choosing a key based on the message or choosing a message based on the key is generally a bad idea.

Theorem 3. Let $\mathcal{E} = (E, D)$ be a Shannon cipher defined over the tuple $(\mathcal{K, M, C})$ . Consider a random experiment where $\mathbf{k}$ and $\mathbf{m}$ are random variables satisfying:

$\mathbf{k}$ is uniformly distributed over $\mathcal{K}$ ,
$\mathbf{m}$ is distributed over $\mathcal{M}$ ,
$\mathbf{k}$ and $\mathbf{m}$ are independent.

Define the random variable $\mathbf{c} := E(\mathbf{k}, \mathbf{m})$ . Then:

If $\mathcal{E}$ is perfectly secret, then $\mathbf{c}$ and $\mathbf{m}$ are independent.
Conversely, if $\mathbf{c}$ and $\mathbf{m}$ are independent, and every message in $\mathcal{M}$ has a nonzero probability, then $\mathcal{E}$ is perfectly secret.

Proof. We will prove the theorem in both directions.

(1) If $\mathcal{E}$ is perfectly secret, then $\mathbf{c}$ and $\mathbf{m}$ are independent.

For any $m \in \mathcal{M}$ and $c \in \mathcal{C}$ , we need to show:

$\Pr[\mathbf{c} = c \wedge \mathbf{m} = m] = \Pr[\mathbf{c} = c] \Pr[\mathbf{m} = m].$

Expanding the left-hand side:

$\begin{aligned}\Pr[\mathbf{c} = c \wedge \mathbf{m} = m] &= \Pr[E(\mathbf{k}, \mathbf{m}) = c \wedge \mathbf{m} = m] \\&= \Pr[E(\mathbf{k}, m) = c \wedge \mathbf{m} = m].\end{aligned}$

Since $\mathbf{k}$ and $\mathbf{m}$ are independent, we obtain:

$\Pr[E(\mathbf{k}, m) = c \wedge \mathbf{m} = m] = \Pr[E(\mathbf{k}, m) = c] \Pr[\mathbf{m} = m].$

To establish independence, we need to show:

$\Pr[E(\mathbf{k}, m) = c] = \Pr[\mathbf{c} = c].$

Indeed, we have:

$\Pr[\mathbf{c} = c] = \Pr[E(\mathbf{k}, \mathbf{m}) = c].$

Applying the law of total probability:

$\begin{aligned}\Pr[\mathbf{c} = c] &= \sum_{m_0 \in \mathcal{M}} \Pr[E(\mathbf{k}, \mathbf{m}) = c \wedge \mathbf{m} = m_0] \\&= \sum_{m_0 \in \mathcal{M}} \Pr[E(\mathbf{k}, m_0) = c \wedge \mathbf{m} = m_0].\end{aligned}$

Since $\mathbf{k}$ and $\mathbf{m}$ are independent:

$\Pr[E(\mathbf{k}, m_0) = c \wedge \mathbf{m} = m_0] = \Pr[E(\mathbf{k}, m_0) = c] \Pr[\mathbf{m} = m_0].$

From the assumption that $\mathcal{E}$ is perfectly secret, we know that:

$\Pr[E(\mathbf{k}, m_0) = c] = \Pr[E(\mathbf{k}, m) = c] \quad \forall m_0 \in \mathcal{M}.$

Substituting this into the summation:

$\Pr[\mathbf{c} = c] = \sum_{m_0 \in \mathcal{M}} \Pr[E(\mathbf{k}, m) = c] \Pr[\mathbf{m} = m_0].$

Since the probabilities $\Pr[\mathbf{m} = m_0]$ sum to 1 over $\mathcal{M}$ , we conclude:

$\Pr[\mathbf{c} = c] = \Pr[E(\mathbf{k}, m) = c].$

Thus, $\mathbf{c}$ and $\mathbf{m}$ are independent.

(2) If $\mathbf{c}$ and $\mathbf{m}$ are independent, then $\mathcal{E}$ is perfectly secret.

Assume that $\mathbf{c}$ and $\mathbf{m}$ are independent, and that every message in $\mathcal{M}$ has nonzero probability. Consider any $m \in \mathcal{M}$ and $c \in \mathcal{C}$ . We will prove that:

$\Pr[E(\mathbf{k}, m) = c] = \Pr[\mathbf{c} = c],$

which immediately implies perfect security. Since $\Pr[\mathbf{m} = m] \neq 0$ , we obtain:

$\begin{aligned}\Pr[E(\mathbf{k}, m) = c] \Pr[\mathbf{m} = m] &= \Pr[E(\mathbf{k}, m) = c \wedge \mathbf{m} = m] \quad \text{(since $\mathbf{k}$ and $\mathbf{m}$ are independent)} \\&= \Pr[E(\mathbf{k}, \mathbf{m}) = c \wedge \mathbf{m} = m] \\&= \Pr[\mathbf{c} = c \wedge \mathbf{m} = m] \\&= \Pr[\mathbf{c} = c] \Pr[\mathbf{m} = m] \quad \text{(since $\mathbf{c}$ and $\mathbf{m}$ are independent)}.\end{aligned}$

Dividing both sides by $\Pr[\mathbf{m} = m]$ (which is nonzero), we conclude:

$\Pr[E(\mathbf{k}, m) = c] = \Pr[\mathbf{c} = c] = \Pr[E(\mathbf{k}, \mathbf{m})].$

Since this holds for all $m \in \mathcal{M}$ and $c \in \mathcal{C}$ , the encryption scheme $\mathcal{E}$ is perfectly secret.

The bad news is saved for last. The following theorem shows that perfect security is such a strong concept that no encryption scheme can do better than the one-time pad (OTP): the key must be at least as long as the message.

As a consequence, perfectly secure encryption schemes are impractical in most real-world applications. For example, if Alice wants to send Bob a 1GB video file securely, they must agree on a secret key of at least 1GB in length beforehand. This leads to the following theorem.

Theorem 4 (Shannon’s Theorem). Let $\mathcal{E} = (E, D)$ be a Shannon encryption scheme defined over $(\mathcal{K, M, C})$ . If $\mathcal{E}$ achieves perfect security, then:

$|\mathcal{K}| \geq |\mathcal{M}|.$

As with the previous theorems, we will prove this statement. Suppose, for contradiction, that $|\mathcal{K}| < |\mathcal{M}|$ . We will show that $\mathcal{E}$ cannot be perfectly secure by demonstrating the existence of two messages $m_0, m_1$ and a ciphertext $c$ such that:

$\Pr[E(\mathbf{k}, m_0) = c] > 0, \quad \text{(1)}$

$\Pr[E(\mathbf{k}, m_1) = c] = 0. \quad \text{(2)}$

where $\mathbf{k}$ is a uniformly distributed random variable over $\mathcal{K}$ .

Constructing the Counterexample

Choose an arbitrary message $m_0 \in \mathcal{M}$ and a key $k_0 \in \mathcal{K}$ .
Define $c := E(k_0, m_0)$ .
Clearly, Equation (1) holds because there is at least one key $k_0$ that encrypts $m_0$ to $c$ .

Next, consider the set:

$S := \left \lbrace D(k_1, c) \mid k_1 \in \mathcal{K} \right \rbrace.$

The set $S$ contains all possible messages that could be decrypted from ciphertext $c$ using any key in $\mathcal{K}$ . Since the number of possible decryptions cannot exceed the number of available keys, we have:

$|S| \leq |\mathcal{K}| < |\mathcal{M}|.$

Because $|S| < |\mathcal{M}|$ , there must exist at least one message $m_1 \in \mathcal{M}$ not included in $S$ . In other words,

$m_1 \notin S.$

This means that no key in $\mathcal{K}$ can decrypt $c$ to $m_1$ . We now prove this formally.

Contradiction Proof
Assume, for contradiction, that there exists a key $k_1$ such that:

$E(k_1, m_1) = c.$

By the correctness of the encryption scheme (i.e., decryption with the correct key must recover the original message), we have:

$D(k_1, c) = D(k_1, E(k_1, m_1)) = m_1.$

However, this contradicts our choice of $m_1 \notin S$ , meaning that such a key cannot exist. Therefore, we conclude:

$\Pr[E(k, m_1) = c] = 0.$

Final Argument
From Equations (1) and (2), we see that there exists at least one ciphertext $c$ that can only be generated by some messages but not others. This means that the probability distribution of ciphertexts is not uniform across all messages, violating the definition of perfect security.

Since perfect security requires that every message be equally likely to produce any given ciphertext, we conclude that $\mathcal{E}$ cannot be perfectly secure, proving the necessary condition:

$|\mathcal{K}| \geq |\mathcal{M}|.$

While Perfect Security represents an idealized form of encryption, it is ultimately impractical due to its strict key-length requirement. Shannon’s Theorem proves that the secret key must be at least as long as the message to achieve this level of security, limiting its usability in real-world scenarios.

References

Dan Boneh and Victor Shoup. A Graduate Course in Applied Cryptography. Retrieved from https://toc.cryptobook.us/

Connect with Me

Connect with me on Facebook, LinkedIn, via email at dangduongminhnhat2003@gmail.com, GitHub, or by phone at +84 829 258 815.