In quantum mechanics we assume the following:

  1. Each observable is associated with an Hermitian operator of a Hilbert space H. Its eigenvalues must be real and the eigenstates are orthogonal to each other, thus form a set of basis of H.
  2. Upon observation, one of the eigenvalues will be the quantity and the wave function will collapse onto one of the corresponding eigenstates.

Here I don’t want to discuss the deep insights, which I have no idea of. That’s why I take Copenhagen interpretation, so shut up and compute!

This article is written as a note of my understanding of physical and mathematical meaning of commutators, mainly the answer to the question: “why do commute operators have a common set of eigenstates?”


  • Suppose $\hat{A}, \hat{B}$ are operators of $H \to H$
  • Then the commutator of them are defined as $[\hat{A}, \hat{B}] \equiv \hat{A}\hat{B} - \hat{B}\hat{A}$ is also an operator of $H \to H$
  • We are particularly interested in observables, so without further specification, we assume the operators mentioned below are Hermitian.

Uncertainty Principle

The most famous pair of operators might be position op $\hat{X}$ and momentum op $\hat{P} = -i\hbar\partial_x$ (1-d case, position space). Before the easy algebra, it really bothers me to ponder on their domains and codomains:

  1. Consider the simplified case (more general ones are similar) where position x makes up the whole representation of our system.
  2. Can H be $L^2(x)$? Well, it seems problematic. $\hat{X} \psi(x)$ might not be normalizable. Technically for meaningful wave functions, the domain of $\hat{X}$ is $D(\hat{X}) = \{\psi \in L^2(x) \,|\, \hat{X} \psi \in L^2(x) \}$ and the codomain $L^2(x)$ is not the same. Similarly, for $\hat{P}$, the partial derivative of a differentiable can be indifferentiable.
  3. Things won’t help if we try to define the operators over a broader function space, for example, $\mathbb{C}^{\mathbb{R}}$. The basis property of eigenstates $\delta(x - x_0)$ still holds and $\hat{X}$ seems to make sense. However the eigenstates of $\hat{P}$, namely the Fourier basis only span the space of square-integrable functions, i.e. $L^2(x)$.
  4. The trick is to add delicate restrictions to $L^2(x)$, such that the wave functions become realistic. Here we force the wave functions of H to be $\in L^2(x)$, $\in C^{\infty}$ and with compact supports.
  5. It’s easy to verify that $H \subset D(\hat{X}) \cap D(\hat{P})$ and $\forall \psi(x) \in H, \hat{X}\psi(x) \in H \land \hat{P}\psi(x) \in H$. Finally we get a suitable function space for the commutator.

Now we have $[\hat{X} , \hat{P}] = i\hbar$, which is a constant operator. The physical interpretation of this is “position and momentum of the same direction can’t be observed simultaneously”:

  1. Position x is observed iff the wave function collapses onto an eigenstate of $\hat{X}$
  2. Suppose both quantities are measured, then there must be a shared eigenstate $\Phi_{x,p}(x)$ for the collapsed wave function to reside in. We denote its eigenvalues as $x, p$.
  3. Contradiction occurs
    i \hbar \Phi_{x,p} = [\hat{X}, \hat{P}] \Phi_{x,p} = x p \Phi_{x,p} - x p \Phi_{x,p} = 0
  4. No such none zero wave function to hold the equation, hypotheses objected.

Mathematically, we can apply the Cauchy-Schwarz inequality to prove the Robertson uncertainty relation:

{\displaystyle \sigma_{A}\sigma_{B}\geq \left|{\frac {1}{2i}}\langle [{\hat {A}},{\hat {B}}]\rangle \right|={\frac {1}{2}}\left|\langle [{\hat {A}},{\hat {B}}]\rangle \right|}

When it comes to the case of position and momentum, we get the very “Heisenberg uncertainty principle” written on high-school physics textbooks.

Commute Operators

The real problem is, what happens when two operators always commute? i.e. $[\hat{A}, \hat{B}] = 0$. By definition we can easily derive:

Prop 1. $\forall \Phi_a(x) s.t. \hat{A}\Phi_a(x) = a \Phi_a(x), \hat{A}(\hat{B}\Phi_a(x)) = a \hat{B}\Phi_a(x)$, $\hat{B}$ acts on any eigenstate of $\hat{A}$ yields another (might be the same) eigenstate of A, with the same eigenvalue.
Prop 2. Vice versa.
Prop 3. If a is an none-degenerate eigenvalue of $\hat{A}$, i.e. only one normalized eigenstate $\Phi_a(x)$. According to Prop 1, $\hat{B} \Phi_a(x)$ must be proportional to $\Phi_a(x)$, i.e. $\Phi_a(x)$ is also an eigenstate of $\hat{B}$.

We have shown that any none-degenerate eigenstate of $\hat{A}$ is a shared eigenstate. In order to prove the compatibility theorem: “commute operators possess a complete set of common eigenstates”, by symmetry, we only need to show that “for any degenerate eigenvalue a of $\hat{A}$, there are shared eigenstates, the span of which covers all the eigenstates of a”.

  1. Both operations are Hermitian, thus linear. All eigenstates of a forms a linear subspace (here for simplicity, consider only finite-dimensional case) $H_a$ of $H$.
  2. For any eigenstate $\Phi_b(x)$ of $\hat{B}$, project it onto both $H_a$ and its orthogonal complement $H_c$: $\Phi_b(x) = \beta_a + \beta_c$.
  3. $b\beta_a + b\beta_c = b\Phi_b(x) = \hat{B}\Phi_b(x) = \hat{B}\beta_a + \hat{B}\beta_c$, since Prop 1, $\hat{B}\beta_a$ lies in $H_a$ and $\hat{B}\beta_c$ in $H_c$ which is perpendicular to $H_a$.
  4. Hence $b\beta_a = \hat{B}\beta_a, b\beta_c = \hat{B}\beta_c$, both components are eigenstates of $\hat{B}$ with eigenvalue of b. $\beta_a$ is also a shared eigenstate since it lies in $H_a$.
  5. Consider a certain set of basis $S_B$ of $H$ constituted by eigenstates of $\hat{B}$, their $\beta_a$ components are orthogonal, and none-zero parts of them should form a complete set of basis of $H_a$. Otherwise their span won’t cover the whole space.
  6. Choose the none-zero $\beta_a$ parts, and for the rest, change the eigenstates to their $\beta_c$ components. A new set of basis with shared eigenstates across $H_a$ is formed.

So far, we have not only proved the assertion but also come up with a way to actually construct the common basis. For infinite-dimensional cases, things become much trickier, but the basic intuitions are similar. To see this, imagine the commutator $[\hat{X}, \hat{Y}] = 0$ of 3-d position operators.

Physically, if two observables commute, the observation of one is independent of the other. It means their information is additive and neither of them is enough to describe the nature. No wonder system states are specified by CSCO.