Skip to main content
Logo image

Section 3.3 Ideals and quotients

Let \(R,S\) be rings. A ring homomorphism \(\varphi:R\to S\) is (as usual) a structure-preserving map, in this case taking sums to sums and products to products. It is in particular a homomorphism of the additive groups of the rings, so its kernel is \(\ker \varphi :=\{r\in R\mid \varphi(r) = 0\}\text{,}\) and is an ideal of the ring \(R.\)
The fundamental homomorphism theorem for rings says that given a ring homomorphism \(\varphi:R\to S\) and any ideal \(I\subseteq \ker\varphi,\) there is a well-defined ring homomorphism \(\varphi_*:R/I \to S\) with \(\varphi_*\circ\pi=\varphi.\) Here \(\pi: R\to R/I\) is the usual projection. In particular the first isomorphism theorem says
\begin{equation*} R/\ker \varphi \cong \Im \varphi. \end{equation*}

Exercise 3.3.1.

Let \(R,S\) be rings with identities and \(\varphi:R \to S\) a (nontrivial) ring homomorphism.

(a)

Show that \(\varphi(1_R)\) an idempotent in \(S\text{,}\) so \(\varphi(1_R)=1_S\) or is a zero divisor in \(S.\)
Answer.
\(s:= \varphi(1_R) = \varphi(1_R \cdot 1_R) = \varphi(1_R)\varphi(1_R)=s^2 \) so \(\varphi(1_R)\) is an idempotent. Rearranging we see that
\begin{equation*} \varphi(1_R) (\varphi(1_R) - 1_S) = 0 \end{equation*}
from which the conclusion follows.

(b)

Conclude that if \(S\) is an integral domain, then \(\varphi(1_R)=1_S\text{.}\)
Let \(A\) be a commutative ring with identity, and let \(A[x]\) denote the ring of polynomials in the variable \(x\) with coefficients in \(A.\)

Proof.

Let \(p(x) = b_0+ b_1x + \cdots + b_n x^n\) be the usual expression for the polynomial. Consider the polynomial \(q(x) = p(x+a).\) After a bit of algebra, the polynomial \(q(x)\) has the expression \(q(x) = c_0 + c_1x + \cdots + c_nx^n,\) for some (uniquely determined) \(c_i\in A.\) But then
\begin{equation*} p(x) = q(x-a) = c_0 + c_1(x-a) + \cdots + c_n (x-a)^n. \end{equation*}
For the second statement, consider the evaluation homomorphism \(ev_a: A[x] \to A\) given by \(p(x) \mapsto p(a)\text{.}\) It is immediate that \(ev_a\) is a surjective homomorphism. To compute its kernel, let \(p \in A[x]\) and write \(p\) as \(p(x) = c_0 + c_1(x-a) + \cdots + c_n (x-a)^n.\) Then
\begin{equation*} ev_a(p) = 0\iff p(a)=0 \iff c_0 =0\text{.} \end{equation*}
It follows that \(\ker ev_a = (x-a)\text{,}\) and the result follows from the first isomorphism theorem.
We would like to understand finitely generated ideals as well as their quotient rings. Recall a simple but important idea. Suppose that \(S,T\) are subsets of a commutative ring \(R,\) and we wish to compare the ideals \((S)\) and \((T).\) Then
\begin{equation*} (S) = (T) \iff S\subseteq (T) \text{ and } T \subseteq (S), \end{equation*}
which follows from the simpler \((S) \subseteq (T)\) if and only if \(S\subset (T).\) Analogous statements hold for groups generated by set or subspaces generated by a collection of elements. Consider a few examples.

Example 3.3.2.

Let \(f(x) = x-3\) and \(g(x) = (x-3)(x-5)+7\) be polynomials in \(\Z[x].\) Compare the ideal \(I = (f,g)\) in \(\Z[x]\) versus \(\Q[x].\)
Solution.
Often it is useful to replace one set of generators of an ideal, by a simpler set of generators using the observation above. We claim that in either ring, \(\Z[x]\) or \(\Q[x]\)
\begin{equation*} I = (f,g) = (f,7). \end{equation*}
We need only check that \(f,g \in (f,7)\) and \(f,7 \in (f,g)\text{.}\) But of course \(g = f(x-5) + 7 \in (f,7)\) and \(7 = g - f(x-5) \in (f,g)\)
So now we simply consider the ideal \(I = (x-3,7)\text{.}\) Since \(7 \in \Q^\times = \Q[x]^\times\text{,}\) viewed as an ideal in \(\Q[x]\text{,}\) \(I = \Q[x]\text{.}\)
Viewed as an ideal in \(\Z[x]\text{,}\) \(I\) is a proper ideal, indeed a maximal ideal, as we shall show a bit later by proving \(\Z[x]/I = \Z[x]/(x-3,7) \cong \Z/7\Z.\)
For now, if you wish to prove it is a proper ideal, it suffices to show that 1 cannot be written as \(h\cdot 7 + h'\cdot (x-3)\) for \(h, h' \in \Z[x].\)

Example 3.3.3.

Every ideal in \(\Z\) is principal: \(I = n\Z =(n)\) for some \(n\in \Z.\)
Solution.
Every ideal \(I\) is a subgroup of the additive group of \(\Z\text{,}\) a cyclic group, so \(I = n\Z\) as a group, but this is also an ideal of \(\Z.\)

Example 3.3.4.

Let \(n\in \Z\text{.}\) The ideal \(I = (n,x)\) is a proper ideal of \(\Z[x]\) iff \(n \ne \pm 1.\)
Solution.
If \(n = \pm 1,\) then \(I\) contains a unit in \(\Z[x]\text{,}\) so \(I = \Z[x].\) For the converse, we assume that \(n \ne \pm 1\) and show that \(1 \not\in I\text{.}\) We proceed by contradiction and suppose that
\begin{equation*} 1 = f\cdot n + g\cdot x \end{equation*}
for some \(f,g \in \Z[x].\)
Notice that \(g\cdot x\) contributes zero to the constant term of \(f\cdot n + g\cdot x\) no matter the choice of \(g\text{,}\) so if \(a_0\) is the constant term of \(f\text{,}\) in order for \(1 = f\cdot n + g\cdot x\text{,}\) it is necessary that
\begin{equation*} 1 = a_0\cdot n. \end{equation*}
But that demands that both \(a_0, n\) be units in \(\Z[x]^\times = \Z^\times =\{\pm1\}\) which is not true by assumption.
Now we would like to consider quotients of rings and in particular, polynomial rings, but it is useful to recall a few definitions. First if \(I,J\) are two ideals of a commutative ring \(R\) with identity, it is useful to recall the meaning of \(I+J\text{,}\) \(IJ\text{,}\) and \(I\cap J\) (see Definition 1.5.5).

Definition 3.3.5.

Let \(R\) be a commutative ring with identity, and \(P\) an ideal of \(R.\) Then \(P\) is a prime ideal iff
  • \(P\) is a proper ideal
  • For every \(a,b \in R\text{,}\) if \(ab \in P\text{,}\) then either \(a \in P\) or \(b \in P.\)
We remark that in a non-commutative ring, a different definition is required: \(P\) is a prime ideal iff \(P\) is proper and for any ideals \(I,J \subset R\text{,}\) \(IJ \subseteq P\) implies \(I \subseteq P\) or \(J \subseteq P.\) If the ring is commutative, this definition is equivalent to the previous one.

Definition 3.3.6.

Let \(R\) be a commutative ring with identity, and \(M\) an ideal of \(R.\) Then \(M\) is a maximal ideal iff
  • \(M\) is a proper ideal
  • Whenever \(I\) is an ideal of \(R\) with \(M \subseteq I \subseteq R\text{,}\) then either \(I = M\) or \(R.\)

Definition 3.3.7.

Let \(R\) be a commutative ring with identity. Then two ideals \(I,J\) of \(R\) are said to be comaximal iff \(I+J = R.\)

Example 3.3.8.

If \(M_1\ne M_2\) are maximal ideals in a commutative ring \(R\) with identity, then they are comaximal ideals.
Solution.
\(M_i \subsetneq M_1+M_2 \subseteq R\) and since the \(M_i\) are maximal, \(M_1+M_2=R.\)

Exercise 3.3.2.

Consider the ideals \(I = (x-2)\text{,}\) \(J=(x+2)\) in \(R[x]\) with \(R = \Z\) or \(\Q.\)

(a)

Show that \(I,J\) are comaximal ideals in \(\Q[x],\) but not in \(\Z[x].\)
Solution.
It is easy to see that \(4\in I+J.\) Since 4 is a unit in \(\Q[x]\text{,}\) \(I+J = \Q[x]\text{.}\) For \(\Z[x]\text{,}\) notice that every element of \(I+J\) has an even constant term.

Exercise 3.3.3.

Let \(R\) be a commutative ring with identity, with \(I,J\) ideals in \(R.\)

(a)

Show that
\begin{equation*} (I+J)(I\cap J) \subseteq IJ \subseteq I\cap J \end{equation*}
Solution.
It is clear that \(IJ \subset I\cap J\) since \(I,J\) are ideals of \(R.\) For the other inclusion, let \(i\in I, j\in J, k\in I\cap J\text{.}\) It is enough to check that \((i+j)k \in IJ\text{.}\) Certainly \(ik \in IJ\) since \(k\in J\text{,}\) and \(jk \in JI\) since \(k \in I\text{,}\) but \(IJ=JI\) since \(R\) is commutative.

(b)

Show that if \(I,J\) are comaximal ideals, then
\begin{equation*} IJ = I\cap J \end{equation*}
Solution.
Since \(I+J + R\text{,}\) the above inclusions now read
\begin{equation*} I\cap J \subseteq IJ \subseteq I\cap J\text{.} \end{equation*}
Recall the Chinese Remainder Theorem for rings.

(sketch).

From the exercise above we know that \(IJ = I\cap J.\) Consider the natural projections
\begin{equation*} R \to R/I \times R/J \text{ given by } r\mapsto (r+I, r+J). \end{equation*}
It is immediate to check this is a homomorphism with kernel \(I\cap J\text{.}\) The issue is surjectivity. Since \(I,J\) are comaximal, \(I+J=R\text{,}\) so choose \(i\in I,\) and \(j\in J\) with \(i+j = 1.\) Choose an arbitrary element \((a+I, b+J)\) in the codomain, and put \(r = bi + aj.\) We claim that
\begin{equation*} r+I = a+I \text{ and } r+J = b+J. \end{equation*}
We observe that
\begin{equation*} r+I = a+I \iff r-a \in I \iff (bi + aj) -a \in I\iff a(j-1) \in I \end{equation*}
But \(i+j=1\text{,}\) so \((j-1) = i \in I.\) A similar argument shows the element maps onto \(b+J.\)

Example 3.3.10.

Familiar examples include
  • If \(m,n\) are relatively prime integers in \(\Z\text{,}\) then \(m\Z+n\Z = \Z\text{,}\) so
    \begin{equation*} \Z/mn\Z \cong \Z/m\Z \times \Z/n\Z \end{equation*}
    a result we have used for groups in talking about elementary divisors and invariant factors.
  • From an exercise above and Proposition 3.3.1, you can conclude
    \begin{equation*} \Q[x]/(x^2-4) \cong \Q[x]/(x-2) \times \Q[x]/(x+2)\cong \Q \times \Q. \end{equation*}
  • We still need to work out a more robust analog of the first example for polynomial rings, but that requires material from the next section.
Also recall how to characterize prime and maximal ideals via their induced quotients.

Example 3.3.12.

  • \((0)\) is a prime ideal in any integral domain. \((0)\) is a maximal ideal in any field.
  • If \(n\ne 0\text{,}\) then \(n\Z\) is a prime ideal in \(\Z\) iff \(n\Z\) is a maximal ideal in \(\Z\) iff \(n\Z=p\Z\) where \(p\) is a prime.
  • The principal ideal \((x)\) is a prime ideal in \(R[x]\) iff \(R\) is an integral domain, while \((x)\) is a maximal ideal in \(R[x]\) iff \(R\) is a field.
We mention another very useful proposition in dealing with quotients of polynomial rings. Here is the background. Given a commutative ring \(R\) with identity, fix an element \(a \in R.\) The natural projection \(R \to R/(a)\) extends to one \(R[x] \to R/(a)[x]\) sending \(f(x) \in R[x]\) to \(\overline f(x) \in R/(a)[x]\) by viewing coefficients of \(f\) in \(R/(a),\) often stated as reducing them modulo the ideal.
Before proving the proposition, we should do an example to motivate what appears quite technical.

Example 3.3.14.

Let \(f(x) = x-3\) and \(g(x)=(x-3)(x-5) + 7\) in \(\Z[x].\) Show that the ideal \(I = (f,g)\) is maximal in \(\Z[x].\)
Solution.
In Example 3.3.2, we showed that \(I = (f,g) = (f,7)\text{.}\) Now we use the proposition:
\begin{equation*} \Z[x]/I = \Z[x]/(f(x),7) \cong (\Z/7\Z)[x]/(\overline f(x)) = (\Z/7\Z)[x]/(x-\overline 3)\cong \Z/7\Z \end{equation*}
the last by Proposition 3.3.1. Since \(\Z[x]/I\) is a field, Proposition 3.3.11 tells us that \(I\) is maximal in \(\Z[x].\)

Proof of Proposition 3.3.13.

Consider the composition of natural surjective homomorphisms \(\varphi\text{:}\)
\begin{equation*} R[x] \to (R/(a))[x] \to (R/(a))[x]/(\overline f(x))\text{.} \end{equation*}
We need only show its kernel is \((a,f)\) to complete the proof by the first isomorphism theorem. By walking the elements \(a,f\) through the compositions, it is easy to see that \((a,f) \subseteq \ker\varphi.\) For the reverse, take \(g(x) = b_0 + \cdots + b_nx^n \in R[x]\text{.}\) Then \(g \in \ker \varphi\) if and only if \(\overline g \in (\overline f(x)).\)
Write \(f(x) = c_0+ \cdots + c_mx^m \in R[x],\) and suppose that \(\overline g(x) = \overline f(x)\cdot \overline h(x),\) where \(\overline h(x) = \overline a_0 + \cdots + \overline a_d x^d.\) Then the coefficient of \(x^j\) in \(\overline g\) is
\begin{equation*} \overline b_j = \sum_{k=0}^j \overline c_k \overline a_{j-k}, \end{equation*}
or consider the arithmetic in \(R/(a),\)
\begin{equation*} b_j + aR = \sum_{k=0}^j c_ka_{j-k} + aR \end{equation*}
So if we fix \(h(x) = a_0 + \cdots + a_dx^d \in R[x]\) then \(b_j - \sum_{k=0}^j c_ka_{j-k}\) is the \(j\)th coefficient of \(f-gh\) is in \(aR.\) This implies that
\begin{equation*} g-fh = \sum_{j=0}^n [b_j - \sum_{k=0}^j c_ka_{j-k}] x^j\in aR[x]\text{,} \end{equation*}
or \(g \in (a, f) \subseteq R[x].\)
It is useful to say a few words about polynomial rings, beginning with some elementary properties.
Polynomial rings in several variables play a fundamental role in algebraic geometry, so we should say a few words about the different ways to view elements of those rings. For example, we have the natural corollary to the above result:
The idea behind the proof of the above corollary involves how to view the polynomial ring and its elements. We examine this in more detail below, but briefly we want to think of the polynomial ring, \(R[x_1, \dots, x_n]\text{,}\) in \(n\) variables with coefficients in \(R\) as the polynomial ring in one variable, \(S[x_n]\text{,}\) with coefficients in \(S=R[x_1, \dots, x_{n-1}]\text{.}\) Given this view, the proof of the corollary is by induction on \(n\) with the base case being Proposition 3.3.15.
The comments above apply to polynomial rings in any number of variables, but for simplicity of exposition, we consider the case of two variables.
One way to view elements of \(R[x,y]\) is as finite sums of monomials \(x^iy^j\) with coefficients \(a_{ij} \in R:\)
\begin{equation*} p(x,y) = \sum_{i,j} a_{ij}x^iy^j\text{.} \end{equation*}
Alternatively, we can think of \(p \in R[x,y]\) as an element of \((R[x])[y]\text{,}\) so that we could write
\begin{equation*} p(x,y) = \sum_{i,j} a_{ij}x^iy^j = \sum_k \phi_k(x)y^k = \phi_0(x) + \phi_1(x)y + \cdots + \phi_n(x) y^n\text{,} \end{equation*}
where the “coefficients”, \(\phi_k(x)\in R[x]\text{.}\)

Example 3.3.17.

Different views of a polynomial in \(\Z[x,y]\text{:}\)
\begin{align*} \Z[x,y]:\amp\quad 2 + 7y + 4xy + 3x^5y\\ \Z[x][y]:\amp\quad 2 + (7+4x+3x^5)y\\ \Z[y][x]:\amp\quad (2+7y)+(4y)x + (3y)x^5 \end{align*}
We see that in the second representation, the polynomial is simply linear in \(y.\)
Changing the perspective on how to view a polynomial affects both your intuition as well as the tools you might bring to bear to understand the polynomial. For example, we have some inkling on how to factor polynomials in one variable, but have almost no intuition when there is more than one variable.

Example 3.3.18.

Viewed as polynomials in \(\Q[x]\text{,}\) we immediately know that \(x^2 -4\) factors non-trivially, but that \(x^2 + 4\) does not (is irreducible).
Similarly, we see that \(x^2+4y -y^2 -1\) factors not by viewing it in \(\Q[x,y]\text{,}\) but thinking of it as analogous to \(x^2-4\) in \(\Q[y][x]\) as
\begin{equation*} x^2 - (y-1)^2 = (x - (y-1)) (x+ (y-1)). \end{equation*}