Arguing by contradiction, assume that $f$ is not continuous at $p$. This then means that
Let $\epsilon$ be the number provided by this statement. Then we may choose $\delta$ to be any positive number. We choose $\delta = \frac1n$ with $n\in\N$. For our choice of $\delta = \frac1n$ there is a point $x_n\in X$ with $x_n\in B_{1/n}(p)$, i.e. $d(x_n, p) \lt \frac1n$, and $f(x_n)\not\in B_{\epsilon}(f(p))$, i.e. $d(f(x_n), f(p)) \geq \epsilon$.
It follows that the sequence $x_n$ converges to $p$. By assumption, it also follows that $f(x_n)\to f(p)$. But this is impossible because $f(x_n)\not\in B_{\epsilon}(f(p))$ for all $n\in\N$.
We have proved the first half of the theorem and now consider the converse statement: suppose $f$ is continuous at $p$, and let $x_n$ be a sequence of points with$x_n\to p$. We must show that $f(x_n)\to f(p)$.
To prove $f(x_n)\to f(p)$ let $\epsilon\gt0$ be given. Since $f$ is continuous at $p$ there is a $\delta\gt0$ such that $f(x)\in B_\epsilon(f(p))$ for all $x\in B_\delta(p)$. We also are given that $x_n\to p$, so there is an $N_\delta\in \N$ such that $x_n\in B_\delta(p)$ for all $n\geq N_\delta$. It then follows that for all $n\geq N_\delta$ we have $x_n\in B_\delta(p)$, and therefore $f(x_n) \in B_\epsilon\bigl(f(p)\bigr)$.
If $g(p)\neq 0$ then $q(x) = \frac{f(x)}{g(x)}$ is also continuous at $p$.
The quotient function $q(x)$ may not be defined for all $x\in X$, but since $g(p)\neq0$ there is a neighborhood $B_r(p)$ on which $g(x)\neq0$, and thus the function $q(x) = f(x)/g(x)$ is well defined on this neighborhood.
To prove that $u(x) = f(x)+g(x)$ is continuous at $p$, we recall the sequential continuity theorem which tells us that we have to prove that $u(x_n)\to u(p)$ for every sequence $x_n\to p$. Given such a sequence we know that $f(x_n)\to f(p)$ and $g(x_n)\to g(p)$, because $f$ and $g$ both are continuous at $p$. Since $f(x_n)$ and $g(x_n)$ are sequences of real numbers we can apply the limit properties and conclude that \[ \lim u(x_n) = \lim f(x_n) + g(x_n) = f(p) + g(p) = u(p). \] Thus $u=f+g$ is indeed continuous at $p$.
To show this, we let $x_j\in\R^n$ be any sequence of points with $x_j\to p$ and show that $f(x_j)\to f(p)$. Let the coordinates of $x_j$ be $x_j = (x_{j1}, x_{j2}, \dots, x_{jn})$, and let the coordinates of $p$ be $p=(p_1, p_2, \dots, p_n)$. It then follows from $x_j\to p$ that $x_{j1} \to p_1$. This means that $f(x_j) \to f(p)$, and thus we have shown that $f$ is continuous at $p$.
To prove this we consider the two functions $g(x, y)=x$ and $h(x, y) = y$. These are both coordinate functions, so they are continuous at any point. We can write our function $f$ as \[ f(x, y) = \frac{g(x, y) h(x, y)}{1+h(x, y)^2}. \] By the theorem about sums and products of continuous functions we know that $fg$, $g^2$, $1+g^2$, and hence $fg/(1+g^2)$ are continuous at every point $p$.
The next theorem gives a description of continuity in terms of what a function does to open subsets of the metric space.
$f$ is continuous if and only if $f^{-1}(O)$ is open in $E$ for every open subset $O\subset Y$.
and:$f$ is continuous if and only if $f^{-1}(C)$ is closed in $E$ for every closed subset $C\subset Y$.
We recall that, by definition, for any subset $A\subset Y$ \[ f^{-1}(A) = \{ x\in X \mid f(x) \in A\}. \]
Since $f(p)\in O$, and since $O$ is open, there is an $\epsilon\gt0$ such that $B_\epsilon(f(p)) \subset O$. Since $f$ is continuous, there is a $\delta\gt0$ such that $f\bigl(B_\delta(p)\bigr) \subset B_\epsilon(f(p))$. This implies that all points $x\in B_\delta(p)$ get mapped into $O$, and thus they belong to $f^{-1}(O)$. Conclusion: $B_\delta(p) \subset f^{-1}(O)$.
Next, we prove the converse. Suppose that $f:X\to Y$ has the property that $f^{-1}(O)$ is open in $X$ for any open $O\subset Y$. We will show that $f$ is continuous at any point $p\in X$.
To prove that $f$ is continuous at $p$, let $\epsilon\gt$ be given. Then $B_\epsilon(f(p))$ is an open subset of $Y$, and it follows that $f^{-1}\bigl(B_\epsilon(f(p))\bigr)$ is an open subset of $X$. Since $p\in f^{-1}\bigl(B_\epsilon(f(p))\bigr)$, there is a $\delta\gt0$ such that \[ B_\delta(p) \subset f^{-1}\bigl(B_\epsilon(f(p))\bigr). \] This implies that for every $x\in B_\delta(p)$ one has $f(x) \in B_\epsilon(f(p))$. Therefore $f$ is continuous at $p$. Since $p$ is an arbitrary point in $X$ it follows that $f$ is continuous on $X$.
Since the space $X$ is compact, the sequence $x_n$ has a convergent subsequence $x_{n_k}$. Let $p_+ = \lim x_{n_k}$. Continuity of $f$ then implies that $f(x_{n_k}) \to f(p_+)$.
If the function were unbounded then we would have $f(x_{n_k}) \geq n_k\geq k$, which cannot be because the sequence of numbers $f(x_{n_k})$ converges. Therefore the function is bounded.
Since $f$ is bounded, our sequence of points is such that \[ M-\frac1{n_k} \leq f(x_{n_k}) \leq M \] for all $k$. Taking the limit $k\to\infty$ we see that $f(x_{n_k}) \to M$. But we already had $f(x_{n_k}) \to f(p_+)$, so we end up with $f(p_+) = M$.
We define a sequence of intervals $[a_n, b_n]$ ($n\geq 0$) by induction. First, $[a_0, b_0] = [a,b]$. For every $n\geq0$ we set $c_n= (a_n+b_n)/2$. If $f(c_n)=0$ we are done for we have found a solution to $f(c)=0$. Otherwise we either have $f(c_n)\gt0$ or $f(c_n)\lt 0$.
If $f(c_n)\gt0$ then we set $[a_{n+1}, b_{n+1}] = [a_n, c_n]$, otherwise we set $[a_{n+1}, b_{n+1}] = [c_n, b_n]$. In both cases we end up with an interval $[a_{n+1}, b_{n+1}]$ for which $f(a_{n+1}) \lt 0 \lt f(b_{n+1})$. The length of the new interval is exactly half the length of the previous interval, so, $b_n-a_n = 2^{-n}(b-a)$.
Since the $[a_n, b_n]$ form a nested family of intervals, the sequences $a_n$ and $b_n$ are nondecreasing and nonincreasing, respectively, and thus $a_n\to c$ for some $c\in[a,b]$ while $b_n\to c'$ for some $c'\in [a,b]$. Since $b_n-a_n\to0$, the two limits $c$ and $c'$ coincide.
The function $f$ is continuous, so $a_n\to c$ implies $f(a_n)\to f(c)$. Since $f(a_n)\lt 0$ for all $n$, we get $f(c) = \lim f(a_n) \leq 0$. On the other hand $b_n\to c$ and $f(b_n)\gt0$ for all $n$, so $f(c) = \lim f(b_n) \geq 0$. We conclude that $f(c)=0$.
This argument shows that $x\to\sqrt{x}$ is well defined as an inverse of $x\to x^2$. The argument works for many more functions. The following theorem gives the most general version of the argument.
To prove this, consider $f(x) = x^2-a$. Clearly $f(0)\lt 0$, so all we have to do is find a number $b\gt0$ with $f(b)\gt0$, i.e. $b^2\gt a$. Once we have such a number, the intermediate value theorem implies the existence of an $x\in (0, b)$ with $f(x)=0$, i.e. $x^2=a$. One possible choice of $b$ is $b=1+a$: since $b=1+a\gt 1$ we have $b^2\gt b =1+a \gt a$, so $f(b)\gt0$, as required.
Note that the two sequences $p_n$ and $q_n$ do not have to converge. They can jump all over the place in $X$, as long as they get closer to each other as $n\to\infty$.
Let $\epsilon\gt0$ be given. Then, because $f$ is uniformly continuous, there is a $\delta\gt0$ such that $d(f(p), f(q)) \lt \epsilon$ for all $p,q\in X$ with $d(p,q)\lt \delta$. We also know that $d(p_n, q_n)\to0$, so there is an $N_\delta\in\N$ such that $d(p_n, q_n) \lt \delta$ for all $n\ge N_\delta$. Thus, if $n\geq N_\delta$, we have $d(p_n, q_n)\lt \delta$, and therefore $f(d(p_n), f(q_n))\lt \epsilon$. We have shown that $d(f(p_n), f(q_n))\to0$.
Next, we assume that $f$ has the property that $d(f(p_n), f(q_n)) \to 0$ for any two sequences $p_n, q_n \in X$ with $d(p_n, q_n)\to0$, and we prove by contradiction that $f$ is continuous.
If $f$ is not uniformly continuous, then there is an $\epsilon\gt0$ such that for any $\delta\gt0$ there exist points $p,q\in X$ with $d(p,q)\lt \delta$ and $d(f(p), f(q)) \gt\epsilon$. For each $n\in\N$ we choose $\delta=\frac1n$ and let $p_n, q_n\in X$ be the points for which $d(p_n, q_n)\lt \frac1n$ and $d(f(p_n), f(q_n))\geq\epsilon$ holds. We now have a contradiction because $d(p_n, q_n)\to0$ and $d(f(p_n), f(q_n))\geq\epsilon$ for all $n$.
To see why, consider the sequences $p_n=\sqrt{n}$ and $q_n = p_{n-1} = \sqrt{n-1}$. Then \[ |p_n-q_n| = \sqrt{n} - \sqrt{n-1} = \frac 1 {\sqrt{n}+\sqrt{n-1}} \lt \frac1{\sqrt{n}} \to0, \] while \[ f(p_n) - f(q_n) = \bigl(\sqrt{n}\bigr)^2 - \bigl(\sqrt{n-1}\bigr)^2 = 1. \] So we have two sequences with $|p_n-q_n|\to0$ but $|f(p_n)-f(q_n)|\not\to 0$.
Since $X$ is compact, we can extract a subsequence $n_k$ for which $p=\lim p_{n_k}$ exists. Then $q_{n_k}$ also converges to $p$, because for every $k\in\N$ we have \[ d(p, q_{n_k}) \leq d(p, p_{n_k}) + d(p_{n_k}, q_{n_k}) \leq d(p, p_{n_k}) + \frac1{n_k}. \] As $k\to\infty$ both terms on the right go to zero, so $q_{n_k}\to p$.
Since $f$ is continuous, we also have
At this point we run into the following contradiction: we know that \[ d(f(p_{n_k}), f(q_{n_k}))\leq d(f(p_{n_k}), f(p)) + d(f(p), f(q_{n_k})) \to 0, \] so $d(f(p_{n_k}), f(q_{n_k}))\to 0$. On the other hand we also have $d(f(p_{n_k}), f(q_{n_k}))\geq \epsilon$. The contradiction shows that $f$ must be uniformly continuous after all.