A quick analysis here is based on the fact that y=(x2). A big change is worked out between two points. The gradient between x=1 and x=2 is equal to 3. BUT we know the gradient is constantly changing so this is an average over a large change. This is not sufficient to model changes over the whole curve. What we need to do is approximate in a space where the curve matches our quick "gradient = change in y over change in x" model as closely as possible. This is easy. If we zoom in on a curve enough, it will begin to look like a straight line. Don't believe me? The earth is curved if seen from space but if you zoom closely enough it appears flat. so instead we`re going to look at the tiniest change in x, from x to (x+h), where h is tiny. Then if: y = x2, then gradient = ((x+h)2-x2) / ((x+h)-x) = (x2 +2xh +h2 -x^2) / h = (2xh+h2) / h = h(2x+h) / h = (2x+h) Now all we do is reduce the size of h until it reaches 0. So over no change in x value whatsoever. If y=x2 then the gradient dy/dx=2x. We can explore this with other examples!