The derivative tells us how much the output of a function changes when we change the input by a tiny amount. For the function f(x) = x2, we can think of the input as being a line of length x, and the output as being the area of a square of side length x. We use the letter “d” to mean “small change”, so that “dx” means “a small change in x”, and “df” means “a small change in f”. The derivative is (by definition) the gradient of the graph at each point. We know that gradient of a straight line is change in y over change in x. If we zoom in, the graph of x2 looks like a straight line with gradient df/dx, so that’s what we need to find.So, let’s add a tiny length dx to the line and see what the corresponding change df to the area of the square will be. The input is now a line of length x+dx, and the output is the area of a square with side length x+dx. We can split this area into: a square of length x, two rectangles with area xdx, and a tiny square of area dxdx. The “new part” of the area consists of the rectangles and the tiny square, so df is equal to 2xdx + dx*dx. Dividing by dx shows us that the derivative is df/dx = 2x + dx. But dx was supposed to be a tiny change - we can make it as small as we like. This makes df/dx get as close to 2x as we want it to, so it makes sense to say df/dx = 2x.