I understand how it works, how its derived, etc. The proof of it has been shown to me. That is to say, I know how Legendre polynomials are derived, I know they are orthogonal, I know we sample a function at the roots of the polynomial, I know how the weights are calculated, and I see that it is basically a Riemann sum in that we are approximating the area with rectangles.
But the Riemann left sum, midpoint, right sum, simpsons rule, etc., have all been shown to me too. But they are not as impressive. What I dont understand is what makes the G-L Q so effective. Where does it derive its stellar accuracy? And why should it be so accurate anyway? The derivation is an interesting proof of concept that a numeric approximation can be done this way and is interesting in its own right... but to see that it is drastically so much more accurate than a Riemann sum is a completely different matter.
I dont see why the orthogonality of the Legendre polynomials is so important and how it plays such a key role in the accuracy. I dont see why we choose the roots of the polynomials as the sampling points of a function, as opposed to any other set of points with appropriate weights derived. Is the norm used in deriving the Legendre polynomials unique or can other norms and polynomials prove equally (or more) effective? Does the G-L Q work by approximating an arbitrary function by a polynomial? And if so, wouldnt we get more accuracy making those polynomials generic rather than by restricting them to linear combinations of Legendre?