How can a function be “perpendicular” to another function? What the does the integral have to do with anything? Here is some intuition for the concept of “orthogonal” functions.

Questions for thought:

* How might one sample and weight points to approximate 2D functions on a square? A spherical surface? A cylinder? A manifold? A measurable set?

Can you give me an example of two orthogonal functions, from the R -> R?

Sure. sin(x) and sin(2x). You can orthogonalize any functions of your choosing by using the Graham-Schmidt procedure.

Hi,

Great presentation, thank you. I was also wondering how the weighting function w(x) comes into all this? i.e. http://mathworld.wolfram.com/OrthogonalFunctions.html

Could you please comment. Thank you.

Hey Maciej, sorry for the delay!

Most mathematicians will say there is no meaning to the weighting function and it is just a convenience. However I disagree.

My personal interpretation of the weighting function is that it accounts for the “units”, or “relative importance” of the different parts of the function.

For example if you measure the x-coordinates of 2D vectors in inches, and the y-coordinates in feet, then to find the angle between vectors measured in such a way, you will have to multiply the y-coordinate by the weight w=12 (inches/foot) before taking the dot product – otherwise you would give undue weight to the coordinate direction measured in inches.

A function is an infinite dimensional vector. In some sense each different point x in the domain represents a different “direction” in the function space, and the value f(x) is the “coordinate of f in the x-direction”. So, the weighting function accounts for the different “units” of the function in different parts of the domain, or different “relative importance” of different parts of the domain.

Similarly, imagine you measured a 2D vector but your x and y measurement devices were misaligned at a skew angle instead of orthogonal. Then you couldn’t just do x1*x2+y1*y2 to get the angle – you would overemphasize parts of the vectors where the measurement angles are close, and underemphasize parts of the vectors where the measurement angles are not. To discount this error and get the true angle, you would have to multiply the vectors by a skew matrix before taking the dot product. In probability theory, this skewness between measurement directions becomes correlation between observables, and the skew matrix becomes a covariance kernel that you may see in place of the weighting function.

Hope this makes some sense/?