

But the structure - the fact that all the data points fall on a single line - is still evident in a 3-D plot: We've embedded the one-dimensional set of points in a three-dimensional space, and after the random linear transformation, the description of the points seems to use all three dimensions. Now it's not quite so clear what we started with, just from inspection of the numbers: The nature of the set remains clear by simple inspection of the matrix of values:īut now let's apply a random linear transformation (implemented as multiplication by a random 3x3 matrix) If we add a couple of other dimensions whose values are all zero: Let's start with a simple and coherent set of numbers: the integers from 1 to 100: Which produces a new n-by-m matrix B1 whose columns contain the projections of the columns of B.Įmbedding: losing a line and finding it again Note also that if we have a collection of m points B, arranged in the columns of an n-by-m matrix, then we can project them all at once by the simple matrix multiplication Note that this works for vectors in a space of any dimensionality. You can forget the derivation, if you want (though it's good to be able to do manipulations of this kind), and just remember this simple and easy formula for a projection matrix. In other words, we're projecting point b onto the line through point a by pre-multiplying b by an n-by-n projection matrix (call it P), which is particularly easy to define: What is the dimensionality of that term? Well, given that a is an n-by-1 column vector, we have

This is the vector b pre-multiplied by the term (a*a')/(a'*a). We can re-arrange the terms (since a'*a is just a scalar) to yield This allows us to compute the location of p and the distance from b to a. Thus (in Matlab-ese)Īnd therefore, with a bit of algebraic manipulation, Since the vector b-c*a must be perpendicular to the vector a, it follows that their inner product must be 0. Another simple fact of the algebra of vectors is that the line from point b to point c*a is b-c*a. It should be obvious to your algebraic intution that any point on the line through point a can be represented as c*a, for some scalar c. It should be obvious to your geometrical intuition that the line from b to p is perpendicular to a. (The point p is also the projection of the point b onto the line through a.) We want the point p on the line through a that is closest to b. Suppose that we have a point b in n-dimension space - that is, a (column) vector with n components - and we want to find its distance to a given line, say the line from the origin through the point a (another column vector with n components). It's important to become comfortable with these geometical ideas and with their algebraic (and computational) counterparts. 9.Many interesting interesting and important techniques center around the ideas of embedding of a set of points in a higher-dimensional space, or projecting a set of points into a lower-dimensional space.
#PROJECTION FROM LINEAR SUBSPACE ALGEBRAIC GEOMETRY FULL#
8.4.1 Regressing the partialled-out X on the full Y.6.4 Worked Example: Consistency of mean estimators.6.2 Notational shorthand and ``arithmetic" properties.6.1.1 Relationship of big-O and little-o.5.4.1 Proving the consistency of sample variance, and the normality of the t-statistic.4.2.5 Limitation of CLT (and the importance of WLLN).4 Weak Law of Large Numbers and Central Limit Theorem.3.2 Properties of the projection matrix.1.4.2 Proof of VoS: \(X, Y\) are dependent.1.4.1 Proof of VoS: \(X, Y\) are independent.10 Fundamental Theorems for Econometrics.
