Conceptual explanation of Linear Transformations or Linear Maps: Linear Algebra
Introduction
Linear maps, also known as linear transformations, are a fundamental concept in linear algebra. They are functions that map one vector space to another while preserving the operations of vector addition and scalar multiplication. Refer to the previous post. It plays a crucial role in many areas of mathematics, physics, and engineering.
In physics, Linear maps are used to model the behavior of systems under different conditions, and in engineering, they are used to design control systems and signal-processing algorithms. In computer science, linear maps are used in image processing, machine learning, and data compression. They are also used in cryptography to encrypt and decrypt messages.
In data science, linear maps are widely used in a variety of tasks such as PCA, LDA, Linear Regression, Linear Classifiers, Linear Dimensionality Reduction, Image compression, and so on. this article, we will explore the basic properties of linear maps and investigate a few of the Data science tasks in detail in the coming section below:
What is a linear map?
A linear map, denoted by T, is a function that maps vectors in a vector space V to vectors in a vector space W.
It is defined by a set of rules that specify how to transform each vector in V into a corresponding vector in W.
Example: Given a vector x in V, the linear map T will map it to a vector y in W according to the rule T(x) = Ax, where A is a matrix. This matrix is called the matrix representation of the linear map T.
As we know that linear map preserves the operations of vector addition and scalar multiplication. This means that if x and y are vectors in V, and a and b are scalars, then
T(x + y) = T(x) + T(y) -> vector addition
T(ax) = aT(x) -> scalar multiplication
A simple example of a linear map can be represented by a function T(x) = 2x, which takes a vector x and multiplies it by 2.
This shows that T(x+y) = T(x) + T(y) holds for any linear map T, regardless of the specific function used to represent it.
T(ax) = aT(x) is a property of linear transformations. A simple example of this property is a transformation that scales a 2D vector by a factor of “a”.
Let T be the transformation that scales a 2D vector by a factor of “a”.
Thus, T(ax) = aT(x) holds for this example of a scaling transformation.
This property allows us to use the same operations on the vectors in V and W, making it easier to work with the transformed vectors. This property is known as the “linearity” of the map.
Another key property of linear maps is that they are linear combinations of the identity map and the zero map. The identity map is a linear map that maps each vector in V to itself, and the zero maps is a linear map that maps each vector in V to the zero vector in W. Any linear map can be written as a combination of these two maps, meaning that it can be represented as T(x) = ax + b0, where a and b are scalars.
We will look at some more interesting pieces of stuff so please hold tight and let’s see some real-time examples.
Why it is called a linear transformation or linear map
The main reason is that it preserves the operations of vector addition and scalar multiplication. These operations are known as “linear operations” because they can be represented by linear equations. A linear equation is an equation of the form y = mx + b, where y is a linear combination of x and b, and m is a scalar.
In the case of a linear map T, we can think of the input vector x as the variable x in a linear equation, and the output vector T(x) as the variable y. The matrix A that represents the linear map T can be thought of as the coefficients m and b in the equation.
Some examples of Linear transformations:
As we saw earlier that linear maps are widely used in data science for a variety of tasks. Some situations where linear maps can be particularly useful include:
1. High-dimensional data: When working with high-dimensional data, it can be difficult to visualize and analyze the data. Linear maps such as Principal Component Analysis (PCA) can be used to reduce the dimensionality of the data while retaining as much variation in the data as possible.
Principal component analysis (PCA) is a technique that transforms high-dimensions data into lower dimensions while retaining as much information as possible.
Example:
Let’s create a simple dataframe having 3 columns with 5 rows each.
In this example col1, col2 and col3 are positively linearly correlated with each other, which means we can assume that:
If col1 increases, there is a possibility that col2 increase.
If col2 increases, there is a possibility that col3 increases and vice versa.
PCA will assume that the columns are linear and will apply the transformation to make it into n_components without losing much information.
6. Linear Regression: Linear regression is a linear map that finds the best-fitting line that approximates a set of data points.
Eg: In the below example, we have x and y variable where x be the integer that says the marks out of 20 for each student and y be the integer that says the attitude of each student out of 100, we form a regplot and having a regression line passing between the marks and attitude, We see the trend here the Linearity between X and y.
The formula for Linear regression is y= mx+c, where y is the dependent variable, c is y-intercept, and m is the slope time x. now we need to find the slope and y-intercept so that we can predict the y with a given x value.
Step 1: To find the slope(m) of the regression line, we have a formula of m = r(Sy/Sx), where r is Pearson coefficient correlation multiplied by standard deviation y divided by the standard deviation of x.
Step 2: Once we find the slope using the above formula, we then find the y-intercept which is c = y — mx, where y is the mean of the y sample and x is the mean of the x sample, m is the slope
Lets first calculate slope and then calculate the y-intercept:
Now we got slope and y-intercept, lets apply this to Linear regression formula and see how the linearity works here:
What we say here is, if I have a new value of X(student mark) as 15, with calculated slope and y-intercept, I can able to predict the y (attitude of a student) assuming X and y are having a linear relationship.
Conclusion:
Linear maps are useful in data science because they are simple to implement, computationally efficient, and can be easily understood by humans. They also can be represented by matrices, which makes them easy to implement in computer programs. However, it’s important to keep in mind that linear maps may not always be the best choice for a particular problem, as other types of transformations such as non-linear maps may be more appropriate.