The Dot Product

Let’s say, there is a shop inventory that lists unit prices and quantities for each of the products they carry. If the shop has 10 small-sized storage boxes each worth \$5, 7 medium-sized storage boxes each worth \$10, and 5 small storage boxes each worth \$15. Now, the price vector p and the quantity vector q can be written as:

p = \begin{bmatrix}5\\10\\15\end{bmatrix}, q = \begin{bmatrix}10\\7\\5\end{bmatrix}

The net-worth of the boxes is calculated as:

(10)(\$5)+(7)(\$10)+(5)(\$15) = \$195

This operation of multiplying two entries in pairs of vectors  and summing arises often in applications of linear algebra and is also foundational in basic linear algebra theory.

Definition: The dot product of two vectors x, y in \mathbb{R}^n is defined as:

x . y = x_1y_1+x_2y_2+x_3y_3+....+x_ny_n

Let us solve an example using S2.

Q: Determine x .y given x = \begin{bmatrix}1\\2\\3\\4\end{bmatrix} and y = \begin{bmatrix}4\\5\\6\\7\end{bmatrix}

					%use s2
// define vectors x and y
var x = DenseVector(arrayOf(1.0, 2.0, 3.0, 4.0))
var y = DenseVector(arrayOf(4.0, 5.0, 6.0, 7.0))

// P = x.y
val P = x.multiply(y)
[4.000000, 10.000000, 18.000000, 28.000000] 

Geometrical Connection:

The dot product x . y can also be calculated with the help of the angle \theta between these two vectors as follows:

 x . y =\left|x\right|\left|y\right|cos \theta

Let us find the dot product of the vectors in the figure below:

From the figure, vector x can be written as: (3\vec{i}+5\vec{j})-(0\vec{i}+2\vec{j}) = (3\vec{i}+3\vec{j})

\implies x = 3\vec{i}+3\vec{j}

Similarly, vector y can be written as: (4\vec{i}+2\vec{j})-(0\vec{i}+2\vec{j}) = (4\vec{i}+0\vec{j})

\implies y = 4\vec{i}

Given, \theta = 45^\circ

 x . y =\left|x\right|\left|y\right|cos \theta

                      =\left|3\vec{i}+3\vec{j}\right|\left|4\vec{i}\right|cos 45^\circ

                         =(\sqrt{3^2+3^2}) (\sqrt{4^2}) (\frac{1}{\sqrt{2}})

         =(3\sqrt{2}) (4) \frac{1}{\sqrt{2}}

\implies x . y = 12

Note:x . y = 0 if and only if the vectors x and y are orthogonal (perpendicular to each other).

Real World Application of Vector Dot Product

In Natural Language Processing, one of the basic ways to compare a finite number of text documents is to use
vectorized word counts. Let’s suppose the documents have a combined total of n distinct words, which are arranged in some order. Each document is then associated with a vector of length n whose i^{th} entry indicates the number of times the i^{th} word occurs in the associated document.

One way to measure similarity between two documents is to take the dot product of the associated
unit vectors: If two documents A and B have associated vectors a and b respectively, their similarity is defined by:

Similarity, S(A, B) = \frac{a . b}{\left|a\right| \left|b\right|} 

As defined earlier, x . y = \left|x\right| \left|y\right| cos \theta

\implies cos \theta = \frac{x . y}{\left|x\right| \left|y\right|}

We know that, 0 \leq cos \theta \leq 1

 \implies 0 \leq \frac{x . y}{\left|x\right| \left|y\right|} \leq 1

Similarly we have, 0 \leq S(A, B) \leq 1 for any two documents A and B.

  • Documents with no words in common are associated with orthogonal vectors and thus have 0 similarity.
  • If two documents have similarity 1, their associated vectors are scalar multiples of each other, meaning that they have the same words and that the words appear in the same proportions.

Let us try to find the vectorized word count similarity between the following sentences:

A: “Julie likes John more than Kelly likes John”

B: “Joel likes Jonam more than Julie likes John”

Respective word count in A & B:

WordCount in ACount in B

Therefore, the two vectors are:

a = [1, 2, 1, 0, 0, 2, 1, 1] and b = [1, 1, 0, 1, 1, 2, 1, 1]

S(A, B) = \frac{a . b}{\left|a\right| \left|b\right|}

 \implies S(A, B) =  \frac{(1+2+0+0+0+4+1+1)}{\sqrt {12} \sqrt {10}}

 \implies S(A, B) = 0.821583836

Approximately, S(A, B) = 0.822

Hence, the similarity between the two given sentences A and B, S(A, B) = 0.822.

The Transpose

The dot product gives us a compact way to express the formula for an entry of a matrix product: to obtain
the (i, j)^{th} entry of a matrix product AB, we dot the i^{th} row of A and the j^{th} column of B.

However, the matrix product by itself is not quite flexible enough to handle a common use case: suppose
We have two matrices A and B which contain tabular data stored in the same format.

For example, suppose that the columns of A store the vectorized word counts for a series of emails sent from Kelly, while B stores vectorized word counts for a series of emails sent from Ana. If we want to calculate the similarity of each of Kelly’s emails to each of Ana’s emails, then we want to dot the columns of A but not its rows with the columns of B.

So we define the transpose A^T of a matrix A to be the matrix resulting from switching A‘s rows and columns.

Definition: If A is an m \times n matrix, then its transpose A^T is defined to be the matrix with n rows whose i^{th} row is equal to the i^{th} column of A, for each i from 1 to n

If A = \begin{bmatrix}1 & 2 & 3\\4 & 5 & 6\end{bmatrix}, then A^T = \begin{bmatrix}1 & 4\\2 & 5\\3 & 6\end{bmatrix}

Let us implement the same using S2.

					%use s2
// define a matrix
var A = DenseMatrix(arrayOf(
    doubleArrayOf(1.0, 2.0, 3.0),
    doubleArrayOf(4.0, 5.0, 6.0)))

// B = Tranpose of A
val B = A.t()
println("\n\nTranspose of A:\n")
	[,1] [,2] [,3] 
[1,] 1.000000, 2.000000, 3.000000, 
[2,] 4.000000, 5.000000, 6.000000, 

Transpose of A:
	[,1] [,2] 
[1,] 1.000000, 4.000000, 
[2,] 2.000000, 5.000000, 
[3,] 3.000000, 6.000000, 

Note:Transpose is a linear operator, meaning that (cA + B)^T = cA^T + B^T, where c is a constant and A & B are matrices.

Matrix Properties

Now that we know what the transpose of a matrix is, let’s learn about various matrix properties and their illustrations in S2.

Symmetric Matrix

A matrix in which its (i,j)^{th} entry will be necessarily equal to its (j,i)^{th} entry is known as a Symmetric Matrix.

In other words, if A is an n \times n matrix satisfying the equation A = A^T, we say that A is symmetric.

					%use s2
// define a matrix
var A = DenseMatrix(arrayOf(
    doubleArrayOf(1.0, 2.0, 3.0),
    doubleArrayOf(2.0, 4.0, 5.0),
    doubleArrayOf(3.0, 5.0, 6.0)))

//prints 'true' if A is symmetric, else prints 'false'
val precision = 1e-15
	[,1] [,2] [,3] 
[1,] 1.000000, 2.000000, 3.000000, 
[2,] 2.000000, 4.000000, 5.000000, 
[3,] 3.000000, 5.000000, 6.000000, 


Skew-Symmetric Matrix

If A is an n \times n matrix satisfying the equation A^T = -A, we say A is Skew-Symmetric.

					%use s2
// define a matrix
var A = DenseMatrix(arrayOf(
    doubleArrayOf(0.0, -6.0, 4.0),
    doubleArrayOf(6.0, 0.0, -5.0),
    doubleArrayOf(-4.0, 5.0, 0.0)))

//prints 'true' if A is skew-symmetric, else prints 'false'
val precision = 1e-15
	[,1] [,2] [,3] 
[1,] 0.000000, -6.000000, 4.000000, 
[2,] 6.000000, 0.000000, -5.000000, 
[3,] -4.000000, 5.000000, 0.000000, 


Idempotent Matrix

If A is an n \times n matrix satisfying the equation A = A*A or A = A^2, we say A is Idempotent.

					%use s2
// define a matrix
var A = DenseMatrix(arrayOf(
    doubleArrayOf(2.0, -2.0, -4.0),
    doubleArrayOf(-1.0, 3.0, 4.0),
    doubleArrayOf(1.0, -2.0, -3.0)))

//prints 'true' if A is idempotent, else prints 'false'
val precision = 6.0
	[,1] [,2] [,3] 
[1,] 2.000000, -2.000000, -4.000000, 
[2,] -1.000000, 3.000000, 4.000000, 
[3,] 1.000000, -2.000000, -3.000000, 


Orthogonal Matrix

If A is an n \times n matrix satisfying the equation A.A^T = A^T.A = I where I is an Identity Matrix of order n, we say A is Orthogonal.

					%use s2
// define a matrix
var A = DenseMatrix(arrayOf(
    doubleArrayOf(1.0, 0.0, 0.0),
    doubleArrayOf(0.0, 0.0, -1.0),
    doubleArrayOf(0.0, -1.0, 0.0)))

//prints 'true' if A is orthogonal, else prints 'false'
val precision = 1e-15
	[,1] [,2] [,3] 
[1,] 1.000000, 0.000000, 0.000000, 
[2,] 0.000000, 0.000000, -1.000000, 
[3,] 0.000000, -1.000000, 0.000000,