What is the standard algorithm for multiplication

The street algorithm

The Strassen algorithm is a recursive algorithm based on the divide-and-conquer principle. This is done in two steps:
  1. The problem is divided into sub-problems of the same size as possible of the same type as the original problem. These sub-problems can be solved independently of one another using the same algorithm.
  2. The solutions to the sub-problems are combined to form a solution to the original problem.
This division is repeated recursively until the sub-problems are trivial and can be solved immediately. The simplest case of matrix multiplication is the multiplication of two 2x2 matrices A, B
Define the auxiliary variables
p1 : = (a11 + a22) * (b11 + b22)
p2 : = (a21 + a22) * b11
p3 : = a11 * (b12 - b22)
p4 : = a22 * (b21 - b11)
p5 : = (a11 + a12) * b22
p6 : = (a21 - a11) * (b11 + b12)
p7 : = (a12 - a22) * (b21 + b22)
then applies to the product
as you can easily confirm by simple recalculation:
c11 = p1 + p4 - p5 + p7
c12 = p3 + p5
c21 = p2 + p4
c22 = p1 + p3 - p2 + p6
This algorithm for matrix multiplication, the Strassen algorithm, requires seven multiplications and 18 additions or subtractions in this case, while the standard algorithm only requires 8 multiplications and 4 additions. In the case of 2x2 matrices, the Strassen algorithm is clearly inferior, but Strassen recognized that the formulas (5.3) and (5.4) remain valid if the aij and bij are matrices themselves. For this general case, the algorithms Am, k, the two matrices of order m2k multiply, defined by induction on k. If n is not an even number, then the last column of C is calculated according to the usual methods and the procedure is applied to the remaining matrices of dimension n-1.

Definition 5.5.2: A.m, 0 be the standard algorithm for matrix multiplication (this requires m3 Multiplications and m2(m-1) additions). Is Am, k already known, then let Am, k + 1 defined as follows:

Let the matrices A and B be of order m2k + 1 are multiplied, then A, B and A.B are divided into blocks

where Ai, j, Bi, j, Ci, j Matrices of the order m2k and computes the auxiliary matrices of order m2k

P.1 : = (A11 + A22) * (B.11 + B22)
P.2 : = (A21 + A22) * B11
P.3 : = A11 * (B.12 - B.22)
P.4 : = A22 * (B.21 + B11)
P.5 : = (A11 + A12) * B22
P.6 : = (A21 - A11) * (B.11 + B12)
P.7 : = (A12 - A22) * (B.21 + B22)

C.11 = P1 + P4 - P5 + P7
C.12 = P3 + P5
C.21 = P2 + P4
C.22 = P1 + P3 - P2 + P6

by using the algorithm A for the multiplicationm, k and uses the usual (element-wise) algorithm for addition and subtraction.

The matrices P1, P2, ..., P7 can all be calculated at the same time. This also applies to the matrices C.11, ..., C22, although their computational effort compared with that for the auxiliary matrices P1, P2, ..., P7 is relatively small, so that the parallel calculation is often not worthwhile. The Strassen algorithm is definitely well suited for use on parallel computers.

Here is an implementation of the Strassen algorithm in C. It is neither efficient, nor pretty, nor is it guaranteed to be bug free. In addition, it does not release allocated memory again. However, it should demonstrate how and that the algorithm works.

Complexity of the street algorithm

The Strassen algorithm enables the calculation of the product of two matrices of order n = 2k with 7k Multiplications and less than 6.7k Additions and subtractions, more precisely:
Kmult(n) = nlog27 Multiplications and
Kadd(n) = 6 (nlog27 - nlog24) Additions / subtractions
This algorithm has an asymptotic complexity O (nlog27), which is significantly lower than that of the standard algorithm. For matrices with general order (not necessarily n = 2k) the following applies:

Theorem 5.5.1: The product of two square matrices of order n can be calculated using the Strassen algorithm with less than 28.nlog27 arithmetic operations are calculated.

Software for the street algorithm

Developing efficient programs from the basic concept of the Strassen algorithm is not an easy task. The Strassen algorithm was therefore only considered a theoretically interesting example for reducing complexity in matrix multiplication for a long time. However, there have recently been very efficient implementations for single-processor computers and parallel computers by C. C. Douglas, M. Heroux, G. Slishman and R. M. Smith. Implementations of the Strassen algorithm have also already been included in the software libraries of computer manufacturers (e.g. and in the ESSL library).

The "limit dimension" nmin, from which a Strassen implementation requires less computing time than the corresponding BLAS-3 program or, on most computers, lies between 32 and 256. With larger matrices, there is a significant reduction in the required floating point operations and a corresponding reduction in computing time.