This paper presents the implementation of the coarse-gained parallel matrix multiplication (c = A x B) with two ways of data partitioning on a cluster of PCs. In the past, most existing studies proposed the medium-grained parallel matrix multiplications on the hypercube-connected on mesh-connected parallel computers. We propose to study and implement the practical parallel matrix multiplication based on the MPMD model on the cluster of PCs using MPI (Message Passing Interface) standard. In particular, two data partitioning schemes for decomposing matrix A and matrix B with balancing workload are presented: 1) the row-block partitioning and 2) the checkerboard-block partitioning. Moreover, we also introduce a modified parallel matrix multiplication to cover an approach of the parallel all-pair shortest paths. Finally, the system performance of sequential and parallel processing of the matrix multiplication have been compared and evaluated in terms of response time, speedup, and efficiency. Based on our experimental results, the system performance of the matrix multiplication was improved up to 50% when the number of processors(p) were increased by one.
Keywords: Parallel matrix multiplication, data-block partitioning, an approach of parallel all-pair shortest paths, a cluster of PCs, MPMD (Multiple Program Multiple Data), MPI (Message Passing Interface)
Corresponding author: E-mail: s6063611@kmitl.ac.th
Samutrak*, P. ., Boonniyom, J. ., & Srisawat, J. . (2018). Parallel Matrix Multiplication On a Cluster of PCs. CURRENT APPLIED SCIENCE AND TECHNOLOGY, 34-42.
