create a wrapper for cblas_sgemm that can transpose the input matrices in swift

You can create a wrapper for cblas_sgemm function that transposes the input matrices by using the cblas_sgemm function itself along with the cblasTranspose argument. Here is an example implementation of the wrapper function:

main.swift
import Accelerate

func transpose_sgemm(_ transA: CBLAS_TRANSPOSE,
                     _ transB: CBLAS_TRANSPOSE,
                     _ M: Int32,
                     _ N: Int32,
                     _ K: Int32,
                     _ alpha: Float,
                     _ A: UnsafePointer<Float>,
                     _ lda: Int32,
                     _ B: UnsafePointer<Float>,
                     _ ldb: Int32,
                     _ beta: Float,
                     _ C: UnsafeMutablePointer<Float>,
                     _ ldc: Int32) {

    var transposedA = [Float](repeating: 0.0, count: Int(M*K))
    var transposedB = [Float](repeating: 0.0, count: Int(N*K))

    let A_stride = transA == CblasTrans ? lda : K
    let B_stride = transB == CblasTrans ? K : ldb

    vDSP_mtrans(A, vDSP_Stride(A_stride), &transposedA, vDSP_Stride(1), vDSP_Length(K), vDSP_Length(M))
    vDSP_mtrans(B, vDSP_Stride(B_stride), &transposedB, vDSP_Stride(1), vDSP_Length(K), vDSP_Length(N))

    cblas_sgemm(CblasRowMajor, transA, transB, M, N, K, alpha, &transposedA, K, &transposedB, N, beta, C, ldc)
}
1096 chars
28 lines

In this implementation, we first create two temporary arrays transposedA and transposedB to store the transpose of matrices A and B, respectively. Then, we use vDSP_mtrans function from the Accelerate framework to perform the transpose operation on these arrays.

Finally, we call cblas_sgemm with the transposed matrices transposedA and transposedB and the other arguments passed to the function. The transA and transB arguments of cblas_sgemm are set according to the transA and transB arguments passed to our wrapper function.

You can use the wrapper function as follows:

main.swift
let M = 3
let N = 4
let K = 2
let alpha: Float = 1.0
let beta: Float = 0.0

var A = [Float](repeating: 0.0, count: M*K)
var B = [Float](repeating: 0.0, count: K*N)
var C = [Float](repeating: 0.0, count: M*N)

// fill A and B with some values...

transpose_sgemm(CblasTrans, CblasTrans, N, M, K, alpha, &B, N, &A, K, beta, &C, N)

// C is now the result of transposed A times transposed B
388 chars
16 lines

gistlibby LogSnag