From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 2002:a37:b902:: with SMTP id j2mr23672132qkf.247.1585005378629; Mon, 23 Mar 2020 16:16:18 -0700 (PDT) X-Received: by 2002:a9d:7617:: with SMTP id k23mr18610839otl.329.1585005378172; Mon, 23 Mar 2020 16:16:18 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!feeder.eternal-september.org!news.gegeweb.eu!gegeweb.org!usenet-fr.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Mon, 23 Mar 2020 16:16:18 -0700 (PDT) Complaints-To: groups-abuse@google.com Injection-Info: google-groups.googlegroups.com; posting-host=2001:1c00:c1d:4b00:6491:2cde:30bf:8d8a; posting-account=-iT6ZQoAAAAlqBCInAc-vB6x1soT8Jhq NNTP-Posting-Host: 2001:1c00:c1d:4b00:6491:2cde:30bf:8d8a User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <6c2c0e35-af07-4169-8be5-464ec7fd0fd5@googlegroups.com> Subject: GNAT vs Matlab - operation on multidimensional complex matrices From: darek Injection-Date: Mon, 23 Mar 2020 23:16:18 +0000 Content-Type: text/plain; charset="UTF-8" Xref: reader01.eternal-september.org comp.lang.ada:58229 Date: 2020-03-23T16:16:18-07:00 List-Id: Hi Everyone, I am working on radar signal processing algorithms that use multidimensional complex arrays. To my surprise, the performance of some Matlab functions is much better than compiled Ada code. Let's start with a simple problem of computing sum of all elements in a 3D real and complex array. The Ada code is as follows: with Ada.Text_IO; with Ada.Real_Time; use Ada.Real_Time; with Ada.Unchecked_Deallocation; with Ada.Numerics.Long_Complex_Types; use Ada.Numerics.Long_Complex_Types; with Ada.Text_IO.Complex_IO; procedure TestSpeed is package TIO renames Ada.Text_IO; package CTIO is new Ada.Text_IO.Complex_IO(Ada.Numerics.Long_Complex_Types); subtype mReal is Long_Float; NumIteration : constant := 1_000; NumChannels : constant := 64; NumRanges : constant := 400; NumAngles : constant := 30; type tCubeReal is array (1..NumChannels, 1..NumAngles, 1..NumRanges) of mReal; type tCubeRealAcc is access all tCubeReal; --for tCubeReal'Alignment use 8; type tCubeComplex is array (1..NumChannels, 1..NumAngles, 1..NumRanges) of Complex; type tCubeComplexAcc is access all tCubeComplex; --for tCubeComplex'Alignment use 16; RealCubeAcc : tCubeRealAcc; SReal : mReal; ComplexCubeAcc : tCubeComplexAcc; SComplex : Complex; procedure Free is new Ada.Unchecked_Deallocation(tCubeReal, tCubeRealAcc); procedure Free is new Ada.Unchecked_Deallocation(tCubeComplex, tCubeComplexAcc); --| ------------------------------------------------------------------------- procedure SpeedSumRealCube (NumIteration : Integer; Mtx : in tCubeReal; S: out mReal) is Ts : Time; TEx : Time_Span; t : mReal; begin Ts := Clock; S := 0.0; for k in 1..NumIteration loop for m in Mtx'Range(1) loop for n in Mtx'Range(2) loop for p in Mtx'Range(3) loop S := S + Mtx(m, n, p); end loop; end loop; end loop; end loop; TEx := Clock - Ts; TIO.New_Line; TIO.Put_Line("Computation time:" & Duration'Image(To_Duration(TEx))); t := mReal(To_Duration(TEx))/mReal(NumIteration); TIO.Put_Line("Computation time per iteration:" & t'Image); end SpeedSumRealCube; --| ------------------------------------------------------------------------- procedure SpeedSumComplexCube (NumIteration : Integer; Mtx : in tCubeComplex; S: out Complex) is Ts : Time; TEx : Time_Span; t : mReal; begin Ts := Clock; S := 0.0 + i* 0.0; for k in 1..NumIteration loop for m in Mtx'Range(1) loop for n in Mtx'Range(2) loop for p in Mtx'Range(3) loop S := S + Mtx(m, n, p); end loop; end loop; end loop; end loop; TEx := Clock - Ts; TIO.New_Line; TIO.Put_Line("Computation time:" & Duration'Image(To_Duration(TEx))); t := mReal(To_Duration(TEx))/mReal(NumIteration); TIO.Put_Line("Computation time per iteration:" & t'Image); end SpeedSumComplexCube; --| ------------------------------------------------------------------------- begin TIO.Put_Line("Real cube"); TIO.Put_Line("Real type size is:" & Integer(mReal'Size)'Image); RealCubeAcc := new tCubeReal; RealCubeAcc.all := (others =>(others =>( others => 1.0))); SpeedSumRealCube(NumIteration => NumIteration, Mtx => RealCubeAcc.all, S => SReal); TIO.Put_Line("Sum is:" & SReal'Image); TIO.Put_Line("Complex cube"); TIO.Put_Line("Complex type size is:" & Integer(Complex'Size)'Image); ComplexCubeAcc := new tCubeComplex; ComplexCubeAcc.all := (others =>(others =>( others => 1.0 + i * 1.0))); SpeedSumComplexCube(NumIteration => NumIteration, Mtx => ComplexCubeAcc.all, S => SComplex); TIO.Put("Sum is:"); CTIO.Put(SComplex); Free(ComplexCubeAcc); Free(RealCubeAcc); end TestSpeed; 1. Compiled with: gcc version 9.2.0 (tdm64-1) ( and gnat community edition 2019), with the -O2 optimisation level. 2. CPU: AMD64 Family 23 Model 24 Stepping 1 CPU0 2300 AMD Ryzen 7 3750H with Radeon Vega Mobile Gfx 3. Win10 64bit The results of the program execution: Computation time: 0.616710300 Computation time per iteration: 6.16710300000000E-04 Sum is: 7.68000000000000E+08 Complex cube Complex type size is: 128 Computation time: 3.707091000 Computation time per iteration: 3.70709100000000E-03 Sum is:( 7.68000000000000E+08, 7.68000000000000E+08) The executable produced by the gcc provide with the gnat community edition gave very similar results. More interesting part - the Matlab code. Matlab version : Matlab 2019b, 64bit function [] = TestSpeed NumIterations = 1000; NumChannels = 64; NumRanges = 400; NumAngles = 30; %--| real matrix ReMtx = ones(NumChannels, NumAngles, NumRanges); tic SReal = ComputeSum(NumIterations, ReMtx); TExR = toc;%cputime-ts; fprintf('TExe:%f, sum real=%f\n', TExR, SReal); %--| complex matrix CplxMtx = complex(ReMtx, ReMtx); %ts = cputime; tic SCplx = ComputeSum(NumIterations, CplxMtx); TExC = toc; %cputime - ts; fprintf('TExe:%f, sum complex= <%f,%f> \n', TExC, real(SCplx), imag(SCplx)); fprintf('Complex operations are %f times slower\n', TExC/TExR); end % function function [S] = ComputeSum(NumIterations, Mtx) S = 0; for i = 1:NumIterations S = S + sum(sum(sum(Mtx))); end % for end % function The results of the program execution: TExe:0.260718, sum real=768000000.000000 TExe:0.789778, sum complex= <768000000.000000,768000000.000000> Complex operations are 3.029242 times slower What is wrong with my code ? Is it the Ada compiler doing bad job here? If you look at Matlab code, on average the computation that use complex addition are ~3 times slower than for the real numbers. In the case of Ada code, the complex operations are ~ 6 times slower that for the real numbers. Did I miss something somewhere ? Any ideas how I can improve the performance of the Ada program (different array layout, magic pragmas, or magic compiler switches) ? It seems that Matlab is performing really well here ... Any suggestions are very welcome. Regards, Darek