Update Compiler flag's args to speed up dreamcast projects

Place for discussing homebrew games, development, new releases and emulation.

Moderators: pcwzrd13, deluxux, VasiliyRS

User avatar
Ian Micheal
Developer
Posts: 5994
Contact:

Update Compiler flag's args to speed up dreamcast projects

Post#1 » Sat Jan 19, 2019 3:01 am

Hi im sharing this.. I tested all these on lot's of project's and you can get a lot of speed on project's using these -0s plus my chart.. I spent lots of time on this back in the day..



Most People use -03 but dont know what that enables.. here's what it enables

Code: Select all

[color=#FFBFFF]GNU CPP version 3.0.4 (cpplib) (Hitachi SH)
GNU C version 3.0.4 (sh-elf)
        compiled by GNU C version 3.2 20020927 (prerelease).
options passed:  -lang-c -v -I/usr/local/dc/kos-1.1.9/include
 -I/usr/local/dc/kos-1.1.9/libc/include
 -I/usr/local/dc/kos-1.1.9/kernel/arch/dreamcast/include -DGNUC=3
 -DGNUC_MINOR=0 -DGNUC_PATCHLEVEL=4 -Dsh -DELF -Dsh
 -DELF -Acpu=sh -Amachine=sh -DOPTIMIZE -DSTDC_HOSTED=1
 -DLITTLE_ENDIAN -DSH4_SINGLE_ONLY -DSDL -DLSB_FIRST -DALIGN_LONG
 -DINLINE -DDC -D_arch_dreamcast -ml -m4-single-only -O3
options enabled:  -fdefer-pop -foptimize-sibling-calls -fcse-follow-jumps
 -fcse-skip-blocks -fexpensive-optimizations -fthread-jumps
 -fstrength-reduce -fpeephole -fforce-mem -ffunction-cse -finline-functions
 -finline -fkeep-static-consts -fcaller-saves -freg-struct-return
 -fdelayed-branch -fgcse -frerun-cse-after-loop -frerun-loop-opt
 -fdelete-null-pointer-checks -fschedule-insns2 -fsched-interblock
 -fsched-spec -fbranch-count-reg -freorder-blocks -frename-registers
 -fcommon -fgnu-linker -fregmove -foptimize-register-move -fargument-alias
 -fstrict-aliasing -fident -fpeephole2 -fguess-branch-probability
 -fmath-errno -m1 -m2 -m3 -m3e -m4-single-only -m4-nofpu -ml
[/color]
 When you  -03  this  the flags it  enables by default


As you can see it does not enable -funroll-loops ... By adding to -03 flags you can tune you compile for more speed


# -falign-functions -falign-loops -falign-labels -falign-jumps
# -fstrict-aliasing -ffast-math -fomit-frame-pointer \
# -fdelete-null-pointer-checks -funroll-all-loops -fno-optimize-sibling-calls \
# -falign-loops -ffloat-store \
# -frename-registers
# -funroll-all-loops -> Adds about 15% speed performance
# -fno-optimize-sibling-calls -> Adds speed
# -funroll-all-loops (in environ-dc.sh) -> Adds speed
# -fomit-frame-pointer -> small effect?
# -falign-loops -> small improvement

# -falign-labels

# -falign-functions=32 -> no effect ?
# -fssa -> no effect ?
# -fexpensive-optimizations -> no effect
# -mbigtable -> no effect
# -mfmovd -> no effect
# -fno-builtin -> no effect
# -fno-gcse -> no effect
# -falign-jumps -> no effect
# -falign-jumps=32 -> no effect

# -fno-guess-branch-probability -> slower
# -fmove-all-movables (except drawgfx.o) -> slower
# -finline-functions & -finline-limit=10000 -> slows things down some
# -fno-strict-aliasing -> display error?
# -fssa & -fdce -> no display!
# -mrelax -> segmentation fault in compilation
# -freduce-all-givs -> drawgfx won't compile
#-fno-for-scope -fno-delayed-branch -> Fixes pc-rel too far

Try it on you projects add what i found best combo to your makefile





This chart is based on a mame Driver
Kos flag with all others
-Wall -ml -mbigtable -mnomacsave -m4-single-only -pipe = faster compiling and better memory managment. No speed to the main emulation at all. Sets up CPU mode and FPU mode -PIPE makes compiling faster no speed up other then that.


Optimize levels no flags test this is standed GCC optimizing level test

-O9 -> speed = 27 fps bin size 1922 bytes 1 .9meg this is larger
-08 -> Speed = 27 fps bin size 1877 bytes 1.8meg same size
-07 -> Speed = 27 fps bin size 1877 bytes 1.8meg
-06 -> Speed = 27 fps bin size 1877 bytes 1.8meg
-05 -> Speed = 27 fps bin size 1877 bytes 1.8meg
-04 -> Speed = 26 fps -slower Strange as GCC Docs say there is only -03
-O3 -> speed = 28 fps again why is this faster then -09 is -09 is false
-O2 -> speed = 27 fps 1fp slower then -03
-O1 -> speed = 24 fps This shows -02 & -03 atlest work.
-Os -> speed = 26 fps slower by 1 fps but 1.722 meg bin size That can make or break a large rom loading .
-O0 -> speed = 13fps Ouch!


Optimize Flag settings




-funroll-all-loops -> Added size almost 150k bloat up will not load rom
-fschedule-insns2 -> small bin size no speed up at all but smoother fps
-fstrict-aliasing -> Added 1 to 2 fps worth using. but less smooth jerky
-fexpensive-optimizations -> speed went from 28fps to 25 2fps loss
-fomit-frame-pointer -> smaller bin size 1fps loss this is a shock!


Best setting ended up on

-04 -fomit-frame-pointer -ffast-math -fno-optimize-sibling-calls

This is for mame driver but can be of use..

Normal project

-03 -fno-for-scope -fno-delayed-branch -fno-optimize-sibling-calls -funroll-all-loops -fschedule-insns2 -fexpensive-optimizations -fomit-frame-pointer -fstrict-aliasing -ffast-math

Gains i found up to 5% to 10% ..

Hope some one find's it useful..

User avatar
Ian Micheal
Developer
Posts: 5994
Contact:

Re: Update Compiler flag's args to speed up dreamcast projects

Post#2 » Mon Jul 12, 2021 7:16 am

This for developers so you dont have to all these where tested on hardware not emulators which is useless you will se up to 4fps upgrade on your projects mosts times..

I did all these tests back in the day so you dont have to and yes -0s can be faster on a usecase if you code is not ordered

  • Similar Topics
    Replies
    Views
    Last post

Return to “New Releases/Homebrew/Emulation”

Who is online

Users browsing this forum: Google [Bot]