Update Compiler flag's args to speed up dreamcast projects

Place for discussing homebrew games, development, new releases and emulation.

Moderator: VasiliyRS

User avatar
Ian Micheal
Vagabond
Posts: 794
Contact:

Update Compiler flag's args to speed up dreamcast projects

Post#1 » Sat Jan 19, 2019 3:01 am

Hi im sharing this.. I tested all these on lot's of project's and you can get a lot of speed on project's using these -0s plus my chart.. I spent lots of time on this back in the day..



Most People use -03 but dont know what that enables.. here's what it enables

Code: Select all

[color=#FFBFFF]GNU CPP version 3.0.4 (cpplib) (Hitachi SH)
GNU C version 3.0.4 (sh-elf)
        compiled by GNU C version 3.2 20020927 (prerelease).
options passed:  -lang-c -v -I/usr/local/dc/kos-1.1.9/include
 -I/usr/local/dc/kos-1.1.9/libc/include
 -I/usr/local/dc/kos-1.1.9/kernel/arch/dreamcast/include -DGNUC=3
 -DGNUC_MINOR=0 -DGNUC_PATCHLEVEL=4 -Dsh -DELF -Dsh
 -DELF -Acpu=sh -Amachine=sh -DOPTIMIZE -DSTDC_HOSTED=1
 -DLITTLE_ENDIAN -DSH4_SINGLE_ONLY -DSDL -DLSB_FIRST -DALIGN_LONG
 -DINLINE -DDC -D_arch_dreamcast -ml -m4-single-only -O3
options enabled:  -fdefer-pop -foptimize-sibling-calls -fcse-follow-jumps
 -fcse-skip-blocks -fexpensive-optimizations -fthread-jumps
 -fstrength-reduce -fpeephole -fforce-mem -ffunction-cse -finline-functions
 -finline -fkeep-static-consts -fcaller-saves -freg-struct-return
 -fdelayed-branch -fgcse -frerun-cse-after-loop -frerun-loop-opt
 -fdelete-null-pointer-checks -fschedule-insns2 -fsched-interblock
 -fsched-spec -fbranch-count-reg -freorder-blocks -frename-registers
 -fcommon -fgnu-linker -fregmove -foptimize-register-move -fargument-alias
 -fstrict-aliasing -fident -fpeephole2 -fguess-branch-probability
 -fmath-errno -m1 -m2 -m3 -m3e -m4-single-only -m4-nofpu -ml
[/color]
 When you  -03  this  the flags it  enables by default


As you can see it does not enable -funroll-loops ... By adding to -03 flags you can tune you compile for more speed


# -falign-functions -falign-loops -falign-labels -falign-jumps
# -fstrict-aliasing -ffast-math -fomit-frame-pointer \
# -fdelete-null-pointer-checks -funroll-all-loops -fno-optimize-sibling-calls \
# -falign-loops -ffloat-store \
# -frename-registers
# -funroll-all-loops -> Adds about 15% speed performance
# -fno-optimize-sibling-calls -> Adds speed
# -funroll-all-loops (in environ-dc.sh) -> Adds speed
# -fomit-frame-pointer -> small effect?
# -falign-loops -> small improvement

# -falign-labels

# -falign-functions=32 -> no effect ?
# -fssa -> no effect ?
# -fexpensive-optimizations -> no effect
# -mbigtable -> no effect
# -mfmovd -> no effect
# -fno-builtin -> no effect
# -fno-gcse -> no effect
# -falign-jumps -> no effect
# -falign-jumps=32 -> no effect

# -fno-guess-branch-probability -> slower
# -fmove-all-movables (except drawgfx.o) -> slower
# -finline-functions & -finline-limit=10000 -> slows things down some
# -fno-strict-aliasing -> display error?
# -fssa & -fdce -> no display!
# -mrelax -> segmentation fault in compilation
# -freduce-all-givs -> drawgfx won't compile
#-fno-for-scope -fno-delayed-branch -> Fixes pc-rel too far

Try it on you projects add what i found best combo to your makefile





This chart is based on a mame Driver
Kos flag with all others
-Wall -ml -mbigtable -mnomacsave -m4-single-only -pipe = faster compiling and better memory managment. No speed to the main emulation at all. Sets up CPU mode and FPU mode -PIPE makes compiling faster no speed up other then that.


Optimize levels no flags test this is standed GCC optimizing level test

-O9 -> speed = 27 fps bin size 1922 bytes 1 .9meg this is larger
-08 -> Speed = 27 fps bin size 1877 bytes 1.8meg same size
-07 -> Speed = 27 fps bin size 1877 bytes 1.8meg
-06 -> Speed = 27 fps bin size 1877 bytes 1.8meg
-05 -> Speed = 27 fps bin size 1877 bytes 1.8meg
-04 -> Speed = 26 fps -slower Strange as GCC Docs say there is only -03
-O3 -> speed = 28 fps again why is this faster then -09 is -09 is false
-O2 -> speed = 27 fps 1fp slower then -03
-O1 -> speed = 24 fps This shows -02 & -03 atlest work.
-Os -> speed = 26 fps slower by 1 fps but 1.722 meg bin size That can make or break a large rom loading .
-O0 -> speed = 13fps Ouch!


Optimize Flag settings




-funroll-all-loops -> Added size almost 150k bloat up will not load rom
-fschedule-insns2 -> small bin size no speed up at all but smoother fps
-fstrict-aliasing -> Added 1 to 2 fps worth using. but less smooth jerky
-fexpensive-optimizations -> speed went from 28fps to 25 2fps loss
-fomit-frame-pointer -> smaller bin size 1fps loss this is a shock!


Best setting ended up on

-04 -fomit-frame-pointer -ffast-math -fno-optimize-sibling-calls

This is for mame driver but can be of use..

Normal project

-03 -fno-for-scope -fno-delayed-branch -fno-optimize-sibling-calls -funroll-all-loops -fschedule-insns2 -fexpensive-optimizations -fomit-frame-pointer -fstrict-aliasing -ffast-math

Gains i found up to 5% to 10% ..

Hope some one find's it useful..
https://www.youtube.com/channel/UCeVCRA ... whHyKp6OsA my youtube channel of my projects running :)

https://discord.gg/ZHb4rCq discord for my projects :)

https://twitter.com/IanMicheal10 Reach me on twitter:)

  • Similar Topics
    Replies
    Views
    Last post

Return to “New Releases/Homebrew/Emulation”

Who is online

Users browsing this forum: No registered users