Golden Sun Hacking Community
13, April, 2021, 07:17:35 AM *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News: Be sure to check out gsmagic, a WIP tool that has map editing capabilities!
 
  Home   Forum   DC Wiki Help Search Calendar Downloads Login Register  
Pages: [1]   Go Down
  Print  
Author Topic: Camelot's C compiler  (Read 948 times)
0 Members and 3 Guests are viewing this topic.

New User


Coins: 1
Offline Offline

Gender: Male
Posts: 8

« on: 11, January, 2021, 03:55:23 AM »

Has anyone done any research into identifying or reproducing Camelot's C compiler?

I've been looking at agbcc, which I guess is supposed to be a reproduction of (or at least very close to) Nintendo's actual toolchain (apparently Cygnus GNUPro). The output is similar enough that I'm willing to believe Camelot's compiler is also a gcc, but there are some major differences too. These same patterns occur in Camelot's other games as well. I have not looked at many non-Camelot games, but the few I checked all look much more like agbcc.

I. Calls clobber r4

Camelot's code actually departs from the ARM standard by making r4 a call-clobbered (caller-save) register. This actually makes me think they modified the compiler themselves (or engaged Cygnus to do it for them) as I haven't found any commercial compiler offering this ability. Getting gcc/agbcc to do this is a trivial change in the Thumb config.

Code:
--- a/gcc/thumb.h
+++ b/gcc/thumb.h
@@ -405,7 +405,7 @@
 #define CALL_USED_REGISTERS    \
 {                              \
   1,1,1,1,                     \
-  0,0,0,0,                     \
+  1,0,0,0,                     \
   0,0,0,1,                     \
   1,1,1,1,1                    \
 }

II. Register allocation order

I think it's fairly clear from reading the code that the low registers are allocated in reverse: r3, r2, r1, r0. This actually looks suspiciously similar to what agbcc's ARM compiler does. Prioritizing ip (r12) looks similar too, but it's not a straightforward comparison because r12 is a high register.

Code:
/* The order in which register should be allocated.  It is good to use ip
   since no saving is required (though calls clobber it) and it never contains
   function parameters.  It is quite good to use lr since other calls may
   clobber it anyway.  Allocate r0 through r3 in reverse order since r3 is
   least likely to contain a function parameter; in addition results are
   returned in r0.
   */
#define REG_ALLOC_ORDER      \
{                                   \
     3,  2,  1,  0, 12, 14,  4,  5, \
     6,  7,  8, 10,  9, 11, 13, 15, \
    16, 17, 18, 19, 20, 21, 22, 23, \
    24, 25, 26     \
}

agbcc's Thumb compiler does not define REG_ALLOC_ORDER at all, so the registers are just allocated in order.

III. Instruction scheduling

This one is harder to explain, or might be evidence that the compiler is not gcc/agbcc based.

agbcc's Thumb compiler does not support delay slots or function units for instruction scheduling. However, Camelot's compiler clearly has something of the sort. Here's a simple example, GS1 at 0x08004458:

Code:
ldr r1, =0x03001CB4
ldr r3, =0x41C64E6D
ldr r2, [r1]
add r0, r2, #0
mul r0, r3
ldr r3, =12345
add r0, r0, r3
str r0, [r1]
lsl r0, r0, #8
lsr r0, r0, #16
bx lr

My attempt at the corresponding C code, and agbcc's output (Compiler Explorer -- edited to inline the constants):

Code:
extern volatile unsigned rng_state;

unsigned short random(void) {
unsigned new_state = rng_state * 0x41C64E6D + 12345;
rng_state = new_state;
return new_state >> 8;
}

Code:
ldr r2, =0x03001CB4
ldr r1, [r2]
ldr r0, =0x41C64E6D
mul r0, r0, r1
ldr r1, =12345
add r0, r0, r1
str r0, [r2]
lsl r0, r0, #8
lsr r0, r0, #16
bx lr

The register allocation is reversed like I mentioned above (r2 → r1, r1 → r2, r0 → r3). Ignoring that, the interesting thing is the order of the first three instructions. You see that Camelot's compiler pipelined a second LDR while waiting for the first, but agbcc just lets the stall happen. This is a tiny example, but the same pattern exists throughout. Here's GS1 at 0x0808B25C:

Code:
push {r5, r6, lr}
ldr r2, =0x02000240
mov r3, #224
mov r12, r2
lsl r3, r3, #1
ldr r4, =0x0809E270
add r3, r12
mov r2, #0
ldrsh r0, [r3, r2]
ldmia r4!, {r2}

It consistently tries to insert at least one instruction (and I think it prefers two) after any load, before using the result. I don't think it's aware of memory regions -- it looks like it uses the same delay no matter whether the load is from IRAM, ERAM, or ROM.

IV. Strange literal pools

The above all make sense as performance improvements. This last one has me a bit stumped.

Usually a literal pool holds the literals in the order they're used. In some cases, though, a few literals are stored at the beginning of the pool instead. These early ones are also strange because sometimes they hold small numbers which could be encoded as immediates or synthesized -- sometimes even zero.

Here's an example from GS1 at 0x0800C1BA:

Code:
add r2, r6, #0
ldr r3, [pc, #4]
add r2, #84
strb r3, [r2, #0]
b 0x0800C230
.word 0
.word 0x0FFF

There's no need for a literal here -- mov r3, #0 is perfectly valid Thumb. The zero is also out of order: the second literal is referenced earlier, at 0x0800C172.

Using literal values this way has to be bad for performance. In this specific case the pool is nearby, but in most other cases it falls well outside of prefetch range (8 opcodes according to endrift). I assume, then, that these are compiled as placeholders, and are filled in by something else in the toolchain. The linker doesn't usually fill in values inside functions as far as I know, but if they modified the compiler, they could have modified the linker too...



Has anyone else done research in this area? Please post links!
Logged
View Profile
Salanewt
His Sexiness
Global Moderator

Oh yeah, baby!

Prodigy
*

Coins: 204
Offline Offline

I am: A part of the organization of Cool Cats, but more of a dog person in reality.
Emblems: Have a nice day.
Posts: 4613

« Reply #1 on: 11, January, 2021, 11:44:45 PM »

Hey, it's very nice to see some research happening in this area!

I can't say I'm an expert on any of this sadly, so I can't really answer too many questions but I feel like GS2 is subtly different from the first for a lot of this. Have you looked into it at all?
Logged

Oh yeah baby, £ me harder.

Fusion is just a cheap tactic to make weak Adepts stronger.

Yoshi's Lighthouse is a hacking website in progress. Why not check it out if you like Yoshi or the Mario & Luigi games?
View Profile WWW

New User


Coins: 1
Offline Offline

Gender: Male
Posts: 8

« Reply #2 on: 12, January, 2021, 02:20:44 AM »

I feel like GS2 is subtly different from the first for a lot of this. Have you looked into it at all?

That's a very good question, and no, I've only skimmed GS2 at a very high level, haven't started disassembling it systematically yet.

Guessing from the indirect call tables (which are another mystery of their own), it looks like GS1's 0x08004458 is at 0x08014878 in GS2 (stub at 0x080000F8 for both). The code is almost, but not quite identical:

Code:
ldr r1, =0x030011BC
ldr r3, =0x41C64E6D
ldr r2, [r1]
add r0, r2, #0
mul r0, r3
mov r3, #0xC0
lsl r3, r3, #6
add r3, #0x39
add r0, r0, r3
str r0, [r1]
lsl r0, r0, #8
lsr r0, r0, #16
bx lr

That's interesting indeed, the second constant (the increment) is still 0x3039 (12345), but now it's synthesized instead of loaded. That looks like a performance improvement to me.

I couldn't find my second example in GS2 - the GS1 code I pasted is from the function that determines the battle background based on the current area, if you know where that is.

So, a different comparison. Here's the beginning of AgbMain, in GS1 it's at 0x08002E00:

Code:
push {r5, lr}
ldr r2, =0x40000B0
ldr r3, =0xC5FF
ldrh 1, [r2, #10]
and r3, r1
strh r3, [r2, #10]
ldr r3, =0x7FFF
ldrh r1, [r2, #10]
and r3, r1
strh r3, [r2, #10]
ldrh r3, [r2, #10]
sub sp, #4
ldr r2, =0x4014
ldr r3, =0x04000204
strh r2, [r3]
mov r0, sp
mov r5, #0
mov r1, #0xC0
str r5, [r0]
ldr r3, =0x040000D4
lsl r1, r1, #18
ldr r2, =0x85001E00
stmia r3!, {r0, r1, r2}
sub r3, #12

And the same code in GS2, at 0x0801319C:

Code:
push {r5, lr}
mov r2, #0x80
lsl r2, r2, #19
add r2, #0xB0
ldrh r1, [r2, #10]
mov r3, #0xC5
lsl r3, r3, #8
add r3, #0xFF
and r3, r1
strh r3, [r2, #10]
mov r3, #0xFE
ldrh r1, [r2, #10]
lsl r3, r3, #7
add r3, #0xFF
and r3, r1
strh r3, [r2, #10]
sub sp, #4
ldrh r3, [r2, #10]
mov r2, #0x80
ldr r3, =0x04000204
lsl r2, r2, #7
add r2, #20
strh r2, [r3]
mov r0, sp
mov r3, #0x80
mov r5, #0
lsl r3, r3, #19
mov r1, #0xC0
str r5, [r0]
add r3, #0xD4
lsl r1, r1, #18
ldr r2, =0x85001E00
stmia r3!, {r0, r1, r2}
sub r3, #12

The actual values are the same: (0x80<<19)+0xB0 is 0x040000B0, (0xC5<<8)+0xFF is 0xC5FF, and so on. But GS2 is definitely more aggressive about synthesizing the constants! I like it.

That function also has an example of one of the "strange literals" I mentioned. GS1, the value 0x0140 is at 0x08002E90 and the ldr is at 0x08002E54. GS2, the value is at 0x08013244 and the ldr is at 0x08013204. Both of them, the literal is at the beginning of the pool even though its ldr is later in the sequence.

Looks like all the other patterns I noticed are still there: r4 is still caller-save, r0-r3 are still allocated in reverse, loads are still delayed by ~2 instructions.

Any other interesting differences?
« Last Edit: 12, January, 2021, 02:25:33 AM by tarpman » Logged
View Profile
Daddy Poi's Oily Gorillas
Fox McCloud, the Hacking Doctor
Mercury Clan

Prodigy
*

Coins: 1002
Offline Offline

I am: certainly not a Gallant!
Clan Position: Head Gallant
Posts: 2495

« Reply #3 on: 12, January, 2021, 03:30:25 AM »

There are still some functions that push/pop r4... such as from 08016E0C in GS2. (Relating to Save Game mechanics)

And also same for the audio section. (081C0000) ... We know Camelot uses the Sappy Engine... The code here looks kinda weird at times, though.... so it could have been compiled separately. (Kinda like how DLLs are.)
« Last Edit: 12, January, 2021, 03:35:44 AM by Daddy Poi's Oily Gorillas » Logged

Golden Sun Docs: Broken Seal - The Lost Age - Dark Dawn | Mario Sports Docs: Mario Golf & Mario Tennis | Misc. Docs
Refer to Yoshi's Lighthouse for any M&L hacking needs...

Sometimes I like to compare apples to oranges. (Figuratively) ... They are both fruits, but which one would you eat more? (If taken literally, I'd probably choose apples.)
Maybe it is over-analyzing, but it doesn't mean the information is useless.


The only GS Discord servers with significance are:
Golden Sun Hacking Community
GS Speedrunning
/r/Golden Sun
GS United Nations
Temple of Kraden

Can you believe how small the Golden Sun Community is?

2+2=5 Don't believe me? Those are rounded decimal numbers. Take that, flat earth theorists! :)
View Profile

New User


Coins: 1
Offline Offline

Gender: Male
Posts: 8

« Reply #4 on: 13, January, 2021, 01:30:34 AM »

There are still some functions that push/pop r4... such as from 08016E0C in GS2. (Relating to Save Game mechanics)

And also same for the audio section. (081C0000) ... We know Camelot uses the Sappy Engine... The code here looks kinda weird at times, though.... so it could have been compiled separately. (Kinda like how DLLs are.)

Yeah - thanks for bringing that up. The audio code doesn't exhibit any of the Camelot patterns, so I suspect you're right about it being built separately. Some of it could be handwritten assembly, but I don't think all of it is. I understand the sound library is third-party and shared with other games, so I wonder if it might have been provided as pre-built libraries instead of source code... I haven't been into the save/load code yet, so can't speak to that part.
Logged
View Profile
Pages: [1]   Go Up
  Print  
 
Jump to:  

Chatbox
28, March, 2021, 06:10:33 PM
Salanewt: Tiny announcement down here that a new animation pack is now available for download! The Animation Overhaul is required; see the downloads page for more details.
03, March, 2021, 03:16:30 AM
Daddy Poi's Oily Gorillas: I'm happy ACE is possible and easy in GS1. :D Thank you Tret. Now if we can get GS2 Air's Rock to do ACE, wouldn't that be something.
18, February, 2021, 03:25:26 PM
Salanewt: The one here has everything: http://forum.goldensunhacking.net/index.php?action=downloads;sa=view;down=132
18, February, 2021, 02:03:20 PM
VardenSalad: It looks like Atrius' editor download is broken again. The only thing in the .zip is the background image
13, February, 2021, 03:15:41 PM
MaxiPower: Welcom back Charon. been a long time.
11, February, 2021, 09:40:04 PM
Daddy Poi's Oily GorillasSo how long is this site going to be "This website is still under construction, please excuse the mess." :D
04, February, 2021, 12:56:21 AM
charon the ferryman: LMAO i love 2010 tech
04, February, 2021, 12:56:07 AM
charon the ferryman: release me from my discord prison meme mortals
15, November, 2020, 03:45:26 PM
Salanewt: We're more alive than the board implies; most discussion takes place over on the Discord server.
15, November, 2020, 03:45:06 PM
Salanewt: Haha, heya!
15, November, 2020, 02:07:50 PM
Frog: Nice to see this place is still alive...ish anyway xD
30, October, 2020, 11:10:23 PM
Salanewt: ... Actually, it's almost time for me to post an update on the forum. It's been coming along really nicely overall, and it was only a couple months or so ago that I added some new status effects to the game.
30, October, 2020, 11:08:41 PM
Salanewt: I need to get it done to start making progress on the parts of the AI Overhaul I want to start working on.
30, October, 2020, 11:08:14 PM
Salanewt: It's going well! I wanted to get more of my djinn reform done this week but I've been busy with other things sadly.
30, October, 2020, 09:58:14 PM
KyleRunner: Hi guys! How are you doing?
13, October, 2020, 11:19:21 AM
MaxiPower: first chapter of my golden sun inspired webtoon is now live. be cool if yas could check it out. https://www.webtoons.com/en/challenge/legacy-of-the-lost-dragoon/the-northern-reaches/viewer?title_no=508487&episode_no=6
31, August, 2020, 08:40:32 AM
Daddy Poi's Oily Gorillas: Being able to convert IPS files to text files to also supporting commenting, and convert back to IPS after changes.... sounds like a fun strategy? 
27, August, 2020, 10:37:30 PM
Salanewt: Hey, thanks! 
25, August, 2020, 12:59:22 AM
Foreclosure: Yo! Looking forward for this AI OVerhaul patch when it's finished :) :) 
19, August, 2020, 07:00:17 PM
Salanewt: That's awesome! I'm pretty good; been working on an AI Overhaul project and have made some nice progress in the past couple weeks; the goal for tonight is to add Deadbeard to GS2 and then I can move onto something new.
19, August, 2020, 01:14:56 PM
Awec: Ye not bad, just checked back in for updates on GS Reloaded really. Although now I'm looking back through some of my old forum posts, which is fun :D how are you?
19, August, 2020, 01:43:07 AM
Salanewt: It sure is! How have you been?
16, August, 2020, 09:40:57 AM
Awec: Yo nice to see this forum is still going
14, August, 2020, 09:06:44 PM
Salanewt: Hey! How goes it?
01, August, 2020, 01:31:53 PM
KyleRunner: Hello, people!
08, July, 2020, 08:57:11 PM
Salanewt: That awkward feeling when almost a quarter of a year goes by without anyone posting a message in this chatbox.
27, April, 2020, 07:54:10 PM
Daddy Poi's Oily Gorillas: I'm having a thought. So I know Atrius's editor is slow at saving sprites.... So I've been thinking.   I am considering another console program.  This time, for importing and exporting sprites/animations. Image files (for sprites), binary files (for animation data), and maybe binary or text files for Settings. Would certainly be cool if it works out. - Console programs are fun, when you can (at least temporarily) cheat out making a full interface editor for them. =P
25, April, 2020, 10:42:07 PM
Daddy Poi's Oily Gorillas: Tried to list those in order of importance, but anyway....
25, April, 2020, 10:37:59 PM
Daddy Poi's Oily Gorillas: GSHC, GS Speedrunning, /r/Golden Sun, GS United Nations, ToK.  Sixth one is border-line I'm not sure.
25, April, 2020, 08:00:21 PM
Daddy Poi's Oily Gorillas: (low as in <100)

Affiliates
Temple of Kraden Golden Sunrise
Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!
Page created in 0.08 seconds with 23 queries.