FS#11666 - codeclib arm hand-tuned assembly in fft-ffmpeg doesn't work on Android/-PIC builds
Opened by Dave Hooper (stripwax) - Monday, 11 October 2010, 23:53 GMT
Last edited by Dave Hooper (stripwax) - Friday, 12 November 2010, 10:19 GMT
See mailing list thread entitled "Segfault with FasterMDCT patch and -fPIC" for more details.
The hand-rolled asm in fft-ffmpeg_arm.h uses a combination of preprocessor macros and individual named register selection, in order to enable most use of stm/ldm (requiring register ordering constraints) without writing all of the asm manually - in theory enabling gcc to still make register allocations, local optimisations, etc.
However, it doesn't seem to work very well. In particular, using -fPIC in a regular ARM environment means that gcc needs to use r10 for its own needs, and the current code doesn't seem to enable gcc to realise that I am using r10 for my needs. As a result, gcc doesn't preserve r10 when it ought to, resulting in data aborts.
This patch serves three purposes-
1. Remove a number of the preprocessor macros in fft-ffmpeg_arm.h , in favour of static inlines, hopefully resulting in increased readability
2. Propagate address increments outside the scope of the local function (by simply returning the updated address pointer) for a slight reduction in stacking/unstacking and pointer arith (observed thru disassembly)
3. Reduce number of manually-selected specific register allocations. In particular, the register used for "t1" in the TRANSFORM_xxx macros does not need to be manually allocated if we make some small rearrangements. This also means the code no longer needs to specifically reference r10, which hopefully means gcc will track the PIC register correctly now.
Friday, 12 November 2010, 10:19 GMT
Reason for closing: Accepted
Additional comments about closing: submittied in r28262