|
Rockbox mail archiveSubject: Re: Iriver: 48kHz Ogg turns on radio?Re: Iriver: 48kHz Ogg turns on radio?
From: Pedro Vasconcelos <pbv_at_st-andrews.ac.uk>
Date: Sat, 16 Jul 2005 17:36:18 +0100 On Sat, 16 Jul 2005 15:08:31 +0200 Magnus Holmgren <lear_at_algonet.se> wrote: > Pedro Vasconcelos wrote: > > > The optimizations that I made were fairly straightforward: writing short > > asm routines for 32-bit arithmetic in the hooks provided and placing > > some critical arrays in the fast IRAM. > > Any idea how much the cosine table is used? It can be made to fit > (perhaps by throwing out some of the window lookup tables that aren't > used anyway), but when I tried that, I didn't notice much of a difference. I tried keeping the most used stuff in IRAM: the pcm buffer used in floor synthesis, sine/cosine tables and window lookup tables. Because space is limited only the most commonly used windows sizes (256 and 2048) are kept in IRAM (there isn't enough space for all of them anyway). Also: there are two methods in the Vorbis spec for floor encoding and only floor1 method is used by current encoders, so only those tables are kept in IRAM. Right now Rockbox Tremor uses almost all of the 32kb IRAM available for a codec, but can't see much more we could put there that would make much of a difference. > > The other difficulty is the lack of profiling in the actual iriver > > hardware. I have done Tremor profiling on my P4 to get an idea of what > > were the critical functions, but the Coldfire is very diferent (cache, > > pipelines, etc) so it all guesswork. > > Still, that should give an indication over what functions are used a > lot, to see what to focus on and what to put in IRAM... Have you asked > on the Tremor mailing list about profiling data (on ARM or ColdFire > hardware)? Sure, I have done that profiling and it does give an indication of the most relevant functions; here is the profiling obtained from decoding a 2min Q6 song: % cumulative self self total time seconds seconds calls ms/call ms/call name 20.30 4.41 4.41 346878 0.01 0.01 mdct_butterfly_generic 13.79 7.41 3.00 15778 0.19 0.60 mdct_backward 10.06 9.59 2.19 6065325 0.00 0.00 decode_packed_entry_number 9.92 11.74 2.15 453691 0.00 0.01 vorbis_book_decodevv_add 8.20 13.53 1.78 15778 0.11 0.11 _vorbis_apply_window 7.09 15.06 1.54 7847603 0.00 0.00 oggpack_look 5.82 16.33 1.26 164430 0.01 0.01 render_line 3.98 17.20 0.86 15778 0.05 0.06 mdct_bitreverse 2.85 17.82 0.62 vorbis_synthesis_blockin The mdct functions seem like the best candidates for optimisation, which was also what the Tremor list people suggested. The most promising route seems to be replacing the MDCT with an FFT together with some shufling of the data to give a mathematically equivalent answer. From what I gather the FFT algorithm is much more regular and can be optimised to use the MAC pipeline more effectively. But I have no idea about how to do this replacement. Pedro _______________________________________________ http://cool.haxx.se/mailman/listinfo/rockbox Received on 2005-07-16 Page template was last modified "Tue Sep 7 00:00:02 2021" The Rockbox Crew -- Privacy Policy |