Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Patches
  • Category LCD
  • Assigned To No-one
  • Operating System iPod 5G
  • Severity Low
  • Priority Very Low
  • Reported Version Daily build (which?)
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by Buschel - 2007-11-02
Last edited by amiconn - 2007-11-26

FS#8075 - 5G LCD speed up

As I am not sure if my last patch was closed by accident here the current update of the LCD-optimization for LCD/YUV based upon amiconns rework.

Changes:
- include outer loop (–height) into asm routine for lcd_update_rect()
- include two line updates into a single asm routine for lcd_yuv_blit()
- use 32bit-access for chroma buffer, needs more space but is a lot faster through usage of stm/ldm

Results (@80MHz):
LCD 1/1 screen: 101fps (same)
LCD 1/4 screen: 399fps (+2%)
YUV 1/1 screen: 28.5fps (+4%)
YUV 1/4 screen: 112.5fps (+4%)

Closed by  amiconn
2007-11-26 23:48
Reason for closing:  Accepted

Needed to correct one issue which caused possible crash for full width videos (but not the test_fps, what is kind of strange).

Change:
- calculate the correct length of chroma buffer (width/2 * 3, and not width*3). btw, the buffer length is also false in trunk.

This one was tested on all available resolutions of elephants dream as well as via test_fps.

Just the next speed-up for lcd_yuv_blit().

Changes:
- use pixel packing for each loop (2 pixels) and write them with one single str

Results (@80MHz) vs. trunk:
YUV 1/1 screen: 29.5fps (+8%)
YUV 1/4 screen: 117fps (+8%)

Just a new idea: As we now have the ability to set the destination address we could now totally drop the chroma buffer and write all 4 pixels. 2 pixels for first line, then update destination address and do the 2 pixels for the next line – when doing so there is no need to save the chroma bytes anympore. I'll try to change the code :o)

Dropped this idea. We may save 1x stmia (3*2+1) and 1x ldmia (3*2+1) per 4 pixels. But instead we need to add at least 2x str + 2x ldrh + several mov's for setting the LCD-registers as there are not enough ARM-registers left…

Put some more details in the comments – especially corrected the YUV-conversion formula.

Removed the changes for lcd_update_rect() and kept the changes for lcd_yuv_blit() as discussed in IRC.

Speed of YUV-blit with this patch is ~+8% vs. svn:
30MHz → 1/1: 11.0fps / 1:4: 43.5fps
80MHz → 1/1: 29.5fps / 1:4: 117.0fps

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing