Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Patches
  • Category LCD
  • Assigned To No-one
  • Operating System All players
  • Severity Low
  • Priority Very Low
  • Reported Version Daily build (which?)
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by idak - 2008-10-05
Last edited by Unhelpful - 2008-12-26

FS#9458 - resize bitmap on load with smooth scaling

This patch can resize bitmap on load with smooth scaling.
This is quick and dirty hack,
but I think this is good start point to do this…

The task blocks these from closing
ID Project Summary Priority Severity Assigned To Progress
9665 Rockbox  FS#9665 - pictureflow resize-on-load  Very Low Low
100%
9683 Rockbox  FS#9683 - pluggable scaler output for greylib  Very Low Low
100%
9689 Rockbox  FS#9689 - greylib pictureflow for greyscale targets  Very Low Low
100%
Closed by  Unhelpful
2008-12-26 08:02
Reason for closing:  Accepted
Additional comments about closing:   Warning: Undefined array key "typography" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 371 Warning: Undefined array key "camelcase" in /home/rockbox/flyspray/plugins/dokuwiki/inc/parserutils.php on line 407

Committed as r19592.

Project Manager

Some remarks:

- Comments in the code explaining what things do

- There are many > 80 columns lines

- There are // comments

- There are some #if 0 lines

In general I think it looks a fine effort to really add this feature “the right way”!

With this added, would it be possible to remove the current load bmp function? It seems like a waste to have two…

Akio,

You describe this as “a good start”. Are you continuing to work on it? What improvements are you planning?

As Daniel said, my first reaction was also that we probably won’t want two read_bmp functions.

My other reaction was the static “cache” buffer. I haven’t looked at the code in detail, but am wondering if it would make sense to have three “unsigned char[]” arrays (one each for R, G and B), rather than a single 32-bit buffer. This would cut the size of the array by 25%.

But I guess that’s what we’re looking for in general - optimisations to try and both simplify the code, and reduce the memory requirements (for both code and cached line data).

Dave.

idak commented on 2008-10-09 16:08

Thank you for some comments.
# Yes, old one is my first quick hack.

First, I merged those two functions into one.

idak commented on 2008-10-18 15:53

Finally, I add 2bpp-LCDs resize function.
I think smooth scaling algorithm is overkill for 2bpp-LCDs.
So I use simple scaling algorithm (nearest neighbour).

That’s all I’m planning…

idak commented on 2008-10-25 15:02

1) separate resize algorithm into another file.
2) use three “unsigned char[]” arrays (one each for R, G and B)
instead of a single 32-bit buffer.
#but I use a single 32-bit buffer to read one line…

Firstly, I think the patch is looking good - separating the resize functionality into other files is a good idea.

However, I also think we need to make this patch as efficient as possible before it can be committed. I’m also thinking that maybe it needs to be committed in smaller parts, to make it easier to review (and to justify all the changes).

I’ve just tried compiling this patch for the ipod color - it adds 5024 bytes to the binary size (i.e. code and initialised data), and a total
of 17056 bytes of RAM usage. (taking the numbers from the rockbox-info.txt file created with current SVN and SVN with this patch).

Looking at the code, I can see the following buffers:

uint32_t bmpbuf[MAX_WIDTH] - this used to be on the stack (and hence in IRAM - meaning faster access on some targets). IIUC, MAX_WIDTH is LCD_WIDTH for mono targets, and MAX(500,LCD_WIDTH) for other targets. This means a maximum of 2000 bytes for most targets.

You also have 5 cache lines, with each of those lines being an array of MAX_LINE 3-byte structs, plus a 16-bit y value - meaning 5 * (3 *
MAX_WIDTH + 2) bytes = a maximum of 7510 bytes.

Adding these up, I get 2000+7510+5024 = 14534, so I’m not sure where the other 2522 bytes have gone….

A MAX_WIDTH of 500 seems too small to me - I would expect people to want to use larger bitmaps, and if we support resizing, I think we should try and support any bmp size. I also have some album art which is just slightly more than 500 pixels (501 or 502). So I would want 512 as a minimum. My thoughts are that this should be 2*LCD_WIDTH, and then for any images larger than that, simply skip pixels in the horizontal direction when reading a line in order to reduce the size to LCD_WIDTH*2. This obviously isn’t ideal, but I think that a non-optimal resize for very large images is better than simply rejecting them.

Also, why is the value 5 chosen for the cache size? The comment in the code says “must be > 2”, so why isn’t it 3?

A general comment on the code - I see that you’re using “short” in a lot of places This often has the effect of increasing memory usage, rather than decreasing it - especially when these variables are local (i.e. allocated on the stack). Whereever possible, just use “int”, as by definition, that’s the easiest integer type for the target CPU to deal with.

Looking at just the changes to bmp.c, a major change appears to be to change the format of the bitfields[] array. This seems to be adding a lot of extra code - are there savings to be made elsewhere by changing this format? If not, then I would leave it as it was.

I would like to see this committed, and my approach would be to commit in smaller parts as follows:

1) Refactor bmp.c to incorporate all the changes required for resizing, but without actually adding the resizing itself. This patch should have negligible impact on binary size and memory usage, and the resizing patch itself should then only make minimal changes to this file.

2) Add the resizing feature, but without using it in the core - so we can see the impact of that feature by itself, and if possible, find optimisations.

3) Finally, add the code which uses the new resize feature for album-art (and elsewhere, if applicable).

idak commented on 2008-10-28 16:19

Thank you for a lot of feedback and comments!
I will make this patch into smaller parts.

And… > Also, why is the value 5 chosen for the cache size? The comment in the code
> says “must be > 2”, so why isn’t it 3?
Sorry. This is my complete WRONG comment. The detail is the following.

— “Bilinear” algorithm only needs 2 cache lines.
But in some case (downscaling 1/[integrals]), downscaling with “Bilinear” is the
same as “Nearest Neighbour” (named Simple Resize).
Ex: Downscaling with “Bilinear” and “Nearest Neighbour” from 500×500 to
100×100 are the same, and not so smooth.

In general, downscaling algorithm ranking is as follow.
#Yes, there are a lot of another resize algorithm

GOOD ←———- ————→ BAD
“Lanczos3” »[GREAT WALL]» “Area Sampling” > “Bicubic” >= “Bilinear” >= “Nearest Neighbour”

So, Imlib2 (and old GIMP?) uses “Area Sampling” for downscaling.
# “Area Sampling” algorithm is only for downscaling

I implement “Area Sampling” for downscaling both ways (down x/down y).
# and I implement “bilinear” for (up x/up y), (up x/down y), (down x/up y).
# This implement is different from smooth_resize_bitmap() (and Imlib2).

“Area Sampling” algorithm needs “src/dst” lines.
Ex: Downscaling from 500×500 to 100×100 needs 5 cache lines.

I thought most use-case was from 500×500 to 100×100, so I chose 5 cache lines….

And if cache line is not enough in downscaling with “Area
Sampling”, I use “Bilinear” for downscaling…

idak commented on 2008-11-02 06:09

I made the following resize patches.

(1) Refactoring bmp.[ch]
(2) Adding the resizing feature
(3) Adding the new resize feature for album-art
(4) Adding the new resize feature for sliding puzzle plug-in

idak commented on 2008-11-02 11:00

fix (1) and (2)… ((3) and (4) is the same)

idak commented on 2008-11-05 14:17

fix (2).
(fix downscaling algorithm selection)

Hi, I’m trying to understand the area sampling algorithm.

From my understanding of area sampling, you take several pixels over an area (in this case 5×5?) and then average them. The average is then used as input to some other interpolation algorithm. Since the pixels are averaged, higher frequencies are rejected before interpolation. Is this what you are doing here?

“Bilinear” algorithm only needs 2 cache lines.
>But in some case (downscaling 1/[integrals]), downscaling with “Bilinear” is the
>same as “Nearest Neighbour” (named Simple Resize).
>Ex: Downscaling with “Bilinear” and “Nearest Neighbour” from 500×500 to
>100×100 are the same, and not so smooth.

Does area sampling yield any improvement over bilinear for interpolation by less then a factor of 2? I.E. 150×150 > 100×100? I think in this case there should be no difference?

I thought most use-case was from 500×500 to 100×100, so I chose 5 cache lines….

So this patch is limited to resampling by less then a factor of 5? Still that might be more then enough, since the eventual goal is to integrate this into a JPEG decoder, and JPEG can do optimal factor of 8 (in each dimension) resizing “free”. So 5×8 = 40x reduction. If the target size is 100×100, that would mean jpegs under 4000×4000 would be within that range.

That said, for JPEG users, the most interesting case (assuming we ever get a JPEG decoder anyway) will be resizing by less then a factor of 2.

idak commented on 2008-11-09 10:07
Hi, I’m trying to understand the area sampling algorithm.
Thank you for some comments!
From my understanding of area sampling,
> you take several pixels over an area (in this case 5×5?)
> and then average them.
> The average is then used as input to some other interpolation algorithm.
> Since the pixels are averaged,
> higher frequencies are rejected before interpolation.
> Is this what you are doing here?

No, the average(area_sample()) is used to output image,
and it doesn’t input to some other algorithm.
# “Area sampling” algorithm is one of downscaling algorithm
If you want to read unoptimized “area sampling” algorithm,
please check Imlib2-1.4.0 “src/lib/scale.c:L1489-”.

Does area sampling yield any improvement over bilinear for interpolation
> by less then a factor of 2?
> I.E. 150×150 > 100×100?
> I think in this case there should be no difference?
The output image is different from the other.
But, I think the difference is very small.
So this patch is limited to resampling by less then a factor of 5?
> Still that might be more then enough,
> since the eventual goal is to integrate this into a
> JPEG decoder, and JPEG can do optimal factor
> of 8 (in each dimension) resizing “free”.
> So 5×8 = 40x reduction. If the target size is 100×100,
> that would mean jpegs under 4000×4000 would be within that range.
> That said, for JPEG users, the most interesting case
> (assuming we ever get a JPEG decoder anyway)
> will be resizing by less then a factor of 2.

In new patch,
If “area sampling” cannot use (original width is over 500px),
I select “Nearest Neighbour” or “bilinear” algorithm.
If original width is less than 1250, I use “bilinear”.
And “Nearest Neighbour” can be used under 2500 width.

idak commented on 2008-11-09 10:08

fix (2)
(fix “area sampling” algorithm)

idak commented on 2008-11-11 14:34

One more fix for (2)
(fix cache line size calculation.)

idak commented on 2008-11-17 13:09

fix for (2) again
(fix corner case of aspect ratio calculation)

Here’s a start at my attempt to reduce the memory usage / code size of this a bit. Nothing really changes for non-color targets. Color targets lose the nearest-neighbor scaler, and get a single smooth scaler that uses area downscaling and bilinear upscaling, independently on each axis, for essentially any reasonable input and output size. The attached replaces the 01+02 patches from this set, and works with the existing 03+04 patches. It can be further improved to reduce the raw bitmap cache to less than one whole line, but only after the non-color scalers are reworked to handle that, as well. The interface of the color scalers with the refactored bitmap loader is also not the prettiest. I’ve only tested hardware on gigabeat s, but sim for ipod 4G grey seems to work fine, as well. Any comments or questions welcome, just ask here or address Unhelpful on IRC.

This converts the full-line static read buffer to a partial-line on-stack read buffer. *only* the scaled-color and unscaled cases are covered by this, right now - build on grayscale targets is broken, and special remote format handling on color targets is disabled via a nasty hatchet job.. The rest should not be difficult to convert, only needing the addition of a callback for skipping lines, and some variables to track the skipped pixels in the input buffer.

The horizontal-packed 2-bit greyscale targets are now working, in sim at least - this should cover the greyscale ipods. The nearest-neighbor scaler is called unconditionally on targets that use it, and is designed to be fairly quick even when not scaling. I would would appreciate any reports on color or supported greyscale targets, as I can only test on Sansa E2xx and Gigabeat S. I still need to add yields for slower color targets, where the area scaler might take long enough to stall audio.

Added support for vertical-packed greyscale, works in sim but is again only used on targets I don’t have.

Added support for vertical-interleaved grayscale, works in sim.

Fixed color targets that broke with the first greyscale releases, unified all nearest-neighbor scalers, added yields roughly per-tick, though I doubt the nearest-neighbor scaler will run through multiple ticks often. Works on sims for iaudio m3/m5, ipod4g, iriver h1x0, gigabeat f/x, and on hardware for sansa e200, gigabeat s. archos, and probably other mono targets, still have some difficulties in sim, but appear to function correctly.

Per amiconn’s suggestion on IRC, scaling is disabled completely on mono targets, which go back to a direct-output case in bmp.c, modified only to use the partial-line bmp reader.

idak commented on 2008-11-27 13:12
Unhelpful
Thank you for a lot of your works.
I am not so good at resize algorithm.
So please change as you like it.

And, I found some corner-case bugs.

(1) 15bit depth
If bitmap depth is 15bit,
you must change depth 16 before lseek() in read_part_line() like the
following (this was already fixed my latest 01_refactor_bmp-011102-p1.patch).
===
read_part_line(…)
{

  cur_col += cols;
  if (cur_col == width)
  {
      if (depth == 15) <<<<<<
          depth = 16;  <<<<<<
      int pad = padded_width - ((width * depth + 7) >> 3);
      if (pad > 0)
          lseek(fd, pad, SEEK_CUR);
      cur_col = 0;
      cur_row += 1;
  }

===

(2) Omitted maxheight or maxwidth
You should support that Album Art tag’s maxheight or maxwidth is omitted like
the following (taken from http://www.rockbox.org/twiki/bin/view/Main/AlbumArt),
===
%Cl|50|70|||
#displays the album art at position x=50, y=70. No maxwidth or maxheight are
#given, so no resizing is done.
===


Lastly… Upscaled image is so bad from my patch (White is not white).
#And I mainly use upscaling.
Do you intend to fix this?

1) I’ll look into addressing that. I started with 081111 of your patch #2, I’ll have to look over the newer revision and add the fixes.

2) I’m not sure the best way to address this. I’m thinking that a missing constraint should either be set to the screen dimension in that direction, or to match the other constraint. Another possibility would be to take the specification literally, and clip instead of resizing. I think clamping constraints to screen size, and setting them to screen size if not specified, is probably the most reasonable thing to do for this case.

3) The scaler is sub-pixel accurate, so *one* white pixel between non-white pixels in the input may not be quite white, if it does not align correctly with the output pixels. Major changes in brightness or color that are visible in large portions of the image should not be happening, however - if you’re seeing that, there’s a bug somewhere that I need to fix.

Addressed #1 as described - sorry, I have no idea how I missed that there was an update since the original bmp_refactor patch.
Addressed #2 via clamping constraints to screen size - the scaler is not capable of outputting >screen width, and clamping is less code than clipping.
Not sure about #3, I ran a set of color test squares through sim at <screen and >screen sizes, and got the same colors both ways. They look the same on my Sansa as well. Was the “not white” white dim, or did it have a color to it?

Also fixed a few compile-time warnings on unused and “unitialized” variables.

idak commented on 2008-11-28 15:56

#3
please test attached “cover.bmp” on ipod video sim.
(from 100×100 to 120×120)

I never saw anything like that when testing, and I think it had to do with the target size of the scale. It turned out to be an off-by-one error causing the saturated values to overflow, and get wrapped to nearly zero. It’s fixed in this latest patch, as well a few other minor fixes, mostly stripping some unneeded things on mono targets, and adding more correct rounding in the color scaler.

Also attached are rockbox-info.txt for svn trunk and both resize patches, for m:robe 100, iPod 4G, and Sansa e200, to give an idea of the effects on binsize for mono, grayscale, and color targets. The color targets are the only ones where the memory cost should vary much, since there are static buffers totalling 24*LCD_WIDTH for use by the color scalers.

idak commented on 2008-11-29 06:02

Do you test on 2bpp-LCD target?
At least ipod 3G sim’s all bmp image is broken…

Yes, but up until your problem report on the bad white output, I was doing all of my testing on sliding_puzzle. The nearest-neighbor scaler was only returning true for success, ie 1, instead of the bitmap data size. Fixed in this patch.

idak commented on 2008-11-29 10:07

I think your “01+02_resize_on_load9.patch” and “01+02_resize_on_load10.patch” is completely the same.

You’re right, I committed those changes to another branch, not the one I’ve been generating the patches from. Sorry about that, this a few other small fixes since then, and line-length fixes for bmp.c and resize.c.

idak commented on 2008-11-29 12:38

I think something bad is in your 2bpp-LCD.
Please check the attached dump image.

Thanks for the extra testing. I won’t have a chance to look for a fix for at least a few hours, but let me know if you find anything else. The M3-remote-on-M5 also has broken icons.

Found it, the 4-bit loader was still testing if the image had even width, rather than the current chunk.

Mono bitmap problems fixed, both the read_part_line portion and other bugs in the direct-output case for mono-only targets and the scaler_nearest mono output block. Bitmaps now display correctly on ondio sim, and in remote on h1x0 sim.

idak commented on 2008-11-30 11:18

I think your patch still breaks the following remotes.
o “iriver H300”(11) remote (1bpp)
o “iaudio X5”(30) remote (2bpp)
o “iaudio M5”(31) remote (2bpp)

Rewrite of read_part_line to reduce size and number of reads. Now performs one read per call to read in the entire chunk worth of input data. Very verbose debugging added, activate by defining LOUD_BMP_LOADS and LOUD_SCALERS. Loading mono bitmaps for display on the H300 main screen and remote is broken, as are mono bitmaps for main screen on H100. Verified the X5/M5 remote issues, not sure if X5 uses the same M3 remote that M5 uses.

Debugging defines for scaler and loader defines were changed to better match other rockbox feature defines. scale_nearest has been simplified a good deal, and I am seeing good bitmaps on M5, X5, and H300 remotes now. There is a very odd problem with *some* mono bitmaps not loading correctly. The DancePuffDuo theme is broken pretty much everywhere, even in places where other mono bitmaps load correctly (for example, cabbiev2 theme on H300 remote). There are a few other themes where mono bitmaps are not right - the remote part of the boxes theme works on H300, but there are apparently broken mono bitmaps used in the main-display boxes theme. I’ll keep working on it when I have time, but I don’t seem to be getting anywhere with this problem.

Some more minor fixups, fixed mono bitmap handling, things seem to be correct on pretty much everything now. Sims tested for Ondio SP (no scaling enabled on this target, but bmp files in themes work), E200, H100, H300, iPod 4G, iPod Video, iaudio M3, M5, X5.

Sorry about the disastrous concatenation of patches. Here’s the correct patch.

idak commented on 2008-12-04 17:25

I think your patch can’t resize bmp on the following main (colour) display.
o iriver H300 (11)
o iAudio X5 (30)

Fixed, a missing “else” in the color + remote case was causing both scalers to be called, which made the reader go past the last row of the image. Retested on X5, H300, S.

Latest binsize reductions for mono case, some will probably result in small reductions for others as well. Remove readshort/readlong in favor of letoh*. Also included are the contents of rockbox-info.txt for both vanilla SVN and resize-patched builds for Recorder v2, H100, H300, X5, E200, and iPod Video.

I saw some image corruption on S60 hardware starting playback and then loading an image in sliding_puzzle, I suspect that sliding_puzzle and the album art loader used the scaler at the same time and overwrote each others’ data in the static buffers used by the color scalers. This could be resolved either by locking around used of the color scalers, or by allocating space for them dynamically from the buffer provided by the caller to load the bitmap, which would mean that only album art or plugins could use the color scaler, but would reduce RAM usage greatly on color targets with wide displays.

The old patch indeed used mutex while scaling.

Experimental reworking of the new resizer to use the buffer passed for the target bitmap as scaling buffer. The color scaler needs this space to be at least width*24 bytes larger than the data for the final scaled bitmap, and will fail the load if there is not enough space. The sliding puzzle patch also needed reworking for this, to use the plugin buffer instead of a fixed array exactly the size of the loaded bitmap. This reduces the RAM size increase on iPod Video, which needed the most space for the static buffers, from 11.8KiB to 4.4KiB. The disadvantage is that scaling is no longer available for color displays to users that can not provide the extra buffer space, essentially limiting its use to album art and plugins.

Latest versions of my work for the whole patchset. Rebuffer is forced on WPS load if the new WPS has albumart that is different from the old one - the audio will stop briefly when this happens, but it is a direct result of user input, so it shouldn’t be too shocking. The color upscaler is now optional, although it is presently activated by default. Removing it saves about 1.4-1.6KiB depending on target.

The 01+02 patch from the last set had a bit of merge conflict junk in it, which should be corrected in this one. The rest of the patches from the previous set should work with it.

Correction of #03 for a build failure on targets without HAVE_ALBUMART

Latest combined patch. More correct initialization of WPS AA state on first load, fixing forced rebuffer on load of any WPS with AA to correctly happen only when the old WPS lacked AA or the sizes differ.

idak commented on 2008-12-08 13:52

I found some bugs in your patch on ipod video sim.

o can’t upscale bmp
o can’t keep aspect ratio with upscaling

I think this is depend on “HAVE_UPSCALE” definition.

idak commented on 2008-12-08 13:57

x can’t keep aspect ratio with _upscaling_
o can’t keep aspect ratio with _downscaling_

HAVE_UPSCALER was not defined correctly, logic for resize was not correct with it undefined, fixed both.

idak commented on 2008-12-10 19:30

Unhelpful:
I found my bug in SVN.
apps/recorder/bmp.h: L.174
“(LCD_REMOTE_DEPTH == 2) && (LCD_REMOTE_PIXELFORMAT == VERTICAL_INTERLEAVED)” in get_totalsize() is unnecessary for iriver H300 (1bpp remote).

This is my latest local work. Mono targets have a faster, smaller, and somewhat less accurate brightness(), other targets have a different version that is still faster, but without as large a change in binsize or accuracy. A number of inline functions for bitmap sizes are replaced with macros, and a new macro is added which allows a static buffer to be made the correct size for a scaled image load, based on format and desired output size. All of these should reduce to constants if their inputs are. plugin.h now includes the core bmp.h, to make these macros available in plugins, and MAX_WIDTH in bmp.h is replaced by BM_MAX_WIDTH to prevent potential conflicts (there is an actual one in minesweeper). Extra format flags are stripped together via &= ~1. Some multiplications are saved in the color scalers, for about a 2.6% speedup on ARM.

Further scaler work, binsize is reduced on color targets, about 700B saved on arm, 1000B on coldfire. Nearest-neighbor scaler is removed, area/linear scalers reworked to support greyscale, cost about 1000B on greyscale targets. Foundations for custom output formats are in place, via an output_row function pointer in scaler_context, currently always set to output_row_native. Still needs a solution for passing such a function to the loader, loader calculating fb sizes for custom formats, but should not need much more work to support scaled output for greylib in plugins. Linear scaler is faster, with fewer multiplies, at the cost of an extra line of temp buffer - this was needed anyway to support output_row. A number of minor fixes for long lines or excess whitespace are also included.

idak commented on 2008-12-17 12:06

for scaler_postfreeze_hq_greyscale_20081217.patch
o can’t compile sliding_puzzle (and a lot of warning) on iAudio M3 (34)
o “apps/recorder/bmp.c:604: warning: unused variable ‘fb_width’” on 1bpp-LCD

Both fixed with this patch.

* move recalc_size into resize.c, with declaration in resize.h, in anticipation of other scaler frontends, along with some other struct definitions
* on greyscale-only targets, loop over each bitmap segment in read_part_line, and replace each pixel with its greyscale value - a small binsize savings vs using the inline brightness() function in several other places
* use Bayer dither values calculated from a 16-byte dither_table instead of the 16×16 dither_matrix, replacing a 2D array lookup per-pixel with a 1D array lookup and bitwise XOR
* fix some long lines
* update some docs for the changes since the HQ greyscale branch
* always use the high-quality brightness(), except on mono targets, and remove the “medium” quality one

Barring any bugs found, or further ideas for improvement, I think this is ready for commit. Binsize improves for color targets, is slighly worse for greyscale ones, but with improved visual quality, and is about the same for mono. Next stage is work on output plugging, an output plugin in greylib for 8-bit greyscale output, and a test case for greylib output, as nothing currently loads bitmaps for greylib use (I think).

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing