This is the bug/patch tracker for Rockbox. Click here for more information.
Quick links: Bugs · Patches · Rockbox frontpage
FS#9458 - resize bitmap on load with smooth scaling
Attached to Project:
Rockbox
Opened by Akio Idehara (idak) - Sunday, 05 October 2008, 15:55 GMT+2
Last edited by Andrew Mahone (Unhelpful) - Friday, 26 December 2008, 09:02 GMT+2
Opened by Akio Idehara (idak) - Sunday, 05 October 2008, 15:55 GMT+2
Last edited by Andrew Mahone (Unhelpful) - Friday, 26 December 2008, 09:02 GMT+2
|
DetailsThis patch can resize bitmap on load with smooth scaling.
This is quick and dirty hack, but I think this is good start point to do this... |
This task blocks these from closing
FS#9665 - pictureflow resize-on-load
FS#9683 - pluggable scaler output for greylib
FS#9689 - greylib pictureflow for greyscale targets
Closed by Andrew Mahone (Unhelpful)
Friday, 26 December 2008, 09:02 GMT+2
Reason for closing: Accepted
Additional comments about closing: Committed as r19592.
Friday, 26 December 2008, 09:02 GMT+2
Reason for closing: Accepted
Additional comments about closing: Committed as r19592.
- Comments in the code explaining what things do
- There are many > 80 columns lines
- There are // comments
- There are some #if 0 lines
In general I think it looks a fine effort to really add this feature "the right way"!
With this added, would it be possible to remove the current load bmp function? It seems like a waste to have two...
You describe this as "a good start". Are you continuing to work on it? What improvements are you planning?
As Daniel said, my first reaction was also that we probably won't want two read_bmp functions.
My other reaction was the static "cache" buffer. I haven't looked at the code in detail, but am wondering if it would make sense to have three "unsigned char[]" arrays (one each for R, G and B), rather than a single 32-bit buffer. This would cut the size of the array by 25%.
But I guess that's what we're looking for in general - optimisations to try and both simplify the code, and reduce the memory requirements (for both code and cached line data).
Dave.
# Yes, old one is my first quick hack.
First, I merged those two functions into one.
I think smooth scaling algorithm is overkill for 2bpp-LCDs.
So I use simple scaling algorithm (nearest neighbour).
That's all I'm planning...
2) use three "unsigned char[]" arrays (one each for R, G and B)
instead of a single 32-bit buffer.
#but I use a single 32-bit buffer to read one line...
However, I also think we need to make this patch as efficient as possible before it can be committed. I'm also thinking that maybe it needs to be committed in smaller parts, to make it easier to review (and to justify all the changes).
I've just tried compiling this patch for the ipod color - it adds 5024 bytes to the binary size (i.e. code and initialised data), and a total
of 17056 bytes of RAM usage. (taking the numbers from the rockbox-info.txt file created with current SVN and SVN with this patch).
Looking at the code, I can see the following buffers:
uint32_t bmpbuf[MAX_WIDTH] - this used to be on the stack (and hence in IRAM - meaning faster access on some targets). IIUC, MAX_WIDTH is LCD_WIDTH for mono targets, and MAX(500,LCD_WIDTH) for other targets. This means a maximum of 2000 bytes for most targets.
You also have 5 cache lines, with each of those lines being an array of MAX_LINE 3-byte structs, plus a 16-bit y value - meaning 5 * (3 *
MAX_WIDTH + 2) bytes = a maximum of 7510 bytes.
Adding these up, I get 2000+7510+5024 = 14534, so I'm not sure where the other 2522 bytes have gone....
A MAX_WIDTH of 500 seems too small to me - I would expect people to want to use larger bitmaps, and if we support resizing, I think we should try and support any bmp size. I also have some album art which is just slightly more than 500 pixels (501 or 502). So I would want 512 as a minimum. My thoughts are that this should be 2*LCD_WIDTH, and then for any images larger than that, simply skip pixels in the horizontal direction when reading a line in order to reduce the size to LCD_WIDTH*2. This obviously isn't ideal, but I think that a non-optimal resize for very large images is better than simply rejecting them.
Also, why is the value 5 chosen for the cache size? The comment in the code says "must be > 2", so why isn't it 3?
A general comment on the code - I see that you're using "short" in a lot of places This often has the effect of increasing memory usage, rather than decreasing it - especially when these variables are local (i.e. allocated on the stack). Whereever possible, just use "int", as by definition, that's the easiest integer type for the target CPU to deal with.
Looking at just the changes to bmp.c, a major change appears to be to change the format of the bitfields[] array. This seems to be adding a lot of extra code - are there savings to be made elsewhere by changing this format? If not, then I would leave it as it was.
I would like to see this committed, and my approach would be to commit in smaller parts as follows:
1) Refactor bmp.c to incorporate all the changes required for resizing, but without actually adding the resizing itself. This patch should have negligible impact on binary size and memory usage, and the resizing patch itself should then only make minimal changes to this file.
2) Add the resizing feature, but without using it in the core - so we can see the impact of that feature by itself, and if possible, find optimisations.
3) Finally, add the code which uses the new resize feature for album-art (and elsewhere, if applicable).
I will make this patch into smaller parts.
And...
> Also, why is the value 5 chosen for the cache size? The comment in the code
> says "must be > 2", so why isn't it 3?
Sorry. This is my complete WRONG comment. The detail is the following.
---
"Bilinear" algorithm only needs 2 cache lines.
But in some case (downscaling 1/[integrals]), downscaling with "Bilinear" is the
same as "Nearest Neighbour" (named Simple Resize).
Ex: Downscaling with "Bilinear" and "Nearest Neighbour" from 500x500 to
100x100 are the same, and not so smooth.
In general, downscaling algorithm ranking is as follow.
#Yes, there are a lot of another resize algorithm
============================================================================================
GOOD <----------- -------------> BAD
"Lanczos3" >>[GREAT WALL]>> "Area Sampling" > "Bicubic" >= "Bilinear" >= "Nearest Neighbour"
============================================================================================
So, Imlib2 (and old GIMP?) uses "Area Sampling" for downscaling.
# "Area Sampling" algorithm is only for downscaling
I implement "Area Sampling" for downscaling both ways (down x/down y).
# and I implement "bilinear" for (up x/up y), (up x/down y), (down x/up y).
# This implement is different from smooth_resize_bitmap() (and Imlib2).
"Area Sampling" algorithm needs "src/dst" lines.
Ex: Downscaling from 500x500 to 100x100 needs 5 cache lines.
I thought most use-case was from 500x500 to 100x100, so I chose 5 cache lines....
And if cache line is not enough in downscaling with "Area
Sampling", I use "Bilinear" for downscaling...
(1) Refactoring bmp.[ch]
(2) Adding the resizing feature
(3) Adding the new resize feature for album-art
(4) Adding the new resize feature for sliding puzzle plug-in
((3) and (4) is the same)
(fix downscaling algorithm selection)
From my understanding of area sampling, you take several pixels over an area (in this case 5x5?) and then average them. The average is then used as input to some other interpolation algorithm. Since the pixels are averaged, higher frequencies are rejected before interpolation. Is this what you are doing here?
>"Bilinear" algorithm only needs 2 cache lines.
>But in some case (downscaling 1/[integrals]), downscaling with "Bilinear" is the
>same as "Nearest Neighbour" (named Simple Resize).
>Ex: Downscaling with "Bilinear" and "Nearest Neighbour" from 500x500 to
>100x100 are the same, and not so smooth.
Does area sampling yield any improvement over bilinear for interpolation by less then a factor of 2? I.E. 150x150 > 100x100? I think in this case there should be no difference?
>I thought most use-case was from 500x500 to 100x100, so I chose 5 cache lines....
So this patch is limited to resampling by less then a factor of 5? Still that might be more then enough, since the eventual goal is to integrate this into a JPEG decoder, and JPEG can do optimal factor of 8 (in each dimension) resizing "free". So 5x8 = 40x reduction. If the target size is 100x100, that would mean jpegs under 4000x4000 would be within that range.
That said, for JPEG users, the most interesting case (assuming we ever get a JPEG decoder anyway) will be resizing by less then a factor of 2.
Thank you for some comments!
> From my understanding of area sampling,
> you take several pixels over an area (in this case 5x5?)
> and then average them.
> The average is then used as input to some other interpolation algorithm.
> Since the pixels are averaged,
> higher frequencies are rejected before interpolation.
> Is this what you are doing here?
No, the average(area_sample()) is used to output image,
and it doesn't input to some other algorithm.
# "Area sampling" algorithm is one of downscaling algorithm
If you want to read unoptimized "area sampling" algorithm,
please check Imlib2-1.4.0 "src/lib/scale.c:L1489-".
> Does area sampling yield any improvement over bilinear for interpolation
> by less then a factor of 2?
> I.E. 150x150 > 100x100?
> I think in this case there should be no difference?
The output image is different from the other.
But, I think the difference is very small.
> So this patch is limited to resampling by less then a factor of 5?
> Still that might be more then enough,
> since the eventual goal is to integrate this into a
> JPEG decoder, and JPEG can do optimal factor
> of 8 (in each dimension) resizing "free".
> So 5x8 = 40x reduction. If the target size is 100x100,
> that would mean jpegs under 4000x4000 would be within that range.
> That said, for JPEG users, the most interesting case
> (assuming we ever get a JPEG decoder anyway)
> will be resizing by less then a factor of 2.
In new patch,
If "area sampling" cannot use (original width is over 500px),
I select "Nearest Neighbour" or "bilinear" algorithm.
If original width is less than 1250, I use "bilinear".
And "Nearest Neighbour" can be used under 2500 width.
(fix "area sampling" algorithm)
(fix cache line size calculation.)
(fix corner case of aspect ratio calculation)
Thank you for a lot of your works.
I am not so good at resize algorithm.
So please change as you like it.
And, I found some corner-case bugs.
(1) 15bit depth
If bitmap depth is 15bit,
you must change depth 16 before lseek() in read_part_line() like the
following (this was already fixed my latest 01_refactor_bmp-011102-p1.patch).
===
read_part_line(...)
{
...
cur_col += cols;
if (cur_col == width)
{
if (depth == 15) <<<<<<
depth = 16; <<<<<<
int pad = padded_width - ((width * depth + 7) >> 3);
if (pad > 0)
lseek(fd, pad, SEEK_CUR);
cur_col = 0;
cur_row += 1;
}
===
(2) Omitted maxheight or maxwidth
You should support that Album Art tag's maxheight or maxwidth is omitted like
the following (taken from http://www.rockbox.org/twiki/bin/view/Main/AlbumArt),
===
%Cl|50|70|||
#displays the album art at position x=50, y=70. No maxwidth or maxheight are
#given, so no resizing is done.
===
----
Lastly...
Upscaled image is so bad from my patch (White is not white).
#And I mainly use upscaling.
Do you intend to fix this?
2) I'm not sure the best way to address this. I'm thinking that a missing constraint should either be set to the screen dimension in that direction, or to match the other constraint. Another possibility would be to take the specification literally, and clip instead of resizing. I think clamping constraints to screen size, and setting them to screen size if not specified, is probably the most reasonable thing to do for this case.
3) The scaler is sub-pixel accurate, so *one* white pixel between non-white pixels in the input may not be quite white, if it does not align correctly with the output pixels. Major changes in brightness or color that are visible in large portions of the image should not be happening, however - if you're seeing that, there's a bug somewhere that I need to fix.
Addressed #2 via clamping constraints to screen size - the scaler is not capable of outputting >screen width, and clamping is less code than clipping.
Not sure about #3, I ran a set of color test squares through sim at <screen and >screen sizes, and got the same colors both ways. They look the same on my Sansa as well. Was the "not white" white dim, or did it have a color to it?
Also fixed a few compile-time warnings on unused and "unitialized" variables.
please test attached "cover.bmp" on ipod video sim.
(from 100x100 to 120x120)
Also attached are rockbox-info.txt for svn trunk and both resize patches, for m:robe 100, iPod 4G, and Sansa e200, to give an idea of the effects on binsize for mono, grayscale, and color targets. The color targets are the only ones where the memory cost should vary much, since there are static buffers totalling 24*LCD_WIDTH for use by the color scalers.
At least ipod 3G sim's all bmp image is broken...
and "01+02_resize_on_load10.patch" is completely the same.
Please check the attached dump image.
o "iriver H300"(11) remote (1bpp)
o "iaudio X5"(30) remote (2bpp)
o "iaudio M5"(31) remote (2bpp)
o iriver H300 (11)
o iAudio X5 (30)
I saw some image corruption on S60 hardware starting playback and then loading an image in sliding_puzzle, I suspect that sliding_puzzle and the album art loader used the scaler at the same time and overwrote each others' data in the static buffers used by the color scalers. This could be resolved either by locking around used of the color scalers, or by allocating space for them dynamically from the buffer provided by the caller to load the bitmap, which would mean that only album art or plugins could use the color scaler, but would reduce RAM usage greatly on color targets with wide displays.
o can't upscale bmp
o can't keep aspect ratio with upscaling
I think this is depend on "HAVE_UPSCALE" definition.
o can't keep aspect ratio with _downscaling_
I found my bug in SVN.
apps/recorder/bmp.h: L.174
"(LCD_REMOTE_DEPTH == 2) && (LCD_REMOTE_PIXELFORMAT == VERTICAL_INTERLEAVED)"
in get_totalsize() is unnecessary for iriver H300 (1bpp remote).
o can't compile sliding_puzzle (and a lot of warning) on iAudio M3 (34)
o "apps/recorder/bmp.c:604: warning: unused variable 'fb_width'" on 1bpp-LCD
* on greyscale-only targets, loop over each bitmap segment in read_part_line, and replace each pixel with its greyscale value - a small binsize savings vs using the inline brightness() function in several other places
* use Bayer dither values calculated from a 16-byte dither_table instead of the 16x16 dither_matrix, replacing a 2D array lookup per-pixel with a 1D array lookup and bitwise XOR
* fix some long lines
* update some docs for the changes since the HQ greyscale branch
* always use the high-quality brightness(), except on mono targets, and remove the "medium" quality one
Barring any bugs found, or further ideas for improvement, I think this is ready for commit. Binsize improves for color targets, is slighly worse for greyscale ones, but with improved visual quality, and is about the same for mono. Next stage is work on output plugging, an output plugin in greylib for 8-bit greyscale output, and a test case for greylib output, as nothing currently loads bitmaps for greylib use (I think).