Rockbox

Tasklist

FS#11927 - Queue overflow during database initialisation

Attached to Project: Rockbox
Opened by Steve Bavin (pondlife) - Friday, 11 February 2011, 10:54 GMT
Last edited by Jonathan Gordon (jdgordon) - Wednesday, 02 March 2011, 05:22 GMT
Task Type Patches
Category Database
Status New
Assigned To No-one
Operating System All players
Severity Low
Priority Normal
Reported Version Daily build (which?)
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No

Details

While the database initialisation is running, the tagcache queue can overflow due to system broadcast messages. This can be seen by running a simulator (it will crash with KERNEL_ASSERT after a while) or on a target by enabling KERNEL_OBJECT_CHECKS, then initialising the database and repeatedly connecting/disconnecting a charger.

The same problem may well happen with other queues (dircache, for example).
This task depends upon

Comment by Steve Bavin (pondlife) - Friday, 11 February 2011, 10:55 GMT
If I debug and put a breakpoint in kernel.c, I get the following backtrace when the problem occurs:
#0 queue_post (q=0x567e88, id=-1610612735, data=0) at /home/Steve/rockbox/firmware/kernel.c:623
#1 0x004700f1 in queue_broadcast (id=-1610612735, data=0) at /home/Steve/rockbox/firmware/kernel.c:844
#2 0x0046a682 in battery_status_update () at /home/Steve/rockbox/uisimulator/common/powermgmt-sim.c:65
#3 0x0046a739 in battery_level () at /home/Steve/rockbox/uisimulator/common/powermgmt-sim.c:106
#4 0x0042e824 in gui_statusbar_draw (bar=0x599f2c, force_redraw=false, vp=0x9c40e0)
at /home/Steve/rockbox/apps/gui/statusbar.c:200
#5 0x00435353 in do_non_text_tags (gwps=0x55bfd0, info=0x22f750, element=0x9c4150, vp=0x9c40e0)
at /home/Steve/rockbox/apps/gui/skin_engine/skin_render.c:221
#6 0x004359a9 in skin_render_line (line=0x9c4124, info=0x22f750)
at /home/Steve/rockbox/apps/gui/skin_engine/skin_render.c:449
#7 0x00435efb in skin_render_viewport (viewport=0x9c4124, gwps=0x55bfd0, skin_viewport=0x9c40e0,
refresh_type=1703936) at /home/Steve/rockbox/apps/gui/skin_engine/skin_render.c:626
#8 0x00436140 in skin_render (gwps=0x55bfd0, refresh_mode=1703936)
at /home/Steve/rockbox/apps/gui/skin_engine/skin_render.c:721
#9 0x0043121d in skin_update (skin=CUSTOM_STATUSBAR, screen=SCREEN_MAIN, update_type=1703936)
at /home/Steve/rockbox/apps/gui/skin_engine/skin_display.c:90
#10 0x0042f9ea in sb_skin_update (screen=SCREEN_MAIN, force=false)
at /home/Steve/rockbox/apps/gui/statusbar-skinned.c:156
#11 0x004308be in viewportmanager_redraw (data=0x0) at /home/Steve/rockbox/apps/gui/viewport.c:240
#12 0x00470a41 in send_event (id=2050, data=0x0) at /home/Steve/rockbox/firmware/events.c:85
#13 0x00401394 in get_action_worker (context=4, timeout=100, get_context_map=0)
at /home/Steve/rockbox/apps/action.c:177
#14 0x004016d5 in get_action (context=4, timeout=100) at /home/Steve/rockbox/apps/action.c:335
#15 0x00405f43 in do_menu (start_menu=0x489424, start_selected=0x22ff38, parent=0x0, hide_theme=false)
at /home/Steve/rockbox/apps/menu.c:378
#16 0x004197a2 in root_menu () at /home/Steve/rockbox/apps/root_menu.c:634
#17 0x00405180 in main (argc=4, argv=0x3e4f98) at /home/Steve/rockbox/apps/main.c:196
Comment by Steve Bavin (pondlife) - Friday, 11 February 2011, 11:05 GMT
(gdb) display q->read
1: q->read = 7
(gdb) display q->write
2: q->write = 24
Comment by Steve Bavin (pondlife) - Tuesday, 22 February 2011, 11:55 GMT
This problem is a genuine queue overflow. It happens because the simulator broadcasts SYS_CHARGER_CONNECTED, SYS_CHARGER_DISCONNECTED and SYS_BATTERY_UPDATE messages. In this particular case, these accumulate in the tagcache queue - which isn't popped during the scan and overflows after 15 events have occurred.

I will define KERNEL_OBJECT_CHECKS to allow a panic on queue overflow then try to repro this on a target by initialising the database then repeatedly connecting/disconnecting the charger.

I'm unsure as to the correct fix:
- A larger queue would help, but in theory could still overflow. How long is a piece of string?
- I can make the tagcache code drop the offending messages. But does this need doing elsewhere too?
Comment by Steve Bavin (pondlife) - Tuesday, 22 February 2011, 14:28 GMT
OK, reproduced on target, will change the description and category shortly.
Comment by Steve Bavin (pondlife) - Tuesday, 22 February 2011, 16:46 GMT
The attached patch fixes this for me (tested on Gigabeat F device and H300 sim) by discarding any unwanted system events from the tagcache/dircache queues (the only 2 that are peeked). Opinions on whether this is a good idea or not are sought.
Comment by Michael Sevakis (MikeS) - Tuesday, 08 March 2011, 11:15 GMT
Try this out, hot off my fingertips and not even compiled once so who knows what'll happen (got no time to check it right now). I very well may need it anyway in some form to deal with stale notification removal and may be more useful. Stick it somewhere in kernel.c to test. Consider it the strpbrk of event queue APIs since it can peek and optionally remove messages in queued order from a list of messages to match.
Comment by Steve Bavin (pondlife) - Thursday, 10 March 2011, 08:04 GMT
Hi Mike,

Not sure I understand - should the tagcache peek be using this, and passing ALL system messages to be removed? (The basic problem is that SYS messages are being broadcast, but never removed/handled in particular queues.)
Comment by sideral (sideral) - Thursday, 14 April 2011, 13:22 GMT
Hi pondlife, I think your patch goes into the right direction. A few comments:

For dircache, it appears to be OK to remove all remaining system events. However, tagcache_thread seems to be interested in one additional system event, SYS_TIMEOUT, and I think this one shouldn't be removed from the queue in this case.

Also, in general posting events to full queues should never crash the kernel. Either the event should not be posted (and the sender notified) or the sender should block until there's room in the queue (doesn't work when queuing events for oneself).

Loading...