#24 open
marc

debug output for unknown segfault

Reported by marc | July 3rd, 2009 @ 05:10 PM

hi,

python-spidermonkey used as serverside JS in cherrypy with mongodb segfaults when i'm doing more than one simultaneous request. when running normal JS code without mongodb, everything works fine - no segfaults even under high load.

is there a way to get debugging output of python-spidermonkey on to find out where the segfault comes from?

thanks in advance,

cheers marc

Comments and changes to this ticket

  • Paul J. Davis

    Paul J. Davis July 7th, 2009 @ 02:37 PM

    • Tag set to execute, javascript, segfault, threading
    • State changed from “new” to “open”

    Marc,

    The easiest way to get a traceback will be to start up your webserver, and then use gdb to attach to the process id before starting your tests. Should be something like:

    $ gdb
    > attach $PID
    ...
    > continue
    

    Don't forget the continue once it pauses in a break point otherwise it'll make python appear deadlocked.

    When you say handles normal JS code without mongodb, do you mean that you can run at high concurrency with python-spidermonkey?

    Also of note, a Context should never be used in a thread that didn't create it. While it is possible to use the C-API to clear and set the thread, I'd have to do that on every callback. I haven't gotten around to figure out if there's a way to test if its being called in the same thread or not though, so its trusting you right now.

    Paul Davis

  • marc

    marc July 8th, 2009 @ 08:57 AM

    hi paul,

    thank you for your reply.

    running js code with no extra added globals works perfectly, even under high concurrency. but using external objects from python breaks the execution.

    mike from mongodb has constructed a testcase without using mongodb, just plain python that deadlocks:

    http://mongodb.pastebin.com/m2ecdfebb

    i know you are involved in couchDB, i'm sorry to mention mongodb here :)

    thanks in advance,

    cheers marc

  • Paul J. Davis

    Paul J. Davis July 8th, 2009 @ 09:26 AM

    marc,

    That test case reproduces the bug here. I just ran it quickly this morning before work and its not pointing at anything blatantly in spidermonkey code which means this is gonna be a fun one to debug.

    i know you are involved in couchDB, i'm sorry to mention mongodb here :)

    lol, last I checked we aren't sworn enemies or anything. :)

    Do you have a link to your web framework? I'm pretty interested in giving it a try.

  • marc

    marc July 8th, 2009 @ 09:49 AM

    hi paul,
    good to hear that you are not enemies :) it's not really a webframework i'm building, just playing around with cherrypy and python-spidermonkey to get comfortable with serverside JS. here is a testcase that i've constructed for the mongodb guys, 'cause i've initially thought this was a bug in the pymongo driver.

    test case: http://groups.google.com/group/mongodb-user/msg/5d58d2b138297d81?

    i wanted to know if this bug is my fault or spidermonkey related, so i've tried the same thing with pyjscore and it deadlocks too. damn :)

    i'm feel a bit stupid in just sending around testcases without having the ability to help you in debugging this code. but i'm very happy that you have "fun" with it :)

    cheers marc

  • Paul J. Davis

    Paul J. Davis July 8th, 2009 @ 07:38 PM

    Marc,

    Did a quick bit of reading. The first step I'm going to have to do is go through and wrap lots of code in calls to GIL protection clauses. This won't be the most trivial thing so it'll have to wait until the weekend. Might be a good time to do some refactoring though.

    Paul

  • marc

    marc July 16th, 2009 @ 06:41 AM

    hi paul,

    are there any news on the deadlock topic? is there anything i can provide you to help you with this issue (yeah i know, stupid question :)? is it possible to replace spidermonkey later with tracemonkey without rewriting the complete python-spidermonkey code?

    thanks and cheers,

    marc

  • Paul J. Davis

    Paul J. Davis July 16th, 2009 @ 11:24 AM

    marc,

    Doh! I totally meant to get on earlier this week and leave an update on this. I started going through the code last weekend to figure out what I think I'll need to change. My guess is that somewhere the GIL is being released that I'm not expecting. My answer is to go through and add safe guards to all code that could possibly be called from a thread that doesn't have the GIL. Hopefully this is the issue.

    Going through the different bits to put in the GIL stuff is going to take a bit as it reaches into the code a bit. In the mean time I've been contemplating how best to implement the whole thing. Sometimes I wish C had try/catch.

    Anyway, that's the current status. I think I know what to do, its just a matter of finding time. I have an experiment to try on some other code this weekend, if that goes better than expected or blows up quickly I might get to it this weekend. If not hopefully either over a couple nights next week or next weekend.

  • marc

    marc July 16th, 2009 @ 11:43 AM

    hi paul,

    that's great news. thanks for the quick update :)

    cheers marc

  • Paul J. Davis

    Paul J. Davis July 16th, 2009 @ 03:25 PM

    Interesting development. I just happened to glance past the docs on building C++ extensions again. It caught my eye in that having try/catch/finally in the Spidermonkey code would be extremely helpful. I just sat down and put a try/catch in spammodule.c and built it as an extension and it works fine.

    Looks like I'll be upgrading python-spidermonkey to (ab)use C++ instead of straight C. This should make quite a bit of the internals far simpler as well. I'll probably start making updates in the next couple days on a branch to see how it works out.

    Hopefully I don't get struck down for abusing exception handling like this.

  • Paul J. Davis

    Paul J. Davis July 26th, 2009 @ 11:08 PM

    For anyone paying attention, I've started working on rewriting python-spidermonkey with C++. The code will be up at [1] as it appears. My initial prodding with the GIL state calls didn't give me any straight answers so in regards to this ticket it may be a bit of a haul.

    Paul

    [1] http://github.com/davisp/python-spidermonkey/tree/cxxrewrite

  • marc

    marc July 28th, 2009 @ 09:02 AM

    hi paul,

    thanks for your effort. i've tested your newly submitted code on mac os x and unfortunately, there was no magic, it still locks...but i'm looking forward to your c++ implementation :) i've tried to figure out if this is a spidermonkey related problem, but other implementations like pyjscore (http://code.google.com/p/pyjscore/) have the same issue.

    cheers marc

  • Paul J. Davis

    Paul J. Davis July 28th, 2009 @ 12:47 PM

    Marc,

    Oh I haven't gotten around the underlying issue yet. But at least
    you've verified that the new code builds and runs for you so there's
    that. Once I get the code all translated to C++ then I'll look at how
    to attack that threading issue. I'm still suprised it even exists
    because I never touch the GIL myself. I tend to wonder if this isn't
    some horrid sort of interaction between JS and Python that is just
    gonna make me go bonkers.

    Paul

  • marc

    marc August 18th, 2009 @ 02:28 AM

    Hi Paul,

    i was a bit curious if Ruby in combination with Spidermonkey and MongoDB results in the same locking problem. Using Johnson (http://github.com/jbarnette/johnson/) and MongoDB for Ruby together with Rack (http://rack.rubyforge.org/) and Thin (http://code.macournoyer.com/thin/) works very impressive. No locking, no freezing.

    Ok, Ruby has a completely other architecture behind it but maybe the Johnson code can help to find a solution for the GIL problem?

    Cheers Marc

  • Paul J. Davis

    Paul J. Davis August 18th, 2009 @ 02:59 AM

    marc,

    Sorry I haven't gotten a chance to follow up on this more thoroughly. Been a bit busy with other projects.

    I'm not sure about Johnson and Ruby. I'd have to learn Ruby's threading model and I doubt that it'd end up being very intuitive. I actually had a friend familiar with the GIL look at this and when he ran it on windows it didn't fail. I fear that this is going to be a very obscure bug when I figure it out eventually.

    I'm still waiting on Spidermonkey 1.8.1 to come out and hopefully as I push through that update I'll get to the bottom of this error. But I still keep thinking that I haven't the slightest on how to get to even start effectively diagnosing this. Eventually I'll get a free weekend to sink in and pick apart what's going on, but I don't see this being a fun journey either way.

    Paul

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

Python/JavaScript bridge module, making use of Mozilla's spidermonkey JavaScript implementation. Allows implementation of JavaScript classes, objects and functions in Python, and evaluation and calling of JavaScript scripts and functions respectively. Borrows heavily from Claes Jacobssen's Javascript Perl module, in turn based on Mozilla's 'PerlConnect' Perl binding.

People watching this ticket

Attachments

Pages