Looking back at the WebKit GTK hackfest

Saturday, January 23, 2010

In December, I attended the WebKit GTK hackfest which has been summed up nicely in many other places. Some of the things I worked on (apart from getting my luggage):

With the closing of 20736 WebKit GTK should now properly support windows with RGBA colormaps. This means that WebKit GTK can now be used to create nice desktop widgets or non-rectangular applications without worrying about BadMatch errors. Surprisingly non-rectangular windows is one of the most requested features for Titanium and we now support it on OS X, Linux and Windows via the transparent-background property in tiapp.xml.



The first big part of my clipboard / drag-and-drop reorganization also landed. Once I get the remaining patches together, DOM DataTransfer access to WebKit GTK clipboard and drag-and-drop data will work properly. With luck one day you'll be able to drop your photos and videos onto your web browser and have them upload seamlessly.

The hackfest itself was in A Coruña, a picturesque Galician city right on the coast of the Atlantic (which I managed to call the Pacific once, against all odds). I also had a rather abrupt introduction to the bounty of European languages — Spain has four recognized regional languages apart from Spanish (Castilian).



Xan and me, probably trying to convince him that drag-and-drop is the future.

On a personal note, the hackfest solidified my love for open source software. I want to keep doing this, however possible. Evan Martin made a really good point about open source at some point during the hackfest. He said that people sometimes think that being open source means that you can simply host an archive of your source code somewhere. There is more to it though, including having an open and active community and understanding and responding to your users. Some of this is just good software development practice, but being open source requires doing even more work to open the development process.

One important aspect of this work is having community discussions and allowing the community to help make decisions. One look at the WebKit development mailing list shows this working. There are many interests tied up with WebKit and some of them are direct competitors, yet somehow they manage to coordinate development on a humongous project. LWN.net recently posted a great article related to this topic, which I think exemplifies what not to do.

Managing the Python GIL via RAII

Tuesday, October 6, 2009

One of the killer features of C++ is RAII. RAII means that the amount of special-case cleanup code in the case of exceptions or early exits is minimized. For more on exactly how this happens, I recommend the Wikipedia article linked above.

This feature became useful to me when making our Python Kroll module work properly with the Python GIL. There were two usage patterns I was interested in for this work.

Letting Python code in other threads run

PyThreadState* threadState = PyEval_SaveThread();
...do some expensive non-Python work here...
PyEval_RestoreThread(threadState);

Taking back the GIL to use the Python API

PyGILState gstate = PyGILState_Ensure();
...use the Python API here...
PyGILState_Release(gstate);

The problem with this approach is that if an uncaught exception escapes from anywhere between the ellipses, the cleanup code (PyEval_RestoreThread or PyGILState_Release) will not run, likely leaving the program in a bad state. It turns out that RAII has a very elegant solution to this problem.

class PyLockGIL
{
PyLockGIL() : gstate(PyGILState_Ensure())
{ }

~PyLockGIL()
{
PyGILState_Release(gstate);
}

PyGILState_STATE gstate;
};

class PyAllowThreads
{
PyAllowThreads() : threadState(PyEval_SaveThread())
{ }

~PyAllowThreads()
{
PyEval_RestoreThread(threadState);
}

PyThreadState* threadState;
};
The acquisition and release of the GIL is just wrapped in the constructors and destructors of objects that are allocated on the stack. Here is the previous example using these new objects:

{
PyLockGIL lock
...do some expensive non-Python work here...
}

{
PyAllowThreads allow;
...use the Python API here...
}
Notice that braces can be used to fine tune the amount of code with the desired GIL state. The destructor for these objects will be called as soon as they go out of scope.

End-run around registration-free COM

Thursday, April 23, 2009

With the introduction of registration-free COM, Windows now allows you to create COM objects without registering their DLLs in the registry. This means that you can include your COM DLLs with your application files and run the application without being an administrator and mucking with regsvr32. Instead you include (or embed in your executable/DLL) a manifest which specifies the location of the COM DLL relative to your application.

This "new" (it was introduced with Windows XP, I believe) approach is not flexible enough for Titanium, which doesn't know until after run-time which WebKit DLL it will be using. Writing this manifest file into the application directory and restarting or copying the DLLs into the application directory is a less than ideal solution. While the copy may work in Vista because of the wonky UAC virtualization of write permissions (that's an entirely different rant), copying an entire DLL and it's dependencies at startup is expensive and defeats the purpose of DLLs.

This situation bothered me until I stumbled upon this blog post in the murky underbelly of the interblogs. It basically describes an approach using the somewhat thinly documented (perhaps intentionally) activation context API.

This Windows nightmare has a happy ending in that we were able to simply point an activation context at the WebKit DLL's manifest path. Here's what it looks like (this is originally from Gopalakrishna's blog which I linked to above):


// This is a normal CoCreateInstance -- done as if the COM DLL
// was actually registered with the system -- it will fail
IMyComponent* pObj = NULL;
HRESULT hr = CoCreateInstance(
CLSID_MyComponent,
NULL,
CLSCTX_INPROC_SERVER,
IID_IMyComponent,
(void**)&pObj);

ACTCTX actctx;
ZeroMemory(&actctx, sizeof(actctx));
actctx.cbSize = sizeof(actctx);
actctx.lpSource = "C:\Path\To\My\Manifest.manifest";
HANDLE pActCtx = CreateActCtx(&actctx);
ULONG_PTR lpCookie;

if(pActCtx != INVALID_HANDLE_VALUE)
{
if (ActivateActCtx(pActCtx, &lpCookie))
{
// CoCreateInstance should succeed now.
HRESULT hr = CoCreateInstance(
CLSID_MyComponent,
NULL,
CLSCTX_INPROC_SERVER,
IID_IMyComponent,
(void**)&pObj);
DeactivateActCtx(0, lpCookie);
}
ReleaseActCtx(pActCtx);
}

Avoiding the Logging Performance Hit

Saturday, April 4, 2009

Sometimes you have a function or a method which more often throws away its arguments than actually uses them. Quite possibly the most common example of this situation is logging. Often you'll see a snippet like this:

log.Debug("Processing " + index + " of " + count);

It's just the way things are that debug log statements are used more liberally than error statements. Even if log.debug(...) doesn't actually log it's argument, your program will still pay the penalty of doing the string concatenation continuously. As a result many programmers are forced to write code that looks like this:

if (log.IsDebug()
{
log.Debug("Processing " + index + " of " + count);
}

Now the internal state of the logger must be examined externally for every debug statement. D solves this issue through lazily evaluated functional arguments. In D you can change the signature of your function to look like this:

void Debug(lazy char[] entry)
{
if (debug)
fwritefln(logfile, entry());
}

This is really just syntactic sugar around D's delegate parameters in which you can pass a function as an argument. The end result, though, is that the statement that composed the argument won't be evaluated until it is actually used.

Titanium and Kroll are written in C++ though, which doesn't have the same kind of syntactic niceties that D does. Recently, when putting the first bits of the logging system into Kroll, I decided that I wasn't satisfied with the Java-style debugging which I wrote above. I wanted to fully encapsulate the behavior of the logger and avoid paying the penalty of uneeded string processing. One common approach for C/C++ is to use macros, but as Walter Bright explains in his essay on lazy argument evaluation, this isn't the best solution.

Luckily, C and C++ have variadic arguments which allow us to create both easy-to-read logging code and to avoid paying a penalty for those statements which should essentially be no-ops. I harnessed the power of a seldom-used friend from the C library for this task: vsnprintf. vsnprintf operates much like snprintf, except that instead of taking a variable list of arguments it takes a va_list (basically a variadic list of arguments that can be passed forward to other functions). Here is the resulting code from Kroll:

std::string Logger::Format(const char* format, va_list args)
{
// Protect the buffer
Poco::Mutex::ScopedLock lock(this->mutex);

vsnprintf(Logger::buffer, LOGGER_MAX_ENTRY_SIZE - 1, format, args);
Logger::buffer[LOGGER_MAX_ENTRY_SIZE - 1] = '\0';
std::string text = buffer;
return text;
}

void Logger::Log(Level level, const char* format, va_list args)
{
Poco::Logger& loggerImpl = Poco::Logger::get(name);

// Don't do formatting when this logger filters the message.
// This prevents unecessary string manipulation.
if (level >= (Level) loggerImpl.getLevel())
{
std::string messageText = Logger::Format(format, args);
this->Log(level, messageText);
}
}

void Logger::Debug(const char* format, ...)
{
va_list args;
va_start(args, format);
this->Log(LDEBUG, format, args);
}

A typical logging statement looks like this:

log.Debug("Processing %i of %i", index, count);

DBus and Threads

Friday, March 6, 2009

Sometimes you'll be using Dbus with threads and notice intermittent segfaults with stack traces like this;

#0 0xb4d58d29 in _dbus_watch_invalidate (watch=0x0) at dbus-watch.c:147
#1 0xb4d57066 in free_watches (transport=0x979c0d0) at dbus-transport-socket.c:82
#2 0xb4d57e7e in socket_disconnect (transport=0x979c0d0) at dbus-transport-socket.c:908
#3 0xb4d561e8 in _dbus_transport_disconnect (transport=0x979c0d0) at dbus-transport.c:494
#4 0xb4d56c68 in _dbus_transport_queue_messages (transport=0x979c0d0) at dbus-transport.c:1137
#5 0xb4d3dc24 in _dbus_connection_get_dispatch_status_unlocked (connection=0x979c4b0) at dbus-connection.c:3983
#6 0xb4d3b695 in check_for_reply_and_update_dispatch_unlocked (connection=0x979c4b0, pending=0xb3c00550) at dbus-connection.c:2235
#7 0xb4d3b883 in _dbus_connection_block_pending_call (pending=0xb3c00550) at dbus-connection.c:2337
#8 0xb4d501c1 in dbus_pending_call_block (pending=0xb3c00550) at dbus-pending-call.c:707
#9 0xb4d7aedf in dbus_g_proxy_end_call_internal (proxy=0x9776f90, call_id=30, error=0xb4d2d30c, first_arg_type=158972744, args=0xb4d2d2e4 "��Ҵ") at dbus-gproxy.c:2256
#10 0xb4d7b90e in dbus_g_proxy_call (proxy=0x9776f90, method=0xb4dca4bb "Get", error=0xb4d2d30c, first_arg_type=64) at dbus-gproxy.c:2584
This is happening because DBus doesn't do locking by default. I presume this is for performance reasons. Luckily, the solution to this problem is very simple. Simply run dbus_thread_init_default() before you try accessing DBus.

Compiling D Source is Easy

Sunday, January 18, 2009

There are two compilers available for D. The original D compiler (the one written by Walter) is DMD. It has an open-source front-end (the part that generate the IL) and a proprietary, closed-source back-end (the part that converts the IL into machine code). Two other, fully open-source compilers exist, GDC and LDC, which use the GCC and LLVM back-ends respectively.

I like using GDC, because it's open-source and has better support for shared library linking on Unices. It's fairly easy to switch between GDC and DMD, as GDC includes a gdmd command which emulates the command-line behavior of DMD.

Previously I was using DSSS to build my D projects. While DSSS is very nice in many ways, it felt odd using a build tool that was really only popular in the D community. After we started using SCons at work, I was positively pickled to find that it supported building D code. SCons works just as easily for D code as it does for building other things.

Say you have a source file containing your ground-breaking implementation of the Smash Mouth Optimization (see Computational Football vol. 23), SmashMouth.d. To build this with SCons, simply create a SConstruct file that looks like this:
BuildDir('build', 'src')
e = Environment()
e.Program('build/SmashMouth.d')
This assumes that you have your source in a folder called src and you want to build into a folder called build (I know, I know...this is a pretty brazen assumption). Running scons should produce something like this:
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
gcc -o build/Test build/Test.o -lgphobos -lpthread -lm -lgphobos -lgphobos -lgphobos
scons: done building targets.
You can see here that scons decided to build using GDC and Phobos. The SCons D scanner is pretty smart about what compilers you have on your system. It's also configurable. Okay, here comes the magical wonderland part.

What happens when you decide that you want SmashMouthOptimizer to be a shared library? You have some options here.

  • You could simply look through all the man pages of the D compilers you are using, hunt for all the relevant linking options, and finally write conditional code in your Makefile to use them.

  • On the other hand, you could just change your SConstruct to look like this:
    BuildDir('build', 'src')
    e = Environment()
    e.SharedLibrary('build/SmashMouth.d')
    Then let scons do this:
    scons: Reading SConscript files ...
    scons: done reading SConscript files.
    scons: Building targets ...
    gdmd -I. -c -ofbuild/SmashMouth.os build/SmashMouth.d
    gcc -o build/libSmashMouth.so -shared build/SmashMouth.os
    scons: done building targets.

Different People

Saturday, January 3, 2009

Being There warms my heart for several reasons. One of the most notable is that Hal Ashby managed to turn a Cheech and Chong song into some kind of religious experience with umbrellas. There is also the unexpected vignette during the ending, which many people seem to hate. I have no idea why.

One of the lesser known reasons though, is that it exposes you to some truly awesome music from somewhere you didn't expect. If I had to construct a rigid philosophical framework, I think this would be a major tenet. You might be asking, "What is that unexpected source, Martin?"

Sesame Street


Here's the rundown: Buffy Sainte-Marie is talking to Big Bird about some of their interpersonal problems. Big Bird, of course, is suffering unshakable angst, so Buffy is forced to crack it in typical Sesame Street fashion. The music starts at 2:05, by the way.



Labels: ,