Improved drawing performance in MyPaint brush engine

A first set of performance improvements for the brush engine has just landed in MyPaint master. The goals for this work for me were, in priority: a) Making sure that moving to a GEGL backend in MyPaint does not reduce performance, b) Improve performance when integrating the MyPaint brush engine in other applications, and lastly c) Improving the performance in MyPaint itself.

TL;DR: * Users of the soon-to-be-released MyPaint 1.1 should experience about 15% faster drawing of strokes for medium to big brushes. * Switching to the GEGL based backed for MyPaint 1.2 is now both feasible and highly desirable from a performance perspective.


Optimizations

The optimizations are implemented through three complimentary strategies:

1. Deferred data access to minimize fetching and updating of tiles

All dab drawing operations that happen as a result of a motion update event are queued up. When the brush engine has calculated where all dabs should go, tiles are fetched, all dabs drawn and the tiles updated. This in contrast to before where each dab drawing operation would fetch and update tiles.

2. Coarse grained parallelism using multi-threading via OpenMP directives

The tiles to be processed are divided evenly between processing threads (one per core). Each tile is processed completely independent of other tiles, so there is no locking or synchronization in the drawing code. The tile backing store must naturally be thread-safe and may ensure this using locks.

3. Fine grained parallelism using SSE via GCC auto-vectorization

Within each tile we attempt to make use of auto-vectorization to create the brush dab mask and do the composition of the dab onto the tile. Currently this is only implemented for a part of the mask calculation.

Results

Details of the results and how they can be reproduced is found in the original email thread.

Gains for MyPaint 1.1

Starting with the lowest priority goal, but the most relevant to users; performance impact on MyPaint right now.

Surface drawing results for existing Python-based backend

In terms of raw speed of drawing brushes to onto the underlying surface, speedups range from 20% to 50% for larger brushes (16 px+). This sets an upper boundary for the speedup perceived by the user.

Looking at the UI-enabled benchmarks of MyPaint, which is doing everything a normal application instance does, including layer compositing and rendering to screen, around 15% speedup was observed.  As the UI benchmark only tests a single brush at size=8.0px, it is possible that larger brushes will a higher speedup.

Users of the soon-to-be-released MyPaint 1.1 should experience about 15% faster drawing of strokes for medium to big brushes.

Note that the backend in use does not make use of the multi-threading introduced by (2) due to the tile store not being thread-safe and that it already had a cache to mitigate the problem fixed by (1).
Note

GEGL-based backend results, outlook for MyPaint 1.2

Surface drawing results, GEGL versus Python-based surface.
Test results by Till Hartmann on his Phenom II X6.

So in terms of raw surface rendering speed, the GEGL based backend is now significantly faster than the Python-based one. With 1 and 2 threads it is respectively up to 25% and 100% faster for big brush sizes. With 6 threads, it can be up to 4 times faster.

Switching to the GEGL based backed for MyPaint 1.2 is now both feasible and highly desirable from a performance perspective.

Note that to see UI performance increases approaching the raw surface drawing performance increase we may also need to do the layer compositing multi-threaded.

Gains in other applications

I’m trying to convince the Krita guys to update to the new version and to provide some feedback on the impact. Other consumers of the MyPaint brush engine do not tend to communicate much with us (some are proprietary).
I have strong hopes that (1) should increase their performance radically as their tile get/set cost is significantly higher than in the MyPaint case: They need to convert between the internal Krita and the MyPaint brush engine working colorspace each time. They may also be able to enable multi-threading and see speedups similar to the GEGL-based backend as a result.

Future Work

This is only lays the groundwork of better optimized MyPaint brush engine, many areas have room for improvement. For one only a small subset of the heavy code is vectorized. There may be inner loops that can be tweaked. It may be that, with a different tile access pattern compared to before, a different tile size would be more ideal. Perhaps doing the expensive calculation of the brush dab could be avoided some times by caching them… Thinking bigger, one could move all the drawing (and rendering) to the GPU.

More details on these ideas can be found here. If you are interested in working on any of it, get in touch and start hacking!

Participating in Libre Graphics Research Unit residency

Following the Piksels & Lines research meeting in June and being accepted in the call for proposals I am now taking part in the Piksel & Lines Orchestra residency, hosted by Piksel and LGRU, together with media artist and engineer Brendan Howell. Together we will further develop the Piksels & Lines Orchestra, a system that turns the traditional libre graphics tools (like MyPaint, GIMP, Inkscape, Scribus) into instruments for use in an performance setting.

A prototype of this system is to be demonstrated at Piksel X in Bergen, November 21-25th. Piksel is an annual festival where artists and developers working with free and open source software, hardware and art come together. Its diverse program will include presentations, workshops, performances, and installations.

Within the Piksels & Lines Orchestra residency we will also realize an artwork that will be performed in Madrid, April at the Future Tools Conference. This event combines Libre Graphics Meeting, an annual artist and developer meeting around free and open source graphics software with Interactivos?’13, two weeks of project-centric workshops focused on collaborative creation using open hardware and open software. The call for projects is now open, focusing on “Tools for a Read-Write World”.

Don’t miss any of these events if you are interested in the intersection and interaction between artistic works and open tools!

from more than a dozen countries

Hello Oslo, Hello Squarehead Technology

Two years after moving to Berlin and joining Openismus, it is time for another big change. I learned a lot while at Openismus, and had a lot of fun both in and outside of working hours. Those of you who have been to the barbecue parties know what a great bunch of people they are. I’m very glad to see that they are now going strong again.

In December I will join Squarehead Technology in Oslo as a Software Developer. There I will work on their advanced microphone array systems for acoustic cameras and acoustical zoom. The role includes both real-time programming, digital signal processing of audio/video and embedded Linux, which is pretty much exactly what I was looking for.

Here is a very quick demo of the technology, used for noise analysis: Nor 848 video

The array microphone system records 200+ channels of audio simultaneously. By exploiting the time difference between channels, digital signal processing can extract and/or visualize audio content at different positions in the recording. This can be done both in real time, and in retrospect (unlike parabolic microphones).

Fun times ahead!

Piksels and Lines – Libre Graphics Research Unit seminar in Bergen

Last week I was so lucky as to attend the 3rd Libre Graphics Research Unit (LGRU) meeting in beautiful Bergen, Norway.

The meeting was titled Piksels and Lines and had “a particular focus on improvements, interoperability between and fringe use of F/LOSS graphic bitmap and vector software, as well as generative software used in performative contexts.”

The meeting was structured into three different areas: Seminar, Workshop and Performance.

Seminar

The attendants that were invited for the meeting each did a presentation of their choosing. They were recorded and are available in the online archive of the seminar. The video quality leaves something to be desired, but the audio is generally good.

The presentations I found particularly interesting were:

I gave a presentation titled MyPaint and cross-application workflows. It was an introduction to MyPaint as a creative tool, how it combines raster and vector (piksels and lines) concepts, and my perspective on interoperability between libre graphics applications.

 

 

Workshop

I had hoped to hack some code for one of my existing ideas during the workshops. That did not happen. Instead I ended up hacking specifications. Maybe that is just as good. Hacking one can always do later, hashing out and documenting ideas has to be done while it is fresh.

First the results of some discussions with Øyvind Kolås, the GEGL maintainer:

A journal for GEGL: transaction log over changes made to a GEGL graph. Specification. Discussion. This feature would allow for applications based on GEGL to:

  • Implement non-linear histories (undo/redo), and a timeline of the changes
  • Store the history in a document like OpenRaster
  • Share the history between different applications
  • Let multiple applications to work on the same document at the same time

A strategy for improved file format support in GEGL, and using this to improve  file support and interoperability in libre graphics applications. Proposed plan.

Executing this plan would move a lot of the existing file format support from GIMP (PSD, XCF, OpenRaster) down into GEGL so that it can be reused across applications. And would then let GEGL provide image support plugins for GdkPixbuf and QImage – so that at the very least – previews will work everywhere.

 

Chatting with Egil Möller, creator of Sketchspace, also resulted in:

A web based system supporting a continuous work-flow from free-hand
sketch to finished product. Concept and mockups

“Imagine starting from a freehand drawing or imported raster image and gradually refining this into a technical document with illustrations, UML-diagrams or even running code or a 3d model.”

Refining here means that the user guides the tool to transform freehand sketch into vector paths, then into vector shapes, then into something domain-specific and formal like UML – by adding additional data like annotations, strengthening of lines to “disambiguating” the transformation.

Needless to say, this is more a visionary thing. Realizing this would involve finding good solutions to a fair amount of computer vision problems.

Making GEGL available for use in web-based applications. Proposal.

More on the concrete side: allow GEGL to be used in interactive or batch-oriented web applications, or in native applications based on web technologies (Javascript, HTML5 user interfaces).

Some of the discussions also resulted in me writing down the strategy for GEGL integration in MyPaint and the related ideas/plans for how to improve the performance of the MyPaint brush engine.

Now we just need to implement all the stuff… Contributions welcomed!

 

Performance

Since Piksel, with a long history in generative performance arts, was the hosting organization it was not surprising a project in that area materialized.

A workshop session hosted by media artist Brendan Howell called Demonstrating the Unexpected came up with the idea of the Piksels & Lines Orchestra (PLO): think of the collaborative use of our traditional libre graphics software as an orchestra. The applications, from MyPaint to Scribus, are instruments; the people using them players; a performance the use of these instruments. Can we create an experience for an audience based on this framework? How would it sound? How would it look?

Architecture diagram - no need for them to be dull looking.

Having plenty of code-crafting people available, the next afternoon it was decided to spend a couple of hours realizing a prototype. The LGRU blog has the details. We recorded video of our initial performances with this prototype as well, but that has sadly not made it online yet…

 

Thanks!

Thanks a lot to Piksel and LGRU for sponsoring my attendance, and the EU Culture Programme and Bergen municipality for funding activities that support libre graphics and free culture!

MyPaint and goats at LGM2012

Already covered in the news from LGM was the release of GIMP 2.8, and that GIMP 2.10 will be fully GEGLified. The goat-invasion branch which has most of that work, the result of 3 weeks of pippin and mitch on a couch hacking together, has already landed in master. This means that GIMP now has support for high bit-depth workflows for most operations. Finally.

Putting the goat in MyPaint

During LGM I started working on using GEGL in MyPaint. I have already mentioned this idea several times, so it was time to stop talking and get hacking.

As a first step in making use of GEGL I wanted to replace the current surface implementation with one based on GeglBuffer. Since GeglBuffer already provides tiling, and can store any buffer data supported by Babl this turned out to be easy. Øyvind (pippin) added the semi-quirky pixel format we currently use* in MyPaint to Babl, and I was able to get a rough working GEGL based Surface implementation the first evening.

The MyPaint brush engine working on top of GeglBuffer

* RGBA premultiplied alpha, in 16 bit unsigned integers with  2^15 being the maximum value.

The next couple of days went to moving to the GeglBufferIterator API instead of gegl_buffer_{get,set} to have zero-copy access to improve performance, and improving GEGL and GEGL-GTK so that some of the hacks in the initial implementation could be removed.

Most of the work is in the gegl branch of MyPaint. A simple test application, mypaint-gegl.py, is included, and you can read README.gegl for how to try it out. Warning: only intended for curious developers at this stage.

A lots of work remains to be done for MyPaint to be able to fully use GEGL. The progress is tracked in two bugs, one for MyPaint work and one for GEGL issues. Because one cannot combine PyGObject with PyGTK, it will likely not be possible to fully integrate GEGL in MyPaint before porting to PyGI and GTK+ 3.

Oh, in case the goat references are lost on you – check the GEGL page on wikipedia.

Application-hosted and compositor-hosted Maliit

The standard way of deploying Maliit is to have a single maliit-server instance (per user session), hosting the actual input method (virtual keyboard, handwriting). Applications then communicate with the server (and by extension, the IM) through an IPC.

This allows for a single instance of Maliit to serve all applications, which is memory efficient and robust. A crash in a Maliit IM plugin cannot take down the application and risk loss of significant user data. The disadvantage is the increased system complexity (a separate server process needs to be running at all times*) and requiring compositing of the application and input method windows. The latter can be quite challenging to do in a well-performing way on low-powered mobile/embedded devices. See Jans blogpost for how we handled that on the Nokia N9.

* By default we make use of DBus autostarting, of course.

Application-hosted Maliit

To make Maliit more suitable for systems where only a single application runs (embedded) or compositing performance is not good enough, we now also allow Maliit to be “application-hosted”: the Maliit server and input method plugins lives in the application process, not a separate server process. Enabling this feature has been a long running task of mine: All the code in input-context and server was made transport independent, a direct transport (no IPC) was introduced, and setting up the server for a given configuration (X11, QPA, app-hosted) was simplified. Other motivations for this work include being able to run the server and IM plugin easily for automated end-to-end system or acceptance testing, or just to easily start the server with a given IM plugin loaded for quick manual testing during development (see Michaels merge request).

An example application exists as part of the Maliit SDK that demonstrates the feature: maliit-exampleapp-embedded

Maliit running in application-hosted mode: The Maliit Server and input method plugin is embedded in the application instead of running in a standalone server process.

This works by having a special input-context “MaliitDirect” which instead of connecting to the server over DBus, creates the server and a direct connection. As when running standalone the server will instantiate and manage the necessary input method plug-ins.

Because the IM does not have its own window in this configuration, the application is responsible for retrieving the IM widget from the server, and re-parenting it into the appropriate place in the widget hierarchy. For all other purposes the application uses the same interface as if the IM was hosted remotely, making sure the abstraction is not broken and that one can easily use the application with Maliit deployed in different configuration.

This feature currently works with Qt4 applications, and is in Maliit since the latest release (0.90.0). One issue is that with the current input method API, the plugin assumes a fullscreen window; overlays extending the base area of the IM will be clipped and size needs to be overridden. This is something we are fixing in the new improved API.

Compositor-hosted Maliit

Another approach to make rendering perform better is to host the input method in the process responsible for the compositing. This also reduces the number or processes involved in rendering/compositing, and the associated overhead. This could be a X11 compositing window manager (like KWin or mcompositor), but a more realistic use-case is a Wayland compositor (for instance based on QtCompositor).

The API allows the consumer to inject an class instance for the configuration dependent logic, allowing to integrate the Maliit server with the logic in the rest of the compositor. Applications will use the normal “Maliit” inputcontext and communicate to the server through an IPC like DBus.

After the work with application-hosted Maliit, this feature was completed by making the server and connection libraries available as public API. The API is available in the latest  Maliit release (0.90.0), but is considered unstable until Maliit hits the 1.0 mark.

 

 

 

 

Maliit on Windows: Basic build working

Enabling third-party developers of input methods is one of the primary goals in the Maliit project. In an attempt to improve this story I spent some time on getting Maliit to work on Windows.

Since we use Qt there were few changes needed to the code, but since we use qmake, quite many to the build system. One of the bigger changes was making glib-dbus and qdbus  optional, which is also useful for Maliit on embedded systems.

With the Windows build fixes merge requests for maliit-framework and for maliit-plugins, one can now build Maliit on Windows and run the provided example applications. This feature is currently being reviewed and should be in the next Maliit release.

Thanks to the standalone viewer application for Maliit Keyboard this allows one to develop new features, theming and language layouts for it on Windows.

Sadly loading an input method plugin in the maliit server crashes for an unknown reason. With my limited Windows software development experience I was not able to solve this within the couple of days I had available. This is necessary for application-hosted Maliit to work and to enable general development of Maliit input method plugins (not just maliit-keyboard). Help would be much appreciated, even just someone checking  if it is reproducible on another Windows system.

Also left on the todo-list due to lack of time is to set up a Windows build slave for the Maliit buildbot, to test that the build continues to work on Windows and to produce executables.