Planet Collabora

October 18, 2014

Jeremy Whiting

Upcoming KDE Applications 14.12 release prep

Hello,

In preparing for the upcoming releases of KDE Applications 14.12 (2014 Month 12) I realized the other day that we have an interesting situation.  For Qt4 based applications there's libkdeedu which contains the kvtml parsing and manipulating code and also a handful of .kvtml files that KAnagram and KHangMan use to get their word lists. KAnagram has been ported to Qt5 and KDE Frameworks for some time now, and will have it's first Qt 5 based release at the end of this year. It uses libkeduvocdocument which was ported to Qt 5 also at about the same time this year. libkeduvocdocument uses Qt 5 and KDE Frameworks 5, and also ships the same handful of kvtml files that libkdeedu ships. (libkdeedu has been split for the Qt 5 based releases) KHangMan hasn't yet been ported to frameworks and Qt5 or at least the port isn't stable yet, so it will depend on libkdeedu still, as will KWordQuiz and Parley from what I understand. So we have two libraries that ship the same files, makes them not coinstallable. So we'll be moving the kvtml files out of the libraries and into kdeedu-data soonish to solve this problem. The moral of this story is to look around, see what will be released using Qt5 in the upcoming release, and what will be using Qt4 still. https://community.kde.org/Frameworks/Application-release-status-December-2014 may help also. If you maintain an application and haven't put your application on that page under the Qt4 or Qt5 tables yet, please do, the more we coordinate, the better this release will be.

Thanks, and keep up the good work all.

October 11, 2014

Jeremy Whiting

Simple Elegance

I just noticed a couple of features today and yesterday in plasma next and kwin that I appreciate and wanted to thank whoever thought of adding them. Both are simple but very handy to have. I'm talking about the little X buttons on both the wallpaper configuration dialog and the kwin present windows effect. I don't use either of these features very often, but yesterday when I was testing knewstuff with the wallpaper config it was very handy to be able to delete the installed wallpaper from the wallpaper selection dialog. Then just now it was very handy to be able to close extra windows I had open that I no longer need when I was in the present windows effect looking at what I need to be doing next. Makes it very simple to clean up a workspace.

Just throwing this out there, thanks whoever added these simple nice features.

October 07, 2014

Jeremy Whiting

Plasma Next Improvements and KApplication -> QApplication gotchas

tldr, if you port from KApplication to QApplication, remove the %i from the Exec lines of your .desktop files.

Hey all, so I've been running plasma-next on my main development machine for a month or two now and have definitely enjoyed the speed at which improvements come. Just in time for plasma 5.1 release timezone support was added to the digital clock, so you can choose which timezones you want to see, use your mouse scrollwheel to change which one is visible, and so on. Also because our translators are awesome we got a last minute change in after the string freeze with their permission to show all the configured timezones in the clock's tooltip (similar but not identical to how it worked in kdelibs4 based plasma times). I enjoy the new Breeze theme and icons, and the alternatives system for switching between different k-menus, different task managers (I'm currently enjoying the icons only one) etc. is so handy.

In other news something that has been bothering me for a while is that okular just didn't want to get launched from any visual launcher. Clicking on a pdf in dolphin acted like it was launching, but no Okular ui would appear. Clicking view book on pdf books in Calibre would do likewise. So I spent a couple of days adding debug messages to kinit and klauncher trying to figure out what was going on. Kate launches just fine, so I tried copying the kate.desktop to okularApplication_pdf.desktop and replaced kate with okular etc. and that worked fine also.

So today I asked Albert if he had any ideas and got thinking it had to be something in the .desktop files themselves. So I uncommented another qDebug line in klauncher that said exactly what it was asking kinit to start and found it using "/usr/local/bin/okular blah.pdf --icon okular. So I tried the same from a terminal and found that okular's binary failed to launch because it doesn't understand the --icon parameter. A bit of digging found that KApplication handled that argument, while QApplication doesn't and the frameworks port of okular like a good example ported from KApplication to QApplication already.
Klauncher puts --icon blah in when you have %i in the Exec line of the .desktop file. So if you port from KApplication to QApplication, be sure to remove the %i from the .desktop files of your application also.

October 06, 2014

Alban Crequy

Improving the security of D-Bus

In the last months, I have been working on improving the security of D-Bus, mainly to make it more resistant to denial of service attacks. This work was sponsored by Collabora.

Eight security issues were discovered, fixed and attributed a CVE. They were found by looking at the source code (in D-Bus and Linux' af_unix implementation), checking existing issues in the D-Bus bugzilla and a bit of luck.

Security issues fixed in D-Bus

• CVE-2014-3477(Bug #78979): dbus-daemon sent an AccessDenied error to the service instead of a client when the client is prohibited from accessing the service, which allowed local users to cause a denial of service (initialization failure and exit) or possibly conduct a side-channel attack via a D-Bus message to an inactive service.
• CVE-2014-3532(Bug #80163): when running on Linux 2.6.37-rc4 or later, local users could cause a denial of service (system-bus disconnect of other services or applications) by sending a message containing a file descriptor, then exceeding the maximum recursion depth before the initial message is forwarded.
• CVE-2014-3533(Bug #80469): dbus-daemon allowed local users to cause a denial of service (disconnect) via a certain sequence of crafted messages that caused the dbus-daemon to forward a message containing an invalid file descriptor.
• CVE-2014-3635(Bug #83622): an off-by-one error in dbus-daemon allowed remote attackers to cause a denial of service (dbus-daemon crash) or possibly execute arbitrary code by sending one more file descriptor than the limit, which triggered a heap-based buffer overflow or an assertion failure.
• CVE-2014-3636(Bug #82820): a denial-of-service vulnerability in dbus-daemon allowed local attackers to prevent new connections to dbus-daemon, or disconnect existing clients, by exhausting descriptor limits.
• CVE-2014-3637(Bug #80559): malicious local users could create D-Bus connections to dbus-daemon which could not be terminated by killing the participating processes, resulting in a denial-of-service vulnerability.
• CVE-2014-3638(Bug #81053): dbus-daemon suffered from a denial-of-service vulnerability in the code which tracks which messages expect a reply, allowing local attackers to reduce the performance of dbus-daemon.
• CVE-2014-3639(Bug #80919): dbus-daemon did not properly reject malicious connections from local users, resulting in a denial-of-service vulnerability.

Other fixes

In addition to fixing specific bugs, I also explored ideas to restrict the number of D-Bus connections a process or a cgroup could create. After discussions with upstream, those ideas were not retained upstream. But while working on cgroups, my patch for parsing /proc/pid/cgroup was accepted in Linux 3.17.

Identify bogus D-Bus match rules

D-Bus security issues are not all in dbus-daemon: they could be in applications misusing D-Bus. One common mistake done by applications is to receive a D-Bus signal and handle it without checking it was really sent by the expected sender. It seems impossible to check the code of all applications potentially using D-Bus in order to see if such a mistake is done. Instead of looking the code of random applications, my approach was to add a new method GetAllMatchRulesin dbus-daemon to retrieve all match rules and look for suspicious patterns. For example, a match rule for NameOwnerChanged signals that does not filter on the sender of such signals is suspicious and it worth checking the source code of the applications to see if it is legitimate. With this method, I was able to fix bugs in Bluez, ConnMan, Pacrunner, Ofonoand Avahi.

GetAllMatchRulesis released in dbus 1.9.0 and it is now possible to try it without recompiling D-Busto enable the feature. I have used a script to tell me which processes register suspicious match rules. I would like if there was a way to do that in a graphical interface. It's not ready yet, but I started a patch in D-Feet.

October 01, 2014

Philip Withnall

Dynamic relocs, runtime overflows and -fPIC

While merrily compiling something a little while ago, my linker threw me this gem of an error message (using GNU gold):

error: libmumble.a(libmumble.o): requires dynamic R_X86_64_PC32 reloc against 'g_strdup' which may overflow at runtime; recompile with -fPIC

or, if you’re using GNU ld (the two linkers have different error messages for the same problem):

error: mumble.o: relocation R_X86_64_PC32 against symbol g_strdup' can not be used when making a shared object; recompile with -fPIC

I recompiled everything with -fPIC, and magically the problem went away. But I didn’t understand why. I finally got a bit of time to investigate, so here we go.

tl;dr: This is caused by linking a shared library (which requires position-independent code, PIC) to a static library (which has not been compiled with PIC). You need to either link the shared library against a shared version of the static code (such as is produced automatically by libtool), or re-compile the static library with PIC enabled (-fPIC or -fpic).

To understand this, we need a brief introduction to the different types of linking, and how static objects and libraries differ from shared (or dynamic) objects and libraries. Let’s run with a minimal working example: two C files, shared.c and static.c. static.c is compiled to a static archive, libstatic.a (without position-independent code, PIC), and shared.c is compiled to a shared object, libshared.so, which links against libstatic.a.

What is a static object? It’s one where all symbol references are resolved at compile time. What’s a dynamic object? One where symbol references can be resolved at runtime. This means that dynamic objects have to have relocations performed as they’re loaded, which incurs a load-time penalty, but allows for shared libraries and symbol interpositing.

It is these relocations which cause the problem hinted at by the error message above. Each relocation is effectively a note to the runtime loader instructing it to replace a symbol reference in the dynamic object being loaded, with an address calculated at load time.

There are various types of relocations, defined by the platform ABI, as they are specific to the processor’s instruction set. For a more in-depth account of them, see Relocations, Relocations by Michael Guyver. In this case, the R_X86_64_PC32 relocation was chosen by the compiler, which is defined by the AMD64 ABI (Table 4.10). What does that mean? Each relocation type is essentially a mathematical function to define the address of a relocated symbol, given the information in various symbol, section and relocation tables in the dynamic object. The ABI defines R_X86_64_PC32 as . Less succinctly, it is the offset of the referenced symbol, plus a constant adjustment (the addend) minus the offset of the relocation. This is all explained brilliantly by Michael Guyver on his blog.

So, with our example, we get the error:

$make libshared.so cc -Wall -c -o shared.o shared.c cc -Wall -c -o static.o static.c ar rcs libstatic.a static.o cc -shared -o libshared.so shared.o libstatic.a /usr/bin/ld: error: shared.o: requires dynamic R_X86_64_PC32 reloc against 'my_static_function' which may overflow at runtime; recompile with -fPIC collect2: error: ld returned 1 exit status make: *** [libshared.so] Error 1 If we look at the disassembly of the shared object: $ objdump -d shared.o

shared.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <my_shared_function>:
0:    55                       push   %rbp
1:    48 89 e5                 mov    %rsp,%rbp
4:    e8 00 00 00 00           callq  9 <my_shared_function+0x9>
9:    5d                       pop    %rbp
a:    c3                       retq

we can see at offset 4 that the callq instruction (calling my_static_function()) leaves 4 bytes for the address of the function to call (actually, callq is instruction-pointer-relative, so the 4 bytes are for the offset of the function from the RIP register).

As the code in libstatic.a is not PIC, it has to be loaded at a fixed offset in a process’ address space. The shared library, libshared.so, must be capable of being loaded anywhere in an address space. This would be fine if the callq instruction could take an absolute address to call, as the linker could substitute in the absolute address of my_static_function() (as is done on 32-bit systems). However, it cannot – it only has 4 bytes of operand to play with, rather than the 8 needed for a 64-bit address – so linking has to fail. And that’s why we get an error which talks about overflow.

What happens if libstatic.a is compiled with PIC enabled? Not a whole lot changes, actually. The disassembly of libstatic.a remains unchanged. shared.o gains a global object table (GOT) section and its relocation for the my_static_function() call changes from R_X86_64_PC32 to R_X86_64_PLT32 — a procedure linkage table (PLT) relocation using the GOT. We can see that in action in the disassembly of the successfully-linked libshared.so (with irrelevant bits omitted):

$objdump --disassemble libshared.so libshared.so: file format elf64-x86-64 Disassembly of section .plt: 00000000000005f0 <my_static_function@plt>: 5f0: ff 25 fa 13 00 00 jmpq *0x13fa(%rip) # 19f0 <_GLOBAL_OFFSET_TABLE_+0x28> 5f6: 68 02 00 00 00 pushq$0x2
5fb:    e9 c0 ff ff ff           jmpq   5c0 <_init+0x20>

Disassembly of section .text:

00000000000006e8 <my_shared_function>:
6e8:    55                       push   %rbp
6e9:    48 89 e5                 mov    %rsp,%rbp
6ec:    e8 ff fe ff ff           callq  5f0 <my_static_function@plt>
6f1:    5d                       pop    %rbp
6f2:    c3                       retq
6f3:    90                       nop

00000000000006f4 <my_static_function>:
6f4:    55                       push   %rbp
6f5:    48 89 e5                 mov    %rsp,%rbp
6f8:    5d                       pop    %rbp
6f9:    c3                       retq
6fa:    66 90                    xchg   %ax,%ax

Firstly, the callq instruction in my_shared_function() has acquired a non-zero operand. This is a constant offset from the instruction pointer at that instruction which references the entry for my_static_function() in the PLT, which we can see as my_static_function@plt in the .plt section. Rather than being the code for the my_static_function(), this is actually a ‘trampoline’ which loads the address of my_static_function() from the GOT, then jumps to it. The GOT is set up by the runtime loader, and allows for the address of my_static_function() to be changed; for example when relocating it, or when interpositing a different version using LD_PRELOAD. By default, the GOT entry for my_static_function() will point to the implementation in the .text section, as linked in from libstatic.a.

This trampolining through a PLT and GOT is the standard solution for producing position independent code, and demonstrates three things:

• Exported functions incur a runtime cost (in the PLT) on every call. This can be eliminated for private symbols (or internal calls to public symbols, with -Bsymbolic), but not (easily) for public ones, as explained by Ian Lance Taylor. This cost is only three instructions; as they change control flow, they could be relatively expensive, but are probably also catered specifically for in modern superscalar 64-bit processors, as the majority of the code they execute will do indirect function calls this way. So the cost can be safely ignored for all but rather specific use cases.
• Position independent code is easy to achieve, and the indirection it requires brings other benefits like the ever-useful LD_PRELOAD, used by developer tools everywhere.
• Marking internal functions as static is important, because ELF exports functions by default, so internal function calls end up being indirected through the PLT if you omit the static modifier. (Though note that none of the functions here could have been marked as such, as they were all in different compilation units.)

So in summary:

• The “requires dynamic R_X86_64_PC32 reloc against ‘mumble’ which may overflow at runtime; recompile with -fPIC” error is caused by attempting to link a shared library against a static object.
• One solution is to compile a position-independent version of the static object. libtool does this automatically, so why aren’t you using libtool?
• Another (highly related) solution is to link against a shared version of the static object.
• This isn’t an issue on 32-bit systems because PIC is possible by default on those systems, since instruction operands are wide enough to contain absolute symbol addresses .
• Compiling with position independent code introduces a procedure linkage table (PLT) and global offset table (GOT) for each object file, which are very hard to eliminate if you want to avoid the (small) function call overhead they introduce.
• So you should avoid PIC if compiling for constrained targets like embedded devices.
• But use it otherwise (e.g. on desktop systems) for the flexibility (the use of shared libraries!) and security (address space layout randomisation) it affords.

Source code for the example here is available on gitorious in the public domain.

September 17, 2014

Philip Withnall

Long live gnome-common? Macro deprecation

gnome-common is shrinking, as we’ve decided to push as much of it as possible upstream. We have too many layers in our build systems, and adding an arbitrary dependency on gnome-common to pull in some macros once at configure time is not helpful — there are many cases where someone new has tried to build a module and failed with some weird autotools error about an undefined macro, purely because they didn’t have gnome-common installed.

So, for starters:

What does this mean for you, a module maintainer? Nothing, if you don’t want it to. gnome-common now contains copies of the autoconf-archive macros, and has compatibility wrappers for them.

In the long term, you should consider porting your build system to use the new, upstreamed macros. That means, for each macro:

2. Adding the macro to EXTRA_DIST in Makefile.am.
3. Ensuring you have ACLOCAL_AMFLAGS = -I m4 \${ACLOCAL_FLAGS} in your top-level Makefile.am.
4. Updating the macro invocation in configure.ac; just copy out the shim from gnome-common.m4 and tidy everything up.

Here’s an example change for GNOME_CODE_COVERAGE ? AX_CODE_COVERAGE in libgdata.

It seems from the comments that there’s more discussion to be had about the best way to implement this. So hold off on these changes for the moment!

This is the beginning of a (probably) long road to deprecating a lot of gnome-common. Macros like GNOME_COMPILE_WARNINGS and GNOME_COMMON_INIT are next in the firing line — they do nothing GNOME-specific, and should be part of a wider set of reusable free software build macros, like the autoconf-archive. gnome-common’s support for legacy documentation systems (DocBook, anyone?) is also getting deprecated next cycle.

Comments? Get in touch with me or David King (amigadave). This work is the (long overdue) result of a bit of discussion we had at the Berlin DX hackfest in May.

September 08, 2014

Xavier Claessens

OpenGlucose: Again

Made progress this weekend on OpenGlucose. The GUI is still ugly but it has the info I want.

Important things on my wish-list, when I have time:

1. Handling the units. My only device is mg/dl but other countries uses mmol/L. Since I’m living in Canada where they use mmol/L I should grab a new device with those units, so I’ll be able to compare logs and see how to know in which unit the device is configured.
2. Make printable report, I’ve heard doctors like that. OTOH, I shouldn’t encourage using unofficial app for medical purpose.
3. Support other FreeStyle devices. I’m pretty sure they all have the same kind of format so most of the parser should be reusable, I hope. I should be able to get a spare FreeStyle Freedom Lite in a few weeks.
4. Publish an ubuntu package on a PPA.
5. CSV export.
6. Ideas?

I’m curious, does someone else own an InsuLinx and tried OpenGlucose yet? Or even tried to support another device?

September 04, 2014

Nick Richards

First as tragedy

Nine years ago I drew some diagrams attempting to reverse engineer the coffee, milk and foam proportions available in the beverages served at Eat. I'm not going to pretend that this was particularly original work but when walking past one of their branches the other day I noticed some uncannily familiar design.

Actually pretty decent communication, well done. Avoid the actual drinks there though.

September 01, 2014

Marco Barisione

A web browser for the Raspberry Pi

As I previously mentioned, Collabora has been working with the Raspberry Pi Foundation on various projects including a web browser optimised for the Raspberry Pi.
Since the first beta release we have made huge improvements; now the browser is more responsive, it’s faster, and videos work much better (the first beta could play 640×360 videos at 0.5fps, now we can play 25fps 1280×720 videos smoothly). Some web sites are still a bit slow (if they are heavy on the JavaScript side), but there’s not much we can do for web sites that, even on my laptop with an Intel Core i7, use 100% of one of the cores for more than ten seconds!

The browser is based on Gnome Web (Epiphany) using WebKit 1 (i.e. the non-multi-process version of WebKit).

Our main achievements are:

• Progressive tiled rendering for smoother scrolling (as mobile browsers do)
• Startup is three times faster
• Avoid useless image format conversions
• Hardware decoding of videos (through gst-omx)
• Hardware scaling of videos (again, through gst-omx)
• Reduction of the number of memory copies to play videos
• Faster fullscreen playback using dispmanx directly (a bit buggy at the moment, we are working on it)
• Memory and CPU friendly tab management
• JavaScript JIT fixes for ARMv6
• Disk image cache (decoded images are kept in memory mapped files in a cache, saving CPU)
• Memory pressure handler support

No video displayed here? Watch the video on Youtube.
The Raspberry Pi web browser (mp4 video file)

To install the browser, just update your Raspbian and install the “epiphany-browser” package:

sudo apt-get update
sudo apt-get install epiphany-browser


Thanks to all the people at Collabora that, at some point or another, helped on this project: Julien Isorce, Emanuele Aina, ChangSeok Oh, Tomeu Vizoso, Pekka Paalanen, André Moreira Magalhaes, Derek Foreman, Gustavo Noronha, Danilo Cesar, Emilio Pozuelo Monfort and Jonny Lamb (I hope I haven’t forgotten anybody!).
Also thanks to the Raspberry Pi Foundation, and in particular to Eben Upton, for their commitment to making browsing on the Pi better, and to Ben Avison for his work on optimising pixman and libav for ARMv6.

Update: people have reported a few bugs since the release, in particular a problem with Raspbian configured to use 24-bit or 32-bit mode for graphics. We should be able to fix this in a week or so.
Another problem is that Vimeo videos stopped working. This seems to be due to a change made by Vimeo that broke playback also on other browsers and on Android.

August 27, 2014

Nicolas Dufresne

GStreamer gains V4L2 Mem-2-Mem support

Two years since my last post, that seems a long time, but I was busy becoming a GStreamer developer. This story started late in 2009. A team at Samsung (and many core Linux contributors) started adding new type of drivers to the Linux Media Infrastructure API (also known as Video4Linux 2). They introduced video decoding, encoding and video post-processing support through a class of drivers called memory-to-memory.

At the end of 2012, my employer, Collabora was chosen to implement a proof of concept, enabling hardware decoding support to the Cotton Candy, a USB stick size computer based on Samsung Exynos 4412 and built by FXI. The new element has been developed by Sebastien Dröge and was called mfcdec. All this being demonstration code, it never got close to being useful in production..

At the end of 2013, we got contracted again, to bring the demonstration code toward production code. At this point, we took the decision that we where no longer going to build an Exynos specific decoder, but instead re-use the existing GStreamer V4L2 support and do it the “right” way.

It took nearly three months, but with the help of my colleague Julien Isorce, we managed to upstream and ship hardware decoding support for the Cotton Candy. The new element is called v4l2videoNdec, where videoN is that name of the driver node (to allow having multiple decoder at the same time). The element was well suited for static pipeline and embedded applications, but not as flexible as software decoders for desktop.

At the beginning of 2014, we started a new project with Endless Mobile. This time, the goal was to do hardware accelerated decoding also on an Exynos 4412 platform, but in a desktop environment base on Gnome Shell. Two main issues had to be addressed. The buffer pool in GstV4l2 did not track it’s memory, and the color format produced by this decoder could not be color converter using GLES2 shader (not enough coordinate precision). We had to implement a custom memory allocator and rewrite most of the v4l2 buffer pool code. To handle the color format, we had to implement an element that wraps hardware video converter in order to obtain video frames in a format that can be uploaded to GLES2.

As of today, all this effort has landed into GStreamer and is now part of 1.4 GStreamer release. Some of my colleagues went even further by demonstrating during SIGGRAH the benefits of using V4L2 decoder when combining DMABUF and Wayland. Other team, including Pengutronix on Freescale CUDA and STE have started testing against this new promising decoder which finally brings a standard and low level way of decoding medias on Linux.

August 26, 2014

Xavier Claessens

OpenGlucose: continued

I started working on the UI to display the results:

It is made using a GtkApplicationWindow containing a WebkitWebView, the content is made with HTML/CSS/JS with jquery and the chart is made using jqplot.

To make testing easier, I also added a dummy device that has random data, it can be enabled by setting OPENGLUCOSE_DUMMY_DEVICE=1 in your env.

A lot more work is needed, but that’s a start.

August 17, 2014

Xavier Claessens

New project: OpenGlucose

Hello there,

I recently got diagnosed with a diabetes type 1. Like all diabetics, I got a glucometer device that comes with a windows/mac closed-source application. That’s clearly not acceptable for a freedom lover! So here is my new challenge: reverse-engineer the USB protocol of my Abbott FreeStyle InsuLinx device, and write an open source Linux application for it.

And here it is: https://github.com/xclaesse/OpenGlucose

So far it only fetch the bare minimum information from the device and print them in the terminal. More GUI/features will come later.

If you’re a geek diabetic, your help is welcome!

August 15, 2014

Olivier Crête

GNOME.Asia Summit 2014

Everyone has been blogging about GUADEC, but I’d like to talk about my other favorite conference of the year, which is GNOME.Asia. This year, it was in Beijing, a mightily interesting place. Giant megapolis, with grandiose architecture, but at the same time, surprisingly easy to navigate with its efficient metro system and affordable taxis. But the air quality is as bad as they say, at least during the incredibly hot summer days where we visited.

The conference itself was great, this year, co-hosted with FUDCon’s asian edition, it was interesting to see a crowd that’s really different from those who attend GUADEC. Many more people involved in evangelising, deploying and using GNOME as opposed to just developing it, so it allows me to get a different perspective.

On a related note, I was happy to see a healthy delegation from Asia at GUADEC this year!

August 13, 2014

The problem

These days there's quite good support for CPU scaling in the mainline kernel, and many ARM SoCs are making use of it already. But in modern hardware with lots of very fast external memory, running the memory bus at its maximum frequency drastically reduces the amount of time that the device can run when on battery.

A problem that many teams are finding when trying to upstream their power management code is that there's currently no way for several clock consumers to influence the frequency of the memory bus. There has been a few tries to upstream the solutions currently in vendor trees, but so far no acceptable solution has been found.

I'm helping to upstream some of the stuff in the ChromeOS tree, and this issue is currently blocking very interesting work from reaching mainline.

The past

In the vendor tree for Tegra this is addressed by creating virtual clocks that are child of the clock that wants to be influenced. Depending on the type of the virtual clock, setting its rate will influence the rate of its parent clock by setting a floor or ceiling value.

In Qualcomm's vendor tree for the Snapdragon family of SoCs, the concept of a voter clock is introduced. Drivers can vote on the rate of a given clock by "voting" through a child clock, so not that different to how Tegra does it.

Both approaches have the critical disadvantage of adding clk instances for things that aren't real clocks, thus making the API considerably more confusing for relatively little gain.

Both vendor trees have additional API for registering bandwidth needs: tegra_isomgr and msm_bus_scale. They bear quite some resemblance with each other and with pm_qos_interface, but both are tightly tied to specificities of their platforms.

The discussion was brought back to life a couple of months ago when a patch was posted for allowing the tegra-drm driver to set the frequency rate of the external memory controller based on the amount of bandwidth that was needed by the display controller for refreshing the display. Of course, that patch was rejected because there are other components that need to have a say in the frequency rate of the memory bus.

But in that discussion some kind of plan took form and I have been working on making something from it that can be merged upstream.

A possible future

There's so far two main additions to existing frameworks, with the rationale being explained further below:
• Add per-user floor and ceiling constraints to the Common Clock Framework, so drivers can set maximum and minimum frequency rates that the clock should respect. Patchset here.
• Add a PM_QOS_MEMORY_BANDWIDTH class to pm_qos, for drivers to register their expected bandwidth needs. Patchset here.
The idea is for the following agents to be able to influence the current frequency of the memory bus:
• Thermal: a cooling device would call clk_set_ceiling_rate to cap the memory bus to a frequency based on the current temperature.
• Power: a battery driver would set a ceiling in the same way, based on the remaining capacity.
• Devfreq: a devfreq driver wrapping a power management unit such as the ACTMON on Tegra or the PPMU on Exynos would set a floor frequency based on the current load stats.
• Cpufreq: a cpufreq driver would set a floor frequency based on the current CPU frequency.
• Devices that can anticipate how much memory bandwidth will need (such as the display controller, the camera, multimedia codecs, an ISP, USB, etc) would register their requirements in the PM_QOS_MEMORY_BANDWIDTH class. The EMC driver would be listening for notifications and setting a floor frequency based on the aggregated bandwidth that is needed.
The impression so far is that this approach matches the needs of the Tegra and Exynos SoCs, and people working on Rockchip upstreaming are evaluating it. Others working on other SoCs are very welcome to look at it and comment, so the result is also useful to them and they can improve their power management in mainline without having to refactor things later.

August 08, 2014

Frederic Plourde

Gecko on Wayland

At Collabora, we’re always on the lookout for cool opportunities involving Wayland and we noticed recently that Mozilla had started to show some interest in porting Firefox to Wayland. In short, the Wayland display server is becoming very popular for being lightweight, versatile yet powerful and is designed to be a replacement for X11. Chrome and Webkit already got Wayland ports and we think that Firefox should have that too.

Some months ago, we wrote a simple proof-of-concept basically starting from actual Gecko’s GTK3 paths and stripping all the MOZ_X11 ifdefs out of the way. We did a bunch of quick hacks fixing broken stuff but rather easily and quickly (couple days), we got Firefox to run on Weston (Wayland official reference compositor). Ok, because of hard X11 dependencies, keyboard input was broken and decorations suffered a little, but that’s a very good start! Take a look at the below screenshot :)

August 07, 2014

Frederic Plourde

Firefox/Gecko : Getting rid of Xlib surfaces

Over the past few months, working at Collabora, I have helped Mozilla get rid of Xlib surfaces for content on Linux platform. This task was the primary problem keeping Mozilla from turning OpenGL layers on by default on Linux, which is one of their long-term goals. I’ll briefly explain this long-term goal and will thereafter give details about how I got rid of Xlib surfaces.

LONG-TERM GOAL – Enabling Skia layers by default on Linux

My work integrated into a wider, long-term goal that Mozilla currently has : To enable Skia layers by default on Linux (Bug 1038800). And for a glimpse into how Mozilla initially made Skia layers work on linux, see bug 740200. At the time of writing this article, Skia layers are still not enabled by default because there are some open bugs about failing Skia reftests and OMTC (off-main-thread compositing) not being fully stable on linux at the moment (Bug 722012). Why is OMTC needed to get Skia layers on by default on linux ?  Simply because by design, users that choose OpenGL layers are being grandfathered OMTC on Linux… and since the MTC (main-thread compositing) path has been dropped lately, we must tackle the OMTC bugs before we can dream about turning Skia layers on by default on Linux.

For a more detailed explanation of issues and design considerations pertaining turning Skia layers on by default on Linux, see this wiki page.

MY TASK – Getting rig of Xlib surfaces for content

Xlib surfaces for content rendering have been used extensively for a long time now, but when OpenGL got attention as a means to accelerate layers, we quickly ran into interoperability issues between XRender and Texture_From_Pixmap OpenGL extension… issues that were assumed insurmountable after initial analysis. Also, and I quote Roc here, “We [had] lots of problems with X fallbacks, crappy X servers, pixmap usage, weird performance problems in certain setups, etc. In particular we [seemed] to be more sensitive to Xrender implementation quality that say Opera or Webkit/GTK+.” (Bug 496204)

So for all those reasons, someone had to get rid of Xlib surfaces, and that someone was… me ;)

The Problem

So problem was to get rid of Xlib surfaces (gfxXlibSurface) for content under Linux/GTK platform and implicitly, of course, replace them with Image surfaces (gfxImageSurface) so they become regular memory buffers in which we can render with GL/gles and from which we can composite using GPU. Now, it’s pretty easy to force creation of Image surfaces (instead of Xlib ones) for just all content layers in gecko gfx/layers framework, just force gfxPlatformGTK::CreateOffscreenSurfaces(…) to create gfxImageSurfaces in any case.

Problem is, naively doing so gives rise to a series of perf. regressions and sub-optimal paths being taken, for example, to copy image buffers around when passing them across process boundaries, or unnecessary copying when compositing under X11 with Xrender support. So the real work was to fix everything after having pulled the gfxXlibSurface plug ;)

The Solution

First glitch on the way was that GTK2 theme rendering, per design, *had* to happen on Xlib surfaces. We didn’t have much choice as to narrow down our efforts to the GTK3 branch alone. What’s nice with GTK3 on that front is that it makes integral use of cairo, thus letting theme rendering happen on any type of cairo_surface_t. For more detail on that decision, read this.

Upfront, we noticed that the already implemented GL compositor was properly managing and buffering image layer contents, which is a good thing, but on the way, we saw that the ‘basic’ compositor did not. So we started streamlining basic compositor under OMTC for GTK3.

The core of the solution here was about implementing server-side buffering of layer contents that were using image backends. Since targetted platform was Linux/GTK3 and since Xrender support is rather frequent, the most intuitive thing to do was to subclass BasicCompositor into a new X11BasicCompositor and make it use a new specialized DataTextureSource (that we called X11DataTextureSourceBasic) that basically buffers upcoming layer content in ::Update() to an gfxXlibSurface that we keep alive for the TextureSource lifetime (unless surface changes size and/or format).

Performance results were satisfying. For 64 bit systems, we had around 75% boost in tp5o_shutdown_paint, 6% perf gain for ‘cart’, 14% for ‘tresize’, 33% for tscrollx and 12% perf gain on tcanvasmark.

To see the code that we checked-in to solve this, look at those 2 patches :

https://hg.mozilla.org/mozilla-central/rev/a500c62330d4

https://hg.mozilla.org/mozilla-central/rev/6e532c9826e7

Cheers !

July 25, 2014

Pekka Paalanen

Wayland protocol design: object lifespan

Now that we have a few years of experience with the Wayland protocol, I thought I would put some of my observations in writing. This, what will hopefully become a series rather than just one post, considers how to design Wayland protocol extensions the right way.

This first post considers protocol object lifespan and the related races between the compositor/server and the client. I assume that the reader is already aware of the Wayland protocol basics. If not, I suggest reading Chapter 4. Wayland Protocol and Model of Operation.

How protocol objects are created

On a new Wayland connection, the only object that exists is the wl_display which is a specially constructed object. You always have it, and there is no wire protocol for creating it.

The only thing the client can create next is a wl_registry through the wl_display. Registry is the root of the whole interface (class) hierarchy. Wl_registry advertises the global objects by numerical name, and using wl_registry.bind request to bind to a global is the first normal way to create a protocol object.

Binding is slightly special still, as the protocol specification in XML for wl_registry uses the new_id argument type, but does not specify the interface (class) for the new object. In the wire protocol, this special argument gets turned into three arguments: interface name (string), interface version (uint32_t), and the new object ID (uint32_t). This is unique in the Wayland core protocol.

The usual way to create a new protocol object is for the client to send a request that has a new_id type of argument. The protocol specification (XML) defines what the interface is, so there is no need to communicate the interface type over the wire. All that is needed on the wire is the new object ID. Almost all object creation happens this way.

Although rare, also the server may create protocol objects for the client. This happens by having a new_id type of argument in an event. Every time the client receives this event, it receives a new protocol object.

As all requests and events are always part of some interface (like a member of a class), this creates an interface hierarchy. For example, wl_compositor objects are created from wl_registry, and wl_surface objects are created from wl_compositor.

Object creation never fails. Once the request or event is sent, the new objects it creates exists, period. This keeps the protocol asynchronous, as there is no need to reply or check that the creation succeeded.

How protocol objects are destroyed

There are two ways to destroy a protocol object. By far the most common one is to have a request in the interface that is specified to be a destructor. Most often this request is called "destroy". When the client code calls the function wl_foobar_destroy(), the request is sent to the server and the client side proxy (struct wl_proxy) for the object gets destroyed. The server then handles the destructor request at some point in the future.

The other way is to destroy the object by an event. In that case, no destructor must be defined in the interface's protocol specification, and the event must be clearly documented to be destructive as there is no automation nor safeties for this. This is for cases where the server decides when an object dies, and requires extreme care in protocol design to work right in all cases. When a client receives such an event, all it can do is destroy the proxy. The (in)famous example of an interface like this is wl_callback.

Enter the boogeyman: races

It is very important that both the client and the server agree on which protocol objects exist. If the client sends a request on, or references as an argument, an object that does not exist in the server's opinion, the server raises a protocol error, and disconnects the client. Obviously this should never happen, nor should it happen that the server sends an event to an object that the client destroyed.

Wayland being a completely asynchronous protocol, we have no implicit guarantees. The server may send an event at the same time as the client destroys the object, and now the event targets an object the client does not know about anymore. Rather than the client shooting itself dead (that's the server's job), we have a trick in libwayland-client: it silently ignores events to destroyed objects, until the server confirms that the object is truly gone.

This works very well for interfaces where the destructor is a request. If the client first sends the destructor request and then sends another request on the destroyed object, it just shot its own head off - no race needed.

Things get tricky for the other case, destructor events. The server may send the destructor event at the same time the client is sending a request on the same object. When the server finally gets the request, the object is already gone, and the client gets taken behind the shed and shot. Therefore pretty much the only safe way to use destructor events is if the interface does not define any requests at all. Ever, not even in future extensions. Furthermore, objects with that interface should not be used as arguments anywhere, or you may hit the race. That is why destructor events are difficult to use right.

The boogeyman's brother

There is yet another nasty race with events that create objects, i.e. server-created objects. If the client is destroying the (parent) object at the same time as the server is sending an event on that object, creating a new (child) object, the server cannot know if the client actually handled the event or not. If the client ignored the event, it will never tell the server to destroy that new object, and you leak in the server.

You could try to make your way out of that pitfall by writing in your protocol specification, that when the (parent) object is destroyed, all the child objects will be destroyed implicitly. But then the client must not send the destructor request for the child objects after it has destroyed the parent, because otherwise the server sees requests on objects it does not know about, and kicks you in the groin, hard. If the child interface defines a destructor, the client cannot destroy its proxies after destroying the parent object. If the child interface does not define a destructor, you can never free the server-side resources until the parent gets destroyed.

The client could destroy all the child objects with a defined destructor in one go, and then immediately destroy the parent object. I am not sure if that works, but it might. If it does not, you have to specify a whole tear-down protocol sequence. The client tells the server it wants to destroy the parent object, the server acks and guarantees it no longer sends any events on it, then the client actually destroys the parent object. Hey, you have a round-trip and just turned a beautiful asynchronous protocol into synchronous, congratulations!

Concluding with recommendations

Here are my recommendations when designing Wayland protocol extensions:
• Always make sure there is a guaranteed way to destroy all objects. This may sound obvious, but we have fixed several cases in the Wayland core protocol where there was no way to destroy a created protocol object such, that all resources on both server and client side could be freed. And there are still some cases not fixed.
• Always define a destructor request. If you have any doubt whether your new interface needs a destructor request, just put it there. It is more awkward to add later than normal requests. If you do not have one, the client cannot tell the server to free those protocol object resources.
• Do not use destructor events. They are hard to design right, and extending the interface later will be a bitch. The client cannot tell the server to free the resources, so objects with destructor events should be short-lived, and the destruction must be guaranteed.
• Do not use server-side created objects without a serious thought. Designing the destruction sequence such that it never leaks nor explodes is tricky.

July 10, 2014

Jeremy Whiting

Plasma Next is pretty darn stable

Today I wanted to share some of my experiences with using Plasma Next for the past couple of weeks. Since I had been working on some frameworks development (just a small bit here and there), I thought I'd try running Plasma Next a couple of weeks ago to see how things were coming along and to be able to work on and test some things I helped with back in KDE 4.0 days.

I have to say I'm very impressed with the stability.  I hit two issues since then, and one of the issues has been fixed. The issue I hit that has already been fixed was a crash in yakuake or konsole when closing a tab that caused the whole application to crash. I looked into the Konsole codebase with Eike Hein's guidance, but Argonel ultimately found the best patch to fix the problem.
The second issue I hit with Plasma Next has to do with disconnecting and reconnecting an external monitor. I don't do that very often at all, but when I tried last weekend I got a variety of issues. For example sometimes when disconnecting Plasma (or maybe KSmServer?) crashes and I am taken back to the sddm login screen. Other times when connecting my external screen my panel ends up floating on the external monitor but nothing on it is clickable.

I just realized this post probably sounds like a rant or complaining about Plasma Next, but that's not what I intended at all. The main point I wanted to get across is that I haven't used Plasma more than once in the past 2 weeks since Plasma Next is stable enough for my usage.

Now for the obligatory desktop screenshot:

So good job to all the people that have worked on this new iteration of the Plasma Desktop.

P.S. One other minor thing I miss from plasma is the ability to show multiple timezone's times in the clock's tooltip. I'll see if I can get that fixed though. :)

July 09, 2014

Jeremy Whiting

Qt5/KDE Frameworks porting steps

As I said in my last post I would elaborate about how porting of libkeduvocdocument (name pending currently) from Qt4 and kdelibs4 to Qt5 and KDE Frameworks happened.

Commits can be seen here but it went like this:
1. Change CMakeLists.txt to look for frameworks and Qt5 packages.
2. Try to build, fix any errors. All while checking the Porting Notes.
3. Port away from deprecated methods.
4. Port away from kdelibs4support.

I forget which part of the above involved each of these, but this is what was changed:
Ported from KUrl to QUrl.
Ported from KStandardDirs to QStandardPaths
Ported from KGlobal::locale() to QLocale
Ported away from other deprecated methods and classes.

So rinse and repeat until it's in a state where you are happy with it.
Note that step 4 above isn't strictly necessary, and is similar to porting Qt4 applications away from Qt3Support (Some kde4 applicationss never were ported away from Qt3Support sadly...) Yes KMouth, I'm looking at you.

July 07, 2014

Jeremy Whiting

Libkeduvocdocument Qt5/KDE Frameworks port

Hello all. Yes I'm still alive. Yes I'm still doing KDE stuff as I find time or make time. I'll report in the next few posts about what is happening and where we are going.
One of the things that happened recently was the port of libkeduvocdocument to Qt5 and frameworks. Vishesh started the effort, and I completed it with some review by Aleix Pol. It was decided as documented here that since libkdeedu only contains libkeduvocdocument it should be split up. Upon further investigation we realized that the other parts of libkdeedu are not used anymore. Besides the icons subfolder that are still looking for a home, the rest of the git repo is only libkeduvocdocument related, so we decided to just rename the git repository for the frameworks and going forward release. So the libkdeedu git repository holds the kde sc 4 codebase, while the libkeduvocdocument git repo holds the qt5 and frameworks based code. Both contain the history so all history is preserved.

I'll write next time about the steps taken to port the library to Qt5 and KDE Frameworks.

June 25, 2014

Emilio Pozuelo Monfort

Firefox and GTK+ 3

Lately at Collabora I have been working on helping Mozilla with the GTK+ 3 port of Firefox.

The problem

The issue we had to solve is that GTK+ 2 and GTK+ 3 cannot be loaded in the same address space. Moving Firefox from GTK+ 2 to GTK+ 3 isn’t a problem, as only GTK+ 3 gets loaded in its address space, and everything is fine. The problem comes when you load a plugin that links to GTK+ 2, e.g. Flash. Then, GTK+ 2 and GTK+ 3 get both loaded, GTK+ detects that, and aborts to avoid bigger problems. This was tracked as bug #624422.

More specifically, Firefox links to libxul.so, which in turn links to GTK+. These days, the plugins are loaded in a separate process, plugin-container, which communicates with the Firefox process through IPC. If plugin-container didn’t link to GTK+, there would be absolutely no problem, as the browser (Firefox) process could link to GTK+ 3 and plugin-container could load any plugin, including GTK+ 2 ones. However, although plugin-container doesn’t directly use GTK+, it links to libxul.so for IPC, which brings GTK+ into its address space.

The solution

In order to solve this, we evaluated various options. The first one was to split libxul.so in two parts, one with the IPC code and lower level stuff, which wouldn’t link to GTK+, and another side with the rest of the code, including all the widget and toolkit integration, which would obviously link to GTK+. However this turned not to be possible as the libxul code was too intricate.

While this solution is somewhat hacky, it means we didn’t need to mess with libxul, splitting it in two just for the Linux/GTK+ port’s sake. And when the GTK+ 2 plugins become irrelevant, or NPAPI support is removed (as it recently happened in Chrome), we should be able to easily revert this and use GTK+ 3 everywhere.

Wayland

On an unrelated note, we have looked a bit at porting Firefox to Wayland. Wayland is designed to be a replacement for X11, and is becoming very popular in the digital TV and set top box space. Those obviously need HTML engines and web browsers, and with WebKit and Chrome already having Wayland ports, we think Firefox shouldn’t fall behind.

For this, the GTK+ 3 port was a prerequisite, but that isn’t enough. There are many X11 uses on the Firefox codebase, most of which are guarded by #ifdef MOZ_X11, though not all of them are. We got Firefox to start on Weston (the Wayland reference compositor) with a bunch of hacks, one of which broke keyboard input (but avoided a segfault). As you can see from the screenshot, things aren’t perfect, but it’s at least a good start!

June 21, 2014

Nick Richards

Pinpoint COPR Repo

A few years ago I worked with a number of my former colleagues to create Pinpoint, a quick hack that made it easier for us to give presentations that didn't suck. Now that I'm at Collabora I have a couple of presentations to make and using pinpoint was a natural choice. I've been updating our internal templates to use our shiny new brand and wanted to use some newer features that weren't available in Fedora's version of pinpoint.

There hasn't been an official release for a little while and a few useful patches have built up on the master branch. I've packaged a git snapshot and created a COPR repo for Fedora so you can use these snapshots yourself. They're good.

June 20, 2014

Philip Withnall

gobject-introspection gets a development mailing list

Public service announcement: if you’re a bindings author, or are otherwise interested in the development of GIR annotations, the GIR format or typelib format, please subscribe to the gir-devel-list mailing list. It’s shiny and new, and will hopefully serve as a useful way to announce and discuss changes to GIR so that they’re suitable for all bindings.

Currently under discussion (mostly in bug #719966): changes to the default nullability of gpointer nodes, and the addition of a (never-null) annotation to complement (nullable).

June 10, 2014

GStreamer on wayland with GTK+

During the past few months I’ve been occasionally working on integrating GStreamer better with wayland. GStreamer already has a ‘waylandsink’ element in gst-plugins-bad, but the implementation is very limited. One of the things I’ve been working on was to add GstVideoOverlay support in it, and recently, I managed to make this work nicely embedded in a GTK+ window.

gtk+ video player demo running on weston

I’m happy to say that it works pretty well, even though GTK does not support wayland sub-surfaces, which was being thought of as a problem initially. It turns out there is no problem with that, and both the GTK and GstVideoOverlay APIs are sufficient to make this work. However, there needs to be a small addition in GstVideoOverlay to allow smooth resizing. Currently, I have a GstWaylandVideo API that includes those additions.

This essentially means that as soon as this work is merged (hopefully soon), there is nothing stopping applications like totem from being ported to wayland :D

I believe embedding waylandsink in Qt should also work without any problems, I just haven’t tested it.

If you are interested, check the code. The code of the above running demo can also be found here, and the ticket for merging this branch is being tracked here.

I should say many thanks here to my employer, Collabora, for sponsoring this work.

June 05, 2014

Nick Richards

Suburbs Driven by Trains Not Cars

I was reading Maciej Cegłowski's excellent talk, 'The Internet With A Human Face' and aside from it's persuasive argument found something interesting in the tension between his argument rooted in American reality and his German audience. I've also recently been reading 'Concretopia', a book about the post war rebuilding of Britain and realised that there, in the gap between the wars was the suburbia he wasn't aware of. I am of course talking of Metroland, where I grew up.

This environment was enabled by a fascinating business model where the railway company created their own demand. They built houses in the middle of nowhere near London where transport to work required a train journey, giving themselves ongoing revenue and financing the building of the railway (and more). Most recently London has seen a similar excercise in commercial demand creation with Arsenal financing their new stadium (and a bit more) with property development on the old stadium.

As such Metro-land was inherently commuter based from the beginning and unsurprisingly it remains so today, although the nearby presence of the M25 has drawn off some of the rail traffic that existed before. But what's different about a train driven suburbia? It's still strip development, as the logic of the tracks wants a direct and ideally flat route. However it also leads to a more freeway style of node location due to the distance required between stations if the network is to be efficient. There is a greater density of housing than you'd expect of car driven suburbia as well since most houses needed to be within walking or cycling range of the station. This distinctive landscape of small towns and villages where there's nothing to do is characteristic of Metro-land. A parade is different to a strip mall, but they're two sides of the same coin.

All that aside, the artistic flowering of this railway suburbia wasn't just in the lovely graphic design and dead on branding of the Metropolitan Railway company, the poetry of Betjeman, music of Elton John and writing of J.G Ballard are just as distinctively suburban and just as meaningful. In the end railway suburbia is fundamentally different and it comes down to a key phrase in Maciej's talk:

When everyone has a car, it means you can't get anywhere without one. Instead of freeing you, the car becomes a cage.

The railway isn't a cage, it's a highway to possibility.

Pekka Paalanen

From pre-history to beyond the global thermonuclear war

This is a short and vague glimpse to the interfaces that the Linux kernel offers to user space for display and graphics management, from the history to what is hot and new, to what might perhaps be coming after. The topic came current for me when I started preparing Weston for global thermonuclear war.

The pre-history

In the age of dragons, kernel mode setting did not exist. There was only user space mode setting, where the job of the kernel driver (if any) was simply to give user space direct access to the graphics card registers. A user space driver (well, Xorg video DDX, really, err... or what it was at the time of XFree86) would then poke the card registers to set a mode. The kernel had no idea of anything.

The kernel DRM infrastructure was started as an out-of-tree kernel module for cooperating between multiple programs wanting to access the graphics card's resources. Later it was (partially?) merged into the kernel tree (the year is a lie, 2.3.18 came out in 1999), and much much later it was finally deleted from the libdrm repository.

The middle age

For some time, the kernel DRM existed alongside user space mode setting. It was a dark time full of crazy hacks to keep it all together with duct tape, barbwire and luck. GPUs and hardware accelerated OpenGL started to come up.

The new age

With the invent of kernel mode setting (KMS), the DRM kernel drivers got in charge of the graphics card resources: outputs, video modes, memory allocations, hotplug! User space mode setting became obsolete and was eventually killed. The kernel driver was finally actually in control of the graphics hardware.

KMS probably started with just setting the main framebuffer (primary plane) for each "CRTC" and programming the video mode. A CRTC is for "cathode-ray tube controller", but essentially means a block that reads memory (a framebuffer) and produces a bitstream according to video mode timings. The bitstream is directed into an "encoder", which turns it into a proper physical/analogue signal, like VGA or digital DVI. The signal then exits the graphics card though a "connector". CRTC, encoder, and connector are the basic concepts in KMS API. Quite often these can be combined in some restricted ways, like a single CRTC feeding two encoders for clone mode.

Even ancient hardware supported hardware cursors: a small sprite that was composited into the outgoing video signal on the fly, which meant that it was very cheap to move around. Cursor being so special, and often with funny color format (alpha!), got its very own DRM ioctl.

There were also hardware overlays (additional or secondary planes) on some hardware. While the primary framebuffer covers the whole display, an overlay is another buffer (just like the cursor) that gets mixed into the bitstream at the CRTC level. It is like basic compositing done on the scanout hardware level. Overlays usually had additional benefits, for example they could apply scaling or color space conversion (hello, video players) very efficiently. Overlays being different, they too got their very own DRM ioctls.

The KMS user space ABI was anything but atomic. With the X11 tradition, it wasn't too important how to update the displays, as long as the end result eventually was what you wanted. Race conditions in content updates didn't matter too much either, as X was racy as hell anyway. You update the CRTC. Then you update each overlay. You might update the cursor, too. By luck, all these updates could hit the same vblank. Or not. Or you don't hit vblank at all, and get tearing. No big deal, as X was essentially all about front-buffer rendering anyway. (And then there were huge efforts in trying to fix it all up with X, GLX, Mesa and GL-compositors, and avoid tearing, and it ended up complicated.)

With the advent of X compositing managers, that did not play well with the  awkward X11 protocol (Xv) or the hardware overlays, and with rise of the  GPU power and OpenGL, it was thought that hardware overlays would  eventually die out. Turned out the benefits of hardware overlays were too great to abandon, and with Wayland we again have a decent chance to make the most of them while still enjoying compositing.

The global thermonuclear war (named after a git branch by Rob Clark)

The quality of display updates became important. People do not like tearing. Someone actually wanted to update the primary framebuffer and the overlays on the same vblank, guaranteed. And the cursor as the cherry on top.

We needed one ABI to rule them all.

Universal planes brings framebuffers (primary planes), overlays (secondary planes) and cursors (cursor planes) together under the same API. No more type specific ioctls, but common ioctls shared by them all. As these objects are still somewhat different, overlays having wildly differing features and vendors wanting to expose their own stuff, object properties were invented.

An object property is essentially a {key, value} pair. In the API, the name of a key is a string. Each object has its own set of keys. To use a key, you must know it by name, fetch the handle, and then use the handle when setting the value. Handles seem to be per-object, so make sure to fetch them separately for each.

Atomic mode setting and nuclear pageflip are two sides of the same feature. Atomicity is achieved by gathering a set of property changes, and then pushing them all into the kernel in a single ioctl call. Then that call either succeeds or fails as a whole. Libdrm offers a drmModePropertySet for gathering the changes. Everything is exposed as properties: the attached FB, overlay position, video mode, etc.

Atomic mode setting means setting the output modes of a single graphics device, more or less. Devices may have hard to express limitations. A simple example is the available scanout memory bandwidth: You can drive either two mid-resolution outputs, or one high-resolution output. Or maybe some crtc-encoder-connector combination is not possible with a particular other combination for another output. Collecting the video mode, encoder and connector setup over the whole grahics card into a single operation avoids flicker. Either the whole set succeeds, or it fails. Without atomic mode setting, changing multiple outputs would not only take longer, but if some step failed, you'd have to undo all earlier steps (and hope the undo steps don't fail). Plus, there would be no way to easily test if a certain combination is possible. Atomic mode setting fixes all this.

Nuclear pageflip is about synchronizing the update of a single output (monitor) and making that atomic. This means that when user space wants to update the primary framebuffer, move the cursor, and update a couple of overlays, all those changes happen at the same vblank. Again it all either succeeds or fails. "Every frame is perfect."

And then there shall be ponies (at the end of the rainbow)

Once the global thermonuclear war is over, we have the perfect ABI for driving display updates.

Well, almost. Enter NVidia G-Sync, or AMD's FreeSync which is actually backed by a VESA standard. Dynamically variable refresh rate. We have no way yet for timing display updates in DRM. All we can do is kick out a display update, and it will hopefully land on the next vblank, whenever that is. But we can't tell the DRM when we would like it to be. Everything so far assumes, that the display refresh rate is a constant, apart from an explicit mode switch. Though I have heard that e.g. Chrome for Intel (i915, LVDS/eDP reclocking) has some hacks that opportunistically drops the refresh rate to save power.

There is also a culprit in the DRM of today (Jun 3rd, 2014). You can schedule a pageflip, but if you have pending rendering on that framebuffer for the same GPU as were you are presenting it, the pageflip will not happen until the rendering completes. And you do not know when it will complete, which means you do not know if you will hit the very next vblank or something later.

If the rendering GPU is not the same graphics device that presents the framebuffer, you do not get synchronization at all. That means that you may be scanning out an incomplete rendering for a frame or two, or you have to stall the GPU to make sure it is done before scheduling the page flip. This should be fixed with the fences related to dma-bufs (Hi, Maarten Lankhorst).

And so the unicorn keeps on running.

May 08, 2014

Philip Withnall

How widely is the GNOME stack used?

After a couple of discussions at the DX hackfest about cross-platform-ness and deployment of GLib, I started wondering: we often talk about how GNOME developers work at all levels of the stack, but how much of that actually qualifies as ‘core’ work which is used in web servers, in cross-platform desktop software1, or commonly in embedded systems, and which is security critical?

On desktop systems (taking my Fedora 19 installation as representative), we can compare GLib usage to other packages, taking GLib as the lowest layer of the GNOME stack:

Package Reverse dependencies Recursive reverse dependencies
glib2 4001
qt 2003
libcurl 628
boost-system 375
gnutls 345
openssl 101 1022

(Found with repoquery --whatrequires [--recursive] [package name] | wc -l. Some values omitted because they took too long to query, so can be assumed to be close to the entire universe of packages.)

Obviously GLib is depended on by many more packages here than OpenSSL, which is definitely a core piece of software. However, those packages may not be widely used or good attack targets. Higher layers of the GNOME stack see widespread use too:

Package Reverse dependencies
cairo 2348
gdk-pixbuf2 2301
pango 2294
gtk3 801
libsoup 280
gstreamer 193
librsvg2 155
gstreamer1 136
clutter 90

(Found with repoquery --whatrequires [package name] | wc -l.)

Widely-used cross-platform software which interfaces with servers2 includes PuTTY and Wireshark, both of which use GTK+3. However, other major cross-platform FOSS projects such as Firefox and LibreOffice, which are arguably more ‘core’, only use GNOME libraries on Linux.

How about on embedded systems? It’s hard to produce exact numbers here, since as far as I know there’s no recent survey of open source software use on embedded products. However, some examples:

So there are some sample points which suggest moderately widespread usage of GNOME technologies in open-source-oriented embedded systems. For more proprietary embedded systems it’s hard to tell. If they use Qt for their UI, they may well use GLib’s main loop implementation. I tried sampling GPL firmware releases from gpl-devices.org and gpl.nas-central.org, but both are quite out of date. There seem to be a few releases there which use GLib, and a lot which don’t (though in many cases they’re just kernel releases).

Servers are probably the largest attack surface for core infrastructure. How do GNOME technologies fare there? On my CentOS server:

• GLib is used by the popular web server lighttpd (via gamin),
• the widespread logging daemon syslog-ng,
• all MySQL load balancing via mysql-proxy, and
• also by QEMU.
• VMware ESXi seems to use GLib (both versions 2.22 and 2.24!), as determined from looking at its licencing file. This is quite significant — ESXi is used much more widely than QEMU/KVM.
• The Amanda backup server uses GLib extensively,
• as do the clustering solutions Heartbeat and Pacemaker.

I can’t find much evidence of other GNOME libraries in use, though, since there isn’t much call for them in a non-graphical server environment. That said, there has been heavy development of server-grade features in the NetworkManager stack, which will apparently be in RHEL 7 (thanks Jon).

So it looks like GLib, if not other GNOME technologies, is a plausible candidate for being core infrastructure. Why haven’t other GNOME libraries seen more widespread usage? Possibly they have, and it’s too hard to measure. Or perhaps they fulfill a niche which is too small. Most server technology was written before GNOME came along and its libraries matured, so any functionality which could be provided by them has already been implemented in other ways. Embedded systems seem to shun desktop libraries for being too big and slow. The cross-platform support in most GNOME libraries is poorly maintained or non-existent, limiting them to use on UNIX systems only, and not the large OS X or Windows markets. At the really low levels, though, there’s solid evidence that GNOME has produced core infrastructure in the form of GLib.

1. As much as 2014 is the year of Linux on the desktop, Windows and Mac still have a much larger market share.

2. And hence is security critical.

3. Though Wireshark is switching to Qt.

Tomeu Vizoso

GNOME API reference at the DevX hackfest

Last week I spent a few days at the Developer Experience hackfest and got to have some fun again with the API reference generator in gobject-introspection.

Jon intended to hack on the XSLT stylesheets in yelp-tools/xsl, but after some trying he got discouraged by the amount of work that would take to get them to generate the HTML that we are interested in. We also discussed the benefits of using Mallard for generating the reference docs, and given that we want to generate a single output, we couldn't see much value in the level of indirection that Mallard adds.

Thus, we considered generating HTML directly from the GIR files, but shortly after Alberto Ruiz came by and offered to explore a client-side-only solution involving processing JSON files with JavaScript.

He very quickly got something relatively complete, which is very encouraging, but even more so is seeing how other projects are generating their API references that way, for example: http://docs.sencha.com/extjs/4.2.2/#!/api/Ext.Class

Aspects such as as-you-type search would bring the online documentation on par with the Devhelp experience. Plus they have some niceties such as a symbol browser and links to annotated source code.

I have hacked a (yet another) --write-json-files switch to g-ir-doc-tool that will output the content of the GIR file to JSON, but indexed and formatted as needed by the JS side of things. See this branch for the code.

That branch also adds support for Markdown rendering using python-markdown, but some more code needs to be written to implement the extensions to Markdown that the Gtk+ docs are using.

It was great to talk about all this and more with old and new friends in Berlin, so I'm very grateful to the GNOME Foundation for organizing it and sponsoring travel, and to Endocode for providing a venue. And special thanks to Chris Kühl for the great organization!

May 06, 2014

Philip Withnall

Berlin DX hackfest and Clang

Last week I was in Berlin at the GNOME DX hackfest. My goal for the hackfest was to do further work on the fledgling gnome-clang, and work out ways of integrating it into GNOME. There were several really fruitful discussions about GIR, static analysis, Clang ASTs, and integration into Builder which have really helped flesh out my plans for gnome-clang.

The idea we have settled on is to use static analysis more pervasively in the GNOME build process. I will be looking into setting up a build bot to do static analysis on all GNOME modules, with the dual aims of catching bugs and improving the static analyser. Eventually I hope the analysis will become fast enough and accurate enough to be enabled on developers’ machines — but that’s a while away yet.

(For those who have no idea what gnome-clang is: it’s a plugin for the Clang static analyser I’ve been working on, which adds GLib- and GObject-specific checks to the static analysis process.)

One key feature I was working on throughout the hackfest was support for GVariant format string checking, which has now landed in git master. This will automatically check variadic parameters against a static GVariant format string in calls to g_variant_new(), g_variant_get() and other similar methods.

For example, this can statically catch when you forget to add one of the elements:

/*
* Expected a GVariant variadic argument of type ‘int’ but there wasn’t one.
*         floating_variant = g_variant_new ("(si)", "blah");
*                                           ^
*/
{
floating_variant = g_variant_new ("(si)", "blah");
}

Or the inevitable time you forget the tuple brackets:

/*
* Unexpected GVariant format strings ‘i’ with unpaired arguments. If using multiple format strings, they should be enclosed in brackets to create a tuple (e.g. ‘(si)’).
*         floating_variant = g_variant_new ("si", "blah", 56);
*                                           ^
*/
{
floating_variant = g_variant_new ("si", "blah", 56);
}`

After Zeeshan did some smoketesting of it (and I fixed the bugs he found), I think gnome-clang is ready for slightly wider usage. If you’re interested, please install it and try it out! Instructions are on its home page. Let me know if you have any problems getting it running — I want it to be as easy to use as possible.

Another topic I discussed with Ryan and Christian at the hackfest was the idea of a GMainContext visualiser and debugger. I’ve got some ideas for this, and will hopefully find time to work on them in the near future.

Huge thanks to Chris Kühl and Endocode for the use of their offices and their unrivalled hospitality. Thanks to the GNOME Foundation for kindly sponsoring my accommodation; and thanks to my employer, Collabora, for letting me take community days to attend the hackfest.

May 04, 2014

Xavier Claessens

OTR in Empathy

It was a really popular Empathy feature request, it even has a bounty on it, and today I just did it! The patches are available in bugzilla, apply on telepathy-gabble and empathy master.

Here is a screencast of OTR in action between Empathy and Pidgin:

I hacked it on my free time, let’s see if crowd funded projects really works. If you like that feature feel free to make a donation on the bounty: https://freedomsponsors.org/core/issue/333/telepathy-should-support-otr-encryption.