vkmark: more than a Vulkan benchmark

Ever since Vulkan was announced a few years ago, the idea of creating a Vulkan benchmarking tool in the spirit of glmark2 had been floating in my mind. Recently, thanks to my employer, Collabora, this idea has materialized! The result is the vkmark Vulkan benchmark, hosted on github:

https://github.com/vkmark/vkmark

Like its glmark2 sibling project, vkmark’s goals are different from the goals of big, monolithic and usually proprietary benchmarks. Instead of providing a single, complex benchmark, vkmark aims to provide an extensible suite of targeted, configurable benchmarking scenes. Most scenes exercise specific Vulkan features or usage patterns (e.g., desktop 2.5D scenarios), although we are also happy to have more complex, visually intriguing scenes.

Benchmarking scenes can be configured with options that affect various aspects of their rendering. We hope that the ease with which developers can use different options will make it painless to perform targeted tests and eventually provide best practices advice.

A few years ago we were pleasantly surprised to learn that developers were using glmark2 as a testing tool for driver development, especially in free (as in freedom) software projects. This is a goal that we want to actively pursue for vkmark, too. The flexible benchmarking approach is a natural fit for this kind of development; the developer can start with getting the simple scenes working and then, as the driver matures, move to scenes that use more advanced features. vkmark has already proved useful in this regard, being an valuable testing aid for my own experiments in the Mesa Vulkan WSI implementation.

With vkmark we also want to be on the cutting edge of software development practices and tools. vkmark is a modern, C++14 codebase, using the vulkan-hpp bindings, the Meson build system and the Catch test framework. To ensure a high quality codebase, the core of vkmark is developed using test-driven development.

It is still early days, but vkmark already has support for X11, Wayland and DRM/KMS, and provides two simple scenes: a “clear” scene, and a “cube” scene that renders a simple colored cube based on the vkcube example (which is itself based on kmscube). The future looks bright!

We are looking forward to getting more feedback on vkmark and, of course, contributions are always welcome!

Advertisements

C vs C++11: C++ goes to eleven!

One of the top web results when searching for “C vs C++” is Jakob Østergaard’s article of the same name. In his article, Jakob presents the challenge of writing a program that counts the unique words in a text file, and tries out various versions he got or created himself. Although Jakob’s text can’t really be considered a comprehensive comparison of C vs C++, it does provide some insight into how powerful C++ can be “out of the box”.

The original C++ implementation given by Jakob is:

#include <set>
#include <string>
#include <iostream>

int main(int argc, char **argv)
{
    // Declare and Initialize some variables
    std::string word;
    std::set<std::string> wordcount;
    // Read words and insert in rb-tree
    while (std::cin >> word) wordcount.insert(word);
    // Print the result
    std::cout << "Words: " << wordcount.size() << std::endl;
    return 0;
}

Unfortunately, the concise and highly readable solution presented above leaves a lot to be desired on the performance front. So, I set out to improve it, trying to also take advantage of any relevant C++11 features. My updated C++11 version is:

#include <unordered_set>
#include <string>
#include <iostream>

int main(int argc, char **argv)
{
    // Declare and Initialize some variables
    std::string word;
    std::unordered_set<std::string> wordcount;
    std::ios_base::sync_with_stdio(false);
    // Read words and insert in set
    while (std::cin >> word) wordcount.insert(std::move(word));
    // Print the result
    std::cout << "Words: " << wordcount.size() << std::endl;
    return 0;
}

There are three changes in the new code. The first change is using the new C++11 std::unordered_set container instead of std::set. Internally, unordered_set uses a hash table instead of balanced tree, losing support for item ordering, but gaining significantly in average performance.

The second change is actually an old C++ option, not particular to C++11: disabling stdio synchronization. This is a big performance booster for intensive I/O. It is highly recommended to turn synchronization off, unless you really, really need to use the C and C++ standard streams at the same time.

The third change is explicitly taking advantage of C++11 move semantics (std::move()). In my benchmarks the change didn’t have a noticable impact, perhaps because the compiler was eliding the copy anyway, or because the strings were small enough that a copy and a move weren’t significantly different in performance.

To test the different versions, I created a series of word files containing 4 million words each, each one consisting of a different number of unique words. The tested versions include all the versions from Jakob’s article, plus the new cpp4, c2, and python versions.

Name Description SLOC
cpp1 Original C++ version 11
cpp1-fixed “Fixed” C++ version (using scanf) 12
cpp2 C++ version of c1 100
cpp3 Jakob’s Ego-booster 83
cpp4 C++11 version 12
c1 C hash 71
c2 Glib hash 73
py Python 5

Here are the run time results:

Here are the results for the maximum RSS:

The updated C++11 version (cpp4) is about 5 times (!) faster than the original, partly because of using unordered_map, and partly because of not synchronizing with stdio. The memory usage has decreased by a decent amount, too! For lower numbers of unique words the performance results are somewhat mixed, but, as the number of unique words grows, the C++11 and Glib versions become clear winners. C++ goes to 11, indeed!

Based on the results above, here are some tips:

  1. Rolling your own implementation is probably not worth it.
  2. In C++11, when you don’t need item ordering, you are probably better off using the unordered variants of the containers (but don’t forget to benchmark).
  3. If you use standard streams, and don’t need to be in sync with stdio streams, be sure to turn synchronization off. If you need to be in sync, try hard to stop needing it!
  4. If you just want to quickly create something having decent performance, consider using python.

You can find the code and scripts used for benchmarking here. To create the sample text files (‘make texts’) you need to have an extracted copy of scowl in the project directory.

Changing gdm/lightdm user login settings programmatically

Recently, as part of the automated testing efforts in Linaro, I needed to programmatically change the default X session for a user. It used to be the case that editing ~/.dmrc was enough to achieve this. Display managers (DMs), such as gdm and lightdm, would read the contents of the ~/.dmrc at login time to decide which language and X session to use (among other settings). When a user changed a setting through the GUI, the DM would write the new settings to ~/.dmrc (only after successfully logging in, of course).

However, all of this changed with the introduction of AccountsService. As the name suggests, AccountsService provides a service for the management of user accounts, and among other things, some of the login settings that were previously configured in ~/.dmrc. It offers this functionality through the org.freedesktop.Accounts DBus service on the system bus.

Modern Display Managers use AccountsService to manipulate user  login settings, falling back to ~/.dmrc only when the service is unavailable (at least in the case of lightdm). What this means for the end-user is that editing ~/.dmrc manually has no effect. For example, you can try changing the Session field in ~/.dmrc all you want, but next time you log in you will end up in the same session and your ~/.dmrc changes will have been overwritten.

In order to actually change any settings we need to communicate with AccountsService through DBus. The first step is to find out the object that corresponds to the user we want to change the settings for. The path of this object is of the form /org/freedesktop/Accounts/. <USER> is usually of the form User<UID> but there is no guarantee of that.  Thankfully, the /org/freedesktop/Accounts object provides the org.freedesktop.Accounts.FindUserByName and org.freedesktop.Accounts.FindUserById methods, which we can use to get the object path for a user.

Having the user object path, we can use the org.freedesktop.Accounts.User.* methods on the user object to change the required settings.

We can use the dbus-send program to perform the operations mentioned above. For example:

$ USER_PATH=$(dbus-send --print-reply=literal --system --dest=org.freedesktop.Accounts /org/freedesktop/Accounts org.freedesktop.Accounts.FindUserById int64:1000)
$ dbus-send --print-reply --system --dest=org.freedesktop.Accounts $USER_PATH org.freedesktop.Accounts.User.SetXSession string:’xterm’

As I needed to get and set the X session for a user in a user-friendly manner,  I decided to create a small python program instead. You can find the program here: user_xsession.py

You can use user_xsession.py like:

$ ./user_xsession.py [--user-id <ID>|--user-name <NAME>] set <SESSION>
$ ./user_xsession.py [--user-id <ID>|--user-name <NAME>] get

where <SESSION> is one of the sessions available in /usr/share/xsessions/ . Note that you may need to run as root, depending on the account you want to change the settings for.

For example:

$ ./user_xsession.py --user-id 1000 set xterm

You can easily tweak the program to change another setting instead of the X session.

Update 2012-08-10: Fixed problem with wordpress converting -- (double-dash) to – (en-dash) in code snippets.

The NP-completeness of tax solidarity

At last, it’s Friday! After a long week of hard work you deserve some time to relax and meet for a drink with friends. You make a few phone calls, everything is arranged and a few hours later you are out having a great time. And then the time comes to pay for the drinks. If you are lucky, there is a separate receipt for every drink  If you are not lucky (and you usually aren’t), there is one receipt for all the drinks. Now you have to spend the next five minutes trying to decide who gets the receipt.

Why all the fuss about receipts, you may ask? If you live in Greece (and perhaps other places) in this time of economic woe, then this scenario is all too familiar. The Greek government, in its never-ending battle with tax evasion has come up with a plan! Every citizen must present an amount in receipts which depends on their income (using a tiered system). Any difference from the required amount (up to 15000 €) leads to either a penalty or credit equal to 10% of the difference. Note, that you can’t present receipts that are used in some other way to lower your taxable income (rent, utilities etc).

The main idea is to make citizens ask for receipts, so that businesses are forced to issue them. The secondary idea is to make citizens spend more. So much for savings…

In any case, near the end of the year, everyone in your group of close friends (the ones you go out with more often) ends up with a bunch of receipts. Some have enough receipts to cover their limit, some do not.  Furthermore, some of the receipts are shared, in the sense that the one who has the receipt has paid for only some of the items on it.  Being such good friends you decide that you should help each other out, so that, if possible, everyone reaches their limit and saves money.

Practical tax solidarity

There are several practical ways to help each other and they all share a common core.  The receipts of each individual are first split into a private and public part.  Each individual keeps their private part and the public part is split among the individuals that need it. By combining different ways to decide what receipts are public and private and how to split the public ones we can get several methods to help our friends. Some interesting methods are:

private public split method
1 nothing all equally among all
2 all until limit the rest among those that need it until limit, rest back to owners or equally among all
3 all until limit + 15000€ the rest among those that need it until limit, rest back to owners or equally among all
4 non-shared shared equally among all
5 non-shared + shared until limit rest equally among all

If you are really good friends and the spirit of solidarity is high within the group, you may decide to try to split the receipts as evenly as possible within the group (method 1). This method maximizes the collective gain, but hurts the individuals that have originally a larger amount in receipts. This may not be a problem, though, and may actually be the most fair way to handle the situation if the number of shared receipts in the group is high.

Another way (method 2) is for every person to keep enough of their receipts so that they reach the limit (if they can, of course) and add the rest to the group’s surplus collection. The receipts from the surplus collection are then distributed to those that don’t have enough receipts to reach the limit on their own.  Any remaining receipts are then either  returned to their original owners or equally split within the group.

A variation of the previous method is method 3. In this case we use limit + 15000€ for the private receipts, so that individuals get the maximum benefit from their receipts. Unfortunately, this method rarely leaves any receipts for the surplus and is quite unfair if there is a large number of shared receipts.

Methods 4 and 5 assume that each receipt can be easily identified as shared or non-shared. In method 4 individuals keep only the receipts that are non-shared and add the shared receipts to the surplus. In method 5 individuals first use their non-shared receipts and then start using the shared ones until they reach the limit. If the receipts were originally distributed randomly, then method 4 is the fairest way to handle things. On the other hand, if shared receipts were originally given to the person with the highest expense then perhaps method 5 is a bit more fair.

Handling NP-completeness

Whatever way you choose to help your friends, the same basic problem must be solved: how can you evenly partition a set of receipts among a number of individuals?

This problem, as innocent as it may look, is actually NP-complete! Its official name is the “partition problem” and in its optimization version is stated as:

Partition a multiset of integers S into n subsets S_1, S_2, ..., S_n such that max(sum(S_1), sum(S_2), ..., sum(S_n)) is minimized.

In plain words, split a multiset of integers into n subsets as evenly (as far as the sum of each subset is concerned) as possible. The classic version involves only two subsets. The difficulty lies in the fact that we are not allowed to split each integer (receipt amount in our case) arbitrarily.

Being NP-complete practically means that in the worst case we have to go through all the partitioning combinations to find the best. The combinations being persons^receipts, this is *not* what I call a fun past-time.

So, is this all part of an evil plot by the tax service to get our hard-earned income? Even if it is, we are not completely helpless! The partition problem is often called the “Easiest Hard Problem”. The reason is that under some conditions, which have been extensively studied, there are heuristic methods that provide very good results.

These conditions state that if we take n numbers randomly selected from the range [1, 2^m] then if k = m / n < 1 ( \Rightarrow m < n) there is a large probability that there exist many perfect partitions. Of course this also works for numbers from arbitrary ranges because we can always map a range [a, b] to [1, b - a +1]. On the other hand if k > 1 the probability of perfect partitions existing for a given multiset abruptly drops.

Intuitively, the above condition makes perfect sense: it is much more probable to be able to split a multiset into subsets with equal sums if that multiset has many small values (where small is relative to the cardinality of the set). As the values get larger and/or the set size becomes smaller it is much less probable to be able to find a perfect split.

So, how does our problem instance fare with regard to this criterion?  First of all, we have to make a minor adjustment because the condition refers to integers but cash amounts are real numbers with two decimal digits.  This can be easily countered by expressing all amounts as euro-cents. The amount on each receipt is hardly ever outside the [1, 6553600] range (\approx 2^{23}) in euro-cents (unless, of course, you end up buying a new yacht every time you go out for groceries). Also, let’s be insanely over-conservative here and say that a normal person gets 3 receipts per week. These give us an m of 23 and an n of 3 * 52 = 156, therefore k = 23 / 156 < 1.

Now that we know that there is a good chance of being able to solve the problem, how do we actually go about it? The algorithm is surprisingly straightforward. Of course, it does not strictly solve the problem, but it produces an approximate solution that is good enough for our purposes. The algorithm is:

1. Sort the numbers in descending order   [ O(nlogn) on average ]
2. Process each number (in sorted order): [ Θ(n) ]
  2.1 Find the subset with the least sum  [ O(logp) using priority queues ]
  2.2 Add the number to the subset        [ Θ(1) ]

where n: number of receipts
      p: number of persons

This gives us T(n, p) = O(nlogn + nlogp + n). For a constant number of persons this leads to T(n) = O(nlogn). Not bad for “solving” an NP-complete problem! This algorithm has been proved to produce results relatively close to the optimal solution. Its discrepancy (the difference in value of the sums of the produced subsets) is O(1/n). There are other  algorithms that can do a bit better, but they are considerably more complicated (and not worth it for our puny purpose).

Finally, because “an ounce of action is worth a ton of theory”, I have developed a proof of concept application that implements some of the methods mentioned above. You can find it at https://code.launchpad.net/~afrantzis/+junk/tax-solidarity-trunk. It is written in python and licensed under the AGPL 3.0+.

Take that, tax service! YOUR NEFARIOUS SCHEME WILL NEVER SUCCEED!

References

  • Karmarker, Narenda and Karp, Richard M., The Differencing Method of Set Partitioning, Techreport, UC Berkley, 1983
  • Brian Hayes, The Easiest Hard Problem, American Scientist, March-April 2002
  • Stephan Mertens, The Easiest Hard Problem: Number Partitioning, Computational Complexity and Statistical Physics, 2006