B. Stroustrup about C++

My long-term (continuous for at least 25 years) hobby is history, and I spent significant time in university and later studying philosophy. This has given me a rather conscious view of where my intellectual sympathies lie and why. Among the long-standing schools of thought, I feel most at home with the empiricists rather than with the idealists - the mysticists I just can't appreciate. This is, I tent to prefer Aristotle to Plato, Hume to Descartes, and shake my head sadly over Pascal. I find comprehensive "systems" like those of Plate and Kant fascinating, yet fundamentally unsatisfying in that they appear to me dangerously remote from everyday experiences and the essential peculiarities of individuals.

I find Kierkegaard's almost fanatical concern for the individual and keen psychological insights much more appealing than the grandiose schemes and concern for humanity in the abstract of Hegel or Marx. Respect for groups that doesn't include respect for individuals of those groups isn't respect at all. My C++ design decisions have their roots in my dislike for forcing people to do things in some particular way. In history, some of the worst disasters have been caused by idealists trying to force people into "doing what is good for them." Such idealism not only leads to suffering among its innocent victims, but also to delusion and corruption of the idealists applying the force. I also find idealists prone to ignore experience and experiment that inconveniently clashes with dogma or theory. Where ideals clash and sometimes even when pundits seem to agree, I prefer to provide support that gives the programmer a choice.

Bjorn Stroustrub, "The Design and Evolution of C++"

On code style guide

When I write code I follow a very relaxed style guide. The ultimate principle is that every word in your program must serve a purpose. Even consistency is not considered a good thing if it doesn't make code more readable or safe.

1. Comments are not (always) good.

Quite the opposite. The very necessity to write comments means that there is something
wrong with the code (or the language). Don't write comments, write code. If an algorithm is unclear, rewrite it.
If it's too large, decompose it. If it's unclear what a function does, come up with a better name.
If there is a constraint that function parameters need to meet, write this constraint in the language you're using.
As a last resort, write a comment, but remember: your compiler won't read it.

2. Program should be const-correct, but you don't have to use const whenever an object is constant.

If a function parameter points to a constant, it make sense to use const, but it is OK to leave the pointer itself non-const even if it's not modified inside the function: it doesn't really make code more safe, but adds one (useless) word to your program.

If a function argument points to a constant, but it is a simple utility function used only inside one compilation unit, it is OK to omit const. Especially if you would have to write two functions (constant and non-constant) with the same semantics otherwise. You will easily fix this once you really need two separate functions.

3. Inconsistent but right is better than consistent and wrong.

People tend to prefer following a meaningless rule to changing it even if it's clear to everyone that the rule is no longer relevant. This results in ugly code that everyone is afraid to change, because "who knows why it was written like that? I don't believe it was for nothing. The author knew better." and so on.

For new languages it's not that bad, but for languages like C or C++ it's overwhelming how much of so called conventional wisdom there is, that's been irrelevant for decades but still used by programmers of big companies all over the world.

4. There is nothing more valuable than common sense.

You are allowed (and encouraged) to ignore the rule if you see that it's not applicable to your problem. But remember, you are responsible for the code you write. If you don't really know why the rule is there, you shouldn't just ignore it, better be safe than sorry. Having said that, I think it's better if you do find out why and decide whether it applicable or not, don't use this "better be safe than sorry" as an excuse for replicating meaningless garbage.

Why I so often come to realize I hate C++ and still use it

It will be a story.

Yesterday at work I submitted a little patch for review. It was a simple improvement: I removed std::vector of pairs that was manually sorted right after construction and accessed via std::lower_bound, with a boost::containers::flatmap. It's simple, you got the idea, right?

The diff was extremely simple and obvious, but there was a teeny-tine drawback: the original vector stored not standard pairs, but custom structures. And there was a perfect reason for it: these pairs were substitution rules. The definition looked like that: struct Rule { std::string source; std::string destination; }. (OK, it was full of ugly prefixes, because as any other company we have a lot of bullshit in our code style guide, but essentially it was the same.)

Anyway, my colleague looked at this diff and first thing he saw was not the replacement std::vector => boost::containers::flatmap, but the fact that I removed this Rule structure. So he immediately commented: "What was wrong with the original structure?"

I have to admit, I answered very quickly and without hesitation. I wrote something like this: hey, just look below, I replaced vector with flatmap, that's why there is no more Rule, but std::pair instead.

But later I realized that I was being like this PHP-guy who can give a "perfect explanation" why the string "00-00-0000" is parsed to 1st of January 1979 (or whatever it was, I don't remember the joke exactly). The point is that my brain is so damaged by C++ that it took me whole day to understand that his question is perfectly valid, while my answer is total BS.

There is no reason whatsoever why I can't (or shouldn't) store our original Rule structure inside boost flatmap! Any explanation is just another way of saying: yeah, sorry, it's C++, you can't do anything with it.

But if you consider yourself a hacker, you don't give up on problems, you try to solve them. And it doesn't matter if the problem ain't worth shit, you can't sleep well unless you find a way.

Well, in this case it was not hard at all. As Woody Allen once said, be original, but when you have to steal, steal from the best. Lots of good people (not those OOP-brain-washed fuckers) who came to C++ by mistake or are forced to use it because that's where the money is, already write generic code for ages. Look at boost::geometry. The authors don't make assumptions about how you store your geometries. They don't imply you store point as boost::geometry::point or whatever. No, you are free, as you should be, to choose yourself what you think the best structure for you geometries is.

So, the solution is simple: just rewrite half of STL to make it work with concept "pair". Allow users to define their own access methods as specializations for std::pair_traits... blah-blah-blah. You got it.

If you don't feel pity for me already, you will once you know I actually couldn't sleep well that night.

And then, I actually started to write code that would solve this problem (you still remember what the problem was, right?).

I wrote something like "namespace std { template<typename Key, Value> pair_traits { const Key &first_get..." and said to myself: oh my God!!! Really? Is that what I think how beautiful code should look like?

So, my personal problem with C++ is that I know it quite well and I probably can find solution to any problem and feel happy about it like a child, but in the end the code will be so fucking ugly, that I will never be proud of it. The best I will be able to say is "look how well I know C++" to people who want to hire me. But with friends, I will always be saying: C++ is crap, the best language is Z-talk, which I'm developing (for how many years now?), just wait a bit.

Yeah, sad story.

New pkgsrc packages for Mac OS X Yosemite

(UPD: there is a newer version of the pkgin repo, based on pkgsrc-2015Q2. See my post in Russian blog.)


I started to build packages for Mac OS X Yosemite from pkgsrc-2014Q3.

As always, to bootstrap pkgin package manager run:

curl http://umc8.ru/~a/packages/Darwin-14.0.0/current/pkg.tar.bz2 | bzip2 -dc | sudo tar -C /usr -xf -

See details in my old post.

Just to remind you, these are built with clang for x86_64.

See also: Pkgsrc binary packages for Mac OS X by Jonathan Perkin

Megatools are in pkgsrc-wip

I hope you know what mega.co.nz is, if you don't I recommend to have a look.

In short, it's a secure cloud service with end-to-end encryption.

Well, anyway, megatools is a set of utilities to access your files at mega.co.nz via command line interface. I tried to build it on my Mac this morning and it was surprisingly easy. And what is the next thing to do if you see something opensource and good? Right, you add it to pkgsrc!

Unfortunately, I am not an pkgsrc commiter, so I was able to add megatools to pkgsrc-wip only. But that's still a success.

If you don't know what pkgsrc is, look at docs.

If you don't want to know what pkgsrc is, but still want to try megatools, you can install pkgin and use my binary repo. See how to do it in one line of shell code. After installing pkgin, you can get megatools (as well as tons of other useful software) by running:

pkgin install megatools

Have fun! And don't forget to tell me if anything goes wrong!


Common Lisp-style macros in Racket

Today I put myself together and read a couple of sections of Racket documentation. Now I finally know how to implement Common Lisp-style macros in Racket. I'm still not 100% sure it will work exactly as expected but common cases work right.

Here it is:

(define-syntax define-macro
  (syntax-rules ()
   ((_ name fn)
    (define-syntax (name stx)
      (datum->syntax stx (apply fn (cdr (syntax->datum stx))))))))

(define-macro mac
  (lambda (sig . body)
    (cond ((symbol? sig) `(define-macro ,sig ,@body))
          ((pair? sig)   `(define-macro ,(car sig) (lambda ,(cdr sig) ,@body)))
          (else           (error "wrong use of mac")))))

This will give you macro-defining macro mac as in Paul Graham's Arc, except that I defined it in a Scheme way. You will easily figure out how to use it from this example:

(mac (with var exp . body)
    `((lambda (,var) ,@body) ,exp))

(with a 2 (+ a  4)) ; => 6

Binary pkgsrc packages for FreeBSD-10.0

I started to build packages from pkgsrc on my FreeBSD machine.

If you want to try, run:

fetch -q -o - http://umc8.ru/~a/packages/FreeBSD-10.0/current/pkg.tar.bz2 | bzip2 -dc | sudo tar -C /usr -xf -

This fetches and extracts the minimal distribution of pkgin.

After that you'll have to add /usr/pkg/sbin and /usr/pkg/bin to your path to be able to run pkgin.

If you don't know how to use pkgin, you will probably want to have a look at the docs.

If you're happy with pkgin, I'm glad that I could help.

If not, please tell me why. Then run sudo rm -r /usr/pkg /usr/pkg-current and continue living with FreeBSD's pkgng.

Ripple daemon in pkgsrc wip

Just added rippled (Ripple peer-to-peer network daemon) to pkgsrc wipRipple is a peer-to-peer payment system created by Ripple Labs Inc.

Here is how you can install it on Mac OS X. First, you need to bootstrap pkgin, see my previous post.

And then run: pkgin update && pkgin install rippled

If you're not using Mac OS X, you can still build it yourself using pkgsrc.

If you're not using pkgsrc, you're missing out ;-).

Binary pkgsrc packages for Mac OS X Maverics (Darwin 13.1.0)

As I already wrote, I build pkgsrc packages for Mac OS X. Some time ago I update to Darwin 13.1.0.

To use the new repository, you need to change the path in /usr/pkg/etc/pkgin/repositories.conf to http://umc8.ru/~a/packages/Darwin-13.1.0/current/All

If you want to install everything from scratch run the following:

curl http://umc8.ru/~a/packages/Darwin-13.1.0/current/pkg.tar.bz2 | bzip2 -dc | sudo tar -C /usr -xf -

Again, you need to add /usr/pkg/sbin and /usr/pkg/bin to your path. That can be done by putting these paths to /etc/paths.d/90-pkgsrc-current or by adding the line PATH=$PATH:/usr/pkg/sbin:/usr/pkg/bin to .profile in your home directory.

To those who want to know more about pkgin, look at the pkgin official web page.

If you have any questions, ask here or send me an email.

P. S. If you want to try pkgin but afraid that it will be difficult to remove it, please notice that I configured it so that everything can be removed with just this one command: rm -rf /usr/pkg /usr/pkg-current

Thoughts about z-talk

Decided to write down some ideas about z-talk.

The language

Z-talk is, of course, a lisp. It's lisp-1. It will support traditional CL-style macros as well as first-class continuations. It will be much like PG's Arc and Clojure but differ in that it will not compile for JVM and will not use Racket (mzscheme) which will make it very lightweight much like Scheme48. But it will differ from Scheme48 in that it will support CL-style macros and will not require any top-level declarations that are inspected in compile time. It will support automatic compilation and storing byte-code cache much like in Python.

Implementation

I've been back and forth on the question of what to implement z-talk in and now I think C++ (mostly plain C) will do just fine. Another option that is very attractive is to use PreScheme compiler... You know what? I just changed my mind back again. PreScheme is very small, easy to use and seems much more powerful than C++ or C. OK, back to PreScheme! Stick to it!

Anyway z-talk will compile to byte-code. Compilation unit is one expression (not one file). Conses, vectors and strings are mutable. But I'm thinking about making variable bindings to be fixed forever. This might greatly simplify the implementation of environment structures and allow inlining procedures. Users will be advised to use mutable boxes (or cells, whatever the name) to modify lexically bound values. But I'm still not sure about that.

Data representation

Values in z-talk are ether immediate or non-immediate. Immediate values are represented by single machine word exactly as in Chicken scheme. But there are fewer types of them in z-talk: fixnums, characters and nil. Nil is of its own type and is not a symbol. There is however symbol nil bound to nil value. Lists are terminated by nil. No boolean type. Nil is false, any other value is true.

Non-immediate values are pointers to the objects stored in blocks. There is a separate list of blocks for each object type. Blocks are aligned so that it is easy to find the beginning of a block by dropping the last few bits in a pointer to an object. This is to avoid storing the type tags in every object. Type tag is stored in the header of each block instead.

Small objects of fixed size are stored just like that. Memory is allocated from a singly-linked lists of free cells.

Objects of variable size are stored differently. Vectors are referenced by descriptors very much like in STL vectors in C++. Descriptors are stored the same way as any other small fixed-sized objects. Strings are much like vectors. Symbols are unique in that all equal symbols are just pointers to the same location. Still haven't decided what is the best layout scheme for symbols but here is one important constraint: it must be easy to look them up by the representation. So they must be arranged in some kind of hash.