Archives de catégorie : Golang

Golang: lack of generics bothers me

I think I will give up Go, mainly because of the lack of generics. What bothers me is that I can’t see how to write all-purpose algorithm functions like the ones C++ have (I love them), like for example «std::remove_if». Without them, you will have to write the same little pieces of code again, and again, and again. Or use cast everywhere (not great for a language with strong static types).
The built-in functions (like «copy») can do such a magic, but you, developer, can’t.
Oh, it’s possible to do like the package sort: provide to the function an interface that will perform the operations on the data (like «Swap», «Len», «Less»). If I want to implement my «remove_if», implementing such an interface will be a drag.
Same problem if you want to create a generic data structure, like a «set» or a b-tree of anything (interface or native type), and keep the type-safety.

That’s a pity, Go has some great features. Maybe I will try Rust.

Golang, Openstreetmap, threads

In my previous attempt of parsing Openstreetmap data, I found parsing XML data slow. Fortunately I realized that the data was also available in Google Prococol Buffer format, and

  • it’s 30-40% smaller than bzip2 XML (and you know, bzip2 requires a fair amount of CPU power)
  • it’s much faster to parse: with Osmosis, reading a PBF file on my quad-core took 14 seconds, and reading the XML.bzip2 with the help of lbunzip2 (multi-thread decompressor) it took 1mn50s. Ouch!

There is a Go library to handle Protocol Buffers, so I tried to write a PBF reader in this language and could see how efficient it would be. My program worked like this:

  • the main thread would read blocks from the file and pass them to thread workers using a channel
  • each worker (a goroutine) would decompress the block, unmarshall the data and process it
  • when there would not be any block left to process, the results of the workers would be merged into a single image

So what kind of performance did I get? I get the best result with a number of workers equal to the number of cores + 1 (so 5 workers): about 28 seconds. I cannot compare this result with Osmosis (not all the cases are handled), but it’s quite acceptable.

I find Go a nice language to use, and it compiles very very quickly. I struggled a little bit with some points, and everything is not clear yet for me also. It feels strange not to program in a OO-way. And I can’t be sure if I have to trigger tons of goroutines or use a pool of workers, if I should pass callback functions or channels.

That’s also a pity that the tremendous performance is not there yet. It’s supposed to be «close to the metal», «a language for system programming», but for the moment (after 3 years) it is not a fast as Java.

Go XML sax-like parsing is slow

I wanted to write a Openstreetmap XML processor in Go language, hoping that I would get a performance boost from my Python implementation. And it ended being slower. Python is using Expat (written in C) and maybe the Go module «encoding/xml» is not the state of the art of optimization.

In wrote simple programs handling the event «start element». In 10 seconds, I could parse the following amount of XML data (Athlon II X4 620):

  • PyPy: did not run because of a bug (no progressive parsing)
  • Go: 70Mo
  • Python 2.7: 210Mo
  • Python 3.2: 215Mo
  • Java 7: 460Mo
  • C++ / libxml: 675Mo

I tried to use Expat or Libxml in Go, but for the moment it is just too complicated. In Go code, It’s easy to call C functions located in shared libraries, but if you need to pass callback functions written in Go to a library written in C, you will have to do dirty things (create wrappers in a Go module having C code).

That’s a pity because the Go compiler automatically generates C wrappers for your exported Go functions, but you can not get an raw pointer to these wrappers (this way I would have been able to pass my callbacks to LibXML or Expat)… See you later, Go.