I think I will give up Go, mainly because of the lack of generics. What bothers me is that I can’t see how to write all-purpose algorithm functions like the ones C++ have (I love them), like for example «std::remove_if». Without them, you will have to write the same little pieces of code again, and again, and again. Or use cast everywhere (not great for a language with strong static types).
The built-in functions (like «copy») can do such a magic, but you, developer, can’t.
Oh, it’s possible to do like the package sort: provide to the function an interface that will perform the operations on the data (like «Swap», «Len», «Less»). If I want to implement my «remove_if», implementing such an interface will be a drag.
Same problem if you want to create a generic data structure, like a «set» or a b-tree of anything (interface or native type), and keep the type-safety.
That’s a pity, Go has some great features. Maybe I will try Rust.
In my previous attempt of parsing Openstreetmap data, I found parsing XML data slow. Fortunately I realized that the data was also available in Google Prococol Buffer format, and
- it’s 30-40% smaller than bzip2 XML (and you know, bzip2 requires a fair amount of CPU power)
- it’s much faster to parse: with Osmosis, reading a PBF file on my quad-core took 14 seconds, and reading the XML.bzip2 with the help of lbunzip2 (multi-thread decompressor) it took 1mn50s. Ouch!
There is a Go library to handle Protocol Buffers, so I tried to write a PBF reader in this language and could see how efficient it would be. My program worked like this:
- the main thread would read blocks from the file and pass them to thread workers using a channel
- each worker (a goroutine) would decompress the block, unmarshall the data and process it
- when there would not be any block left to process, the results of the workers would be merged into a single image
So what kind of performance did I get? I get the best result with a number of workers equal to the number of cores + 1 (so 5 workers): about 28 seconds. I cannot compare this result with Osmosis (not all the cases are handled), but it’s quite acceptable.
I find Go a nice language to use, and it compiles very very quickly. I struggled a little bit with some points, and everything is not clear yet for me also. It feels strange not to program in a OO-way. And I can’t be sure if I have to trigger tons of goroutines or use a pool of workers, if I should pass callback functions or channels.
That’s also a pity that the tremendous performance is not there yet. It’s supposed to be «close to the metal», «a language for system programming», but for the moment (after 3 years) it is not a fast as Java.
I wanted to write a Openstreetmap XML processor in Go language, hoping that I would get a performance boost from my Python implementation. And it ended being slower. Python is using Expat (written in C) and maybe the Go module «encoding/xml» is not the state of the art of optimization.
In wrote simple programs handling the event «start element». In 10 seconds, I could parse the following amount of XML data (Athlon II X4 620):
- PyPy: did not run because of a bug (no progressive parsing)
- Go: 70Mo
- Python 2.7: 210Mo
- Python 3.2: 215Mo
- Java 7: 460Mo
- C++ / libxml: 675Mo
I tried to use Expat or Libxml in Go, but for the moment it is just too complicated. In Go code, It’s easy to call C functions located in shared libraries, but if you need to pass callback functions written in Go to a library written in C, you will have to do dirty things (create wrappers in a Go module having C code).
That’s a pity because the Go compiler automatically generates C wrappers for your exported Go functions, but you can not get an raw pointer to these wrappers (this way I would have been able to pass my callbacks to LibXML or Expat)… See you later, Go.