I wanted to write a Openstreetmap XML processor in Go language, hoping that I would get a performance boost from my Python implementation. And it ended being slower. Python is using Expat (written in C) and maybe the Go module «encoding/xml» is not the state of the art of optimization.
In wrote simple programs handling the event «start element». In 10 seconds, I could parse the following amount of XML data (Athlon II X4 620):
- PyPy: did not run because of a bug (no progressive parsing)
- Go: 70Mo
- Python 2.7: 210Mo
- Python 3.2: 215Mo
- Java 7: 460Mo
- C++ / libxml: 675Mo
I tried to use Expat or Libxml in Go, but for the moment it is just too complicated. In Go code, It’s easy to call C functions located in shared libraries, but if you need to pass callback functions written in Go to a library written in C, you will have to do dirty things (create wrappers in a Go module having C code).
That’s a pity because the Go compiler automatically generates C wrappers for your exported Go functions, but you can not get an raw pointer to these wrappers (this way I would have been able to pass my callbacks to LibXML or Expat)… See you later, Go.
Ping : Golang, Openstreetmap, threads | Fabsk.eu
Hi, Any code by chance? I am trying to read OSM XML and am not sure what kind of structs definitions I should be using to capture the XML.
Sorry, I don’t have useful code, only this snippet. If you plan to read huge OSM data, I suggest that you use the PBF (protocol buffers) format instead, it’s much faster for a program to read. I can’t remember exactly how, but there is a program that will generate all the Go structures for you for the PBF specifications. If you want to go with PBF, I can publish my code (which is 2 years old, so maybe outdated).
Thanks mate. I took your snippet and re-wrote it so its neater and captures all important information. Code here: https://github.com/twitchyliquid64/go-osm-parse/blob/master/main.go