

{"id":963,"date":"2012-11-19T16:47:30","date_gmt":"2012-11-19T15:47:30","guid":{"rendered":"http:\/\/fabsk.eu\/blog\/?p=963"},"modified":"2012-11-29T19:47:55","modified_gmt":"2012-11-29T18:47:55","slug":"golang-openstreetmap-threads","status":"publish","type":"post","link":"https:\/\/fabsk.eu\/blog\/2012\/11\/19\/golang-openstreetmap-threads\/","title":{"rendered":"Golang, Openstreetmap, threads"},"content":{"rendered":"<p>In my <a title=\"Go XML sax-like parsing is slow\" href=\"http:\/\/fabsk.eu\/blog\/2012\/11\/04\/go-xml-sax-like-parsing-is-slow\/\">previous attempt<\/a> of parsing Openstreetmap data, I found parsing XML data slow. Fortunately I realized that the data was also <a title=\"Geofabrik OpenStreetMap Extracts\" href=\"http:\/\/download.geofabrik.de\/openstreetmap\/\">available<\/a> in <a href=\"https:\/\/developers.google.com\/protocol-buffers\/\">Google Prococol Buffer<\/a> <a title=\"Openstreetmap PBF file format\" href=\"http:\/\/wiki.openstreetmap.org\/wiki\/PBF_Format\">format<\/a>, and<\/p>\n<ul>\n<li>it&rsquo;s 30-40% smaller than bzip2 XML (and you know, bzip2 requires a fair amount of CPU power)<\/li>\n<li>it&rsquo;s much faster to parse: with <a href=\"http:\/\/wiki.openstreetmap.org\/wiki\/Osmosis\">Osmosis<\/a>, reading a PBF file on my quad-core took 14 seconds, and reading the XML.bzip2 with the help of lbunzip2 (multi-thread decompressor) it took 1mn50s. Ouch!<\/li>\n<\/ul>\n<p>There is a <a href=\"https:\/\/code.google.com\/p\/goprotobuf\/\">Go library to handle Protocol Buffers<\/a>, so I tried to write a PBF reader in this language and could see how efficient it would be. My program worked like this:<\/p>\n<ul>\n<li>the main thread would read blocks from the file and pass them to thread workers using a channel<\/li>\n<li>each worker (a goroutine) would decompress the block, unmarshall the data and process it<\/li>\n<li>when there would not be any block left to process, the results of the workers would be merged into a single image<\/li>\n<\/ul>\n<p>So what kind of performance did I get? I get the best result with a number of workers equal to the number of cores + 1 (so 5 workers): about 28 seconds. I cannot compare this result with Osmosis (not all the cases are handled), but it&rsquo;s quite acceptable.<\/p>\n<p><a href=\"http:\/\/fabsk.eu\/blog\/wp-content\/uploads\/2012\/11\/golang_threads.png\"><img loading=\"lazy\" decoding=\"async\" title=\"Golang OSM threads perf\" src=\"http:\/\/fabsk.eu\/blog\/wp-content\/uploads\/2012\/11\/golang_threads.png\" alt=\"\" width=\"886\" height=\"465\" \/><\/a><\/p>\n<p>I find Go a nice language to use, and it compiles very very quickly. I struggled a little bit with some points, and everything is not clear yet for me also. It feels strange not to program in a OO-way. And I can&rsquo;t be sure if I have to trigger tons of goroutines or use a pool of workers, if I should pass callback functions or channels.<\/p>\n<p>That&rsquo;s also a pity that the tremendous performance is not there yet. It&rsquo;s supposed to be \u00abclose to the metal\u00bb, \u00aba language for system programming\u00bb, but for the moment (after 3 years) it is not a fast as Java.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In my previous attempt of parsing Openstreetmap data, I found parsing XML data slow. Fortunately I realized that the data was also available in Google Prococol Buffer format, and it&rsquo;s 30-40% smaller than bzip2 XML (and you know, bzip2 requires a fair amount of CPU power) it&rsquo;s much faster to parse: with Osmosis, reading a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24,25,26],"tags":[],"class_list":["post-963","post","type-post","status-publish","format-standard","hentry","category-dev","category-golang","category-openstreetmap"],"_links":{"self":[{"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/posts\/963","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/comments?post=963"}],"version-history":[{"count":13,"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/posts\/963\/revisions"}],"predecessor-version":[{"id":978,"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/posts\/963\/revisions\/978"}],"wp:attachment":[{"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/media?parent=963"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/categories?post=963"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fabsk.eu\/blog\/wp-json\/wp\/v2\/tags?post=963"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}