Sunday, July 18, 2010

OGR DXF Upgrade

Well, I just finished up a week of work implementing PCIDSK Vector write and update support. My next task is an upgrade to the OGR DXF driver requested by Stadt Uster to better meet their production requirements.

They need the ability to control layer naming, production of dash patterns, and producing objects as block references.

Currently the DXF writer is dependent on everything except the entities section coming directly from a template file. This has meant it was not practical to create layers, line styles and block references on the fly. This is going to have to change now, though we will continue to use the template header extensively.

The first approach considered was just to require the user to develop a template header with all the layer names, line styles and block references predefined. This might have been adequate for Uster who have specific needs and once a template was prepared they could generally just reuse it for additional products. But it would have made the new capabilities of very little utility to other users of GDAL.

So the planned approach has two parts. First we scan the template header for layer definitions, and block definitions (and possibly line style definitions). Then as we go through the entities if we find these layers or blocks referenced we just use them directly.

However, if we find layers referenced from the objects being written to the DXF file (based on the "Layer" attribute) we will automatically create a new layer in the header, matching the configuration of the default layer ("0"). This means we can automatically create layers that have identity even if they are otherwise indistinguishable.

For block references we use a roughly similar approach. We prescan for block definitions, but then we extend them with any entities written to an OGR layer named "Blocks". Normally all DXF entities are exposed through an OGR layer called "entities" though when writing we accept any layer name, except now for Blocks which is special. Then when we write we allow these blocks to be referenced based on a BlockName attribute.

Corresponding behavior will also be available when reading. If desired (a config option turned on) we will expose block definitions as a Blocks layer, and the actual block references in the entities layer will just be a point feature with insertion information. The default behavior will remain what it does not - which is to inline copies of the block geometries for each block reference as this is the only approach that will be handled gracefully by most applications or writers.

For line styles it is not clear that it is helpful to predefine them (though I'll examine that during implementation). The only aspect we are interested in preserving and producing is dash-dot patterns. But while writing I will create new line styles for each such pattern encountered, and then write them out to the header.

Currently the DXF writer copies the template header to the output file, then writes entities, then appends the trailer template. In the future I will need to write the entities to a temporary file, storing up block, layer and line style definitions in memory. Then on closing the dataset I will need to compose the full header, append the entities and trailer.

I dislike this pattern. It introduces a need for temporary files which can lead to surprising disk use requirements. Also, if something goes wrong the temp files may not get cleaned up properly. We also lose any hope of streaming operation. However, to achieve what we want to with the DXF driver it seems unavoidable.

Hopefully I'll have the DXF changes in trunk within a couple weeks. If there are folks interested in DXF generation, keep an eye on SVN for updates. Beta testers appreciated!