Navigation
Laboratorium
Unvollständige Arbeitstände
- Andi's notes (en)
Open Office is using a relatively simple file format consisting of zipped XML-Files. Unfortunately Open Office isn't used in most German companies and in none of our pilot users companies.
Open XML is Microsoft's answer to the Open Document format. The format seems to be very similar.
The filter layer could be implemented by copying the ODT solution and adjusting it to Open XML specialities. An alternative could be the OpenXML PHP API.
Using the same mechanisms for ODT and OpenXML would be preferable IMHO.
A Word generated .docx is not easily parsable, even when document templates where used when creating it. There seems to be no semantic to the structure but everything is done through the same elements with different styles attached!
To check:
The .doc format is probably the most used format. But it's a proprietary data blob.
3rd party tools: