A client site recently implemented PeopleSoft CRM which handles Word 2007 documents as attachments. This site's CRM is defined to FTP all attachments to/from a fixed location and does not manipulate file data. The CRM database has a table tracking file names and a configuration file associating file name extensions with applications. For example if a Word 2007 attached document is saved as "filename" and not "filename.docx", it will not open correctly from CRM. It spurred me to look again at how PassPort / Asset Suite works with embedded documents.
The question came to mind "What if the OLE wrapper is removed from Word 2007 embedded documents?", so I tried it by a) reading the data out of tidblob and writing it to a file, b) uncompressing the file, c) use a hex editor to strip the OLE data out of the file, and d) writing the file data back to tidblob as big B data (letting Portal/J know not to try and uncompress it). Accessing the object via P/J on a PC with Word 2003 did not work. When Word 2003 attempted to open the file it didn't know what the data was and prompted for what looked like code pages (Select the encoding that makes your document readable: "USA-ASCII", "Western European", etc). Selecting anything failed, P/J hung and had to be killed via the Windows Task Manager.
The P/J server was then changed via the P/J admin so that RichText = Word.Document.12. When P/J with Word 2003 opens the document it now seems to know the document is a Word document and prompts for what type of Word document, i.e., "Word 97", "Word 2003", "Word 2007". Selecting "Word 2007" opens the document successfully. This might be contingent on the panel (sym) definition's OLE Data Dictionary element being associated with RichText.
Word 2003 saving the document results in the tidblob data being rewritten as an OLE (tidoblok.ole_object_class = Word.Document.12 and tidblob.ole_object_type = b) with what appears to be junk Open Office i.e., Word 2007, data being carried along. The junk data does not appear to affect the useability of the document and can only be seen by reading the data out of tidblob and uncompressing it. Word 2007 is able to read the document and if Word 2007 writes it, the document is saved in tidblob as an OLE.
It's assumed that if Word 2003 rewrites a Word 2007 document that contains features unique to Word 2007 the features will be lost, but I haven't seen this yet.
It should be possible to create a process that periodically looks for Word 2007 documents with OLE wrappers and removes the OLE wrapper. All(?) versions of Word would be able to read it and Word 2007 would be able to cleanly maintain it. Make certain that the Word 2007 default save format is .docx. Hmm... need to re-verify this.