Thursday, July 25, 2019

Parse XML with namespace based on xslt

    There is a xml file (sample.xml) which contains a self-defined namespace. I want to retrieve <w:r><w:t> value and that <w:r> should not contain child <w:pict> inside it. So, as per my below XML Document i want to generate following output:
<paragraph>This is the text that i need to retrieve...</paragraph>
    
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="sample.xsl"?>
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:body>
    <w:p> <!-- Current Node -->
        <w:r>
            <w:t>
                 This is the
            </w:t>
        </w:r>
        <w:r>
            <w:pict>
                <w:p>
                    <w:r>
                        <w:t>
                            I dont need this
                        </w:t>
                    </w:r>
                </w:p>
            </w:pict>
        </w:r>
        <w:r>
            <w:pict>
                <w:p>
                    <w:r>
                        <w:t>
                            I dont need this too
                        </w:t>
                    </w:r>
                </w:p>
            </w:pict>
        </w:r>
        <w:r>
            <w:t>
                 text that i need to retrieve...
            </w:t>
        </w:r>
    </w:p>
</w:body>
</w:document>
        
 The relative xslt file (sample.xsl) as below:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<xsl:template match="/">
  <html>
    <body>
      <table border="1"><td>
        <![CDATA[<paragraph>]]>
          <xsl:for-each select="//w:r[not(ancestor::w:pict)]">
            <xsl:value-of select="w:t"/>
          </xsl:for-each>
        <![CDATA[</paragraph>]]></td>
      </table>
    </body>
  </html>
</xsl:template>

</xsl:stylesheet>

No comments:

Post a Comment