您的位置:首页 > 其它

用xerces-c来进行xml schema校验

2015-07-10 10:24 260 查看
在xerces-c的官方网站上有文章指引说明是如何进行xml schema校验。http://xerces.apache.org/xerces-c/schema-3.html

给出的例子代码:

// Instantiate the DOM parser.
XercesDOMParser parser;
parser.setDoNamespaces(true);
parser.setDoSchema(true);
parser.parse(xmlFile);


但,例子代码根本不起任何作用。

在调用XercesDOMParser::parse之前,还有两件事情要做:

1.调用XercesDOMParser::setValidationScheme来设置校验计划

parser.setValidationScheme( XercesDOMParser::Val_Auto);




parser.setValidationScheme( XercesDOMParser::Val_Always);


2.要调用XercesDOMParser::setErrorHandler, 其中参数必须是ErrorHandler类或子类的对象。

看下面例子

address.xml:

<?xml version="1.0" encoding="utf-8"?>
<Address xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="address.xsd">
<Recipient>Mr. Walter C. Brown</Recipient>
<House>good</House>
<Street>Featherstone Street</Street>
<Town>LONDON</Town>
<PostCode>EC1Y 8SY</PostCode>
<Country>UK</Country>
</Address>


address.xsd:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Address">
<xs:complexType>
<xs:sequence>
<xs:element name="Recipient" type="xs:string" />
<xs:element name="House" type="xs:string" />
<xs:element name="Street" type="xs:string" />
<xs:element name="Town" type="xs:string" />
<xs:element name="County" type="xs:string" minOccurs="0" />
<xs:element name="PostCode" type="xs:unsignedInt" />
<xs:element name="Country" minOccurs="0">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="IN" />
<xs:enumeration value="DE" />
<xs:enumeration value="ES" />
<xs:enumeration value="UK" />
<xs:enumeration value="US" />
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>


new_address.cpp:

#include <stdio.h>
#include <xercesc/parsers/XercesDOMParser.hpp>
#include <xercesc/sax/SAXException.hpp>
#include <xercesc/dom/DOMException.hpp>
#include <xercesc/dom/DOMElement.hpp>
#include <xercesc/dom/DOMLSException.hpp>
#include <xercesc/sax2/DefaultHandler.hpp>

using namespace XERCES_CPP_NAMESPACE;

class SchemaErrorHandler : public DefaultHandler
{
public:
SchemaErrorHandler() {}
~SchemaErrorHandler() {}

void warning(const SAXParseException& exc)
{
printf( "warn in line:%lu, col:%lu, %s\n",
exc.getLineNumber(), exc.getColumnNumber(),
XMLString::transcode( exc.getMessage()) );
}

void error(const SAXParseException& exc)
{
printf( "error in line:%lu, col:%lu, %s\n",
exc.getLineNumber(), exc.getColumnNumber(),
XMLString::transcode( exc.getMessage()) );
}

void fatalError(const SAXParseException& exc)
{
printf( "fatal in line:%lu, col:%lu, %s\n",
exc.getLineNumber(), exc.getColumnNumber(),
XMLString::transcode( exc.getMessage()) );
}

void resetErrors()
{
printf( "nothing\n" );
}
};

int main(int argc, char* argv[] )
{
if ( argc < 2 )
{
printf( "must specify a file\n" );
return -1;
}

XMLPlatformUtils::Initialize();

XercesDOMParser parser;
SchemaErrorHandler handler;
try
{
parser.setErrorHandler( &handler );
parser.setDoNamespaces(true);
parser.setDoSchema(true);
//parser.setValidationScheme( XercesDOMParser::Val_Auto);
parser.parse( argv[1] );
} catch ( SAXException& e )
{
printf( "msg:%s\n", XMLString::transcode(e.getMessage() ) );
return -2;
}
catch ( XMLException& e )
{
printf( "code:%d, msg:%s\n", e.getCode(), XMLString::transcode( e.getMessage() ) );
return -3;
}
catch (	DOMException& e )
{
printf( "code:%d, msg:%s\n", e.code, e.msg );
return -4;
}

return 0;
}
可以看到这里的代码注释掉了这一行:

//parser.setValidationScheme( XercesDOMParser::Val_Auto);


编译运行:

[xuzhina@localhost sample]$ g++ -g -o new_address new_address.cpp -lxerces-c
[xuzhina@localhost sample]$ ./new_address address.xml
[xuzhina@localhost sample]$




//parser.setValidationScheme( XercesDOMParser::Val_Auto);
打开,但注释掉

parser.setErrorHandler( &handler );
编译运行:

[xuzhina@localhost sample]$ g++ -g -o new_address new_address.cpp -lxerces-c
[xuzhina@localhost sample]$ ./new_address address.xml
[xuzhina@localhost sample]$




parser.setErrorHandler( &handler );


打开,编译运行:

[xuzhina@localhost sample]$ ./new_address address.xml
error in line:8, col:31, value 'EC1Y 8SY' does not match regular expression facet '[+\-]?[0-9]+'


运行一下xmllint,对比一下结果:

[xuzhina@localhost sample]$ xmllint --schema address.xsd address.xml
<?xml version="1.0" encoding="utf-8"?>
<Address xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="address.xsd">
<Recipient>Mr. Walter C. Brown</Recipient>
<House>good</House>
<Street>Featherstone Street</Street>
<Town>LONDON</Town>
<PostCode>EC1Y 8SY</PostCode>
<Country>UK</Country>
</Address>
address.xml:8: element PostCode: Schemas validity error : Element 'PostCode': 'EC1Y 8SY' is not a valid value of the atomic type 'xs:unsignedInt'.
address.xml fails to validate


PS:

在xml schema中,string是兼容其它类型,比如在House标签的内容写上数字,比如49,无论xmllint, 还是xerces都不会报这个标签的值有问题。曾经为这个问题折腾一个下午。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: