基于元数据和XML的信息抽取与集成技术研究

Information Extraction and Integration Technology Based on Metadata and XML

  • 摘要: 为了得到统一的数据形式以利于数据操作和处理,提出了采用基于元数据的模板定制技术以实现信息抽取的方法.该方法有效地实现对非结构化文本的信息提取,将抽取信息转换为统一的XML格式,然后将XML格式的信息集成到关系数据库中.本方法在某造船厂的企业信息化中得到成功应用,为解决企业的信息集成问题提供了一种面向Word文档的新方案.

     

    Abstract: In order to unify data form and to facilitate data manipulation and processing,this paper presents an effective method which uses metadata-based template-customizing technology to extract information.The presented method can efficiently extract information from unstructured document and convert the extracted information into XML which is then integrated into the relational database.The method has been successfully applied to the enterprise informatization system of a shipbuilding plant and provides a new Word-oriented solution for enterprise information integration.

     

/

返回文章
返回