Content-based document conversion

This project deals with developing a methodology and a prototype system to extract content-based information of input documents and convert them into XML documents based on pre-specified DTDs. The system would guide an end-user to identify and develop concepts and attributes in an input document and map them to the document structure specified in the DTD. Conversion rules will be created based on the mapping, which will then be used to create and embed tags in the input document to create an XML document that is well-formed with respect to the DTD.

If you would like more information about this work, please contact Aryya Gangopadhyay at gangopad@umbc.edu or by phone: +1-410-455-2620.