Sunday, January 13, 2013

Databases (Course / Tutorial), Part 2 - XML Documents

XML DATA

XML can be considered an alternative for the Relational Data Model. There are situations in which using an XML rather than Ralational Model is more suitable. In general, you can say that XML is much more flexible because it was originally designed to share data. For example, a Relational Model requires a schema, which means you have to plan and design based on your understanding future of the data model your are building.

Today in many cases XML is use as an intermediary layer between users and database. For example, eBay has an XML that can be used to access their data and build applications from it, without directly accessing their database. Therefore, it could be said that XML is an excellent way to have computers talk to each other. For example airlines have XML Documents at specific locations in the internet that allow programmers and/or travel agencies to access their schedules and availability to create services like cheapoair or kayak.

XML Structure (Well-Formed)

 
 <Bookstore>  
      <Book ISBN="978-456321789" Price="15" Edition="2nd">  
           <Title>Learn Python</Title>  
           <Authors>  
                <Author>  
                     <First_Name>Pythonian</First_Name>  
                     <Last_Name>Pythoner</Last_Name>  
                </Author>  
                <Author>  
                     <First_Name>Ricky</First_Name>  
                     <Last_Name>Rich</Last_Name>  
                </Author>  
           </Authors>  
      </Book>  
      <Book ISBN="4218-127894413" Price="12">  
           <Title>Swimming</Title>  
           <Authors>  
                <Author>  
                     <First_Name>Michael</First_Name>  
                     <Last_Name>Phelps</Last_Name>  
                </Author>  
           </Authors>  
           <Remark>  
                Buy this book bundled with a CD - Great Deal  
           </Remark>  
      </Book>  
 </Bookstore>  

As seen above the document is indented and every tag has a closing tag. The tags can have any name you like or your project needs. Notice that the second book, doesn't have an Edition attribute. This is part of the great flexibility of XML. The structure of the XML can then be thought as:
  • Tagged Elements or Nested Elements
  • Attributes (in the elements)
  • Text
Well-Formed is the most flexible XML. The basic Structural requirements of Well-Formed XML are:
  1. Single Root element (In our example we start with  <Bookstore> , so we must have all our other XML content before the closing </Bookstore> ).
  2. Matched tags and proper Nesting (Above as you see all tags are indented depending to their level of hierarchy. For example, it would be wrong to have <Title>  at the same level of indentation as <Book>).
  3. Unique attributes within elements(For example, having 2 ISBN attributes in the <Book>  tag is not correct and therefore not valid for a valid Well-Formed XML)

XML Parser are used to validate the 3 structural requirements of an Well-Formed Document. When the document is parsed is returns a parsed XML in different standards DOM (Document Object Model), or SAX.

Well-Formed is useful when data is not structured, meaning that it is irregular. If there is a uncertainty about what kind of data a document may carry, then a more structured approach for the design of it could be to complex to use( for example DTD/XSD, explained below).

Dsiplay/Render XML

There are two options to diplay XML documents in a nice organized way. You can use CSS or XSL. CSS is the most know for it's usefulness for HTML styling, however, in XML XSL(Extensible style-sheet Language is often used).

A CSS/XSL interpreter is required. The interpreter has some rules for the conversion to the HTML output generated. The document (XML) should still be parsed to check for structural consistence.

Valid XML

The previously mentioned Well-Formed XML is suitable for many different purposes. However, there are cases in which considering content-specific rules for an XML document is required. Adhering content-specifics to an XML is achieved using DTD's or XML Schema(XSD), from which XML Schema is more powerful and widely used.

The validation process is similar to the Well-Formed XML. The only new difference, is that the parser now takes into account the requirements of the content-specific specification and verifies if the content of the document follows the specification for the document.

DTD

Since DTD is a specification language, what DTD allows you to define is:
  • Elements
  • Attributes
  • Nesting
  • Ordering
  • Frequency of Occurrences
  • ID and IDREF(S): Which are special attributes types that act as pointers within and XML Document
Why using DTD or XSD? This question can be considered in light of flexibility and your validation requirements. For example, if there is data coming in the XML that has DTD/XSD it is possible have the data pre-validated due to usage of structural standard for the XML. Thus, saving processing time and analysis time for consistency. Furthermore, working with documents that have a  defined structure make easier the task of using the information in the Document. It is also useful for documenting the data exchanging processes.

Conclusion

In the next post I will include example for both XSD and DTD. From this post the most important take, is to understand the what an XML is, how it structures make it useful. As you've seen in the code above, it is really simple. Tags can be named according to your needs and there are some basic principles to follow for validation if you are working with data that does not require or needs a structure. If the data requires a structure, you now know that there are ways of giving a the document a sort of schema, like in Relational DB's in the next post I will includes examples of this more structured content-specific satisfaction documents.

No comments:

Post a Comment