Data Structure and Serialization Format
Outlines
•
•
•
•
•
•
Review Basic SQL Statements (create, insert, select, update)
Data Serialization Formats
Introduction to XML and XSD
JSON and JSON Schema
Apache AVRO
Basic Linux Commands
Data Serialization
• Serialization – Introduction
• https://www.youtube.com/watch?v=6MisF1sxBTo
• https://www.youtube.com/watch?v=kfVPLSj6Rqw
“The process of converting an object (or a graph of objects) into a linear sequence of bytes for either storage or transmission to another location.”
4
What is serialization?
What is deserialization?
5
“The process of taking in stored information a nd recreating objects from it.”
• Wikipedia currently has a decent overview of serialization
• http://en.wikipedia.org/wiki/Serialization
how to deserialize badly
• What does the program need to know to rebuild an identical objec t in memory?
• Text or binary format
• If binary, need to know everything
• If text, what byte encoding?
• If valid XML, we can at least read in the data
• But whether we know what to do with it is uncertain
6
• If a program is both the serializer and deserializer, things are str aightforward. • If a program receives a serialized file from another company:
•
•
•
•
•
•
Binary
.csv
XML
SOAP
JSON
Protocol Buffers
7
some serialization formats
HTML and XML
•
•
•
•
•
HTML was designed to display data.
XML stands for eXtensible Markup Language.
XML was designed to carry data, not to display data
XML tags are defined by you, not pre-defined
XML tag is self-descriptive
XML
• XML document does not DO anything.
• It is just information wrapped in tags.
<note>
< to>Tove</to>
< from>Jani</from>
< heading>Reminder</heading>
< body>Don't forget me this weekend!</body>
< /note>
• With XML You create Your Own Tags
• XML is Not a Replacement for HTML
• XML is now as important for the Web as HTML was to the foundation of the Web.
• XML is the most common tool for data transmissions between applica tions. XML Separates Data from HT
ML
• To display dynamic data in your HTML document, it will take a l ot of work to edit the HTML each time the data changes.
• With XML, data can be stored in separate XML files. This way y ou can concentrate on using HTML/CSS for display and layout, and be sure that changes in the underlying data will not requir e any changes to the HTML.
• With JavaScript, you can read an external XML file and update the data content of your web page.
An Example XML Document
<?xml version="1.0" encoding="UTF-8"?>
< note>
< to>Tove</to>
< from>Jani</from>
< heading>Reminder</heading>
< body>Don't forget me this weekend!</body>
< /note>
• The first line is the XML declaration. It defines the XML version (1.0).
• Next line is root element
• Line 3-6 are child elements
XML Documents -- a Tree Structure
• XML documents must contain a root element
• All elements can have sub elements (child elements):
<root>
<child>
<subchild>.....</subchild>
</child>
< /root>
<bookstore>
< book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
< /book>
< book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
< /book>
< book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
< /book>
< /bookstore>
XML Book store Exa mple XML Syntax Rules
•
•
•
•
•
XML documents must have a root element
XML elements must have a closing tag
XML tags are case sensitive
XML elements must be properly nested
XML attribute values must be quoted
XML Schema
• An XML Schema describes the structure of an XML doc.
• An XML doc with correct syntax is called "Well Formed".
• An XML document validated against an XML Schema is both “Well Forme d” and “Valid“.
<xs:element name="note">
< xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>