Friday, September 26, 2008

Reading Notes - Week 5

Bryan:
XML is subset of the Standard Generalized Markup Language.

XML allows users to:

  • bring multiple files together to form compound documents
  • identify where illustrations are to be incorporated into text files, and the format used to encode each illustration
  • provide processing control information to supporting programs, such as document validators and browsers
  • add editorial comments to a file
XML is formal language that can be used to pass information about the component parts of a document to another computer system

provides a formal syntax for describing the relationships between the entities, elements and attributes that make up an XML document

users must create a Document Type Definition that formally identifies the relationships between the various elements that form their documents

Where elements can have variable forms, or need to be linked together, they can be given suitable attributes to specify the properties to be applied to them

An XML file normally consists of three types of markup, the first two of which are optional:

  1. An XML processing instruction identifying the version of XML being used, the way in which it is encoded, and whether it references other files or not, e,g,
  2. A document type declaration that either contains the formal markup declarations in its internal subset (between square brackets) or references a file containing the relevant markup declarations (the external subset), e.g.:
  3. A fully-tagged document instance which consists of a root element, whose element type name must match that assigned as the document type name in the document type declaration, within which all other markup is nested.
XML-coded files are, by their nature, ideal for storing in databases. Because XML files are both object-orientated and hierarchical in nature they can be adopted to virtually any type of database, though care sometimes needs to be taken to ensure that enough structural data is retained in the database to reconstruct the original file


Ogbuji:

XML is based on Standard Generalized Markup Language (SGML), defined in ISO 8879:1986 [ISO Standard]. It represents a significant simplification of SGML, and includes adjustments that make it better suited to the Web environment.

an entity catalog can be used to specify the location from which an XML processor loads a DTD, given the system and public identifiers for that DTD. System identifiers are usually given by Uniform Resource Identifiers (URIs)

**
A URI is just an extension of the familiar URLs from use in Web browsers and the like. All URLs are also URIs, but URLs also add URNs** [is this true? i thought the opposite...]

In XML namespaces each vocabulary is called a namespace and there is a special syntax for expressing vocabulary markers. Each element or attribute name can be connected to one namespace

XML Base [W3C Recommendation] provides a means of associating XML elements with URIs in order to more precisely specify how relative URIs are resolved in relevant XML processing actions.

the XML Infoset, defines an abstract way of describing an XML document as a series of objects, called information items, with specialized properties. This abstract data set incorporates aspects of XML documents defined in XML 1.0, XML Namespaces, and XML Base. The XML Infoset is used as the foundation of several other specifications that try to break down XML documents

a physical representation of an XML document, called the canonical form, accounts for the variations allowed in XML syntax without changing meaning

XPointer, language that can be used to refer to fragments of an XML document

XLink offers such links (simple links), as well as more complex links that can have multiple end-points (extended links), and even links that are not expressed in the linked documents, but rather in special hub documents (called linkbases).


XML Tutorial:

The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD.

An XML Schema:

  • defines elements that can appear in a document
  • defines attributes that can appear in a document
  • defines which elements are child elements
  • defines the order of child elements
  • defines the number of child elements
  • defines whether an element is empty or can include text
  • defines data types for elements and attributes
  • defines default and fixed values for elements and attributes
XML Schema became a W3C Recommendation 02. May 2001.

One of the greatest strength of XML Schemas is the support for data types.

With support for data types:

  • It is easier to describe allowable document content
  • It is easier to validate the correctness of data
  • It is easier to work with data from a database
  • It is easier to define data facets (restrictions on data)
  • It is easier to define data patterns (data formats)
  • It is easier to convert data between different data types
A simple element is an XML element that can contain only text

Simple elements cannot have attributes. If an element has attributes, it is considered to be of a complex type. But the attribute itself is always declared as a simple type.

A complex element is an XML element that contains other elements and/or attributes.

There are four kinds of complex elements:

  • empty elements
  • elements that contain only other elements
  • elements that contain only text
  • elements that contain both other elements and text

WSDL is a schema-based language for describing Web services and how to access them.

WSDL describes a web service, along with the message format and protocol details for the web service.


Bergholz:

Extensible Markup Language (XML), a semantic language that lets you meaningfully annotate text. Meaningful annotation is, in essence, what XML is all about.

DTDs let users specify the set of tags, the order of tags, and the attributes associated with each

Elements can have zero or more attributes, which are declared using the !ATTLIST tag

Using namespaces avoids name clashes (that is, situations where the same tag name is used in different contexts). For instance, a namespace can identify whether an address is a postal address, an e-mail address, or an IP address

Unfortunately, namespaces and DTDs do not work well together

XML extends HTML’s linking capabilities with three supporting
languages.
 Xlink (http://www.w3.org/TR/xlink/), which describes how two documents can be linked;
 XPointer, which enables addressing individual parts of an XML document; and
 XPath, which is used by XPointer to describe location paths.

The Extensible Stylesheet Language (XSL) is actually two languages: a transformation language (called XSL transformations, or XSLT) and a formatting language (XSL formatting objects). Although DTDs were the first proposal to providefor a standardized data exchange betweenusers, they have disadvantages. Their expressive power seems limited, and their syntax is not XML. Several approaches address these disadvantages by defining a schema language (rather than a grammar) for XML documents:
 document definition markup language (DDML), formerly known as XSchema,
 document content description (DCD),
 schema for object-oriented XML (SOX), and
 XML-Data (replaced by DCD). The W3C’s XML Schema activity takes these four proposals into consideration.

No comments: