|
|
|||
Document ManagementDocument management in MonetDB/XQuery is done using XQuery queries. To make this possible, we extended XQuery with new built-in functions, under the namespace pf (i.e. http://pathfinder-xquery.org/).
We recommend reading this section to learn about the difference between documents versus collections, as well as read-only versus updatable collections. Having understood these concepts, we should point out, that an alternative way of adding, deleting and inspecting XML databases is by using the Administrative GUI. Additional information can be found in the Document Management section of the Reference Manual.Adding a DocumentSuppose we want to add the XML document HelloWorld.xml to our database. Of course, you first need to start your Mserver (see the Hello World example).
Open an interactive mclient XQuery session (
> pf:add-doc("http://monetdb.cwi.nl/XQuery/files/HelloWorld.xml",
"HelloWorld.xml")
Send this query to the server by closing with CTRL-D (CTRL-Z on Windows)
on a new (empty) line at the more> prompt.
The "HelloWorld.xml" document is now stored persistently in the database!
The first argument of XQuery Access to the Database: now we can XQuery queries that retrieve (Hello World Example 8) by its logical name, directly from the database:
> doc("HelloWorld.xml")
<?xml version="1.0" encoding="utf-8"?>
<doc>
<greet kind="informal">Hi </greet>
<greet kind="casual">Hello </greet>
<location kind="global">World</location>
<location kind="local">Amsterdam</location>
</doc>
For large documents, you can see a strong performance difference between querying web documents and documents added to the database. The former have to be fetched always (e.g. using HTTP) and read fully, whereas the latter are directly available, and MonetDB/XQuery exploits database indices automatically created on them. You can also reference the document stored in the database by its original URL (i.e. "http://monetdb.cwi.nl/XQuery/files/HelloWorld.xml"). URLs in the database do not need to be unique, unlike document names, i.e. the same URL may be stored multiple times under different names.
A difference between documents that were explicitly added with Web Access to the Database: If you have MonetDB/XQuery running on your local machine, you can also access the stored document in your browser at http://localhost:50001/xrpc/doc/HelloWorld.xml. That is, all documents are accessible on the built-in HTTP server of MonetDB/XQuery, by prefixing the document name with xrpc/doc. This built-in HTTP server runs on port 50001 by default (mapi_port+1). If you have MonetDB/XQuery running on a different machine than the one you use to browse this documentation, you should point it to http://machine:5001/xrpc/doc/HelloWorld.xml Deleting a DocumentStart a mclient XQuery session and type:
> pf:del-doc("HelloWorld.xml")
After this, the query
The Document CollectionsMonetDB/XQuery groups documents in so-called collections. This can be useful to organize collections of many documents. Storing documents together in the same collection also makes opening and querying many (small) documents much more efficient. As an example, we may have stored the XML documents book.xml and bib.xml together in a collection "my-collection" using the following XQuery:
> for $name in ("book.xml", "bib.xml")
let $url := concat("http://monetdb.cwi.nl/XQuery/files/", $name)
return pf:add-doc($url, $name, "my-collection")
Note that this XQuery executes the If a document is added to the database without this third parameter (collection name), it is in fact stored in a new collection that has the same name as the document. If a collection by that name already exists, the new document is added to that collection. Opening All Documents in a Collection
There are two similar functions by the name
Typing
> fn:collection("my-collection")//author
<author>Serge Abiteboul</author>
<author>Peter Buneman</author>
<author>Dan Suciu</author>
<author><last>Stevens</last><first>W.</first></author>
<author><last>Stevens</last><first>W.</first></author>
<author><last>Abiteboul</last><first>Serge</first></author>
<author><last>Buneman</last><first>Peter</first></author>
<author><last>Suciu</last><first>Dan</first></author>
Note we get all authors from document bib.xml (i.e. the first three) before those of book.xml. This is because "bib.xml" was shredded before "book.xml", hence all nodes identifiers of "bib.xml" precede those of "book.xml".
The difference between
> count(fn:collection("my-collection"))
2
>
> count(pf:collection("my-collection"))
1
On collections that contain many (more than 1000) documents, we advise use of
Inspecting the XML DatabaseWe can list all collections in our database using the following XQuery:
> pf:collections()
<collection updatable="false"
size="0 MB"
numDocs="2">my-collection</collection>
We get information in the form of
> pf:documents()
<document updatable="false"
url="http://monetdb.cwi.nl/XQuery/files/book.xml"
collection="my-collection">book.xml</document>,
<document updatable="false"
url="http://monetdb.cwi.nl/XQuery/files/bib.xml"
collection="my-collection">bib.xml</document>,
In the pf:documents().
Updatable vs. Read-Only collectionsFrom MonetDB/XQuery 0.16 on, it is possible to update XML documents. Each document belongs to one collection, even if a document was added without mentioning a collection (in which case it is stored in a single-document collection by the same name as the document). Collections can be either updatable or read-only, where read-only is the default. Thus, all the collections we created so far (i.e. "HelloWorld.xml" and "my-collection") were read-only. Once a collection is created, its mode (read-only or updatable) cannot be changed anymore.
To get an updatable collection,
> pf:add-doc("http://monetdb.cwi.nl/XQuery/files/HelloWorld.xml",
"greetings.xml", "greetings.xml", 10)
The extra parameter indicates the percentage of free space that will be left on the table pages internally, to cheaply accommodate inserts. We now invite you to continue reading the tutorial on XML Updates. |
||||
| © 1994-2011 CWI | Contact us Legal HG web Bugs TestWeb | |||