4.2.1 Document Management Functions
The pf:add-doc() function adds a new XML document available at some URI to the database, under
a logical name (second parameter). It is also possible to provide as third parameter a collection
name. This makes it possible to add a document to an existing document collection. All documents in
a collection store all their data together in the same MonetDB tables. Especially in cases where
you have may (thousands or more) of (presumably small) XML documents, it is advisable to store these
together in one or a few collections, because storing a small document in a single collection
(by the same name, which is the default behavior if only two parameters are provided to pf:add-doc())
will cause a lot of table-header and MonetDB meta data overhead, because each single document
will lead to the creation of a couple of relational tables, such that a large XML collection may
cause millions of them.
Normally, collections are created read-only, meaning that updates to them are prohibited and
cause runtime errors. To allow updates, documents have to be shredded explicitly as updatable,
by passing a fourth parameter to pf:add-doc(). This parameter must have a value between 1 and 99,
that indicates the percentage of unallocated space that should be left per page, to accommodate
future updates. All documents inside the same collection are either all updatable, or all read-only.
Note that after a collection has been created by the first pf:add-doc(), its status cannot
be changed anymore. There is a workaround, based on the
backup/restore mechanism.
pf:add-doc
| ($uri as xs:string, $name as xs:string)
| pf:add-doc
| ($uri as xs:string, $name as xs:string, $coll as xs:string)
| pf:add-doc
| ($uri as xs:string, $name as xs:string, $coll as xs:string, $perc as xs:integer)
| pf:del-doc
| ($name as xs:string)
|
A query that calls any of these functions, does not return a result, highly
similar to the XQuery Update Facility.
However, this family of MonetDB/XQuery extension functions is not considered the same
as XQUF update queries. In fact, it is specifically forbidden to mix XQUF
updates and document management commands in the same transaction.
We should note that MonetDB/XQuery, apart from atomicity with respect to document management
(i.e. a document management query either fully succeeds or fully fails), also provides
durability and some form of isolation. Isolation, however is not fully perfect.
It may happen that a read-only or update query that started before a document management
query committed, ends up seeing its effects. That is, if execution of this concurrent query
reaches execution of fn:doc(), it is evaluated with respect of the actual state of the database
at that time. This is an aberration of snapshot isolation, which demands that fn:doc() be
evaluated with respect to the database state at the *start* of the query.
On the other hand, once a query has gained access to a document, the query caches it in its
database snapshot such that subsequent calls to fn:doc() will continue to find it, regardless
whether it has been deleted since.
|