100 Languages Speedrun: Episode 58: XQuery

XQuery is XML processing language similar to XSLT, except XQuery had at least enough sense to use real syntax instead of trying to code in XML.

Let’s see how it turned out.

Hello, World!

I’ll be using BaseX implementation of XQuery, you ca…


This content originally appeared on DEV Community and was authored by Tomasz Wegrzanowski

XQuery is XML processing language similar to XSLT, except XQuery had at least enough sense to use real syntax instead of trying to code in XML.

Let's see how it turned out.

Hello, World!

I'll be using BaseX implementation of XQuery, you can get it with brew install basex.

First let's create a simple document hello.xml:

<?xml version="1.0" ?>
<persons>
  <person>
    <name>Alice</name>
  </person>
  <person>
    <name>Bob</name>
  </person>
</persons>

Then with this hello.xquery:

<messages>
  {
    for $name in doc("hello.xml")//name
    return <message>Hello, {data($name)}!</message>
  }
</messages>

We can get our answer:

$ basex hello.xquery
<messages>
  <message>Hello, Alice!</message>
  <message>Hello, Bob!</message>
</messages>

By default it lacks XML header, and the final newline.

What we can notice here:

  • we did not pass any document to process like XSLT, XQuery script specified which documents it wanted to open with doc("name.xml")
  • we can do any XPath with doc("hello.xml")//name
  • we can switch between XML and code, XML tags start XML mode, {...} starts code mode
  • the code of XQuery is the FLWOR (for let where order by return) statement
  • variables use $ prefix
  • to get text content of a node $name, use data($name) - otherwise XQuery would insert <name>Alice</name>

Loop

The thing we're selecting doesn't have to be XML, and thing we're generating doesn't have to be XML. However, XQuery normally will insert newlines between returned elements, but not after the last one.

(: This is XQuery loop :)
for $n in (1 to 20)
return $n
$ basex loop.xquery
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

XQuery comments use unusual (: ... :) syntax.

FizzBuzz

for $n in (1 to 100)
let $fizz := $n mod 3 = 0
let $buzz := $n mod 5 = 0
return (
  if ($fizz and $buzz)
    then "FizzBuzz"
  else if ($buzz)
    then "Buzz"
  else if ($fizz)
    then "Fizz"
    else $n
)

This almost works, except it lacks the final newline after the last element.

FizzBuzz with correct newlines

If we want to take control over spacing, this becomes more complicated.

declare option output:method "text";
declare option output:item-separator "";

for $n in (1 to 100)
let $fizz := $n mod 3 = 0
let $buzz := $n mod 5 = 0
return (
  if ($fizz and $buzz)
    then "FizzBuzz&#10;"
  else if ($buzz)
    then "Buzz&#10;"
  else if ($fizz)
    then "Fizz&#10;"
    else concat($n, "&#10;")
)
  • we need to switch output:method to text
  • we need to switch output:item-separator to empty string - we do not want a separator, we want a terminator for each element
  • it might seem we could get away with just leaving things as they are, and adding &#10; at the end - this is fine for FizzBuzz, but it would be incorrect in case of empty result set, so it's not really a great practice
  • &#10; or equivalent XML escape generates a newline
  • there no string interpolation, we need to use concat

Fibonacci

There are a few more serialization methods like csv and json. Let's try csv serialization for Fibonacci numbers.

declare option output:method "csv";
declare option output:csv "header=yes";

declare function local:fib($i as xs:integer) as xs:integer {
  if ($i <= 2)
    then 1
    else local:fib($i - 1) + local:fib($i - 2)
};

<csv>{
  for $n in (1 to 30)
  return <record>
    <N>{$n}</N>
    <Fib>{local:fib($n)}</Fib>
  </record>
}</csv>

Here's what it does:

$ basex fib.xquery
N,Fib
1,1
2,1
3,2
4,3
5,5
6,8
7,13
8,21
9,34
10,55
11,89
12,144
13,233
14,377
15,610
16,987
17,1597
18,2584
19,4181
20,6765
21,10946
22,17711
23,28657
24,46368
25,75025
26,121393
27,196418
28,317811
29,514229
30,832040

Step by step:

  • we switch to CSV output mode with declare option output:method "csv";
  • we turn on headers with declare option output:csv "header=yes";
  • while we're actually going to output CSV, we need to pretend we're generating XML
  • the <csv> and <record> tag names are not special, anything will do
  • declare function local:fib($i as xs:integer) as xs:integer { ... } declares a function and all the types it takes and returns
  • we need to put the function in some namespace, local: is a reasonable choice here; just declare function fib(...) wouldn't work.
  • there's no return inside the function, it's just an expression - XQuery return has very little to do with what return means in other languages

JSON Input

For this I got Cat Facts JSON.

<catfacts>{
  for $name in json:doc("catfacts.json")//text
  return <fact>{data($name)}</fact>
}</catfacts>
$ basex catfacts.xquery
<catfacts>
  <fact>Cats make about 100 different sounds. Dogs make only about 10.</fact>
  <fact>Domestic cats spend about 70 percent of the day sleeping and 15 percent of the day grooming.</fact>
  <fact>I don't know anything about cats.</fact>
  <fact>The technical term for a cat’s hairball is a bezoar.</fact>
  <fact>Cats are the most popular pet in the United States: There are 88 million pet cats and 74 million dogs.</fact>
</catfacts>

json:doc parses JSON document, and turns it into some XML-like structure with some weird tags like <verified type="boolean">true</verified>, <json type="array">, <status type="object"> etc.

It looks really weird when printed as XML, but it's not too bad to query it like it's XML.

JSON Output

And of course we can do it the other way. It might seem like I'm doing a lot of CSV and JSON in an XML episode, but that's likely quite representative of the real world. XML based systems are a sizable minority, and a lot of transformation tasks will be about getting data into and out of XML.

For example this exciting code:

declare option output:method "json";

<json type="array">{
  for $n in (1 to 10)
  return <_ type="object">
    <number type="number">{$n}</number>
    <odd type="boolean">true</odd>
    <even type="boolean">false</even>
  </_>
}</json>

Generates the following:

$  basex oddeven.xquery
[
  {
    "number":1,
    "odd":true,
    "even":false
  },
  {
    "number":2,
    "odd":true,
    "even":false
  },
  {
    "number":3,
    "odd":true,
    "even":false
  },
  {
    "number":4,
    "odd":true,
    "even":false
  },
  {
    "number":5,
    "odd":true,
    "even":false
  },
  {
    "number":6,
    "odd":true,
    "even":false
  },
  {
    "number":7,
    "odd":true,
    "even":false
  },
  {
    "number":8,
    "odd":true,
    "even":false
  },
  {
    "number":9,
    "odd":true,
    "even":false
  },
  {
    "number":10,
    "odd":true,
    "even":false
  }
]

You could even use XQuery to process inputs and outputs neither of which are XML. Turn some JSON into CSV or whatnot. It would basically construct those intermediate XMLs when loading data or before saving it.

Should you use XQuery?

I'd not recommend it for most people.

However, unlike XSLT, which is completely insane, XQuery is a legitimate tool created by some sane people for a legitimate purpose.

It's just that for vast majority of people XML processing is not something common enough to warrant learning a whole new language, and general purpose programming languages tend to be pretty much just as concise and expressive at transforming XML into other XWL, while being so much better at all the associated tasks like fetching data from the internet or databases, dealing with JSON or CSV, and any nontrivial data transformation. Especially Ruby and Nokogiri are far better than any of the XML-specific languages, but Python and others are totally adequate as well.

This is different from jq, which I definitely recommend, as JSON processing is a far more common task, and jq is extremely concise for shell one liners. In a way what jq does is closer to XPath (which you can use with your regular language) than to either XQuery or XSLT.

Then again, if you find yourself processing XML a lot, and especially if your default language doesn't have anything as nice as Ruby's Nokogiri, XQuery might be worth checking out.

Code

All code examples for the series will be in this repository.

Code for the XQuery episode is available here.


This content originally appeared on DEV Community and was authored by Tomasz Wegrzanowski


Print Share Comment Cite Upload Translate Updates
APA

Tomasz Wegrzanowski | Sciencx (2022-01-18T21:07:14+00:00) 100 Languages Speedrun: Episode 58: XQuery. Retrieved from https://www.scien.cx/2022/01/18/100-languages-speedrun-episode-58-xquery/

MLA
" » 100 Languages Speedrun: Episode 58: XQuery." Tomasz Wegrzanowski | Sciencx - Tuesday January 18, 2022, https://www.scien.cx/2022/01/18/100-languages-speedrun-episode-58-xquery/
HARVARD
Tomasz Wegrzanowski | Sciencx Tuesday January 18, 2022 » 100 Languages Speedrun: Episode 58: XQuery., viewed ,<https://www.scien.cx/2022/01/18/100-languages-speedrun-episode-58-xquery/>
VANCOUVER
Tomasz Wegrzanowski | Sciencx - » 100 Languages Speedrun: Episode 58: XQuery. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/01/18/100-languages-speedrun-episode-58-xquery/
CHICAGO
" » 100 Languages Speedrun: Episode 58: XQuery." Tomasz Wegrzanowski | Sciencx - Accessed . https://www.scien.cx/2022/01/18/100-languages-speedrun-episode-58-xquery/
IEEE
" » 100 Languages Speedrun: Episode 58: XQuery." Tomasz Wegrzanowski | Sciencx [Online]. Available: https://www.scien.cx/2022/01/18/100-languages-speedrun-episode-58-xquery/. [Accessed: ]
rf:citation
» 100 Languages Speedrun: Episode 58: XQuery | Tomasz Wegrzanowski | Sciencx | https://www.scien.cx/2022/01/18/100-languages-speedrun-episode-58-xquery/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.