Dec 11, 2010

Java + XML

Started trying to parse some XML in Java. I figured this would be simple considering that both technologies are so well developed - but the case is, it's so well developed that there's so many choices out there that it's confusing. I had a hard time trying to find a good succinct introduction to help choose which technology to use. Surprisingly, even though almost all the work is pre-2005, it's hard to find a single simple introduction. There's a lot of documentation out there, and it took me a long time to read through various different API specs, other docs, and tutorials, before I felt like I had a firm grasp on the world of XML parsing/manipulation in Java. Now I assume if you're interested in XML, you already know what it is, if not there's plenty of primers on that topic. Here's some info on how to work with XML with Java.

First of all,

DTD, Document Type Definition: text file in a standardized format that defines the rules of a specific type of XML file, and the allowed elements, attributes, and structure.

xsd, XML Schema: equivalent of a DTD, but written in XML. Newer.

Note: you would only use one of either DTD or xsd, more DTD in the past, xsd for newer stuff.

DOM, Document Object Model: a parser API (ie a bunch of defined classes) that represents an XML or HTML document as a tree of Node objects. After parsing, the user can navigate the tree to find and manipulate the data they are looking for. While conceptually simple, the implementation is tedious because it's generic. You can navigate to an Element, but you have to use String objects to find the particular element or attribute you are looking for every time. It ends up having high memory usage and is slow. DOM can be used for XML or HTML, and the standard Java implementation for XML is org.w3c.dom.

SAX, Simple API for XML: a competing parser API designed for performance, but does not define how a document is represented - that's left to the user. Instead it uses a callback model that calls user-implemented functions when the parser hits elements or attributes. The user can represent the document any way he likes, from a tree of specific classes for pertinent datatypes, to a linked list or array or hash table if that's more convenient. This may be straightforward for a very simple document, or time consuming if there are many types of elements/attributes. The standard Java implementation is org.xml.sax.

Note: you would only use one of either DOM or SAX. DOM is heavyweight but allows you to navigate the tree, and is generally less work. SAX is faster, but you have to create your own object model, which may be simple or hard, depending on the complexity of your XML schema. It's most likely more work than using the DOM if your XML is complicated.

JAXP, Java API for XML Parsing: A standard API that lets you interact with XML documents, primarily by parsing, and then transforming from source to destination formats. For example, writing to file is implemented as a transform. The API is under java.xml.parsers and java.xml.transform. There are SAX and DOM implementations, using JAXP hides much of the details of the underlying parser. This is one of the simpler ways to go initially, but look up some samples to help you along, since the API docs are not so intuitive. JAXP also includes ways to transform XML documents using XSLT styles, and validate your XML using a Schema. I found using JAXP to be much less code for my purposes, which was just to write some structures to XML.

All the APIs described so far will give you Element or Attribute objects, but then you'd check the string inside to figure out what type of Element it is, and then look for the appropriate Attribute inside. Now it would be much simpler if you actually had Java classes representing the element and attribute types, and simply navigated that. One way to do this is to code up your own Java classes to represent the various element and attribute types, and instantiate them as you go with an SAX parser.

An alternative is to code up your own Java classes that wrap the DOM classes, so that your own classes perform the appropriate DOM manipulation operations, and expose a much simpler interface to the user. Too bad there's so much manual work involved. You would have thought that with a DTD or Schema, you would have all that information about your XML document... That's where JAXB comes into play.

JAXB, Java Architecture for XML Binding: JAXB is an intermediate "compiler" that generates Java classes based on an XML Schema. You can then build these auto-generated classes into your app without the tedium of dealing with DOM or SAX. Juicy!

Sep 27, 2010

Samples.







I like the KiD CUDi one the best.

Feb 9, 2010

Movies.

IMG_0708
I meant to write something last month, after Sundance. But I didn't get around to it. I do have to say though, that it was fun times, and far more accessible than I had imagined. Movie tickets did require a bit of getting up early and waiting in line effort, but I think that's part of the experience. Considering going again next year, ping me if you're interested.

Restrepo
Was interested in this movie, because it was shot by the pretty darn awesome photojournalist Tim Hethrington and journalist Sebastian Junger. This is a real war movie, and none like I've ever seen. Through ten trips over the course of a year to the same camp, the viewers experience a sense of intimacy with real soldiers. It's an hour and a half of the raw experience of being a soldier - digging, camping, patrolling, fighting. Post-tour interviews are moments of calm in the film between literal cinema verite experiences in the Korengal valley of Afghanistan, shooting at distant invisible targets, and being shot at. It's just a sense of audio calmness when the guns stop shooting though; the viewer still bears witness to the emotional turmoil of the young veterans trying to internalize the experience and bring it back home.

The film doesn't portray anything that goes on back in the US. There's no high level discussion on how the war is waged, it's pretty much only what's going on on the ground. However, despite the lack of commentary, I got a frightening sense that the war to win over the hearts and minds of the people was being lost by a bunch of macho and culturally ignorant kids, fumbling their way through botched negotiations with village elders over collateral damage.

If the purpose of a documentary is to educate, this film is about as hands on and practical as it gets. I left feeling like I understood what it's like to be "over there". All the print articles you've read about the war in Afghanistan don't hold a candle to this. Apparently the rights have been sold to National Geographic. I'd expect to see it on TV sometime this year. Watch it.

Enter the Void
I'm probably not qualified to write a review for this movie, since I walked out about halfway through. Apparently a crowd favourite during the Midnight Madness run at TIFF, this movie did not work, at all, as a morning showing. I was tempted to stay until the end to ask the director why he hated his audience so. An epilepsy-inducing opening credit sequence physically assaults the viewer with heavy techno music and gaudy flashing colours making the worst internet banner ad seem like a GAP commercial. And the only times the film stops beating the viewer are when it subjects you to boredom of equally painful proportions.

While there's was some impressive camerawork done in the opening scene, a continuous real-time shot of the last 20 minutes of the main character's life, the horrible acting, worse writing and flatlining from boredom timing just killed it for me. The entire production was an experiment in creative camerawork, but there's not much film in there. On top of that, the gut-wrenching shakycam had me covering my eyes in an attempt to postpone my non-existant breakfast from making a return visit. While I hear there's some semblance of a character development a bit later, I think everything was pretty much laid out in the first 20 minutes. After that, it's a flood of shock tactics, drugs, sex and violence. Unfortunately, no rock and roll. I'm curious if there's a twist ending. Someone post the last 5 minutes on youtube please.

Departures
I'm not sure where I read or heard about this movie. It might be that it won an Oscar. I was spellbound by the way this film masterfully danced between comedy and drama. You think from the synopsis that it's a film about a laid of concert cellist who goes to work in a funeral home, but it turns out to be a movie about dreams, regrets, family, honour, respect, love, loss and redemption. You might knock a few points for some fairly contrived situations, to make all the pieces fit - but the film comes together so well that it's a worthy cause in itself. You'll enjoy this one.

Examined Life
I was impressed that this was a National Film Board of Canada production, though it seemed mostly shot in New York City. You pretty much watch a bunch of modern day philosophers talk their stuff, while walking around, or rowing a boat, or riding around town in a car. Fill it was some cutscenes of things that happen around town, driving down a road, or walking down a path. If you're lucky, the scenery might match the topic of discussion. That's about it.

The content really is the talk. Hearing articulate thoughts presented eloquently from the mouths of the philosophers themselves is a vastly different experience than reading it off of a thick stack of pages. Some of it sounds like self-justifying academic bullshit, but there's definitely a few gems in there, as long as you can get through the more obtuse ones droning on. I particularly liked the Martha Nussbaum and Slavoj Zizek segments for their pragmatic analyses of our current society.



(note that the last two films were not viewed at Sundance, they're both available on Netflix streaming)

Jan 13, 2010

One shall stand, one shall fall.

I don't know how this will turn out, but the trailer is chock full of AWESOME.

Bumblebee looks like Bumblebee. There's a lot of recognizable characters, including Trypticon and Omega Supreme.
Thank God Michael Bay did not direct this.