Affordance Critique
Wednesday, July 1, 2009 at 6:48AM The Textual Analysis Example 1 series began by examining several different input sources and concluded with a psychological payoff for the effort invested. Along the way, I examined assembling the text, making notes and observations about particular stretches of text, and leveraging spatial and colour cues available within map view to enable emergent insights.
The generic phases of this form of textual analysis are:
- Capture
- Pre-processing
- Import into Tinderbox
- Arrangement
- Analysis
So, how well did the tool chain function? How could it have functioned better? How might it have functioned with different affordances? And to how well did the initial Tinderbox functionality manifesto hold up this particular example of textual analysis?
I'll examine each question in light of each of the phases.
Capture
In this series, I hinted at four different forms in which the text could have been captured:
- as a manuscript image
- as an image of a published text
- as a machine-recognized PDF text
- as digitised text
It's clear that each of these scenarios need to be supported within the analytical tool chain.
Pre-processing
In the case of the machine-recognised PDF text, the conversion of text from an image to a stream of machine-recognised characters consititutes pre-processing.
In the case of the digitised text, I related in detail how I used Microsoft Word to prepare the text for two different analyses. The first analysis consisted of a corpus study focused on word frequencies. The second analysis was the qualitative textual analysis within Tinderbox. The pre-processing began by stripping away artefacts within the text stream that were not the focus of the study; it finished by adding deictic markers to the text stream.
There are some obvious reasons to retain the mechanics of this step outside of Tinderbox: we already have "good software" to do this; Tinderbox is already feature-rich, we don't want to bloat it. However, when considered within the context of Audenaert's idea of representing an underlying document on a series of layers, it begins to seem more natural to expect Tinderbox to support the separation of the text onto the layers.
If Tinderbox supported the separation of a single text stream into multiple layers, then regular expressions could be used to automation the separation across the multiple layers.
Import into Tinderbox
Should the manuscript or an image of the published text have been the focal point of the research, the import format would have been an image. Should the machine-recognised PDF file have been directly imported, it would have been a PDF file. As digitised text, the import format could have consisted of ASCII, UTF-8, UTF-16, formatted as plain text, RTF, HTML, OPML.
I noticed that on import the em-dash failed to be correctly imported: it was represented as a series of three or four separate characters. I'm unsure what caused this, but I suspect that Tinderbox is due for some improvement in its support for unicode.
Arrangement & Analysis
The particular arrangement affordances I exploited were selected to make explicit use of Tinderbox's current features. The reasons I was able to explode the text into a verse-by-verse representation were:
- the text is already segmented by verse
- arranging by verses enables reflowing of text for visual exploration
- arranging by verses enables agents to return results, approaching the functionality of a concordance
- I knew I could use adornments to mark out larger text segments
- by arranging the text verse-by-verse, I was able to use links to mark semantic relations between verses
It is probable that there are very few texts in which one is likely to want to do this. Indeed, snippetising the text has its costs:
- it is more difficult to read the text as a flowing whole, especially where the verse division occurs within a clause complex
- it is very difficult to mark out multiple overlapping and hierarchic syntagms, paticular if those syntagms span more than one clause
- adornments are represented only in map view, and thus cannot be manipulated programmatically
- I needed to split the verses into even smaller syntagms in order to more richly represent semantic relations
Other issues become apparent when you begin to use Tinderbox in the way I did. For example, making notes within the note body, along with the target text, increases the noise returned by the agents: sometimes the agent returns results based on my notes, rather than on the text. One key piece of functionality Tinderbox needs is to enable focusing a search on the underlying text as distinguished from the commentary on the text. Moreover, adding user-defined rich text fields would also be an enhancement worth considering.
What if? --- Projecting new affordances
Were Tinderbox to implement the functionality described in my manifesto, I would have:
- left the text as a single document
- marked each clause complex with a syntagm lense having a prototype of ClauseComplex.
- marked each independent and dependent clause with a syntagm lense having a prototype of Clause.
- employed agents to mark the key words (as identified by corpus word counts) and their close synonyms and related words with a syntagm lense having unique prototypes descending from a CohesionRepetition prototype
- continued to use links to mark semantic relations between syntagm lenses
- made the inductive analysis entries within the syntagm lense as a separate field from the text.
Concordance Agent
In this study, I used agents to return verses containing particular words. The functionality I truly want is:
- to be able to define an agent to return the target word, with the surrounding linguistic context to either side of the particular word; i.e. a concordance display
- the agent would be capable of automatically creating the syntagm lense based on user-defined criteria: e.g. target word only, x number of words to each side; the syntagm lense can then be manually adjusted once the analyst views the results
- the agent would run once, and then switch itself off; there's no need for it to run continuously, as it is focused on an underlying text that doesn't change
- the agent would have the option of linking all the words together with a given semantic label (using links, probably); this is used for cohesion analysis
- all syntagm lenses are created on a unique overlay that is generated for that particular agent; the analyst may have a control allowing the created syntagm to be assigned to a particular existing overlay
- all the syntagms are visible underneath the agent; they are also visible on the text itself (when the relevant overlay is switched on)
Syntagm Lense
When an analyst marks out a piece of text, and creates a syntagm lense out of it, the textual focus of the syntagm lense is the particular segment of text that was marked out. When you open the syntagm lense (like a Tinderbox note) the target text is visible in the syntagm lense's text slot. However, within Tinderbox, that particular passage of text isn't duplicated. Furthermore, when you open the syntagm lense note, you can create further syntagm lenses right on the text itself. When you close the note, and have the appropriate overlay switched on, that additional syntagm lense is viewable directly on the underlying text. In fact, all syntagm lenses, however they are created, are viewed directly on top of the underlying text, given the appropriate overlays being switched on.
Syntagm lenses also act as a true lense over the particular piece of text. They would have the power to change the font face, font size, color, underlining, bolding and italicising of the underlying text. They also control the background colour, semi-transparent foreground colour, border width, border colour and border line style. All of these properties are inheritable from syntagm lense prototypes. All of the properties can be applied on top of the underlying text.
Clearly, if you have multiple syntagm lenses, you're going to want to manually pull them off the underlying text and re-arrange them in order to reflow the text. And once you've reflowed the text, you may very well decide to pull apart the interior of a clause complex and graphically re-arrange it. There's a bunch of affordances here that need to be explored in order to come up with appropriate functionality. With so many selectable components in such a close proximity, the means of selecting and discriminating between multiple targets is essential.
Syntagm lenses provide link anchors in order to attach multiple named source and destination links. This is clearly superior to Tinderbox's existing word-based hyperlinks, because word-based hyperlinks are:
- not named
- provide no anchoring point for incoming links
- not suitable for multiple links containing different semantic relationships.
Manuscript functionality
If I were conducting the study using the manuscript image as the base text, I would need a way to specify syntagm lenses purely as a graphical representation. I would need the ability to specify a graphical region as being a particular syntagm lense. This graphical region would have similar shaping behaviour as that used by Apple in the iPhone 3.0 copy functionality.
Moreover, few would want to work directly with the manuscript at all times. Most analysts would transcribe the manuscript into a digitally-encoded textual representation. There seems to be a difference between treating the digital transcription as a separate text and treating it as a stand-in for the manuscript. (As a stand-in, for example, one might conduct a search on the digital representation, and return the portions of the manuscript that match the search highlighted.)
Some analysts may want to compare multiple manuscripts. To do this, they may make direct linkages between the manuscript images. Or they may create stand-in digital representations, and then use text differencing tools to highlight the differences between the two versions.
PDF functionality
If I were conducting the study using the PDF document as the base text, there is a question regarding the treatment of multiple-page documents. If the PDF is imported as a multi-page document, should the pages display on the map-like view as a separate, independent page? Or should it display as part of a pile, with forward and backwards controls to flip pages? If the latter, then the overlays would need to automatically change when you flipped the page. But also, if the latter, then you'd definitely want the ability split the stack into multiple piles.
Analysis of the original manifesto
The manifesto to adapt Tinderbox to the affordances directly supporting textual analysis is reasonably well supported by this analysis. That document certainly never envisaged applying layers in order to analytically separate collagic elements within the underlying document; this analyis has shown that in some cases that functionality could very well be useful.
The need for overlays is vital for managing the visual space. Unlike the spatial hypertext design for multiple spaces for hierarchical representation, textual analysis needs to visually represent hierarchy (rank) in place. So the ability for the analyst to switch on or off overlays is essential to managing that space in order to focus attention on what is vital at any stage in the analysis or synthesis.
Due to its increased length, this paper has expanded the description of various elements. Some of the thinking regarding piles and comparative textual criticism certainly represent extensions to the underlying ideas. Those base ideas: being able to create syntagm lenses, hyperlink between them, place them on overlays, and distinguish between an underlying text and the analytical text that is built on top of it were all present in the original manifesto.
Reader Comments (3)
The article is worth reading, I like itgucci handbags outlet very much. I will keep your new articles
Cheap Gucci handbags on sale,Gucci Fashion bags,Gucci zone
If you need a cheaper bag bag, or need a nice pair of earrings, asked her the party and you let you shine. If you were a boy, we this have what you need all kinds of work bag bag, make you a house can have a different experience, this all kinds of leather belt. You can Click on my name to enter my web site. Welcome you the presence!
He added: "Only in Santa Cruz would you have biker wars over who's going to control pumpkin spice lattes." True Religion Brand JeansSeven months after the Starbucks ambush, violence between the two groups flared again in a gunfight in August 2010 that left five people wounded and led to 27 arrests in the northern Arizona town of Chino Valley.