Deep Text: Using Text Analytics to Create New Ways to Structure and Access Information
Wednesday, March 22, 2017 from 9:00AM – 5:00PM
Tom, from your perspective, why is text analytics relevant to the IA community?
Text analytics is relevant to the IA community in two basic ways. The first is that to create sophisticated information access, IA’s are one of the major sources of input into development of the categorization and metadata generation structures and rules that text analytics uses to present information in multiple ways. Having an IA perspective during development often offsets the tendency of taxonomists / librarians to develop overly complicated structures.
The second is in the actual design of information access and utilization of information in complex applications. Text analytics enables complex structuring of information through such capabilities as multiple categorizations, faceted metadata, sentiment analysis, graph databases and ontology-driven displays of related information using multiple kinds of relatedness. All this complexity both presents a challenge for IA to develop ways that take advantage of the richness of information structures and an opportunity to create new directions for the field and new job opportunities.
What’s the biggest challenge in learning text analytics?
There are quite a number of significant challenges in learning and practicing text analytics starting with most people / organizations not having a good idea of what it is and the business value of it. The complexity and variety of the various software used to do text analytics is another major issue. Related to this is the variety of techniques used in text analytics that include one major split between text mining techniques that focus mostly on words as data and taxonomy / categorization techniques that focus on concepts and language.
However, the single biggest challenge is building resources and applications that deal with the complexity of language. Unlike structured information, text is messy with few hard and fast answers but that messy complexity is also the source for its power and depth.
How do people factor into text analytics?
The people connection for text analytics is varied. The first is in terms of input into text analytics. Many vendors claim that their software is completely automatic—no human input needed. This tends to be the cheapest but also low accuracy. So to do text analytics well, humans are needed—taxonomists and others including input from IA’s.
However, the main area of human interaction is in the presentation of complex information. First, text analytics opens up new ways to present information for human consumption. One example is that experts chunk information differently than non–experts and Text Analytics can first characterize the content as expert or not and can also be used to semi-automatically generate an expertise profile based on writings. This enables information to be presented in more targeted ways than simply presenting one size fits all.
A related capability is the ability to apply different types of organization schemas for different types of consumption. For example, some people tend to use primarily taxonomic subject categories (this document refers to encryption which is a subcategory of security). Other people tend to use functional relationships (this document refers to how different modes of transportation impact factors as varied as travel time and stress levels).
There is no correct way to organize information and this workshop discusses how to determine which methods works for which types of people and discusses how to determine which type a person is based on an analysis of their documents. This enables information architecture to present information to different types of people or different types / purposes of content.
It also relates to information architecture in, for example, how best to design a user interface that presents multiple ways of organizing information without becoming too confusing for humans to use.
What is the one thing you want folks to take away from your workshop?
Text analytics is a platform for structuring information that has the power to radically change how information is utilized in organizations. If done right, it can change information overload from a problem to a rich resource for multiple information applications—everything from search and search-based applications that work to social media monitoring that delivers deep insights into customers and employees to whole new classes of applications.
Tom Reamy is Chief Knowledge Architect at KAPS Group and author of Deep Text: Using Text Analytics to Conquer Information Overload, Get Real Value From Social Media, and Add Big(ger) Text to Big Data. You can find him on Twitter at @TomReamy.