To categorize information the content has to be negotiated in some way. Due to the diversity of topics and scenarios it is difficult to determine the content the information has. Using simple keywords as identifiers for a category can be wrong in several cases. Like described in the previous chapters the unique category correlation requires an intelligent approach.
- My boss told me to brief you on the project.
- Hi Sandy, my boss told me to brief the new one on the project. I will come home later.
Both example emails (1. and 2.) have the business topic and would be, when using simple keywords (boss, brief, project), correlated with the category business, which would be wrong in this case. Only the first email applies to the business category, and the second one is private email.
Information Type Negotiation
To intelligently categorize information sent by email the information type has to be negotiated. This means that the categorization process must rate or even understand the content in some way. Using keywords for correlation does not work in all cases and is therefore unapplicable. To determine the type of information (like described in [emailtypes]) the categorization process has to have knowledge about what belongs to each category. This knowledge base can be considered as an operational experience database. Comparing the information in each category with the new information, using different algorithms, can be a difficult but feasible approach.
Negotiation of Human Language
One of the biggest problems of Content Negotiation, especially for Internet email is the use of human language. The latter differs to machine language so much, that no computer is able to fully understand the human language. The barriers in Computer Linguistics and Content Negotiation are obvious. Human language is complex and not always following strict rules. High intelligence is required to understand it, even if it is misspelled, sarcastic or ironic. Different languages make it even more impossible to understand for a computer.
Semiotic, the topic which summarizes the three main language-related subtopics of computer linguistics, is the main problem of today’s content negotiation approaches.
The formal rules of a language, called syntax, including the possible alphabet, combinations, rules and spellings.
The semantic of an information is the intention of a message (syntactically encoded information content type).
Pragmatic sections allow information to have two or more contradicting semantics. This is achieved for example by being sarcastic or ironic, when joking or simply by language specific problems.