An approach to establish the setting or context of a given text or content, especially with regard to improving some term translation algorithms in the field of machine language translation. This approach allows term replacement algorithms to select object language terms from lists of possible terms according to the subject being treated in a text and according to the semantic setting of the source term. As a corollary of the approach, content or text can be graded according to subject matter by means of lexical indicators or standard subject matter glossaries. And indeed, by finely tuning the algorithms, a context profile can be produced across the entire text or content, thus further refining the setting or context of a given term.
Hierarchy of context
An attempt to recognize and differentiate between different contexts.
And to establish a measure or pattern to run on pieces of content to establish the related contexts.
Set of contexts for content
Adaptive context fitting
Lexical feedback schemes
- Use of terms to establish context gradients for texts.
- Key terms needed as context indicators.
- 1) context gradients across a text.
- 2) context hierarchy for the whole text.
For example, a subject is an object like "electronics" characterized by:
- a neural network of electronics terms of different weighting (signal, logic, etc.).
- a network of glossaries of terms for different subjects is run on the text.
- the results of such a comparison run will give a hierarchy list of subjects drawn up by the weighting of the different glossaries.
- a subject glossary contains major lexical indicators for the subject - it can be finely or coarsely tuned.
*"A human encountering a new concept derives the meaning via an accumulation of experience of the contexts in which the concept appears. This opens the door to "learn" the meaning of a concept through how a concept appears within the context of other concepts. Following this idea, a representational model of semantic memory has been developed called Hyperspace Analogue to Language (HAL), which automatically constructs a dimensional semantic space from a corpus of text (Burgess et al, 1998; Burgess & Lund, 1997; Lund & Burgess, 1996).
HAL is a so called global co-occurrence model. It acquires vector representations of meaning by capitalizing on large-scale co-occurrence information inherent in the text corpus.
HAL vectors show compatibility with human information processing and are therefore interesting as computational representations of "meaning" which are cognitively, rather than logically motivated."
Bruza, P.D., & Song, D. (2003). Towards Context Sensitive Information
Distributed Systems Technology Centre, The University of Queensland, Australia
Peter Bruza1, Robert McArthur and Dawei Song
Distributed Systems Technology Centre,
University of Queensland, Australia 4067
"An important factor contributing to lack of awareness is our limited ability to process and absorb information as quickly as it is growing. Information-theoretic research into consciousness  has revealed that our heads have a surprisingly narrow bandwidth and process information slowly - in the region of between 107and 1011 bits per second. For any given second, only about 16 to 20 bits of information enter into consciousness . Consciousness, therefore, "is highly entropic, a thermodynamically costly state for the human system to be in" . Human beings, it seems, make do with little information as that is all they can handle.
There is a growing body of research from cognitive science in which corpus-based representations of terms and concepts are being computed which correlate with human processing."
1. Austin, J. (1998): Zen and the brain: Towards an understanding of meditation
and consciousness. MIT Press
2. Gabbay, D. and Woods, J. (2000): 'Abduction', Lecture notes from ESSLLI 2000
(European Summer School on Logic, Language and Information). Online: http://www.cs.bham.ac.uk/~esslli/notes/gabbay.html
NOBAYASHI Masamichi "Things and Words: the Complex of Meaning Hidden in the Complication of Word-indications"
complex of meaning, contradictory words, lattices of synonyms, negative word, word understanding
If an object is that which we experience in daily life, speakers are clearly conscious of whether a word indicates the object or not. A word is recognized to exist only when it indicates the object. It is a matter of common knowledge that a word has only its affirmative form and meaning.
A pair of distinctive features, affirmative and negative can be assumed as a premise of the formation of word meaning. Word meanings are formed by the selection of the affirmative features alone. Let us suppose that words at the level of "langue" take affirmative features as their component of word meaning. Then, words with negative features can also be assumed at the deeper level of "langue". Therefore concurrence of two lineages of words, affirmative and negative ones which are contradictory of each other, can be assumed as a premise of the formation of a word.
It is said that contradictions constitute reality. Indeed, there is a fact that using affirmative and negative lineages of synonyms together which are brought by crossing of contradictory concepts hits an object. The contradictory words which are hidden in the complication of word indications must be found. That is the one called "lattices of synonyms".
In this paper, the author intends top make the description, of lexical structures objective on the scheme of the binary and complementary structure of cognitive language, that is, the "lattices of synonyms" with which speakers constitute their reality.
Quote from Devlin:
"Information = data + meaning
Knowledge = internalized information + ability to utilize the information"
K. Devlin: Infosense - turning information into knowledge, Freeman, New York. 1999
The phrase decoding algorithm is based on a beam search algorithm that explores the hypothesis space in a breadth-first way. The algorithm consists of three phrases, the first one is cept induction, that initiates the search by generating possible cepts from a given input sequence of words. After the induction, the second step constructs phrases from given cepts. A phrase is defined as a sequence of cepts with inserted words of zero fertility, and does not necessarily have grammatical coherence. When all the input words are consumed, the third step constructs the final sentence by prepending/appending zero fertility words.
Statistical Machine Translation Decoder Based On Phrase
by Taro Watanabe and Eiichiro Sumita
Kyoto University, Japan
The framework of statistical machine translation formulates the problem of translating a sentence in a language J into another language E as the maximization problem of the conditional probability Ê = argmaxE P(E|J) (Brown et al., 1993). The application of Bayes Rule resulted in Ê = argmaxE P(E)P(J|E). The former term P(E) is called a language model, representing the likelihood of E. The latter term P(J|E) is called a translation model, representing the generation probability from E into J.
Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della
Pietra, and Robert L. Mercer. (1993). The mathematics of statistical machine translation: Parameter estimation.
Computational Linguistics, 19(2):263-311.
If you get the meaning of my "Meinung!"
Like some details on the author?