Sabtu, 16 Juni 2018

Sponsored Links

Thesaurus for IR (IEKO)
src: www.niso.org

In the context of information retrieval, a thesaurus (plural: "thesauri") is a controlled vocabulary form that attempts to dictate the semantic manifestation of metadata in the indexing of content objects. The thesaurus serves to minimize semantic ambiguity by ensuring uniformity and consistency in the storage and retrieval of the manifestation of content objects. ANSI/NISO Z39.19-2005 defines the content object as "any item that will be described for inclusion in the information retrieval system, website, or other sources of information". The thesaurus assists the assignment of preferred terms to convey the semantic metadata associated with the content object.

The thesaurus serves to guide the indexer and seeker in choosing the same preferred term or combination of preferred terms to represent the given subject. ISO 25964, the international standard for thesauri information retrieval, defines a thesaurus as a "controlled and structured vocabulary in which the concept is represented by a term, arranged so that the relationship between concepts is made explicit, and the preferred term is accompanied by prospect entries for synonyms or quasi-synonyms."

The thesaurus is composed by at least three elements: 1-list of words (or terms), 2-relationships between words (or terms), indicated by their hierarchical relative position (eg parent/broader term: narrow term/child, synonym, etc.), 3-set rules on how to use a thesaurus.


Video Thesaurus (information retrieval)



History

Wherever there is a large collection of information, whether on paper or on computers, experts have faced challenges in determining the items they are looking for. The use of classification schemes to organize documents in the framework of only a partial solution. Another approach is to index the contents of a document using a word or term, rather than a classification code. In the 1940s and 1950s some pioneers, such as Calvin Mooers, Charles L. Bernier, Evan J. Crane and Hans Peter Luhn, collected their index terms in various types of lists they called "thesaurus" (by analogy to the well ). thesaurus developed by Peter Roget). The first list that is seriously used in information search is a thesaurus developed in 1959 at E I Dupont de Nemours Company.

The first two of this list to be published are the Thesaurus of ASTIA Descriptors (1960) and the Chemical Engineering Thesaurus of the American Institute of Chemical Engineers (1961), a descendant of the Dupont thesaurus. Further followed, culminating in an influential theological and scientific test (TEST) published jointly by the Joint Council of Engineers and the US Department of Defense in 1967. TES did more than serve as an example; Appendix 1 presents Thesaurus rules and conventions that have guided thesaurus construction ever since. Hundreds of thesauri have been produced ever since, perhaps thousands. The most prominent innovations since TEST are: (a) Extensions from monolingual capabilities to multilingual; and (b) The addition of a conceptually arranged view to a basic alphabetical presentation.

Here we mention only a few national and international standards that have built firmly on the ground rules set out in TEST:

  • UNESCO Guidelines for establishing and developing monolingual thesaurs . 1970 (followed by subsequent editions in 1971 and 1981)
  • DIN 1463 Guidelines for establishment and development of monolingual thesauri . 1972 (followed by next edition)
  • ISO 2788 Guidelines for establishing and developing monolingual thesaurs . 1974 (revision 1986)
  • ANSI American National Standard for Structure, Construction, and Use of Thesaurus . 1974 (revised 1980 and replaced by ANSI/NISO Z39.19-1993)
  • ISO 5964 Guidelines for the establishment and development of multilingual thesauri . 1985
  • ANSI/NISO Z39.19 Guidelines for the construction, format and management of a single thesaurus . 1993 (revised 2005 and renamed Guidelines for the construction, format, and management of monolingual-controlled vocabulary .
  • ISO 25964 Thesauri and interoperability with other vocabulary . Part 1 ( Thesauri for information retrieval published 2011; Part 2 ( Interoperability with other vocabulary ) published in 2013.

The most obvious trend throughout the history of the development of this thesaurus is from the context of small-scale isolation to the world of tissue. Access to information is especially enhanced when thesauri crosses the line between monolingual and multilingual applications. More recently, as can be seen from the latest ISO and NISO standard titles, there is recognition that thesauri needs to work in exploiting other vocabulary forms or knowledge organization systems, such as canopy schemes, classification schemes, taxonomy and ontology. The official website for ISO 25964 provides more information, including reading lists.

Maps Thesaurus (information retrieval)



Destination

In information retrieval, a thesaurus can be used as a controlled vocabulary form to aid in indexing appropriate metadata for information containing entities. The thesaurus helps express the manifestation of concepts in the prescribed way, to assist in improving precision and recall. This means that semantic conceptual expression of information containing entities is easier to find due to language uniformity. In addition, thesauruses are used to maintain a list of hierarchical terms; usually single words or bound phrases that help the indexer in narrowing down the requirements and limiting semantic ambiguity.

The Art & amp; Thesaurus, for example, is used by many museums around the world, to catalog their collections. AGROVOC, the thesaurus of the UN Food and Agriculture Organization, is used to index and/or search AGRIS worldwide scientific databases on agricultural research.

Thesaurus for IR (IEKO)
src: www.isko.org


Structure

The information retrieval thesauri is formally organized so that the relationships between the concepts are clear. For example, "citrus fruits" may be related to a wider "fruit" concept, and narrower ones than "oranges", "lemons", etc. When conditions are displayed online, links between them make it very easy to surf around thesauruses, choose useful terms for search. When a term can have more than one meaning, such as a table (furniture) or table (data), it is listed separately so that the user can choose the concept to look for and avoid taking irrelevant results. For any concept, all known synonyms are listed, such as "mad cow disease", "bovine spongiform encephalopathy", "BSE", etc. The idea is to guide all the indexers and all searchers to use the same term for the same concept, so that the search results will be as complete as possible. If a multilingual thesaurus, equivalent terms in other languages ​​are also displayed. Following international standards, concepts are generally arranged hierarchically in terms of or grouped by theme or topic. Unlike the general thesaurus used for literary purposes, the search for thesauri information typically focuses on a single discipline, subject or field of study.

konsep, macam-macam, dan algoritma) - ppt download
src: slideplayer.info


See also

  • Controlled vocabulary
  • ISO 25964
  • Thesaurus

Thesaurus for IR (IEKO)
src: www.isko.org


References


PRF 7: better statistical synonyms - YouTube
src: i.ytimg.com


External links

  • Official site for ISO 25964
  • Warehouse Taxonomy

Source of the article : Wikipedia

Comments
0 Comments