CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

A Collection of Persian-English Documents

عنوان مقاله: A Collection of Persian-English Documents
شناسه ملی مقاله: ICEE15_248
منتشر شده در پانزدهیمن کنفرانس مهندسی برق ایران در سال 1386
مشخصات نویسندگان مقاله:

Sadra Abedinzadeh - Faculty of Electrical and Computer Engineering School of Engineering - University of Tehran Tehran - Iran
Fattaneh Taghiyareh - Faculty of Electrical and Computer Engineering School of Engineering - University of Tehran Tehran - Iran
Farhad Oroumchian - Faculty of Information Technology University of Wollongong (Dubai campus) Dubai - UAE Control and Intelligent Processing Center of xcellence, University of Tehran - Iran

خلاصه مقاله:
The development of Language Engineering (LE) and Information Retrieval (IR) applications requires availability of sizeable, reliable and representative collection of documents. Moreover, cross hnguage Information Retrieval (CLIR) systems are widely used recently due to the explosion of non- documents. However, the lack of such a collection to be used in CLIR which deals with Persian retrieval is a big drawback in researches in this field This paper describes a 90MB Persian-English collection which contains 7073 documents generated fiom Wikipedia, an open encyclopedia, web site and is represented in XML format. We also use the RSLP collection description schema to describe our collection.

کلمات کلیدی:
Collection of Documents, Bilingual, Persian-English, Wikipedia, RSLP schema

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/25317/