毕业论文

打赏
当前位置: 毕业论文 > 计算机论文 >

观点文本分类系统的研究与开发

时间:2020-11-17 10:25来源:毕业论文
文本自动分类简称文本分类(text categorization),是模式识别与自然语言处理密切结合的研究课题对浩如烟海的文献,和资料和数据(很大一部分是文本)进行自动分类,组织和管理,已

随着互联网技术的迅速发展和普及,大量的文字信息开始以计算机可读的形式存在,并且其数量每天都在急剧增加,人们已经从信息缺乏时代过渡到了信息极大丰富的时代。如何对浩如烟海的文献,和资料和数据(很大一部分是文本)进行自动分类,组织和管理,已经成为一个具有重要用途的研究课题。59672

文本自动分类简称文本分类(text categorization),是模式识别与自然语言处理密切结合的研究课题。本文主要做了2个实验:

1. 第一个实验是对宾馆和笔记本语料的使用不同权重和不同特征选择下的实验,实验中使用的权重有布尔权重(BOOL),绝对词频(TF),TFIDF等3种方法,特征选择有文档频数(DF),信息增益(IG), (CHI)等3种方法,分类器使用的是支持向量机模型(SVM)。 实验结果显示:在不同特征选择下使用布尔权重(BOOL)时的正确率最高,3种特征选择的正确率基本相同。

2. 第二个实验是对宾馆和笔记本语料进行否定转移后,按第一个实验方法进行实验。否定转移前后实验结果上可以看出:正确率有了大约1%的提高。说明否定转移是文本分类中起到重要作用。

毕业论文关键字  文本分类  权重  特征选择  否定转移

毕业设计说明书(论文)外文摘要

Title Development and research of sentiment text classification system

Abstract

   With the rapid development and popularization of Internet technology, a large number of text information began to exist in machine-read form, and the number is increasing every day, it has been from the lack of information to the era of a large   amount of information. How the multitude of literature, and the information and data (a large part of text), organization and management, has become a very important   research.

   Automatic text categorization also said text classification (text categorization), is a research topic closely combined with pattern recognition and natural language processing. This text article mainly did 2 experiments:

1. the first experiment is using different term-weight and feature selection in the hotel and notebook review, the methods of term-weight is BOOL, TF, TFIDF and other 3 kinds of methods of feature selection is document frequency (DF), information gain (IG), CHI 3 methods, classifiers are using support vector machine model (SVM). The results show: using BOOL term-weight under the different feature selection have the highest result, 3 feature selections have same result.

2. the second experiment was negative inversion of the hotel and notebook reviews, the tests were conducted according to the first experiment method. Negative inversion and the experimental results can be seen: the correct rate is about 1% of the increase. That negative inversion is play an important role in text classification.

 Keywords  text classification, term weight, feature selection,  negative inversion

目录

1 引言 1

1.1 课题背景 1

1.2 课题研究的目的及意义 1

1.3 国内外相关技术发展现状 2

1.3.1 观点文本分类的主要研究内容 2

1.3.2 观点文本分类的应用现状 观点文本分类系统的研究与开发:http://www.751com.cn/jisuanji/lunwen_64961.html

------分隔线----------------------------
推荐内容