从文本数据中提取特征收藏这部分内容有帮助吗?
有帮助 (2)报告问题标记为完成参考文献
Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze, 2008 (Cambridge University Press) - 详细介绍了词袋模型和TF-IDF等文本表示方法,是文本特征提取的基础。Efficient Estimation of Word Representations in Vector Space, Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, 2013 arXiv preprint arXiv:1301.3781 DOI: 10.48550/arXiv.1301.3781 - 介绍了Word2Vec,这是学习高效高质量分布式词表示的开创性工作。GloVe: Global Vectors for Word Representation, Jeffrey Pennington, Richard Socher, Christopher Manning, 2014 Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Association for Computational Linguistics) DOI: 10.3115/v1/D14-1162 - 提出了GloVe,一种无监督的学习算法,用于获取捕获全局语料库统计信息的词向量表示。Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Daniel Jurafsky and James H. Martin, 2025 - 一本全面的教科书,涵盖了NLP的各个方面,包括经典的文本特征提取和现代的词句嵌入技术(第四版草稿)。© 2025 ApX Machine Learning用心打造