Text prediction method based on multi-label attributes and improved maximum entropy model

Abstract

One of the core content of the prediction is that, on the basis of text label attributes, we can use the algorithm and a heuristic approach to acquire the association of texts, and extract the available text for the user. Therefore, this paper proposes a new content. First, the multi-label attributes are chosen to be the feature structure of text, and it is given the classification and assignment according to the distinguish method of the statistical data. Second, considering the relation between texts, we improve the traditional maximum entropy method. We are able to control the number of multiple leading text and subsequent text at the same time. Our method makes stronger association of text, and it leads to a more unified direction and higher correlation of obtained text through the label attributes. Then we can predict the similar texts. Experiments show that with the consideration of multi-label attributes of text and the control of the number of leading text as well as the subsequent text, the recall rate and precision are definitely improved when compared to similar existing methods.

Keywords

Text prediction maximum entropy model leading text subsequent text label attribute

Get full access to this article

View all access options for this article.

References

Adomavicius

and Tuzhilin

, Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions, IEEE Transactions on Knowledge & Data Engineering17(6) (2005), 734–749.

Bansal

, Belanger

and Mccallum

, Ask the GRU: Multi-task Learning for Deep Text Recommendations, ACM Conference on Recommender Systems, ACM, 2016, pp. 107–114.

Liu

, Yu

, Guo

, et al., Full-text based context-rich heterogeneous network mining approach for citation recommendation, Digital Libraries IEEE, 2014, pp. 361–370.

Garay-Vitoria

and Abascal

, Text prediction systems: A survey, Universal Access in the Information Society4(3) (2006), 188–203.

Rendle

, Freudenthaler

, Gantner

, et al., BPR: Bayesian personalized ranking from implicit feedback, Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2009, pp. 452–461.

Munro

J.I.

and Nekrich

, Compressed Data Structures for Dynamic Sequences. Algorithms-ESA 2015, SpringerBerlin, Heidelberg, 2015, pp. 891–902.

Rao

, Li

, Xiang

, et al., Intensive Maximum Entropy Model for Sentiment Classification of Short Text. DASFAA 2015 Workshops, 2015, pp. 42–51.

Bhattacharya

, Fault detection on a ring-main type power system network using artificial neural network and wavelet entropy method, International Conference on Computing, Communication & Automation, IEEE, 2015, pp. 1032–1037.

L.H.

, Sun

S.T.

and Wang

, Text similarity algorithm based on semantic vector space model, IEEE/ACIS, International Conference on Computer and Information Science, IEEE, 2016, pp. 1–4.

10.

Jiang

, Yuan

, Jiang

, et al., Short Text Sentiment Entropy Optimization Based on the Fuzzy Sets, Web Information System and Application Conference, 2015, pp. 247–250.

11.

Tripathy

, Agrawal

and Rath

S.K.

, Classification of sentiment reviews using n-gram machine learning approach, Expert Systems with Applications57 (2016), 117–126.

12.

Yin

and Xi

, Maximum entropy model for mobile text classification in cloud computing using improved information gain algorithm, Multimedia Tools & Applications (2016), 1–17.

13.

, Wang

, Zhao

, et al., Improved Expected Cross Entropy Method for Text Feature Selection, International Conference on Computer Science and Mechanical Automation, 2015, pp. 49–54.

14.

Wang

, Study on the application of feature selection for big text data using expected cross entropy, Journal of Information & Computational Science12(18) (2015), 6835–6843.

15.

Chandrasekar

and Qian

, The Impact of Data Preprocessing on the Performance of a Naive Bayes Classifier, Computer Software and Applications Conference, IEEE, 2016, pp. 618–619.

16.

Wang

, Entropy-Based Term Weighting Schemes for Text Categorization in VSM, 2015, pp. 325–332.

17.

Kuruvila

and Gopinath

D.P.

, Entropy of Malayalam language and text compression using Huffman coding, International Conference on Computational Systems & Communications IEEE, 2015, pp. 150–155.

18.

Krendzelak

and Jakab

, Text categorization with machine learning and hierarchical structures, International Conference on Emerging Elearning Technologies and Applications, IEEE, 2016, pp. 1–5.

19.

Williams

, Statistical data analysis, Nursing Management23(1) (2016), 19–19.

20.

White

S.J.

, Warrington

K.L.

, Mcgowan

V.A.

, et al., Eye movements during reading and topic scanning: Effects of word frequency, J Exp Psychol Hum Percept Perform41(1) (2015), 233–248.

21.

Baak

, Besjes

G.J.

, Côté

, et al., HistFitter software framework for statistical data analysis, The European Physical Journal C75(4) (2015), 1–20.

22.

H.C.

, Luk

R.W.P.

, Wong

K.F.

, et al., Interpreting TF-IDF term weights as making relevance decisions, ACM Transactions on Information Systems26(3) (2008), 55–59.

23.

Zhao

, Wu

and Liu

, Paper prediction based on the knowledge gap between a researcher’s background knowledge and research target, Information Processing & Management52(5) (2016), 976–988.

24.

Rendle

, Freudenthaler

, Gantner

, et al., BPR: Bayesian personalized ranking from implicit feedback, Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2009, pp. 452–461.

25.

Toledo

R.Y.

, Mota

Y.C.

and Martínez

, Correcting noisy ratings in collaborative recommender systems, Knowledge-Based Systems76 (2014), 96–108.

26.

Zhao

, Wu

and Liu

, Paper prediction based on the knowledge gap between a researcher’s background knowledge and research target, Information Processing & Management52(5) (2016), 976–988.

27.

Rendle

, Freudenthaler

, Gantner

, et al., BPR: Bayesian personalized ranking from implicit feedback, Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2009, pp. 452–461.

28.

Wang

, Ma

, Huang

, et al., Combining Positive and Negative Feedbacks with Factored Similarity Matrix for Recommender Systems, Web-Age Information Management, 2015, pp. 233–246.

29.

Hamby

and Ickes

, Do the readability and average item length of personality scales affect their reliability? Some meta-analytic answers, Journal of Individual Differences36(1) (2015), 54–63.