บรรณานุกรม#

1

Wirote Aroonmanakun. Thoughts on word and sentence segmentation in thai. In Proceedings of the Seventh Symposium on Natural language Processing, Pattaya, Thailand, December 13–15, 85–90. 2007.

2

Steven Bird. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, 69–72. 2006.

3

David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022, 2003.

4

Tom B Brown. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.

5

Pattarawat Chormai, Ponrawee Prasertsom, Jin Cheevaprawatdomrong, and Attapol Rutherford. Syllable-based neural Thai word segmentation. In Donia Scott, Nuria Bel, and Chengqing Zong, editors, Proceedings of the 28th International Conference on Computational Linguistics, 4619–4637. Barcelona, Spain (Online), December 2020. International Committee on Computational Linguistics. URL: https://aclanthology.org/2020.coling-main.407, doi:10.18653/v1/2020.coling-main.407.

6

Matthew Honnibal and Ines Montani. spaCy 2: natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear, 2017.

7

Tibor Kiss and Jan Strunk. Unsupervised multilingual sentence boundary detection. Computational Linguistics, 32(4):485–525, 2006. URL: https://aclanthology.org/J06-4003, doi:10.1162/coli.2006.32.4.485.

8

Hao Lang, Yinhe Zheng, Yixuan Li, SUN Jian, Fei Huang, and Yongbin Li. A survey on out-of-distribution detection in nlp. Transactions on Machine Learning Research, 2023.

9

Peerat Limkonchotiwat, Wannaphong Phatthiyaphaibun, Raheem Sarwar, Ekapol Chuangsuwanich, and Sarana Nutanong. Domain adaptation of Thai word segmentation models using stacked ensemble. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 3841–3847. Online, November 2020. Association for Computational Linguistics. URL: https://aclanthology.org/2020.emnlp-main.315, doi:10.18653/v1/2020.emnlp-main.315.

10

Peerat Limkonchotiwat, Wannaphong Phatthiyaphaibun, Raheem Sarwar, Ekapol Chuangsuwanich, and Sarana Nutanong. Handling cross- and out-of-domain samples in Thai word segmentation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 1003–1016. Online, August 2021. Association for Computational Linguistics. URL: https://aclanthology.org/2021.findings-acl.86, doi:10.18653/v1/2021.findings-acl.86.

11

Lalita Lowphansirikul, Charin Polpanumas, Attapol T Rutherford, and Sarana Nutanong. A large english–thai parallel corpus from the web and machine-generated text. Language Resources and Evaluation, 56(2):477–499, 2022.

12

Tomáš Mikolov, Stefan Kombrink, Lukáš Burget, Jan Černock\`y, and Sanjeev Khudanpur. Extensions of recurrent neural network language model. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), 5528–5531. IEEE, 2011.

13

David D Palmer. Tokenisation and sentence segmentation. Handbook of natural language processing, pages 11–35, 2000.

14

Richard E Pattis. Karel the robot: a gentle introduction to the art of programming. John Wiley & Sons, 1994.

15

Slav Petrov, Dipanjan Das, and Ryan McDonald. A universal part-of-speech tagset. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), 2089–2096. Istanbul, Turkey, May 2012. European Language Resources Association (ELRA). URL: http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdf.

16

Chanatip Saetia, Ekapol Chuangsuwanich, Tawunrat Chalothorn, and Peerapon Vateekul. Semi-supervised thai sentence segmentation using local and distant word representations. arXiv preprint arXiv:1908.01294, 2019.

17

Sorratat Sirirattanajakarin, Duangjai Jitkongchuen, and Peerasak Intarapaiboon. Boydcut: bidirectional lstm-cnn model for thai sentence segmenter. In 2020 1st International Conference on Big Data Analytics and Practices (IBDAP), 1–4. IEEE, 2020.

18

Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, 1631–1642. 2013.

19

Srivatsan Srinivasan and Chris Dyer. Better chinese sentence segmentation with reinforcement learning. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 293–302. 2021.

20

Jakapun Tachaiya, Joobin Gharibshah, Kevin E Esterling, and Michalis Faloutsos. Raffman: measuring and analyzing sentiment in online political forum discussions with an application to the trump impeachment. In Proceedings of the International AAAI Conference on Web and Social Media, volume 15, 703–713. 2021.

21

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, and others. Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.

22

A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017.

23

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, and others. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022.

24

Nianwen Xue and Yaqin Yang. Chinese sentence segmentation as comma classification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 631–635. 2011.

25

Sumeth Yuenyong and Virach Sornlertlamvanich. Transentcut-transformer based thai sentence segmentation. Songklanakarin Journal of Science and Technology, 44(3):852–860, 2022.

26

Caleb Ziems, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang. Can large language models transform computational social science? Computational Linguistics, 50(1):237–291, 2024.