Hate Speech Scientific Project

Goal:

To be able to recognize parts of text that contains hate or vulgarisms.

Possible applications:

Management of discussion forums / detection of spam or abuse.
"Postprocessing" for biased generative language models - preventing to generate inapropriate responses.

Plan:

Perform a review of the state-of-the-art
Pick established (english) corpora
Formalize the problem - classification of sentiment, recognition of topic, keyword selection,
Propose a preliminary system, repeat existing approach.
Create small evaluation set in Slovak
Try multilingual/crosslingual approach. Possibility of machine translation.
Annotate a bigger Slovak Corpus
Recognize and publish scientific contribution

Future Tasks:

Evaluate existing multilingual model. E.G. https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi
Translate existing English dataset into Slovak. Use OPUS English Slovak Marian NMT model. Train Slovak munolingual model.
Train or finetune or prompt a large langauge model.

In progress tasks:

Annotate a Twitter Dataset. Possible guidelines are: https://developers.perspectiveapi.com/s/about-the-api-training-data?language=en_US
Annotate a Facebook Dataset. Use some other guidelines. e.g. sentence-level annotation, for context sensitive hate.
Prepare existing Slovak Twitter dataaset, train evaluate a model.

Finished tasks:

Perform preliminary experiments with HS detection (Bulburu)
Prepare an anotation infrastructure for Facebook data annotation (Ferko)
Gather Facebook data and prepare for annotation. (Ferko)

People:

Ján Staš
Daniel Hládek
Zuzana Sokolová
Vladimír Ferko
Tetiana Mohorian
Patrik Pokrivčák

Former participants:

Links:

https://europeanonlinehatelab.com/
https://hatespeechdata.com/
https://oznacuj-dezinfo.kinit.sk/

Obsah

Hate Speech Scientific Project

Categories

Popular Articles